← Older posts

MySQL “EXPLAIN” changes after restart

Posted on by Mark

If you don’t work with MySQL you can stop reading, but if you’ve come here from Google and you’re experiencing a problem like the one described in the title, hopefully this will save you some time.

I was performance testing a new system at work the other day and was confused by some huge changes in performance that didn’t seem to be related to the config setting I was playing with. The test would run in 6 minutes, I’d tweak a setting and it would suddenly blow out to 13 minutes. I’d change the setting back and it would stubbornly stay at 13 minutes. I’d change something unrelated and it would drop back to 6 minutes. Come back an hour later and it would it be back at 13 minutes.

This is the last thing you want happening with this sort of testing because you can’t trust anything you think you have found. All the conclusions you think you reached are meaningless. Hours you’ve have been spent are possibly wasted and you’re probably going to have to waste a many more before you get anywhere.

Fortunately this story had a happy ending.

On the bad runs SHOW PROCESSLIST was showing a single query running for ~500 seconds. On the good runs this query was done in a second or two. The test I was doing involved running variants of the query (among many others) about 40 times. I initially suspected it was the query cache being on or off, then I thought it might be that if multiple instance of this query were running in parallel it might saturate the CPU or IO. iostat showed no disk activity, but the system runs on a mix of SPARC and Intel servers and the faster Intels seemed to have the problem more than the SPARCS which have lots of relatively slow processors. But nothing held up to scrutiny, the problem would come back or disappear just when I thought I had it pinned down.

I eventually fell back on using EXPLAIN to look at the execution plan for the query. I’d been shying away from this because being 2 weeks away from go-live I wanted the problem to be config and not code. But EXPLAIN did some strange things. When the query was slow EXPLAIN was showing one of the joins not using a key and doing a full scan of a table with 500k records. Other times it would decide to use a key and only have to look at between 700-1500 records. It seemed that restarting MySQL would make EXPLAIN recalculate the execution plan and almost randomly pick one of these two conclusions.

I typed the title of this post into Google and came across this post which gave me the clues I needed to solve the problem. It seems that by default InnoDB uses 8 random pages to estimate the best execution plan for a query and depending on the data it finds the plan can vary. Once it has chosen this data it will cache it until one of several events occur, most likely being a server restart or when ANALYZE TABLE is run.

Once I learned this I was fairly confident I’d found the culprit. Next step was to work out how to coax it into picking the right plan. The number of pages it uses to calculate the plan is controlled by the innodb_stats_sample_pages variable. The downside of setting it too high is that MySQL spends more CPU and potentially IO figuring out the optimal execution plan so I needed to find the lowest number that gave consistently provides the optimal execution plan.

The routine I used looked something like this, with multiple ANALYZE/EXPLAIN runs to check see if the results remained consistent:
mysql> SET GLOBAL innodb_stats_sample_pages = 800;
mysql> SHOW VARIABLES LIKE "innodb_stats_sample_pages";
mysql> ANALYZE TABLE TableName;
mysql> EXPLAIN ....
mysql> ANALYZE TABLE TableName;
mysql> EXPLAIN ....
mysql> ANALYZE TABLE TableName;
mysql> EXPLAIN ....
mysql> ANALYZE TABLE TableName;
mysql> EXPLAIN ....
mysql> ANALYZE TABLE TableName;
mysql> EXPLAIN ....

Each ANALYZE/EXPLAIN run gave the good execution plan, but the ANALYZE TABLE was (unsurprisingly) 100x slower than previously. Next I dropped it to 80 which also gave good execution plan results and was only 10x slower on the ANALYZE. Being fairly suspicious at this point I set it back to 8 and was relieved to get mixed results again. So now I was confident that the right number was between 8 & 80. By repeating this process I eventually found that 10 gave consistently good results. To be on the safe side I eventually bumped it up to 12 and got back to my testing.

Posted in Google Fodder, MySQL, Under The Hood | Leave a comment


A short story about Earned Value Management

Posted on by Mark

Here is a screencast of lightening presentation I gave at What Do You Know Sydney a few weeks ago.

It’s interesting and slightly mortifying to relive a public speaking gig. I have a tendency to focus excessively on what went wrong. My fifth word was “um”, I was pretty wooden, I managed to make a mess of the build up of a key slide (twice) and I completely forgot to deliver a the central line that I had be rehearsing over & over prior to stepping on stage (something about being surprised by how much work is left to do and how little time is left to do it). But it seems to be getting slightly less mortifying each time and I’m slowly learning to evaluate my performance impartially. This talk needed at extra 24 hours prep, but was ok considering.

It was a fun night with a packed house, cold beer, lots of friendly faces and some great talks.

Maxine has uploaded most of the talks that night and many more from the other events run around the country to http://www.youtube.com/user/webdirections?feature=watch. Some highlights for me are John Bristowe – Just Because It’s Ridiculous Doesn’t Mean It’s StoopidJoseph Gentle – 2D Physics with Chipmunk.jsx and Andy Clark – On the Grid. Carl Woodward also gave an excellent talk on using animation to aid usability and Jeremy Keith gave a hilarious talk which has changed the way I use headphones forever, but unfortunately neither are online (yet?).

Thanks John & Maxine for a great event.

Posted in Management, Talks, Uni | Leave a comment


Uni isn’t hard, it just takes a lot of time

Posted on by Mark

I recently started the second year of a Masters of Business and Technology at the Australian Graduate School of Management at the University of New South Wales.

Uni isn’t hard, it just takes a lot of time and organisation. If the material is interesting, you look forward to spending the time and the time passes quickly. If it’s dull then you just have to slog through it. Putting in the hours and staying on top of the work is a challenge but is doable. Making the work interesting, relevant and fun makes this less onerous. Good routine and ritual are a huge help.

This is what my routine looks like.

Reading

Every week you have to read about 100 pages, give or take. About half of this is course-specific material written by uni staff. The other half is readings (journal articles, extracts from books) they’ve picked out for you. Other people I know doing masters-level study have had to source most of the material for themselves – we’re encouraged to read outside of the material, especially when writing essays & reports – but 90% of the material is given to us.

The notes contain some reflection exercises which are worth doing as you go. I also highlight, annotate and scribble extensively as I read.

I find I can get through this by reading on the train every day (about 2 hours per week) and by finding another hour or two at other times.

Real World

Case studies for what you are learning are all around you if you pay attention. Books I read, links I find on Twitter, blogs I subscribe to, articles in the news, issues at work and so on all relate back to what I’m learning. Casting and recasting all this in light of the theory I’m learning helps the ideas stick.

Again when the material is relevant to my interests it is easier because suddenly the examples are all around you. Because I’m doing an MBT, not a straight MBA nearly everything I do and see online is relevant.

Class time

Each subject has an hour and a half of class time per week on a weeknight from 6-7:30pm. I always go to class & pay attention. It’s not a big time commitment, but it anchors everything else. Class size has varied from 10 (last semester) to 26 (this semester).

The AGSM is pushing hard to get us to take online classes instead of face-to-face. I wouldn’t be surprised if by the time I’m graduating face-to-face will be close to non-existent. I think this is a shame because the free wheeling discussion and debate of a good class is the highlight of my study week. I guess I’ll try an online class at some point but I doubt I’d get as much out of it and I loath the idea of spending so much time on Blackboard (more on this later).

In previous semesters the class has been mostly powerpoint slides following the format of the notes frequently interrupted by questions, comments, debate and anecdotes from students. If you hadn’t read all the notes, you could pick most of it up in the class.

This semester the lecturer guides a 90 minute conversation which may or may not loosely follow some slides which may or may not relate to some ideas from the notes. It is always relevant and entertaining, but it doesn’t cover your ass if you haven’t read the notes.

Unit Summaries

Exams are generally open book, but taking 1200-1500 pages of notes into an exam is pretty pointless. I need quick access to details and for this I write unit summaries.

Last semester one of the exam questions was to “discuss Tuckman’s theory in relation to your workplace” or something similar. I had no recollection of who Tuckman was and thought 25% of my exam mark was shot. Luckily I spotted a one-liner in a unit summary I had written: “Tuckman: Forming – Storming – Norming – Performing” this was enough make the link to stages of group development which I knew enough about to write a few hundred words and pick up most of the marks.

I use Google Docs for my unit summaries (and the rest of my uni work). I aim to have 2-4 pages for each unit (example unit). I also create a one or two page page matrix as a quick reference all the units (example matrix).

I don’t write the summaries in the same week as the unit. I usually start 3 or 4 weeks behind. I fall further behind when I am working on assignments. This is good for a couple of reasons – I’m having to recall what I’ve read a few weeks back and I’m able to reassess what is important in a unit based on what I’ve covered in the subsequent weeks. This means that at the end of the semester I’m 4-6 units behind, but having to write 6 unit summaries in the week before the exam is much less daunting than having to write all 12.

Unit summaries take an hour or two to write if I’ve highlighted all the bits I want to cover during my initial reading of the notes.

Assignments

There are usually two assignments per semester, an easy one and a hard one. Most have been individual but some are group based. I’d say they take 15-30 hours each. Some are essays, some are reports with very strict format requirements (like a Project Management Plan), others are more free form.

I always work in Google Docs for assignments. I start as early as I can considering they cover theory that is only covered in week prior to the due date. I start rough and iterate, dumping ideas in and then researching and refining them. I’m not the sort of person who can start at the introduction keep writing until I get to the conclusion (not sure if anyone is).

Google Docs is made for group assignments. If you are not using Google Docs to collaborate on documents and spreadsheets you are doing it wrong. The concurrent edits, version history and comment features are simple and just work.

I usually take the day before the assignment is due to pull it all together and make sure I have loads of time for the last minute stuff. I try to leave enough time to go for a surf after I hit submit, that way if I finish late I’m not missing anything major, but if I finish early I get a reward.

Exams

At the end of semester there is an exam. So far these have mostly been in the form of short and long essays, although I think the PM exam had a calculation in it.

All exams so far have been open-book so most of my exam prep is getting organised so that I walk in with all the right info in an easy-to-find format. This mostly means unit summaries, but also cheat-sheets with formulas or key quotes and ideas.

I also do practice questions if I can find them, usually my answers are just mind maps or bullet points. The main thing here is getting familiar with where to find answers and to sanity check the level of detail in the unit summaries. If detail is missing I go back and fill it in.

I try to take the day of my exam and the day before my exam off from work. I spend the day before studying and getting organised. The morning of the exam I try to go for a surf, double check I have everything I need, have a nice relaxed breakfast or lunch and get to the exam location early. No study on exam day – everything is focused on being relaxed (which for me means organised) and in a good frame of mind. After the exam I go for a beer with my classmates.

Blackboard

The UNSW uses Blackboard as its learning management system (LMS). This is an official information channel so as part of the MBT we have to login once per week to check that our lecturer hasn’t posted something important. We are also supposed to “chat” in the “coffee shop” and generally network and interact online. Some courses actually grade you on the quality of your interactions. As far as I can tell the online classes are all Blackboard based – you watch online video lectures and use the forums for “discussion”.

The user experience in Blackboard is horrible, the whole system is slow, the “community engagement” feature is the worst implementation of a forum that I have ever seen, the “blogging” feature is a slightly modified version of the forum. I could go on. The fact that something so bad can be widely considered to be a leader in the LMS field is a massive market failure.

I spend as little time as I can on Blackboard and loath the time I do spend there.

Overall

On average I’d say I spend 6-8 hours per week in a normal week. Around exams and assignments this can double or triple. I’m doing one subject per semester and 2 semesters per year. This means I get about 5 months off/7 months on per year.

At this rate it will take me 5 more years to finish the degree. I might take a summer semester here and there to speed things up, but I’m not in a rush. As long as I can continue to pace myself and spend the time I need I’m pretty sure I can manage to have a life and keep studying.

And in the meantime blogging will be a healthy form of procrastination.

Posted in Personal, Uni | 2 Comments


Flash drops 10%

Posted on by Mark

Steve Souders is a hero of mine. His tireless evangelism of a simple rule has had a massive impact on performance across the web over the past few years. He created YSlow, Page Speed, SpriteMe and several other tools, wrote a great book and then another, he provided the inspiration and concepts for a talk of mine and most recently he created the HTTP Archive. All this focused on one idea: how to make a faster web experience for end users. If you are making websites and you are not following his work, you are simply doing it wrong.

The HTTP Archive recently had it first birthday and this means it now has two significant data points to compare. There are loads of interesting take aways in Steve’s analysis of the results. However the thing that jumped out at me was that while the average page size had increased by 21%, the average size of Flash content had decreased by 10%. Apparently there wasn’t a major drop in the proportion of sites using Flash, just a drop in the volume of Flash content served by these sites.

This isn’t particularly surprising given the increase in mobile web access coupled with the end of Adobe’s support for Flash for mobile, not to mention the great strides being made in the non-Flash world. We were always going to see a drop. The real questions are how far and how fast?

I guess we’ll see next year.

Posted in Adobe, Industry, Trends | Leave a comment


Interacting with Government: 2011 – Communication technologies

Posted on by Mark

A few weeks ago AGIMO released Interacting with Government, the “sixth and final report into Australian’s use & satisfaction with e-government services”. Below are some highlights from the 90-odd page document. I’ve focused here on the respondents overall digital communication preferences. I may come back to the government specific numbers in another post.

Mobile use up from 18% to 38%  Mobile web use has more than doubled in twelve months. This is probably due to a spike people getting in Android based smart phones for the first time, however I would expect iPhones would also be well represented.

Use a tablet 12% This is a first time question so there is nothing to compare it to. I would have loved to have seen an iOS/Android breakdown here.

Internet phone calls up from 20% to 36% Skype has been around for a long time so it is surprising to see such a jump.

Email 84% Unsurprisingly email is still the most commonly used digital communication tool. However with 16% of people surveyed still not using email it is far from ubiquitous.

Social networking 47% This is a big jump from previous years but still less than half the population are using social networks. However making a couple of fairly safe assumptions you can deduce that close to 60% of people who use any form of online communications also use social networking.

RSS 40% I would have expected the use of RSS to be similar to the use of Twitter (i.e. < 20%).

69% of 65+ are online Use of digital comms decreases with age. 100% of 18-24 year olds using at least one of the technologies, compared with 88% of 55-64 year olds and only 69% people in the 65+ ago bracket. Median age for email users is 44, median age for social bookmarks it is 32.

Posted in Gov 2.0, Industry, Trends | Leave a comment


← Older posts