There’s only one person I know who doesn’t love chocolate and that’s Tom Pilgrem in The Data School. Granted he does have a decent reason in that he’s allergic. So given this is Easter week, I provided a data set about chocolate preferences in the UK. Being an American living in the UK, I’ve observed that the Brits can be very snobbish about their chocolate and have a well known disdain for American sweets. What they don’t realize is that by making it so obvious, they make it more fun to antagonize them, another one of my favorite pastimes.
The data provided this week was super simple: three age groups with candy bars ranked 1-10 in each age group. One of the things I hear from people quite often is how difficult it is to visualize and analyze “small” data. Isn’t that part of the beauty of Makeover Monday though? You have to learn to work with lots and lots of different shapes, sizes and subjects of data.
This week also featured the latest Makeover Monday Live. I visited the Netherlands Tableau User Group and 80+ people turned up to visualizing the data in 45 minutes (that’s right, I didn’t give them the standard hour).
The bonus for me for hosting this event was getting to meeting the great Klaus Schulte. Klaus is a regular contributor and never ceases to deliver super high quality content.
Klaus must have heard of my reputation for “volunteering” people to present, so when I made him go first, he wasn’t very surprised. I had asked him while he was working on his viz if he worked ahead and he told me no. Yet when he delivered his outstanding viz (which replicated the original), he confessed that he prepped ahead of time knowing I was going to make him present. Sneaky, sneaky!
If you are interested in having Eva or me help run a Makeover Monday Live at your TUG or company, give us a shout and we’ll see how we can help.
LESSON 1: RANKS SHOULDN’T BE USED TO MEASURE VARIANCE
During the NL Tug, one of the presenters made a really good point, ranks don’t really tell us anything about how much one group prefers one chocolate over another. Ranked data is ordinal, that is, the data is categorical and has a sequence. That’s it! That’s all ranked data can be used for, to show the order of the data points.
Let me explain this using Superstore. In this first example, I’m showing the rank of sales for each region across product categories.
If I add a box plot as a method of showing the variance between the data, I see that the statistical median (the place where the two colors meet in the box plot) within each category is identical. This means that the variance between the ranks is identical, as would be expected with ordinal data. In other words, you really have no idea what the true variance is between the regions.
If we look at the sales instead of the rank of sales, we start to gain an understanding of the variance across the regions.
By adding a box plot, we can now see better the variance across the regions within each category and also which category has the highest median.
Keep this in mind the next time you work with ranked data. You can’t make inferences about the variance in the data; all you can really definitively state is which is higher than the others, not how much higher.
LESSON 2: USING EFFECTIVE TITLES
If you tune into Viz Review, you’ll notice that the first thing Eva and I look at is the title of the viz. This is what captures the attention of your audience and tells them what the viz is about. If you’re unsure what the title should be, consider putting it in the form of a question. You can always change it to a declarative statement later, however, starting with a question lets your audience know EXACTLY what they should get from the viz. The title can and should be used to inform the audience.
The added benefit of using a question is that it helps you decide what to include and exclude. If you include a viz that doesn’t answer or relate to the question, consider taking it out. If you’ve added color to your viz, does it enhance or distract from the title? Basically, the title serves as the framework and provides structure to your viz.
Consider this visualization by Andrew Herman:
Starting with a simple question informs the viewer about the content of the viz and what’s in it for them. It’s a simple title that frames the entire viz.