I signed up to Lynda.com today and watched a number of tutorials on the subject matter of data visualisation, it covered a lot of the basics initially but opened my mind to what makes a good visualisation compared to not. This blog examines an example of a good visualisation and a bad visualisation in my own opinion.
The first thing to note though is that a good visualisation tends to have three main attributes that it successfully tackles, those are as follows:
A – Accurate – Data is visualised accurately as a whole.
S – Story – It tells a compelling story.
K – Knowledge – It’s interesting and you learn something from viewing the data.
So there are plenty of examples of data visualisation on the internet, some basic, some more complex, some examples are more successful than others. When it comes to examples of good visualisation, no matter how hard looked I couldn’t find an example better than the one I’m going to feature below.
Figure 1 – Charles Minard – Napoleon’s 1812 Russian campaign (Tufte and Finley, 2002)
Charles Minard’s 1869 visualisation of Napoleon’s 1812 Russian campaign remains even after 140 years to be one of the most compelling and interesting data visualisation examples that exists in the public domain today.
Primarily the data visualisation shows the number of his army at different stages of campaign over the stage of time which is shown by geographical data showing the landscape between France and Moscow.
At different points you see figures stating the size of the army at different stages of the campaign, in a beige colour when they are on the offensive and black when retreating back to France.
Figure 2 – William Playfair “Commercial and Political Atlas” (Playfair, 1786)
Minard’s inspirations may have come from other sources but there are limited examples of visualisation from before his time, he may well have been inspired by the work of William Playfair the originator of visualising data. You can see elements of mapping over time and the use of colour in a block to show a flow of narrative in this work from 1786.
At various points of the retreat you see where the temperature is recorded at different points in the journey. The temperature plummets as the army retreats lowering already heavily depleted numbers particularly when crossing the Berezina river where the temperature was already cold and the river particularly so, an already weakened army suffered great losses at this junction.
What makes his data so compelling is that it details real human loss, a considerable number that was completely avoidable if they had been more successful or had been planned better.
Figure 3 – William Playfair – time series graph of prices, wages, and ruling monarch over a 250 year period (Playfair, 1821)
Minard would also find inspiration for multiple data streams in this work by playfair showing a line graph much like his temperature graph below the key data in this case the building like bar chart in this work from 1821.
Clearly the campaign was predicted to be a sweeping success noted by two clear junctions early in the campaign where members of the army were asked to break off from the main army and not be involved directly in the campaign.
Figure 4 – John Snow – Mapping Cholera (Snow, 1854)
Though less visually comparable the work of John Snow to map the source of a Cholera outbreak was groundbreaking and would have made an impact on the world, it changed the way people think, the use of geography however might have been an inspiration for Minard, in this case the streets of London, rather than the larger geography of France and Europe.
The flow of people shows that at almost every stage there were considerable losses, many people died and the originally widith of the bar compared to the end where they line up side by side, it is assumed that 98% of the army perished during this campaign, figures suggest that figure to be 410,000 people.
Also interesting is that less than a quarter of the initial army made it to Moscow after leaving France, 420,000 at the start, 100,000 by the time they reach Moscow. They don’t appear to show the army suffering any great losses in Moscow itself. In both circumstances where the army branched out, 30,000 of 60,000 didn’t survive their spell in what looks like Poland and an even earlier branch that went north on the visualisation only 6,000 of 22,000 returned. 6,000 of the final 10,000 to survive the battle, meaning only 4,000 of 398,000 members of the army survived after going to battle east in Poland and Moscow. That is a compelling and huge loss of human life.
Figure 5 – Charles Minard – Map of Port and River Tonnage (Joseph Minard, 1855)
Minard’s other works also do tend to focus on geographic landmarks, such as France in his work about Cattle and Paris and in this example of rivers around Europe in the 1850’s.
This very real fact but a well presented visualisation are what make this example so effective and stand up well to the test of time, even with my noted critisism of no losses in Moscow itself which could be true depending on what happened. I’m genuinely surprised there are not more strong visualisations showing the human cost of war out there on the internet and clearly none presented so informatively.
Now finding a bad example of data visualisation was far easier and a number of online resources very kindly complied lists of bad examples all together, I quickly found the example below which I’m going to feature and explore in further detail.
Figure 6 – A Visualisation that make no sense (Viz.wtf, 2017)
Well this visualisation is a confusing mess of poor executed data. There are so many basic elements wrong with this data and very little you can actually take away from it.
First let’s explore what it does show, it shows some of the biggest companies in the world compared to each other in a number of different catergories shown across the X axis.
Figure 7 – Modern London Underground Map (Graham-Smith, 2016)
The use of colour in this example might have been inspired by the much celebrated London Underground Map, orginally created by Harry Beck in 1933. Though there are multiple streams of colour and complex data to present, this map is successful in that it simplifies what would be even more complicated if taken too literally and is relatively easy to understand by almost anyone.
You can make vague comparisons saying one company has a higher percentage of women in the workplace than another company but it all means nothing because there are no actual values. If this information taken from the entire workforce of that company or a sample portion of it, are they actual percentages or just figures they are comparing with each other.
Figure 8 – The Daily Routines of Famous Creative People (Currey, 2013)
This recent example also takes a colourful approach and perhaps putting all the varied information into one long bar adjusted proportionately to keep it’s shape and hierarchy, perhaps would have provided a better visual result to the data, making the streams more individual and easier to compare across the companies still using colour but far less of them.
Higher and Lower also means nothing because we don’t know what the top end and low end values of this scale are because it could be between 0 – 100% or 95 and 100% or 0-10%, we literally don’t know what the scale is, it’s completely open to interpretation and meanings nothing at all.
Figure 9 – Semantic Web – (Chibana, 2015)
This example shows the hierarchy of a single stream of data by it’s size, perhaps the data above would have worked better in this manner, by individualising the x axis and only using data that could be quantified by a value, the result would have been clearer and stronger rather than trying to show too much all at once.
I also question what a low value in high % job satisfaction actually means, does that mean that only a percentage of the workforce have high job satisfaction, or does it mean that the whole workforce are only % satisfied with the job, if that satisfaction was say 92% is that a bad figure at all, aside from the fact it could be improved. Either way it would be made a lot clearer with some actual values.
Next the worrying statement ‘0 values indicate there was not enough data to determine a value’ is right there in the corner, what does this even mean, does that mean the sample itself is floored, or impossible to value or to compare rendering this chart useless. Are the samples even consistent? how can you compare something as a percentage of it’s total if you only speak to a small portion of the full sample and assume the rest. This probably alone means you shouldn’t take away anything from viewing this visualisation.
Finally I want to tackle the rainbow of companies features, so 18 companies have made it to this visualisation, shown by a full spectrum of colours to identify them. Now that alone isn’t awful but it’s when you see it in the chart where it just becomes confusing and messy. It’s nearly impossible to not confuse the eyes and struggle to follow the data all the way across, let alone to compare different companies easily, even if you pick two specific companies and follow them all the way across, there is too much information, it all moves around very erratically and it’s extremely difficult to assume anything from it. Then you have sections where data lands on the same point and certain colours just vanish and only certain data remains as the forefront colour.
Figure 10 – Crayola – Evolution of Crayon colours (Von Worley, 2010)
Now this example is the perfect use of multiple colours because it is completely factual and in fact this clearly communicates the data with only a timeline for year required highlighting key points where more colours were introduced, this example shows using a spectrum of colours can indeed work when used correctly. This wouldn’t be possible to replicate with the data streams but at least above it make for a very visual and appealing piece of informative data.
Figure 11 – GE – Health Infoscape (GE, 2013)
This example shows what it’s like to use a spectrum of colours from this different sources, but the detail in the layout and clean presentation makes it a far more successful outcome, there are likewise connections and comparisons to be made but here they are much easier to follow than the confused version in the example, some better choices in using the data and perhaps the original chart could have turned out far more informative and successful in giving knowledge to it’s reader.
In conclusion, there is no element of fine tuning and editing here, why feature so many companies, why compare such ranges of data where it’s impossible to compare them side by side, why not include any values to give the data any grounding or meaning, this is a truly awful example of data visualisation.
So in conclusion, there are plenty conscious decisions we make when handling data that can create a great piece of visualisation or to make something with no useful value at all so we have to be strict on ourselves to be creative and present the information we have as smartly and clearly as possible, finally Lynda.com offers a great array of workshops and information on all the key creative design programmes and I highly recommend giving them a try to learn some new skills and acquire new knowledge and information.
Chibana, N. (2015). 15 Stunning Data Visualizations (And What You Can Learn From Them). [online] Visual Learning Center by Visme. Available at: http://blog.visme.co/examples-data-visualizations/ [Accessed 2 May 2017].
GE (2013). 20 Inspiring Big Data Visualization Examples. [online] Available at: http://www.keywebmetrics.com/2013/07/big-data-visualizations/ [Accessed 2 May 2017].
Graham-Smith, D. (2016). The History Of The Tube Map. [online] Londonist. Available at: http://londonist.com/2016/05/the-history-of-the-tube-map [Accessed 2 May 2017].
Joseph Minard, C. (1855). Charles Joseph Minard | Cartographia. [online] Cartographia.wordpress.com. Available at: https://cartographia.wordpress.com/category/charles-joseph-minard/ [Accessed 2 May 2017].
Mappinglondon.co.uk. (2014). Mapping Cholera | Mapping London. [online] Available at: http://mappinglondon.co.uk/2014/mapping-cholera/ [Accessed 2 May 2017].
Playfair, W. (1786). William Playfair | Humantific. [online] Humantific.com. Available at: http://www.humantific.com/tag/william-playfair/ [Accessed 2 May 2017].
Playfair, W. (1821). [online] Available at: https://www.researchgate.net/figure/226400313_fig6_Figure-7-William-Playfair%27s-1821-time-series-graph-of-prices-wages-and-ruling-monarch [Accessed 2 May 2017].
Tufte, V. and Finley, D. (2002). Edward Tufte: New ET Writings, Artworks & News. [online] Edwardtufte.com. Available at: https://www.edwardtufte.com/tufte/minard [Accessed 1 May 2017].
Viz.wtf. (2017). WTF Visualizations. [online] Available at: http://viz.wtf/ [Accessed 1 May 2017].
Von Worley, S. (2010). Color Me A Dinosaur – The History Of Crayola Crayons, Charted. [online] Datapointed.net. Available at: http://www.datapointed.net/2010/01/crayola-crayon-color-chart/ [Accessed 2 May 2017].