Introduction
A pet interest of mine is how do you take raw data and then turn it into actionable insights. It sits quietly as one of the hardest things in business. Even if you have a very skilled analyst, often they can’t bridge the gap and communicate it well to less technical people, or combine with the business context to make a strong case for change. Meanwhile the managers haven’t developed the skills in the data domain to do that work either. So seemingly great analysis ends up sitting in a dashboard or in a PowerPoint without going anywhere.
A big differentiator between good and great is what you do with data. Do you know how to define a good metric? Can you interrogate a dataset to understand the anomalies, edge cases, variability? Do you know how to present data well so that your intended audience can quickly understand the key point? There is a lot of hidden skill and hidden impact here.
I’ve collected several books and provided some tips to allow you to up your game. It’s appropriate for Product Managers, CEOs and data professionals (data scientists, BI, analysts). It’s very much applicable if you’re using Excel or Google Sheets, although those tools tend to become a bit limited as you move into more complex analysis or richer visualisations.
This is mostly fairly light on maths and statistics.
Book Recommendations
These are books that have greatly helped me. Before reading I did extensive research to try to find ones that I thought would be strong.
Many are not that new. It turns out - and I validated this somewhat before writing this - that the classics are timeless. The authors have often written several books, and I’ve tried to give some indication of their other work here too.
Over the years, I’ve recommended many of these to colleagues and mentees.
What Makes a Good Book?
There can be a lot of fluff in the data visualisation space in particular. Books about Infographics that show you pretty pictures, you look at and think “that’s cool”, but not only do you not learn the logic behind the construction of good charts or diagrams, they often are poor visualisations in that they prioritise novelty over clear communication and so aren’t suitable for most serious scenarios.
So I’ve found books that:
- Go beyond the introductory level.
- Explain the thought process behind decisions, not just show the output.
- Encourage clear, logical thinking over abstract creativity.
- Are fairly easy reads, and not focused on algorithms as much as overall process, or communication.
The Visual Display of Quantitative Information
by Edward Tufte
Best for: thinking consciously about how you present data.
This is the real classic of the art.
At first glance, this looks like something that should be relegated to the shelves of a retired professor. It’s actually very approachable and readable once you’ve got over the dated cover.
It takes you through how to make clear charts, with minimal visual noise (”chart junk”). It goes through lots of examples explaining why they are good / how to improve them.
Edward Tufte has written several books that you can move onto after this, and I’ve also read his Visual Explanations book, and he has a famous deconstruction of why you should avoid PowerPoint / slides if you want to encourage thinking (The Cognitive Style of PowerPoint) that remains timeless.
The Truthful Art
by Alberto Cairo.
Best for: Thinking through the whole process from data collection through analysis to presentation.
Quite a long read, and goes not just into how to present data, and why you should do it a certain way, but also how to analyse data. It’s created from the perspective of data in journalism, but is widely applicable.
He’s also written The Functional Art on information graphics and visualization.
Information Dashboard Design
by Stephen Few
Best for: If you’re focused on dashboards over business communication or data exploration.
This deals with how do you create a dashboard for rapidly seeing the status, rather than say exploring the root cause of a problem, or communicating a message through charts. While this is to some extent a classic, it is the one that I’m most hesitant to recommend. I think some of the dated examples might be more of an eyesore even if the content is still relevant.
The other book that Stephen Few is famous for is Show Me the Numbers, which is more about the design of tables and charts.
How to Make an Impact
by Jon Moon.
Best for: learning about structuring and formatting your written communication, with less focus on data presentation.
This is not really a book about data, it’s about writing with clarity and thought. It includes some of how you should structure your communication, including things like when to use tables or bullet points, and what to put in headings. So it’s a continuation of the theme of making conscious decisions about everything you present, rather than putting out what feels intuitively right.
There are other books about writing that are more popular, but if you want a concise introduction to good logical writing, in particular for business writing, then this is my choice.
Other Books / Honourable Mentions
These somewhat overlap with the recommendations above, but either I haven’t read, or are beyond the core scope:
- Storytelling with Data by Cole Nussbaumer Knaflic - similar concepts to other books like The Truthful Art and The Visual Display of Quantitative Information, but a bit more focused on business and getting the job done over theory / understanding.
- Visualize This by Nathan Yau.
- Books on Cartography (maps) - the challenge of what to show on a map is one of the greatest challenges that most people ignore, partly because it’s been well optimised over centuries, but it’s not a problem with one solution. Consider questions like should you show buildings, roads, foot paths, contours, vegetation, place names? How should you show them?
- How to Lie With Statistics - know your enemy.
Further Reading: Interactive, exploratory charts
A lot of the book recommendations are a bit oriented to static charts - like they were printed in a book. We have levels beyond that:
- Charts with basic interactivity, like hover to reveal more information or zoom. E.g. see examples for plotly.
- Interactive charts that allow you to actively dig into data. So maybe you click on elements of the chart, or widgets alongside the chart to filter the data.
In my experience the basic hover behaviour is great, and often plotting libraries do it out of the box pretty well. Dashboarding tools (like Sisense, Power BI, Tableau) tend to encourage adding filters, and work pretty well, but I tend not to do it much when doing exploratory analysis in Jupyter and just plot more charts.
There are fairly fundamental reasons why it’s rare to go too deep on interactivity. If you really want to explore, then having access to the raw data and going fully custom generating a new chart each time can be the right path forwards. If you want a glanceable dashboard, then often it’s better to have a series of charts (e.g. a top level trend, and then broken down in common segments). Adding some basic filters (e.g. country) work well in between. Too much interactivity leads to hidden functionality that users don’t use / is confusing, and which can take a lot of time to implement, so you get diminishing returns compared to just adding more basic visualisations.
I don’t have specific recommendations for going deep in this area, but I would recommend looking at materials published by the dashboarding tool companies (Looker, Sisense, Power BI, Tableau) and there are some accessible products out there that you could explore like Google Analytics. D3.js is a common javascript library for rich visualisations. If you’re a user of python, then Plotly (and Dash), Bokeh, Jupyter Widgets. But my hesitation is this will show what you can do, but not teach when you should do it.
Using ChatGPT / AI
LLMs are getting very capable at helping with data analysis and building charts, especially when using tools like Python. That doesn’t take away from what you would learn through the other recommendations here, but it is a complement, and potentially a good, interactive way of learning.
The biggest improvement you can make is to start to be very conscious about your decisions on how to analyse and present data. Once you do this, and start asking what the intent of the analysis or chart is, you can start to ask ChatGPT good questions.
A danger is to lazily prompt “make me a bar chart of this data” without thinking it through further. We want to add more context and be less prescriptive, seeking to use the LLM as a sparring partner, not just a lazy plotting tool. Consider for your prompt:
- What is the origin of the data (is it very raw or high quality)?
- Who is the audience? A board of directors, or an analyst?
- Are you looking to project a message or enable exploration?
- What is the question we are trying to answer overall?
The more of this context, or directed questions you can add, the better. Mentioning the books listed here will nudge it to a higher level of analysis.
Then we can use the LLM to:
- Critique our thinking e.g. have we selected the right metrics?
- Help us explore if the data looks sufficiently clean (high quality / consistent) to summarise simply.
- Perform the data processing and analysis.
- Help construct a story surrounding the data.
- Create a visualisation, or critique existing ones.
- Review our final output that we will present.
At each stage you can ask for critique and inspiration to increase the probability of finding something you’ve missed.
What tools to use is out of scope for this article, but in short, I normally use a fairly classic python stack (Jupyter, Python, Pandas, Plotly etc), and then you can prompt an LLM through the IDE (VSCode or JupyterLab).
Increasingly you can also upload your data to LLM web interface (ChatGPT, Claude, Gemini) and do the analysis.