How to avoid dumb data visualization mistakes

An important part of big data is turning the results into information stakeholders easily can grasp. An obvious tactic is data visualization. But there are oh-so-many ways to do that the wrong way. Here’s 11 data viz practices you absolutely should avoid, along with the right way to present data visually.

Everything is surveyed, quantified, and packaged as data these days—so much so that it's easy to feel as though we're drowning in numbers. We so often see lists of statistics like this:

  • 43,346,000 of households own cats.
  • 6 percent of Americans don't know how to ride a bike, while 69 percent of 19-year-olds drive cars.
  • Pizzerias are expected to spend more than $4 billion on cheese.
  • 68.8 percent of adult smokers want to stop smoking, 52.4 percent have made an attempt to quit in the past year, and 6.2 percent have recently quit.

The problem is that numbers on their own are not only difficult to conceptualize but offer little guidance as to how to interpret and compare them.

That need is especially important in the context of big data. Businesses are motivated to collect, manage, and exploit data assets in order to make better decisions, whether it's to cut healthcare costs or improve IT security. But before you can use phrases like "apply analytics for richer insights," you need to package the dataset in a way that humans can understand at a glance.

That's where data visualization comes in. Patterns, trends, and correlations might go undetected in text-based data, such as in a long list of numbers. A visual context, even with simple graphs and charts, illustrates and communicates relationships and ideas. Data visualization—data viz to its friends—tames the vast volume of numbers we have to deal with and presents information in a way humans understand: storytelling. It makes the information accessible so we can decode, correlate, and otherwise elucidate interrelationships.

#alt
Source: Randy Krum, www.infonewt.com

At least, in best case scenarios, data visualization can achieve all that. But it works only if the data visualization is intelligently and effectively designed. That doesn't always happen. It may help if you engage a "creative" to help you with the presentation, but be sure your expert understands the purpose of the underlying data and isn't just designing pretty graphs.

#alt
Source: Randy Krum, www.infonewt.com

These expert suggestions can help you avoid the most common data visualization mistakes:

1. Using the wrong tool for the job

There's nothing wrong with a bar chart or a pie chart. But they are not always the right way to impart information, says Randy Krum, president of InfoNewt and author of the book Cool Infographics. In one expensive research report Krum examined, for example, every one of the 150 graphs was a bar chart. Not only was it boring and eminently forgettable, he says, but the charts didn't represent what the authors wanted to convey about their research data.

HudsonAlpha Institute for Biotechnology is handling big data and accelerating the search to cure humankind's most insidious diseases.

Every type of data viz graphic has a specific strength and purpose. Learn which is the best tool for the job.

For instance, Krum says, bar charts are good for comparing sets of data, such as the number of cars sold by each salesperson in the dealership. However, to show the geographic component of a set of data, such as the spread of Lyme disease by county, state, or region, a color-coded map is more appropriate. (See Alaska Fried Chicken as a good example of how to present regional data.)

#alt
Source: Randy Krum, www.infonewt.com

Choose pie charts for comparing parts to a whole. For example, of all the students in the graduating senior class, x percent are going to college, y percent are enlisting in the military, z percent are going directly into the blue-collar workforce, and so forth.

Line graphs are best at showing trends or other timeline progressions, such as the growth (or loss) in profit over the past 10 years. Scatter plots look at a large set of composite data to show how one set is affected by another.

If you're uncertain how to choose the right kind of graphic for your data, check out the free chart chooser on extremepresentation.com or an interactive version on ChartChooser.com. Both suggest data viz styles based on the type of data and the relationships or patterns to demonstrate.

2. Throwing everything but the kitchen sink into a single graphic

When you present data, you presumably know a lot about it. You have a deep foundation of facts that supports your knowledge and analysis.

However, that doesn't mean you should show everything you know in a single graphic. Each image should have a clear and finely focused purpose that provides an easy-to-understand narrative. Otherwise, your audience never catches onto the single point you need to convey with that graphic.

An audience typically pays attention to a graphic for no longer than five seconds, Krum says. So ask yourself: What will convey my message within that short time? "You can have supporting and follow-up information," Krum explains, "but your key message has to be what your design communicates in those first five seconds."

Don't create "a single chart that tries to do everything and ends up doing nothing well," says Alberto Cairo, the Knight Chair in Visual Journalism at the University of Miami and author of The Truthful Art: Data, Charts, and Maps for Communication. Instead, he suggests creating a dashboard or page with several kinds of charts that look at the same or related data from various viewpoints.

3. Making a mountain out of a molehill, and vice versa

The shapes you use in your data viz should be proportional to the data values they represent. Otherwise, your visuals will contradict your data.

For instance, in a circle chart, a circle that symbolizes 50 apples must be twice the size of one for 25 apples. If you make the 50-apple circle the smaller one, your viewers automatically assume that the data value is also smaller.

Also, make sure your size variations are appropriate to the graph type. For instance, with a bar chart, you can vary the height as long as you keep the width of the bars all the same. That avoids sending a mixed message about what is larger or smaller.

Font sizes are also relevant. The headline—a concise description of your intended take-away—should be the only large or bold text. But don't allow the headline to overwhelm the graphic, which is where viewers should focus. Footnotes must retreat from the eye; so use a small font for them. Text labels within the graphic should be consistently the same size, to avoid visual confusion and keep from distracting from the data.

In fact, any time you can avoid using labels, do so. A color key often is preferable, in which the text is small, though larger than the footnote.

4. Forgetting arithmetic and geometry

When things don't add up, readers doubt their eyes and question your analyses. So always check your arithmetic. For instance, the primary pie chart mistake is that the components don't add up to 100 percent.

Every chart type has an internal logic that we understand almost intuitively. For example, in a line graph that represents a progression over time, the timeline is always the horizontal line (or X axis). A graph in which the timeline reaches upward vertically instead is much harder to interpret and understand.

Circle graphs are particularly difficult. In most data viz software, the drawing tools have fields only for height and width, not for area. Yet, it's the area of the circle that represents the data.

Say you want to show one circle's value as three times that of another, suggests Krum. Typically, the would-be data analyst triples the diameter, since that's the physical variable available in the drawing tools. "But that results in a circle that is nine times larger, rather than three times," he says. Happily, Krum's Visualizing Circles cheat sheet makes the task a little easier.

5. Being color blind

Using too many colors, or ones that clash, distracts the viewer's eye. More important, improper color use buries your infographic's message in a profusion of stimuli so your audience doesn't know where to look first.

The human mind focuses on bright colors, so use them for the important data and reserve dull colors for secondary areas. For instance, in a scatter graph, put all the points in greyscale except the one section that is key to your purpose. That section should be in red, orange, or a similar color that would pull the eye.

#alt
Source: Randy Krum, www.infonewt.com

Also, to avoid confusion, keep colors consistent in related charts. In other words, if you're comparing the sales, reach, and growth of companies A, B, and C, the color you use for each company should remain the same in all three graphics.

6. Piling on shapes, colors and text

When a graphic has too many lines, shapes, colors, or text, it's unattractive, unprofessional and—most important—unfocused.

Keep it visually simple and clear. Avoid visual or textual noise. Give your audience only enough detail to support the purpose and message you are trying to convey—and no more.

After you create the first draft of your graphic, look at each individual component and ask yourself, "What does this do? Is it necessary for the purpose of this illustration?" Then remove anything that isn't, and create a separate image if it's important to your overall interpretation and discussion.

7. Using shaky data comparisons

Just because a chart or graph looks good and appears to be designed effectively doesn't mean it's telling the truth. The fault, most often, is the chart designer's choice of data.

"We need to make sure that the data we're visualizing is actually measuring what we think it is measuring," Cairo points out. He cites the example of a chart about violence against women, in which a statistician used results from multiple sources. "In one source, the rate [of violence in a certain region] was really low, and in another it was really high," he says. "What was going on?" The problem was that the studies used different definitions of violence. The one that focused on physical violence had lower numbers than one that also included verbal abuse.

In other words, when you use a data source, be sure you understand exactly what the original study is measuring and how the researcher came up with their results. In the context of big data, be sure you're reporting from equivalent datasets.

8. Seemingly pulling numbers out of the air

Valid, relevant data viz is based on credible data. You may have a firm handle on the validity of your data and its trustworthiness. However, unless your audience knows your sources and can check them, it may have difficulty trusting your analyses, comparisons, and conclusions.

A trustworthy visualization has a footnote citing the source of the data. An excellent one provides a website, a book, or a publicly available paper where your audience can check the original source material. That tiny bit of text at the bottom of your graphic alleviates your audience's doubts so they can focus on your interpretation.

When we report on big data, whether gathered by IoT sensors or from a DNA database, the audience assumes that the information is vast. It's important to be explicit about the sample size and the source. Even if it is in tiny type at the bottom of the chart, include text like, "The survey studied 750 IT executives" or "Based on temperatures gathered in June 2017 in San Francisco." Otherwise you put yourself at risk of contributing to examples for an update to the classic How to Lie with Statistics.

9. Turning the newest presentation tech into meaningless gimmicks

It's fun to use the hot new graphics styles in your data viz, but be sure they add to your intended functionality and purpose. For instance, having people tap on buttons to get them involved in screen redraws adds no value to the graphic. A better use of interactivity is to allow viewers to extract different patterns of information by changing a variable to see how that affects the results. (See the New York Times' Rent or Buy calculator for an example.)

Similarly, animation that shows the relationships of two different views of the data can be powerful. But often, animations are used to draw a laugh or to try to make a rather dull presentation appear (very temporarily) interesting. Be sure that any animation conveys meaningful information about your data analysis. A great example of animation that is appropriate, meaningful, and beautiful is the global map of wind, weather, and ocean conditions.

3D is a special case. Krum's reaction: "Don't!" While immersive 3D virtual reality has a lot of potential, most applications of 3D in data visualization are charts where the distorted perspectives involved can create false visual representations of comparative sizes and spaces. That's because in a two-dimensional representation of a 3D model, if you have two objects that represent the same data value, the one that is "closer" to the viewer appears physically larger. That, in turn, undermines the viewers' ability to use the charts to effectively interpret the data represented.

#alt
Source: Randy Krum, www.infonewt.com

10. Believing in "pretty is as pretty does."

Making your data viz attractive is useful in catching and holding the audience's eye. But when you worry first about the aesthetics of the graphic, then you're putting the data analysis on the back burner.

Data viz is an analytical tool, so it needs to be precise, with a great deal of clarity and no ambiguity. So after you've simplified and groomed your graphics to make them appealing, check your data representation again.

11. Assuming you're saying what you think you're saying

When we create a chart or graph, we often see much more than is presented. That's because we're close to the data and our purpose.

When you think you've finished designing your chart or graph, Cairo recommends, test it. Ask others to interpret it for you. Specifically, he suggests, ask them, "What did you get from the graphic? What did you learn?" You may be surprised how different their take-away is from what you were trying to convey.

When you avoid these 11 dumb data visualization mistakes, your graphics will keep your audience's attention and effectively and convincingly support the purpose of your presentation.

This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.