Data Visualization Myth

byAmanda Leung

Common tricks to manipulate data:

  1. Truncated Y-axis
  2. Omitting Data
  3. Fake Cumulative graph
  4. Non-Correlating Causation


Type 1: Truncated Y-axis

The interest rates are skyrocketing! And the bar sizes imply that rates in 2012 are several times higher than those in 2008.

But something seems wrong...The y-axis does not start at 0.00%.

What if the y-axis starts at 0.00% ? The interest rates are staying static now.




Type 2: Omitting Data

The graph above is accurate and includes data from each year.

When half of the data points are removed, the data looks like a steady march upward.

By only plotting every second year instead of every year, the graph appears to have a steady increase, while the real data is more volatile.




Type 3: Fake Cumulative graph

The cumulative annual revenue is moving up and to the right, so things must be going well!

When we draw a non-cumulative graph, the result is totally different...

Revenues have been declining for the past ten years!




Type 4: Non-Correlating Causation

Does ice-cream consumption leads to murder? NO!

We are beginning to see correlating causation more and more with big data analyses.

Data scientists are finding statistical patterns in data and sometimes care more about correlation rather than causation.




Methods to avoid the pitfalls of misleading data:

  1. Using standard model for visual models
  2. Make the data visualization clear and easy to understand
  3. Be sure that all data and visualization has been scrutinized before it goes public



Reference:

Data Visualizations Designed to Mislead by Agata Kwapien

How to Lie With Data Visualization by Ravi Parikh

Don’t Let Your Trial Graphics Go Beyond Advocacy to Misleading by Morgan Smith