math3080

Ethics of Graphing

Principles of Graphing

Dr. Michael E. Olson
MATH 3080 - Foundations of Data Science


What is wrong with this graph?

<img src="./images/12_EthicsOfGraphing/bad_scatter.png" width=100% alt="Lack of Visual Cues">

What is wrong with this graph?

<img src="./images/12_EthicsOfGraphing/bad_scatter.png" width=100% alt="Lack of Visual Cues">
* Numbers are not on a scale * (Numbers are likely strings) * No labels * No title * No legend * Font size (Unreadable)

Corrected graph

<img src="./images/12_EthicsOfGraphing/good_scatter.png" width=100% alt="Lack of Visual Cues">
* Numbers are not on a scale * (Numbers are likely strings) * No labels * No title * No legend * Font size (Unreadable)

What is wrong with this graph?

<img src=”https://ticktockmaths.co.uk/wp-content/uploads/2024/09/10.png” width=80%>

(from https://ticktockmaths.co.uk/badgraphs/)


What is wrong with this graph?

<img src="./images/12_EthicsOfGraphing/no_labels.jpeg" height=40%>
(from https://ticktockmaths.co.uk/badgraphs/)

Mistake or Misleading?

Some graphs give readers incorrect impressions.

Sometimes, people honestly make mistakes

Sometimes, people purposefully use bad statistics to mislead readers


Example of Misleading Statistics

In the 2020 election season, a news article said:

Older, white voters are significantly more likely to vote by mail and have those ballots counted, studies show, while voters of color and younger voters are significantly more likely to have their ballots rejected. NBC News, Aug 9, 2020

Could this be deceptive?


Example of Misleading Statistics

It likely isn’t purposefully deceptive - just a wording issue. But here is a scenario where their statement is true, but completely misleading.

  White Non-White
Older 45% 25%
Younger 20% 10%

Younger AND non-white groups would be 20%+25%+10%=55%,
a majority, though the biggest problem is older white voters.


Example of Misleading Statistics

<img src="./images/12_EthicsOfGraphing/Trump_employment.png" width=85% alt="">
From the 2026 State of the Union address

Example of Misleading Statistics

<img src="./images/12_EthicsOfGraphing/Trump_employment_full.png" width=85% alt="">
From the 2026 State of the Union address Fact-checking article published by BBC news on Wednesday, February 25, 2026 https://www.bbc.com/news/articles/cgmlzg0p8k2o

Mistake or Misleading?

In the following slides, we will look at a few principles that could convey information incorrectly and how to avoid them.

Note that these figures come from our textbook:


Visual Cues

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/piechart-1.png" width=100% alt="Lack of Visual Cues">

Visual Cues

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/piechart-1.png" width=100% alt="Lack of Visual Cues">
* Which had the largest? * Did the size of Firefox users increase or decrease?

Visual Cues

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/piechart-1.png" width=100% alt="Lack of Visual Cues">
Issues: * No clear way to compare the size of each area between the two figures * Sections are determined by both angle and area (two different dimensions)

Visual Cues

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/donutchart-1.png" width=100% alt="Results with only area">
Solutions: * Add labels to the graph * Only use angle or area Here is an example using a donut graph with the same data, but only using area. No labels -- still not clear

Visual Cues

| Browser | 2000 | 2015 | | :------ | :---: | :---: | | Opera | 3 | 2 | | Safari | 21 | 22 | | Firefox | 23 | 21 | | Chrome | 26 | 29 | | IE | 28 | 27 |
Sometimes, just giving the data in a table is clearer

Visual Cues

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/two-barplots-1.png" width=100% alt="Pie chart compared to a bar graph">
... or use a bar graph.

Extra (unneeded) perception

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/two-barplots-1.png" width=100% alt="Pie chart compared to a bar graph">
... or use a bar graph.

When to include 0

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/img/class2_8.jpg" width=100% alt="Bar graph without a 0 base">

When to include 0

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/img/class2_8.jpg" width=100% alt="Bar graph without a 0 base">
Issues: * Lines are not proportioned correctly

It looks like 2013 has tripled 2011, but really only increased by 16%


When to include 0

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/barplot-from-zero-1-1.png" width=100% alt="Bar graph with a 0 base">
Solution: * When the distance from 0 matters, make sure 0 is displayed in the graph

When to include 0

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/img/venezuela-election.png" width=100% alt="Election results without a 0 base">

When to include 0

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/img/venezuela-election.png" width=100% alt="Election results without a 0 base">
<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/barplot-from-zero-3-1.png" width=100% alt="Election results with a 0 base">

When including 0 isn’t needed

<img src=”https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/points-plot-not-from-zero-1.png” width=70% alt=”Life Expectancy by continent - Clustered data”>

In cases where we look at a distribution of values, the 0 is not really necessary


Distorting Quantities

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/img/state-of-the-union.png" width=100% alt="World economy shown by radius, but area stands out">

Distorting Quantities

<img src="https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/area-not-radius-1.png" width=100% alt="World economy shown by radius and again by area">
Issues: * The most obvious measure is the area (US 5x as large as China) * Actually used radius/diameter (US 3x as large as China)

Distorting Quantities

<img src=”https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/barplot-better-than-area-1.png” width=55% alt=”World economy shown as a bar graph”>

A bar graph is easier


Distorting Quantities

<img src=”https://i0.wp.com/ticktockmaths.co.uk/wp-content/uploads/2024/09/15-2-1024x530.png?ssl=1” width=55% alt=”World economy shown as a bar graph”>

(from https://ticktockmaths.co.uk/badgraphs/)


Distorting Quantities

<img src="./images/12_EthicsOfGraphing/GunDeaths.jpg" height=67%>

Distorting Quantities

<img src="./images/12_EthicsOfGraphing/GunDeaths.jpg" height=67%>
Issues: y-axis is inverted, giving impression that fewer deaths occurred after Florida enacted its "Stand Your Ground" law

Meaningful Order

<img src=”https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/do-not-order-alphabetically-1.png” width=55% alt=”Murder rates by state alphabetically and again by rate order”>


Meaningful Order

<img src=”https://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles_files/figure-html/reorder-boxplot-example-1.png” width=65% alt=”Income distributions by region alphabetically and again by income median”>