Here is animage from Information Anxiety, P286. This time the focus is on over designing the graphic. Trying to make the graphic 'exciting' makes it harder to get information from it.

3
kinds of lies: lies, damn lies,
and statistics (quote atrributed to several different people)
Here
is a page with nice examples: http://www.math.yorku.ca/SCS/Gallery/lie-factor.html
Here is a comparison of 3 graphics of the same data.
Both
have
a high lie factor. Part of the lie in the
first figure is not taking inflation into account, but the figure
itself 'lies' by using 3 dimensional figures to represent a change in a
single dimension. The extra dimensions make the difference seem larger
- similar to starting a graph with the axis not at the origin. It also
using foreshortening - pushing the past further back making it seem
smaller than the present in the front
In
the second
figure the use of a line graph makes the data
more truthful, but look at the labelling of the price axis - its not a
linear scale. Also the second chart isnt really giving the price of
crude oil - its giving the change in price after setting the price in
1972 to 100.
The
modern graphic below from inflationdata.com is a much more truthful
representation of the data. Both scales are linear and in easy to
understand units. The source of the data is cited.
First
- Pop vs Soda
vs Coke from http://www.popvssoda.com/

lie factor = size of effect shown in graphic vs size of effect in data
- clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data.
- show data variation not design variation
- the number of information carrying dimensions depicted should not exceed the number of dimensions in the data
The purpose of visualization is to make it easy for the user to see the
patterns, the similarities, the differences in the data. This involves
both the variation in the data itself and the ability of a human being
to perceive variation.
In general you do
not want to let the computer use its default values. Unless you are
using a specific program for a specific field the default values will
not be right for your work. This is especially true of programs like
Word and Excel (though both have improved a lot in the last couple
years in this regard.)
Lets start with tables -
the format of a table can greatly enhance or reduce the readability.
Here is a table from the EPA - the Total Emissions column of data is
centered making it very hard to compare the values within.
| Source Sector | Total Emissions |
|---|---|
| Electricity Generation | 652,314 |
| Fires | 14,520,530 |
| Fossil Fuel Combustion | 1,499,367 |
| Industrial Processes | 2,414,055 |
| Miscellaneous | 33,786 |
| Non Road Equipment | 22,414,896 |
| On Road Vehicles | 62,957,908 |
| Residential Wood Combustion | 2,704,197 |
| Road Dust | 0 |
| Solvent Use | 3,294 |
| Waste Disposal | 2,018,496 |
A better version of the table would be the following where both the
sources and the amount of emissions are easier to see and quickly grasp:
| Source Sector | Total
Emissions |
|---|---|
| Electricity Generation | 652,314 |
| Fires | 14,520,530 |
| Fossil Fuel Combustion | 1,499,367 |
| Industrial Processes | 2,414,055 |
| Miscellaneous | 33,786 |
| Non Road Equipment | 22,414,896 |
| On Road Vehicles | 62,957,908 |
| Residential Wood Combustion | 2,704,197 |
| Road Dust | 0 |
| Solvent Use | 3,294 |
| Waste Disposal | 2,018,496 |
Here is a made-up table - its hard to see any pattern in the Yes/No
Values.
| Yes |
No |
Yes |
| Yes |
No |
No |
| No |
Yes |
Yes |
| No |
No |
No |
| Yes |
Yes |
No |
A better version of the table would be:
| Yes |
- |
Yes |
| Yes |
- |
- |
| - |
Yes |
Yes |
| - |
- |
- |
| Yes |
Yes |
- |
A different better version of the table using colour to help highlight
the pattern would be:
| Yes |
No |
Yes |
| Yes |
No |
No |
| No |
Yes |
Yes |
| No |
No |
No |
| Yes |
Yes |
No |
Here is a table from the Nielsen Games page:
http://blog.nielsen.com/nielsenwire/media_entertainment/top-pc-game-titles-and-consoles-october-2008/
The rightmost
column of numbers is hard to read because its left justified.

This version is easier to read because the right column of numbers is
right justified. The decimal points line and bigger numbers look bigger.

Be careful of significant
digits
Your
table should not show more accuracy than the accuracy of the data
collection. The computer will happily compute an average out to an
alarming number of digits, but if you only took measurements to one
decimal point then that's as far as you should show any derived
(average, min, max, median, etc) values.
Programs
may also reduce your significant digits by eliminating trailing zeros
(turning 4.20 into 4.2) so you will want to force all the data of the
same type collected in the same way to have the same number of
significant digits.
For
presentations, your tables should only show as much accuracy as needed
to get your point across. If two values differ by 100 then you dont
need to show those values to the third decimal place. The additional
detail in the numbers gets in the way of seeing the bigger trend.
Here is another table from the same Nielsen page. Again left justifying
the numbers makes things harder to read, but there are also an issueof
significant digits. We can presume since they have been in the survey
business a long time that they do have faith in their data out to that
degree of significance, and very likely that number of digits is
necessary to disambiguate data further down the table, but since they
are just presenting the top 10, the extra digits get in the way.


I should point out that if I was creating these tables myself
then I would use white text on a black background, since these web
pages have a black background
Simple charts
Here is an example charting the population of the USA over the last 8
years. First up is an overly dynamic 3D chart with a hard ro read set
of population numbers and a trend that is made even more pronounced by
the 3D viewpoint. Please do not create charts like this.
Here is a less exciting but ,uch more useful version where the data is shown in 2D and the population values have commas to make it easier to see what the numbers actually are. Another good possibility would be to make the vertical column "Population (in Millions)" and then have 270, 275, 280 etc as the vertical values.

Here are a couple
variants using lines with the actual data points highlighted. The big
difference is in the Y-axis. One chart suggests there is slow steady
growth; the other suggests rapid steady growth.

and now lets go back to the video game console data from above.
First let's see a
couple charts from the older version of Excel. The older Excel just
wasnt very good at making charts - the colours hurt your eyes, the odd
grey background shouldn't be there, etc. Its best just to avoid using
the older Excel to make charts. It takes too much time to fix
everything that is wrong.

The latest version of Excel is much better in dealing with colours and layout, but has also included lots of 3D bling that should be avoided. 3D distorts the data and adds in unnecessary details that makes it harder to see what's really going on. Please do not create charts like this.

Instead we can
display the data without the 3D. By default Excel with pick the colours
for the various data values as seen above. If the data values are
unrelated then thats fine, but here we could also use the colour to
relate consoles made by different , manufacturers (blue for Sony,
red for Nintendo, green for Microsoft, and grey for Other, with
the more saturated colours for their latest releases.)


It would be good if
the colours you choose also work for people who are colour blind. You
should at least make sure that you data doesnt blend together or
disappear for people who are colour blind The colours I chose in the
last couple graphs are OK, but an even better way is to avoid using
green in your charts since red/geen is the most common form of colour
blindness.
A good site to check your graphics is: http://colorfilter.wickline.org/