Week 4

Information Visualization



First lets talk about the Project 1 Presentations next week.

You will be demonstrating your application to the class

Since there are 36 students and 4 presentation days there will be 9 presentations per day. Each person will have 4 minutes to present his/her solution.
You will get a warning at the 3 minute mark. Everyone is doing the same project so you should focus your talk on your solution and not the problem.

After each day's talks you should update your project 1 web page with a list of the things you would improve in your project based on what you saw in the presentations that day. This list should be on a separate web page, in an obvious location so I can find it easily.
 



There are a lot of different ways to visualize different types of information so we are going to spend some time looking at the variety ...

lets start with a nice static visualization of different espressos

visualization
      of contents of different types of espresso
http://www.lokeshdhakar.com/2007/08/20/an-illustrated-coffee-guide/

a related chart with more data relating caffeine and calories is http://www.informationisbeautiful.net/visualizations/caffeine-and-calories/
The Bizz vs the Bulge - 2d
      chart of cafffeine vs calories for different foods and drinks


and how about the growth in the number of Crayola Crayons where the number of colours doubles every 28 years

growth in the number of colors for Crayola crayons over the
      years


a couple casual info-viz tools:

wordle - http://www.wordle.net/
tagCrowd - http://www.tagcrowd.com/

e.g. a wordle comparison of Obama's Inaugural address to Bush's first inaugural address where words that are said more often are larger.

wordle of Obama's
      Inaugural address   Worlde of Bush Jr's first inaugural address

and here is a tagCrowd version of the same speeches focusing on the top 50 uncommon English words in those speeches.

TagCroud of Obama's
      inaugural addressTagCroud of Bush Jr's first inaugural address
full texts can be found at: http://www.presidency.ucsb.edu/inaugurals.php

and a site doing similar things to US presidential speeches over time - http://chir.ag/projects/preztags/

Ben Fry has a site looking at changes to Darwin's On the Origin of Species through its various editions - http://benfry.com/traces/

and here was a very nice political one looking at words in the congressional record: http://www.capitolwords.org/congress/111/
here is an archive link: http://web.archive.org/web/20091125090034/http://capitolwords.org/congress/111/


Which leads us into some more dynamic information visualization tools that allow the user to interact with the data.

DiskInventoryX, WinDirStat to see relative file sizes on disk using treemaps, flattening out the hierarchies and colour-coding by file type as one example of treemaps - http://www.cs.umd.edu/hcil/treemap-history/index.shtml

Processing has a nice built in example to do this as well.

Once the map is drawn the user can click on a large (or small) box and see it dentified in the hierarchy, or click on part of the hierarchy and see its area. Its easy to explore the larger files, much harder to explore the smallest ones, unless one restricts the map to only a subset of the hierarchy.
Tree map of files in a
      multi-level directory


A similar styled chart looking at relative amounts of dollars spent (or lost) on various things is The Billion Dollar Gram -
http://www.informationisbeautiful.net/visualizations/the-billion-dollar-gram/

The Billion Dollar
      Gram

The BBC has a nice treemap of the top 100 sites on the internet - http://news.bbc.co.uk/2/hi/technology/8562801.stm
BBC treemap of top internet sites


Newsmap shows the news of the moment in a similar style - http://newsmap.jp

Newsmap of currently covered
      news topics


theme river style:
http://www.nytimes.com/interactive/2008/02/23/movies/20080223_REVENUE_GRAPHIC.html
Theme River of Hollywood films

What are people doing in Japan?
http://www.xoxosoma.com/tokyo-tuesday/

and a similar one showing
how people in the US spend their days:
http://www.nytimes.com/interactive/2009/07/31/business/20080801-metrics-graphic.html?hp
How Americans are spending their
      time by hour in the day


name voyager - http://www.babynamewizard.com/voyager
Popularity of different
      first names over time

job voyager - http://flare.prefuse.org/apps/index
Popularity of different jobs over time

and something similar with stocks from ManyEyes
http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/stock-data

another similar interactive visualization is Google's ngrams
http://ngrams.googlelabs.com
Google ngrams screenshot

NY times Billboard Rankings http://www.nytimes.com/interactive/2009/06/25/arts/0625-jackson-graphic.html?hp
NY Times billboard ranking
      comparison

and another nice one related to the media of music http://www.nytimes.com/imagepages/2009/08/01/opinion/01blow.ready.html

Popularioty of different
      media for playing music


Traffic Fatality Visualization by John Nelson
http://uxblog.idvsolutions.com/2012/12/five-years-of-traffic-fatalities.html





Here is a visualization of 50 years of space exploration

Visualization of Space
      Probes to Other Places in the Solar System



and the new Google trends site is interesting http://www.google.com/trends

and more modern stuff here - http://www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/

growth of target (earthquake data is often visualized the same way) - http://projects.flowingdata.com/target/
we will talk more about animation later in the course. This one is very good for getting a visceral feel for the rate of expansion and the locations, but for more numeric comparisons it would be good to augment this with a graph showing how many stores open per year across the country or in different regions
Growth of Target stores around
      the US



here is a site with lots of nice examples http://flowingdata.com/




London Underground Map by Harry Beck

Its more of a diagram than a map, as geography is less important than visibility and consistency.

London
        Underground Diagram

with a nice 25 minute video used to be available at:
http://smashingtelly.com/2008/10/16/design-classics-the-london-underground-map/

and a shorter 4 minute excerpt:
http://www.youtube.com/watch?v=Bg3pfUqdLp4

and you can see the history of the maps at:
http://homepage.ntlworld.com/clivebillson/tube/tube.htm

Compare this map of the CTA
http://www.transitchicago.com/assets/1/clickable_system_map/200806C.htm

to this map of the CTA
CTA L train map

and to this map of the CTA
CTA Map

Line Map

London
        Underground Line Map



Here is a interesting way of visualizing the distance to nearby stats from http://strangemaps.wordpress.com/ in terms of what Earth programs they are just receiving (now a few years out of date):

What TV programs are other star
      systems receiving

and a little closer to home, the history of Earth reduced to 24 hours from http://www.geology.wisc.edu
 (though clocks are usually 12 hours per cycle)




Here is a nicely varied set of visualization of US Immigration data over time from FlowingData.
http://flowingdata.com/2008/12/11/winner-of-tufte-books-and-many-other-good-entries/

and a nice visualization of current migration patterns (which would work better on a much larger screen)
http://peoplemov.in




MIT's eyebrowse had some interesting visualizations of browser history in 2010 - http://web.archive.org/web/20100107030938/http://eyebrowse.csail.mit.edu/

and there are a variety of things at chartporn.org



There are many info-viz tools with similar abilities. The one we will take a look at is XmdvTool, and it has a homepage at: http://davis.wpi.edu/~xmdv/

Parallel coordinates in
        xmdvtool

This next one, a scatter plot matrix, is particularly useful since it lets a person get a quick overview of the relationships between sets of variables to know where to look deeper.

Scatterplot matrix in
        xmdvtool

Another similar application that hasn't been updated in a while is Mirage - http://www.bell-labs.com/project/mirage/

One that is still under development is GGobi http://www.ggobi.org/

nView in the early 90s had an interesting twist on this by embedding a 2D map into the parallel coordinates - http://www.youtube.com/watch?v=FI2Wm5CgHSE


Here are some notes introducing Xmdv from a study done by one of our PhD students, Kyoung Park, a couple years ago.

we will take a look at the cars dataset to get a feel for the tool. The dataset is described at:
http://lib.stat.cmu.edu/datasets/cars.desc

Overall, there are 406 observations on the following 8 variables:

Some of these are continuous variables (MPG, displacement, horsepower, weight, acceleration time)
Some are discrete variables (# cylinders, model year)
And one is categorical with no natural ordering (origin)


The XmdvTool site has a downloadable version of the car dataset from the examples. In order to run it with the lite version however you will need to edit the fcars.okc file so that the top of the file looks like this:

7 392
MPG
Cylinders
Horsepower
Weight
Acceleration
Year
Origin
8. 50. 4
2.8 8.2 4
40. 250. 4
1500. 5500. 4
5. 30. 4
69.5 82.5 4
.8 3.2 3
18.000000 8.000000 130.000000 3504.000000 12.000000 70.000000 1.000000



Coming Next Time

Project 1 Presentations


last revision 9/14/15