Project 1
will be an individual project to give people practice with writing
a web-based application in javascript, html, and D3 and get
everyone ready to contribute to the group projects to come. In
this project everyone will learn how to import data, how to write
an interactive application, and how to create an effective user
interface for visualizing and analyzing this data on the classroom
wall. This will give everyone a common basis for communication in
the later group projects where people will start to specialize in
different tasks.
Events
shape attitudes and opinions. This project will focus on
using basic graphs to show information on the ages of people
around the US and around the world, and relate that age
information to various events. Your application should help
someone to investigate differences in the populations in different
regions, and
to get some answers to the question 'how many people were alive
(or remember) when X happened' where X could be the Dust
Bowl in the US, or when The Police broke up, or when England won
the World Cup, etc.
The
data files to start with are:
data for the US as a whole and information on all of the
individual states in 2010-2014:
http://factfinder.census.gov
and in particular (unless they change the link again)
http://www.census.gov/popest/data/state/asrh/2014/SC-EST2014-AGESEX-CIV.html
full data on populations by year for every country:
https://www.census.gov/population/international/data/idb/region.php
Note that the data in the data files can be rather detailed. You
will probably want to come up with an intelligent way to cluster
the data into fewer categories for some of your visualizations.
Dealing with data files is a major part of visualization, and most
are not as well formatted as these. If there are too many slices
in a pie chart, too many bars in a bar chart, or so much text that
it is unreadable, then you need to do something smart to make it
usable, and document those decisions in your writeup. Saying 'that
is what the computer did' is never an acceptable answer from a
computer scientist.
You will be writing your code to run in a web browser and it
should run on all current browsers (Chrome, Safari, Firefox,
Explorer, Edge, etc) but the main evaluation and demonstration
will be done on our classroom wall which runs the latest stable
version of Chrome and Firefox under Windows 10. A big advantage of
using scalable graphics is that your visualizations and interface
should naturally scale up to the larger display. The screen size
will be roughly 8196 x 2188 (which is almost the same aspect ratio
as two HD monitors side by side) but assume some space will be
lost for borders, tool bars etc.
The application should have obvious
and intuitive controls. We will use the touch overlay on the
wall so your features should be accessible assuming the user
only has a single button mouse, and you should make sure any
controls you have are reachable and an appropriate size for a
person to touch.
While scalable graphics scale pretty well, user interaction is a
bit different on a large wall so you should plan on spending
some time testing on the actual wall during office hours to make
sure your application works as expected.
One of the major goals here is to experiment with different ways
to visualize the same data so no other libraries can be used
without prior permission (e.g. no xCharts, D3plus, rickshaw,
etc). You are in control of the visualization and interaction
and you should not feel limited by what some other libraries
provide. You can use a database to store the data if you wish or
flat files or the cloud. You can use external tools to process
the data as long as you have a pipeline you can document in your
writeup.
All of your graphs should be well labelled and have common
axis and colors to make comparison easier.
The
user should be able to bring up information on who wrote the
project, what libraries are being used to visualize it,
where the data came from, etc.
For
a C you need:
- user can choose the entire US or a single US state from a
list or menu of all 50 states and you will generate a column
or bar graph showing the distribution of ages in that state
(i.e. how many people are 1, how many are 2 etc.) and also a
pie chart showing the same data (the pie chart will be very
hard to read). The user should be able to change to a
different dataset and all of your charts should update.
- user can choose to cluster the data in your bar/column
chart and the pie chart into age ranges based on 10 year
blocks (e.g. <10, 10-20, 20-30, 30-40, etc.). This
should make the pie chart easier to read, but hides some
details in the data.
- user can choose to show one event (this event could be
social, political, entertainment, sports, science, but in
general should qualify as something that someone would
remember) and your graphs should highlight the people in
both the bar/column chart and the pie chart that were
conscious of that event (age 12 or older - and yes this is
specifically not on the category boundary so you will have
to deal with a partial slice of the pie chart). You should
also give the percentage of people as a number.
For
a B you need to add:
- ability to show data on multiple US states and/or the
entire US simultaneously - for the C part you had one pair
of charts - here the user should be able to show 3
independent pairs of charts to help compare. This makes use
of the extra space afforded by the wide display, and makes
it easier for the user to compare two or three different
datasets. What US states are the most different
population-wise?
- allow the user to choose from a list of 10 different
events and show people who were conscious of that event
across all of the currently visible charts.
For
an A you need to add:
- allow the user to also choose data from at least 5 other
countries around the world, in addition to the data
available on the US and its states, to show up in the 3-way
comparison. Pick countries that are interesting to you, or
that have interesting events, or interesting age patterns.
Given a particular historical event (be it a conflict, a
disaster, or a football match), the age ranges of different
countries may strongly affect how the population in general
views it, or how much of each country even remembers it.
- document several interesting findings on the dataset from
using your application
You should start by getting D3 installed, running through the
demos, and doing some initial tests to load in the data and start
displaying it.
Once you have a basic shell working you should then start to draw
some sketches of what the interface might look like and how you
want to arrange and display the data. You can use other software
to generate statistics about the data if you find that useful, but
be sure to document that process. Be careful of missing data when
you generate statistics. Look at the actual data and make sure
your statistics make sense.
Your application should start out showing some data (e.g. data for
the entire US in this case) - a blank screen is not very inviting
and doesn't tell a user what your visualization can do. However,
in past terms the students have shown a desire to show an
overwhelming amount of the data to the user right away. You should
be careful not to overwhelm the user either. As Schneiderman said
"overview first, zoom and filter, details on demand." Appropriate
levels of aggregating data will be very important here.
It is also important to note that 'getting it to work' is just a
prerequisite to using the application to find answers to your
questions. It is that usage that will give you ideas on how to
improve your app to make it easier and more intuitive to find
those things. Writing the application at the last minute pretty
much guarantees that you will not come up with an intuitive
interface.
Many of the routines you write for this project will be used again
and expanded upon in the upcoming projects - e.g. all of the
projects will need graphs, so it is a good idea to write your code
in a way that it is reusable so you can modify it rather than
totally rewriting it later.