Project 1 will be an individual
project to give people practice with writing a web-based application that
visualizes the same data in multiple ways using R and Shiny and ggplot2
and Shiny Dashboard, and get everyone ready to contribute to the group
projects to come. In this project everyone will learn how to import data,
use R to manipulate the data, and create an effective user interface for
visualizing and analyzing this data on the classroom wall. This will give
everyone a common basis for communication in the later group projects
where people will start to specialize in different tasks.
This project will focus on using basic graphs to visualize annual air
quality data, to see how air quality around the US has changed since the
early 1980s. We will expand upon this topic in Projects2 and 3.
The data files can be downloaded from
https://aqs.epa.gov/aqsweb/airdata/download_files.html
You will be writing your code to run full screen in a web browser and it
should run on all current browsers (Chrome, Safari, Firefox, Explorer,
Edge, etc.) but the main evaluation and demonstration will be done on our
classroom wall which runs the latest stable version of Chrome under
Windows 10. The screen size is 11520 by 3240 but assume some space will be
lost for borders, tool bars etc.
The fonts and visualization primitives you create should be
work effectively at that scale. The user should not need to scroll the
window, ever, so you should experiment with different ways to organize
the information and controls to find the most effective combinations.
You can (and should) develop your solution on a typical laptop /
desktop, just be sure to test on the classroom wall regularly before
turning your solution in to make sure it works by default at that scale
and resolution.
The demonstration project from week 2 in class should give you a good
starting point.
You should use ggplot2 for all of your chart plotting.
For a C you
need to:
- download and read in all of the annual_aqi_by_county files.
- create a Shiny dashboard using R scaled to the classroom wall
allowing the user to:
- pick any year 1980-2018 (and 2019 if the data becomes available)
from a menu and see for Cook County Illinois:
- pie chart showing the percentage of days, and a bar chart, and
table showing the number of days where the AQI was good /
moderate / unhealthy for sensitive / unhealthy / very unhealthy
/ hazardous
- pie chart for each individual pollutant (CO, NO2, Ozone, SO2,
PM2.5, PM10 ) showing the percentage of days in the year with
that pollutant as the main
pollutant
- bar chart and table showing the number of CO, NO2, Ozone, SO2,
PM2.5, PM10 days as the main
pollutant
- note that the percentages should only include days where there
is data (Days.with.AQI), so the percentages are not necessarily
out of 365 or 366 days, and the charts and tables need to be
obvious about the missing data
- bring
up information on the dashboard about who wrote the project,
what libraries are being used to visualize it, where the
data came from, etc.
For a B you
need to add:
- The user can now choose from a list of 12 US counties including
Cook, Illinois; Hawaii, Hawaii; New York, New York; Los Angeles,
California; King, Washington; Harris, Texas; Miami-Dade, Florida;
San Juan, New Mexico; Hennepin, Minnesota; Wake, North Carolina and
two others that you choose
- for the given year and county show the �C� range visualizations
for that county and year
- show a line graph using the annual data from 1980-2018 showing
lines for the median, 90th percentile, and max AQI over those
years (i.e. the graph should have 3 lines)
- show a line graph and table showing the percentages over the
years for days CO / days NO2 / days Ozone, days SO2 / days PM2.5 /
days pm10 (i.e. the graph should have 6 lines)
- show location of the chosen county on a pannable and zoomable
world map with an appropriate background (that is reasonably
centered and scaled on the US). You need to get the latitude and
longitude values from the aqs_sites.csv file (under the Site
Listing link), and not hard code them. Note that there are NAs and
0s in that file that could affect your results, so be careful.
For an A
you need to add:
- The user can choose any state from a list, and see all the
counties in that state and then choose any of those counties, and be
able to see an alphabetical list of all the counties in the US
(including state names), and be able to search by typing in the
county name, and see the �B� data visualizations on that county,
including its location on the map. Note that there are counties with
the same name in different states.
Graduate
Students need to add:
- The ability for the user to easily compare data from 3 selected
counties on the screen at the same time.
In all of these case you need to make sure that your visualizations are
well constructed with good color and font choices, proper labeling, and
that they effectively reveal the truth about the data to the user.
Note that as part of the web page part of the grade you will need to use
your interface to show your findings, so make sure that the way your
interface displays information is clear.
For this project you should host your solution using Shinyapps.io. For
later projects we will move to a local server. This kind of deployment is
covered in the 'Learn Shiny' tutorials.
It is important to note that 'getting it to work' is just a prerequisite
to using the application to find answers to your questions. It is that
usage that will give you ideas on how to improve your app to make it
easier and more intuitive to find those things. Writing the application at
the last minute pretty much guarantees that you will not come up with an
intuitive interface.
Many of the routines you write for this project will be used again and
expanded upon in the upcoming projects - e.g. all of the projects will
need graphs, so it is a good idea to write your code in a way that it is
reusable so you can modify it rather than totally rewriting it later.