Project 1 will be an individual
project to give people practice with writing a web-based application that
visualizes the same data in multiple ways using R and Shiny and ggplot2
and Shiny Dashboard, and get everyone ready to contribute to the group
projects to come. In this project everyone will learn how to import data,
use R to manipulate the data, and create an effective user interface for
visualizing and analyzing this data on the classroom wall. This will give
everyone a common basis for communication in the later group projects
where people will start to specialize in different tasks.
Litterati is a nice example of modern
citizen science projects with 'average people' collecting and analyzing
data around the planet. In this case people are using smartphones to
photograph and mark the location of litter being picked up, and then
tagging (with the help of AI and ML) the kind of litter. Some people are
better at tagging the type of litter than others. Sometimes the GPS
coordinates are more accurate than other times. Sometimes data gets
corrupted. Sometimes people are just bad at spelling or typing.
In this case the dataset is for a
particular 'challenge' where a group of people try to meet a particular
goal, in this case a 2019 challenge from the Go Green Forest Park group in
the western suburb of Forest Park.
Here we will be creating a tool to
take a look at this data for this one particular project, though it should
be applicable to all the Litterati projects.
The data file can be downloaded from
https://www.evl.uic.edu/aej/424/litterati challenge-65.csv
You should open up the file as a text
file, or as a spreadsheet to take a quick look and get a sense what the
data looks like and how much there is. I think the fields are pretty self
explanatory
You will be writing your code to run full screen in a web browser and it
should run on all current browsers (Chrome, Safari, Firefox, Explorer,
Edge, etc.) but the main evaluation and demonstration will be done on our
classroom wall which runs the latest stable version of Chrome under
Windows 10. The screen size is 11520 by 3240 but assume some space will be
lost for borders, tool bars etc. The fonts and visualization primitives you create should be
work effectively at that scale. The user should not need to scroll the
window, ever, so you should experiment with different ways to organize
the information and controls to find the most effective combinations.
Users will be using touch to interact so make sure your controls are
reachable and at an appropriate size for people to use touch. You can
(and should) develop your solution on a typical laptop / desktop
computer, just be sure to test on the classroom wall regularly before
turning your solution in to make sure it works by default at that scale
and resolution. The project will be graded in terms of how it works on
the classroom wall.
The demonstration project from week 2 in class should give you a good
starting point.
You should use ggplot2 for all of your chart plotting. You should use
leaflet for your map work. If you use another library without permission
you will lose points
You should do a little cleanup to remove lines with errors
(i.e. locations out of range). Any items that are not tagged should get
an 'untagged' tag. Any users without a user name should have an
appropriate (and unique) user name generated for them.
You will probably want to convert the timestamp to something
R prefers and change it from GMT to Chicago time. While as.Date is handy
for dates, you will probably want to use lubridate's conversion routines
to also take the time into account. You may want to convert the tags
into a more searchable form to get access to all the individual tags -
some nice functions for that include head(), sort(), strsplit(), sum(), table() and unlist(). Note that a tag may be more than one word
separated by a space (e.g. 'hair wrap'). You may find the library
stringr useful and in particular str_count().
You may find that given the number of markers you will want to
use clusterOptions or something similar to maintain interactive
interaction. You will probably also need to look into maxNativeZoom and
maxZoom, and possibly the MiniMap in leaflet. You may also want to look
into the various marker possibilities described at https://rstudio.github.io/leaflet/markers.html
I
would highly recommend playing with the data first in the R studio
console to work out the order of operations to get the data you need in
an appropriate form before starting to use Shiny to interact with it.
Most of the effort in R is getting your data into an appropriate form to
apply the functions that you need to apply to it (this is pretty much
true of all data visualization libraries). Plot the data in R studio to
get a sense of how much space each of the visualizations will take up.
Then start looking into setting up the overall dashboard in R studio
with the initial visualizations, and then one by one use shiny to
control the data presented in the visualizations.
For a C you
need to:
read in the datafile, clean it up, and create a touch based
interactive visualization for the classroom wall in R and Shiny that
initially shows a summary with:
an appropriately centered and scaled map showing the location
of all the litter picked up as part of the project that can be
panned and zoomed by the user
text giving the total number of items of litter picked up
a list of the top 10 pickers by user name and how many items
each has picked
a bar chart showing the amount of litter picked up each day
from the beginning of the data file to the end (i.e. from Apr
4, 2018 to Jan 7, 2020)
a bar chart showing the amount of litter picked up by day of
the week (i.e. from Monday to Sunday)
a bar chart showing the amount of litter picked up by hour of
the day (all 24 of them)
a bar chart showing the number of pieces picked up by tag
(e.g. the individual tags like 'plastic', 'paper', 'wrapper',
'bag', etc., including 'untagged', not the combined tags like
'plastic, balloon') for the top 10 tags
the user should also be
able to see the data for each of the bar charts in table form
The
user should be able to choose a name from the top 10 pickers
and see the rest of the charts update to show only the litter
that person picked up on the map, only the times and days that
person was picking, and only the types they picked up in the
bar charts and tables
The
user should be able to choose a tag from the top 10 tags and see
the rest of the charts update to show only that type of litter
on the map and in the bar charts and tables
ability
bring up information on the dashboard about who wrote the
project, what libraries are being used to visualize it, where
the data came from, etc. (i.e. a typical 'about' screen)
for
the C range the 'top 10' lists can be statically created - for
the B and A ranges they need to be created dynamically from
the datafile.
For a B you
need to add:
When the user picks a person or tag from the 'top 10' lists the
user should see that person compared to the totals (i.e. for the C
range this data replaced the summary in the charts, for the B range
you see the summary as well as the individual to make it easy to
compare them
The user should be able to pick a time of day (morning, afternoon,
evening, night) and see the how data from that time period compares
to the summary in the maps, bar charts, and tables
The user should be able to pick a month and see the how data from
that month compares to the summary in the maps, bar charts, and
tables
For an A
you need to add:
let the person change the map style in leaflet from 3 useful
choices
instead of limiting the user to the top 10 pickers and tags, give
the user access to a list of all pickers and tags to use in their
selection
While the B range let the user pick one person, tag, time of day,
month and see compare that data to the summary, in the A range the
user can select up to 3 and see them compared to each other and the
summary
Graduate
Students need to add:
The user should be able to see images of the actual litter that was
picked - note this is intentionally vague in terms of how you show the
litter. Come up with a useful, viable way and defend it.
In all of these case you need to make sure that your visualizations are
well constructed with good color and font choices, proper labeling, and
that they effectively reveal the truth about the data to the user.
Note that as part of the web page part of the grade you will need to use
your interface to show your findings, so make sure that the way your
interface displays information is clear.
For this project you should host your solution using Shinyapps.io. For
later projects we will move to a local server. This kind of deployment is
covered in the 'Learn Shiny' tutorials.
Your code should be made available on
GitHub ( https://github.com/)
in a public repository for the project. You can keep the repository
private while doing your development. I would suggest setting up the
GitHub project early and regularly pushing code to it as a backup.
It is important to note that
'getting it to work' is just a prerequisite to using the application to
find answers to your questions. It is that usage that will give you ideas
on how to improve your app to make it easier and more intuitive to find
those things. Writing the application at the last minute pretty much
guarantees that you will not come up with an intuitive interface.
Many of the routines you write for this project will be used again and
expanded upon in the upcoming projects - e.g. all of the projects will
need graphs, so it is a good idea to write your code in a way that it is
reusable so you can modify it rather than totally rewriting it later.
Chrome's Developer Tools allow you to
emulate screens of different sizes (view / developer / developer tools /
settings / devices) and while the current max size is 9999 pixels wide, it
may help you to do more of your development remotely.
You should create a set of public web pages that describe your
work on the project. This should include:
1 page/tab with a link to your visualization solution optimized for
the classroom wall, and a description of how to use your application and
the things you can do with it.
1 page/tab on the data you used, including where you got it, what you
did to it.
1 page/tab with links to your project on GitHub giving access to your
well commented source code, any necessary data files, and any
instructions necessary to run it. These instructions should start from
the assumption that the reader has a web browser on their computer and
tells the user everything else he/she needs to know and do to get it
running using R studio, including installing all the required software.
1 page/tab on what interesting things you found about the data using
your application. Are there particular people picking more than others?
Is there a time of day or a time of year that more people are picking
more litter? Are there more of a particular kind of litter in particular
areas?
all
of which should have plenty of screenshots with meaningful captions. Web
pages like this can be very helpful later on in helping you build up a
portfolio of your work when you start looking for a job so please put
some effort into it.
Be sure to document any external libraries, tools, etc. that you make
use of - give credit where credit is due for everything that you didn't
create yourself.
You should also create a 2-3 minute YouTube video showing the use of
your application including narration with decent audio quality. That
video should be in a very obvious place on your main project web page.
The easiest way to create the video is to use a screen-capture tool
while interacting with your application, or using a camera while
interacting with the classroom wall, though you will most likely find
its useful to do some editing afterwards to tighten the video up. If
you do decide to use your phone or tablet to make the video, then
please shoot the video in landscape rather than portrait orientation.
Its also a good idea to have a video like this available as a backup
during your presentation just in case of gremlins.
I will be linking your web page to
the course notes so please send andy and the TA a nice jpg image of your
visualization for the web along with the link to your website before the
deadline. The image should be named
p1.<your_last_name>.<your_first_name>.jpg and be roughly
1920 x 540
I would prefer that every student
presents their work to the class, but given the class size that will be
impractical, so we will spend the Tuesday class after the project is due
meeting in groups of <3-5> where each student will show their
solution to their local group, and the group will discuss the merits of
each solution. By the end of Tuesday's class each group will produce a
report (including names of all the group members) on what the group
liked about each solution and make that available on one of the group
member's public web pages and email the location of that page to andy.
The group will also choose an overall favorite solution to present on
Thursday.
On Thursday each group will have <5> minutes to present their
group's favorite solution and discuss what they liked about it on the
classroom wall.
This week is also a very good time to find people to work with on
Project's 2 and 3 based on the work they show in class and all of the
solutions posted on the course web pages.
last
revision 2/6/2020 - clarified the valid dates in the file