Project
1
will
be
an
individual
project
to give people practice with writing an application
in processing and get
everyone ready to contribute to the group projects to come.
The project will
focus on looking at household utility data - which could either be
used
to decrease waste or to cyber-stalk someone.
Below you will find a link to a table
containing 11 years of data on:
- Year
- Month
- Average
temperature (according to the comed)
- Electricity
Usage
(kWh
/ day)
- Natural
Gas Usage (therms / day)
- Water Usage
(gallons / day)
The data is available
here:
http://www.evl.uic.edu/aej/491/10p1/data.csv
In this case the data is already pre-processed to be an
appropriate size and in an easily readable form. That won't hold
true
for the later
assignments.
Electricity is
typically billed once a month, but typically every other meter
reading
is
an estimate. Water is typically billed every 2 to 3 months, and
natural
gas every 1 to 2 months. This makes it harder to see short term
trends,
but longer term trends should still be visible.
The temperature
data in the file comes from the electricity company. The data
seems
reasonable but you might want to get temperature data from
another source, and probably in addition to the average also look
at
the
minimum and maximum.
You should start
by looking at data itself and do some simple plots in your
favourite
spreadsheet / plotting program. There will be some obvious
cyclical
patterns such as air conditioning driving up electricity usage
dramatically in the summer and the furnace driving up natural gas
usage
dramatically in the winter.
Your job is to
look beyond the cyclical patterns for longer term trends and
aberrations that are hiding there, and see what changes in the
real
world could have caused them. Some of these changes are related to
how
hot a summer was or how cold a winter was, others are related to
human
behavior. This is where the cyber-stalking / privacy issues part
of the
project comes in. If you have access to this utility data and can
filter out the repeating patterns, and the general environmental
changes, can you find interesting events or trends that tell you
about
the people?
The goal here is to create an interactive visualization tool to
aid in
your analysis and to back up any conclusions you draw.
Here is some more data that will help you with this:
- The house
stayed basically the same throughout the 10 years of data
collection.
- The house uses
electricity to run the air conditioner for cooling, and to run the
blower on the furnace for heating
- The house uses
gas for heating and cooking, and drying laundry
I did some
roough measurements of my electricity usage recently:
- Electronic
gear (computers, TV, videogame consoles) 4.5 kwH / day
- Cooking
& Cleaning 3.5 kWh / day
- Furnace
fan (not counting air conditioning) 2 kWh / day
- Pond hardware
2 kWh / day
In a common
house:
- taking a bath takes 50 gallons
- taking a shower takes 2 gallons per minute
- flushing a toilet takes 3 gallons
- a dishwaser uses 20 gallons
- a top loading clothes washing machine uses 60 gallons, a
front
loading machine uses 30 gallons
This data should
allow you to break up the daily usage of two people into
components
that can vary
over the months and years.
Here are some of
the dates things that might affect the utility usage
- April 2000
- Started an outdoor pond (400 gallons)
- June 2000
- Replaced the air conditioner
- April 2002
- Enlarged pond (total of 650 gallons), had major infection in
pond
fish requiring many water changes
- July 2004
- Installed new programmable thermostat
- July 2006
- 2 week trip out of the country
- Fall 2006
- Started using mulch and less water in the garden
- 2007 -
Replaced incandescent bulbs with CF bulbs
- July 2008
- 2 visitors for a month
- March 2008
- Replaced top loading clothes washing machine with front
loading model
- September
2009 - Doubled the insulation in the attic
- December
2009 - New high-efficiency furnace and water heater
- June 2010
- 2 week trip out of the country
You
will
very
likely
need
to
look
up some other reference material on the web about
water, gas, and electricity usage. I did. I also found that some
sites
over-inflate the usage numbers by 50 to 100% compared to the
numbers
that I was able to measure. Those numbers also depend quite a bit
on
where you
live and the size of your home. I would recommend looking at a few
different sites to get a better feeling about what good average
numbers
might be. Be sure to cite the websites that you use.
Your
visualization and analysis tool should be written in processing.
The
Milk / Tea / Coffee example should give you a nice head start on
this.
You should start by getting processing installed and doing some
initial
tests to load in the data and start displaying it. You should then
start to draw some sketches of what the interface might look
like
and how you want to arrange and display the data. How are you
going to
make use of the screen real-estate?
You can use other software to generate statistics about the data
if you
find that useful. The data here is the kind of thing you would
typically see in a graph which is probably a good way to visualize
it,
and this will give you a chance to write a set of graphing code
that
you can reuse in future assignments since graphs are so useful.
The application you create should help the user perform the
following
tasks:
Task #1 -
document the repeating seasonal utility usage patterns and what
affects
those
patterns (e.g. temperature)
Task #2 - given
the results from task #1, document the long term trends and short
term
variations in those patterns (e.g. given temperature readings you
should be able to work out expected values for
gas and electricity usage, and then any variation you see might
suggest
a human cause).
Task #3 - given
the results from task #2, try to see if the events listed above
had an
obvious affect. Also look for other possible affects on the data.
Are
there
other dates where interesting things happen?
Bonus Task #4 - predict
Andy's
average daily electricity, natural gas, and water usage for the
next 4
months of the table. The table of data ends at July 2010.
Predict the
values for the next four rows: August, September, October,
November. At
the end of the course we will see who was the best predictor.