RIT COVID Poller
Tracking and graphing historical data is fun and useful for showing trends in data, but RIT provides no historical data. However, our COVID dashboard only shows immediate numbers, with no ways of tracking or seeing trends. I wrote a Python app that scrapes data in real time from the dashboard and provides an API to access current data in JSON format or historical data. Scraping the site is easy enough, and to make it easier I used code from the RIT COVID Tracking Discord Bot, which pulls data in real time from the dashboard for a few Discord servers.
Once that data is retrieved, it's inserted into a SQLite database. However, because the dashboard is updated incrementally, when they updated the dashboard I'd get between 2 and 10 new entries in the database. This becomes massively inconvenient, because each day can have multiple entries and it's just messy. To avoid this, every time the database is updated, if there's any previous entries on the same date, those entries are dropped and only the latest entry is retained.
Two endpoints are exposed to access this data at the moment, though as the semester changes more may be created. The first is /api/v0/latest, which simply shows the current state of the dashboard (with a delay of up to 5 minutes) in JSON format for easy consuption by any other apps that don't want to parse the HTML from the page. The other endpoint is /api/v0/history, which contains all the historical data from the semester.
For most of the dates in the history endpoint, you'll notice the time is
16:00:00 on all entries.
This is because I wrote the app partway through the semester so I had to backfill it with data that was sourced
from the r/rit community.
As always, the source code for this project is available on GitHub!