March 11, 2021
My Background & Expertise
As an admitted numbers junkie with time on my hand, I became immediately obsessed with COVID-19 data as the pandemic’s first wave exploded in March 2020. I have no training in epidemiology (or any other clinical field for that matter) and therefore avoid speculating about the data in a way that would require this expertise.
My degree is in mathematics and I’ve made a career of finding insights by crunching large datasets. So, that’s what I’ve been focusing on. To the extent that my work includes projections and forecasts, the rudimentary models I’ve built are limited by my field of expertise — math and data.
My Focus
Considering the limits of my expertise, there is still a vast opportunity to provide useful information. Simply crunching the data to provide people with a sense where the data are right now — and where it’s been — can be useful perspective. Admittedly, the computations I make aren’t even particularly sophisticated. But I enjoy doing the work and pointing out observations that haven’t yet punctured the news cycle.
From time to time, I have made some short-term forecasts, but even then, these forecasts are limited by principles of mathematical modeling and statistics. As such, the farthest-out forecast I think I’ve made in the past year has been three-and-a-half weeks.
Why 3-1/2 weeks? Because there’s a rough correlation between COVID-19 deaths reported on a given day with the number of new cases reported 24 days prior. This forecast, therefore, is a model based only on math — not epidemiology or any other field.
The vast majority of my work, though, has simply been processing the day’s data, and looking for the stories it tells.
My Data Sources
I’m currently relying on the New York Times dataset hosted on Github:
- Recent County Data:https://raw.githubusercontent.
com/nytimes/covid-19-data/ master/us-counties-recent.csv - State Data: https://raw.githubusercontent.
com/nytimes/covid-19-data/ master/us-states.csv
Until it discontinued daily data collection a couple weeks ago, I was using the Covid Tracking Project (covidtracking.com) for state-level data. For international data, I rely on 19-divoc.com.
I have been using the US Census Bureau’s 2019 estimates for county-level population values when adjusting COVID data for local area case rates.
Recent Comments