Deliverables
- Project Code - Full Repo: GitHub Repository
- Project Code - Tarball: Tarbell
- Project Poster: Google Slides Poster
- Project Final Report: Final Report
Introduction
A superspreader event (SSE) is an incident in which a single infected individual transmits a contagious disease (in this case COVID-19) to a much larger number of people than the average transmission rate. Factors such as close contact, large crowds, and prolonged exposure can contribute to an event becoming a superspreader.
During the COVID-19 pandemic, events like weddings, religious services, and political rallies often led to spikes in cases. In this project, we analyze how political SSEs uniquely contributed to the spread of COVID-19, with a focus on their dynamics and consequences.
Problem Definition
We aimed to model the spread of COVID-19 at political rallies and compare it to other types of SSEs. Using SARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous variables) and SEIR models, we determined whether political events of a specific valence exhibited unique characteristics that amplified disease transmission.
Literature Survey
Superspreader events have long influenced pandemic dynamics. Existing literature highlights their significance in accelerating transmission but often lacks specificity regarding political rallies. Our survey synthesizes insights from key studies and identifies gaps we aim to address.
- Strengths: Papers like Kumar et al. (2020) and Stein (2011) provide foundational insights into SSE dynamics.
- Weaknesses: Limited definitions of SSEs and retrospective analyses hinder proactive prevention.
Proposed Method
Intuition
We hypothesize that political rallies exhibit unique transmission patterns due to crowd behavior and ideology-driven non-compliance with safety protocols.
Approach
Data from reliable sources like the Crowd Counting Consortium and NYT COVID-19 dataset will be analyzed using ARIMA and SEIR models. Our findings will be visualized to identify trends and correlations.
Experiments
Questions
- Do political rallies differ significantly from other SSEs in disease spread?
- How do different political affiliations impact the dynamics of transmission?
Experiment Details
Using ARIMA for time-series forecasting and SEIR for epidemiological modeling, we aim to correlate event characteristics with case spikes. Visualizations will compare infection dynamics across event types.
Results
We found two telling results from our outputs: first, that any given event didn't have that much of an effect on causing huge spikes, and second, comparing all events shows that Republican ones were indeed the most likely to spread Covid, followed by control events and then Democratic events. Figure 3 above is from Los Angeles County in California and shows that the predicted cases where the exogenous variable is equal to 2 (Republican-leaning events in our dataset) are higher than both where valence is 0 (no political leanings) and where valence is 1 (Democratic-leaning). This supports our initial hypothesis that Republican-leaning events would be more likely to spread the disease. Still, it was a little surprising how much lower the projected Democratic-leaning spread was even compared to the projected control spread. This follows a general conclusion that ignoring safe disease practices leads to a greater spread of the disease.
Conclusion
The findings of this study show a straightforward but important point: those who did not follow expert recommendations in regards to the pandemic and public health likely contributed the most to increased case counts. Our results suggest that no single event was a major driver of significant spikes in case counts. It’s likely that each event contributed incrementally, compounding the case counts over time. The process of analysis presented several challenges, particularly in working with the datasets. For instance, the valence variable had to be recoded to properly differentiate between the effects of different political events as an exogenous variable in the ARIMA model. The COVID-19 case data was incomplete, with dates showing zero case counts. To address this, we applied data cleaning and imputation strategies to interpolate missing values, aiming to produce more realistic projections. Another complexity was tuning the ARIMA model. The various underlying mechanisms of COVID made it challenging to select parameters that captured the full dynamics of the data, so model optimization took longer than expected. Looking ahead, we aim to expand our analysis to include a broader set of counties, as time constraints limited us to a subset of the 1,736 counties available. We also plan to explore alternative models to validate whether our conclusions remain consistent across different methodologies. By refining both our approach and the data, we hope to gain deeper insights into the relationships between political events, public health behaviors, and disease spread.