It won't be easy finding the relevant and precise data needed to end COVID, but that's part of the work we're doing with the Pandemic Response Challenge.
As an advisory board member for the Pandemic Response Challenge launched by Cognizant and XPRIZE, I’m all too aware of the complexities involved with accurately predicting how policy interventions will change the trajectory of COVID-19. There’s so much data that needs to be considered to accurately make these predictions, and that data can be hard to find – if it exists at all.
The fact is, government policy is just the tip of the iceberg when it comes to understanding the pandemic. We already know that similar policies can have vastly different outcomes in different countries. Compare Singapore and France, two countries with similar overall responses as measured by the Oxford COVID-19 Government Response Tracker (OxCGRT).
Further, even if a certain policy worked well in the past, that doesn’t mean it will work again in future contexts. Predicting the impact of today’s policy choices will only grow more complicated as vaccines slowly roll out in the next 12 months. In short, context matters, which makes it crucial to get the data on that context.
It’s all about the data
This is why the Pandemic Response Challenge is so important, as we’ll learn more from participants about the best data to strengthen their models. Here are a few areas where I believe additional data will make a big difference in producing highly accurate prediction models of daily COVID-19 cases and the best prescription models for intervention plans:
- Data on individual behavior. Understanding the actual behavior changes that occur as the result of a particular policy intervention is the holy grail when it comes to dissecting the causal chain of viral transmission. After all, when we see stay-at-home orders correlate with lower transmission, that’s only because those orders affect individual behavior. So, what if we could measure this activity directly? There are many factors at play: people density, contact time, mixing between regions. Such data – whether from cell phones, public transit companies or elsewhere – could shed light on the extent to which restrictions actually impact behavior. In some countries, similar levels of government restrictions have much less impact on individual behavior. Tracking how actual activities change over time would help models better predict what will work “tomorrow” vs. what worked yesterday.
- Data on compliance and enforcement. Our OxCGRT database might report a mask mandate or stay-at-home order, but are people actually complying with these rules? And how far is the government going to enforce it? The fact is, both compliance and enforcement vary greatly across the world. There’s also reason to believe compliance decays over time. Indeed, the effectiveness of a policy is heavily dependent on what has been happening in that country before the policy was implemented.
- Data on public sentiment. One of my untested hypotheses is that public attitudes and sentiment – the public zeitgeist – play a massive role in overall outcomes. In countries where there’s strong community acceptance of policy measures, we can likely assume these measures will be more effective than in areas where large segments of the community oppose government restrictions. In Australia, for example, Melbourne spent months in lockdown from July to October to eliminate the virus. In November, when a small cluster of 20 cases appeared in a different part of the country, there was a robust (and willing) public response. The city did not want to be the next Melbourne.
- Data on government messaging. While the OxCGRT describes the rules made by governments, it’s not the rules themselves that will help shape public sentiment but the messages and communications from government leaders. Few citizens will read the text of an emergency declaration or executive order – rather, they take their cues from press conferences, the news or mass public communications. In some countries, the messaging of senior government leaders has been in direct opposition to the country’s containment and health policies. The president of Madagascar, for instance, promoted an herbal tea as a COVID-19 cure at the same time as public health officials were trying to enforce restrictions on gatherings. In the U.S., President Trump publicly rejected mask wearing even when many states were trying to enforce mask mandates.
- Data on epidemic mechanics. The goal of Pandemic Response Challenge competitors is to accurately predict how many cases will be reported by countries. This is the real-world outcome their predictions will be tested against. Of course, the number of cases reported is not the same thing as the actual number of cases in a country. Estimates suggest that in some places, the reported statistics have captured just 5% of real transmission, in other places, it is close enough to 100%. And, of course, this changes over time: As cases rise in a country and capacity for testing and contact tracing is overwhelmed, it becomes much more likely that cases go unreported. A model could be more successful if it is based on an underlying estimate of "true" epidemiological transmission – something we can never directly measure or know. Data like testing rates or hospital admissions may be useful proxies here.
Finding the data
None of these ideas are easy. If they were, someone else would already be doing it.
With the exception of aggregate-level mobility data (such as that published by Google and Apple) and testing statistics (such as those collected by Our World in Data), I don't know of any good data sources on these issues that cover every country in the world. That is part of the challenge, and any contribution of additional data would be meaningful.
But I am optimistic that teams will rise to the challenge. I could imagine, for instance, creative teams scraping social media posts to analyze public sentiment and willingness to comply with stricter measures. Or perhaps using satellite imagery to track patterns of activity (cars on the road; night-time lights) or creating maps of population density over time. Air traffic and import records might show how international flows correlate with transmission. Financial transaction microdata might indicate the extent to which people in a village will comply with a lockdown. Global survey responses might tell us how values differ around the world, or how much the general population actually comprehends public health campaigns.
The Pandemic Response Challenge isn't about figuring out what worked in the past. It’s much harder than that. It's about trying to predict what will work in the future. I’m convinced that the best entrants won't just have the most efficient machine-learning algorithm to crunch through our OxCGRT policy data – they will also find the most novel, relevant and precise additional data to strengthen their models.
This blog was adapted from a post that originally appeared on the XPRIZE website.
Learn more about our Pandemic Response Challenge with XPRIZE, a $550k, four month challenge that focuses on the development of data-driven AI systems to predict COVID-19 infection rates and to prescribe intervention plans.
Or listen to our podcast, in which author and entrepreneur Jason Stoughton speaks to blog author Toby Phillips and Bret Greenstein, Senior Vice President of AI and Analytics at Cognizant, about the XPRIZE Pandemic Response Challenge.
a $550k, four month challenge that focuses on the development of data-driven AI systems to predict COVID-19 infection rates and to prescribe intervention plans.
Take a look at the finalists, the winners and the technology behind the Pandemic Response Challenge, and see Parts 1, 2, 3, 5, 6, 7 and 8 of our blog series.