Existing public health surveillance systems typically classify disease cases (for example, chief complaint data from hospital Emergency Departments) into predefined syndromes (respiratory illness, GI illness, etc.) and look for emerging clusters of cases which may be indicative of an outbreak. These "syndromic surveillance" systems are effective at monitoring known illnesses or those with commonly occurring symptom patterns (e.g., influenza-like illness).
But what happens when a new bio-threat emerges with rare or previously unseen symptomology?
Existing systems would just map these cases into existing syndrome groupings, failing to realize that something new and different is occurring, and often missing the signal entirely. Thus there is a critical need for innovation in “pre-syndromic” surveillance for early detection of novel, emerging bio-threats which do not correspond to already known and monitored illness types (Faigen et al., 2015; see footnote 1 below).
To address this need, we have developed the Multidimensional Semantic Scan (MUSES). MUSES is a data-driven, automated machine learning approach for pre-syndromic surveillance that provides practitioners with personalized, actionable decision support by:
- Learning newly emerging syndromes from free-text ED chief complaints;
- Identifying localized case clusters among subpopulations; and
- Incorporating practitioner feedback to distinguish between relevant and irrelevant clusters.
MUSES is geared for public health practitioners (particularly, but not limited to, state and local health departments) who are already collecting hospital Emergency Department chief complaints (or similar free-text data) and are interested in using this data for pre-syndromic surveillance, e.g., to complement existing syndromic surveillance practice.
Please see our paper (Nobles et al., 2022) for full details of the MUSES approach. Blinded evaluations and case studies demonstrate that MUSES identifies more events of public health interest and achieves fewer false positives compared to a state of the art baseline. These results demonstrate the power of our detection methodology for accurately identifying novel clusters that are meaningful and relevant to public health, substantially improving accuracy and specificity of detection as compared to the existing state of the art.
Code:
Open source Python code for MUSES (Chen et al., 2022) is available on Github under a BSD-3-Clause open source license. Please see the license, documentation, and source code available here. If you have any questions, please feel free to reach out!
Presentations:
Prof. Neill will be presenting two talks at the 2022 Syndromic Surveillance Symposium, an event convened by the Council of State and Territorial Epidemiologists and the CDC’s National Syndromic Surveillance Program, on December 7, 2022. Presentation slides are posted here!
Daniel B. Neill, Mallory Nobles, Ramona Lall, and Robert W. Mathes, “Pre-syndromic surveillance for improved detection of emerging public health threats”. (pdf)
Daniel B. Neill, Boyuan Chen, Yi Wei, and Mallory Nobles, “MUSES Open-Source Software for Pre-Syndromic Disease Surveillance” (training session). (pdf)
Many state and local health departments already collect free-text ED chief complaint data from hospitals in their jurisdiction, for syndromic surveillance and other purposes. This session is geared toward public health practitioners who wish to implement practical, day-to-day pre-syndromic surveillance on this type of data. Training objectives include:
- Understand the purpose and potential value of pre-syndromic surveillance, as distinct from standard syndromic surveillance approaches.
- Gain a high-level understanding of the methodology used for MUSES.
- Download and install MUSES, and prepare data for analysis.
- Run MUSES on sample (synthetic) data or on one’s own collected data.
- Visualize detected clusters using our graphical interface.
- Provide feedback on newly discovered syndromes for MUSES “to monitor” or “to ignore” in future runs.
- Interpret and use detection results.
- Understand practical considerations and limitations for using MUSES in day-to-day practice.
Press coverage:
“A new machine learning model could help public health officials get ahead of the next crisis,” Marketplace Tech- Daily Tech Broadcast, podcast hosted by Kimberly Adams, November 4, 2022 (link)
Ruth Reader, “NYC experiments with artificial intelligence to help predict disease outbreaks”, Politico, November 4, 2022. Featured in Politico’s Future Pulse Newsletter, November 7, 2022, and New York Health Care Newsletter, November 7, 2022. (link)
Shania Kennedy, “Researchers develop machine-learning system to detect public health threats”, HealthITAnalytics, November 4, 2022. (link)
Robert Polner, “Researchers unveil data-driven, automated machine-learning system for detecting emerging public health threats”, New York University, November 4, 2022. (link)
Robert Polner, “Machine learning could warn us about the next public health threat”, Futurity, November 11, 2022. (link)
Robert Polner, “Data-driven, automated machine-learning system for detecting emerging public health threats”, MedicalXpress, November 14, 2022. (link)
Scott Andes, “Public trust in data could have helped China contain the coronavirus”, The Hill, February 19, 2020. (link)
Department of Homeland Security Hidden Signals Challenge, “Announcing the Challenge winners,” May 30, 2018. (link)
References:
Mallory Nobles, Ramona Lall, Robert W. Mathes, and Daniel B. Neill. Presyndromic surveillance for improved detection of emerging public health threats. Science Advances 8(44): eabm4920, 2022. (open access) (pdf)
Boyuan Chen, Yi Wei, and Daniel B. Neill. MUSES: A Pre-Syndromic Approach to Disease Surveillance: A Python Implementation, 2022.
Zachary Faigen, Lana Deyneka, Amy Ising, Daniel B. Neill, Mike Conway, Geoffrey Fairchild, Julia Gunn, David Swenson, Ian Painter, Lauren Johnson, Chris Kiley, Laura Streichert, and Howard Burkom. Cross-disciplinary consultancy to bridge public health technical needs and analytic developers: asyndromic surveillance use case. Online Journal of Public Health Informatics, 7(3):e228, 2015. (pdf)
Footnotes:
1) While others including Faigen et al. (2015) have used the term “asyndromic surveillance” to refer to the detection of case clusters which do not correspond to existing syndrome groupings, we prefer “pre-syndromic surveillance”, as the discovered case clusters can be incorporated into new syndromes and fed back into the system for improved detection of similar clusters going forward. For example, our MUSES software visualization interface allows users to indicate that a cluster should be included as a new syndrome either “to monitor” or “to ignore” in future runs.
Acknowledgements:
We wish to thank the BCD Syndromic Surveillance Unit at NYC Department of Health and Mental Hygiene for providing retrospective data for this study and for participating in the blinded evaluations. This work was partially supported by NSF grants IIS-0916345, IIS-0911032, and IIS-0953330, and by a prize from the Department of Homeland Security Hidden Signals Challenge.
Last update: 11/17/2022.