Here is a selection of recent papers by the ML4G Lab, arranged by topic. Additional publications are available from our faculty web pages (Neill, Adhikari, Cerda, Shroff).
Our application areas include disease surveillance, combatting the opioid crisis, crime and justice, healthcare, and development. Our methodological areas include event and pattern detection, causal inference, algorithmic fairness, systems modeling, and spatial machine learning.
Applications – Disease surveillance
Mallory Nobles, Ramona Lall, Robert W. Mathes, and Daniel B. Neill. Presyndromic surveillance for improved detection of emerging public health threats. Science Advances 8(44): eabm4920, 2022. (open access) (pdf)
Daniel Zeng, Zhidong Cao, and Daniel B. Neill. AI-enabled public health surveillance: from local detection to global epidemic monitoring and control. In L. Xing, M. L. Giger, and J. K. Min, eds., Artificial Intelligence in Medicine, 437-453, 2021. (pdf)
Roberto C.S.N.P. Souza, Renato M. Assuncao, Daniel B. Neill, and Wagner Meira Jr. Detecting spatial clusters of disease infection risk using sparsely sampled social media mobility patterns. Proc. 27th ACM SIGSPATIAL Intl. Conf. on Advances in Geographic Information Systems, 359-368, 2019. (pdf)
Roberto C.S.N.P. Souza, Renato M. Assuncao, Derek M. Oliveira, Daniel B. Neill, and Wagner Meira Jr. Where did I get dengue? Detecting spatial clusters of infection risk with social network data. Spatial and Spatio-temporal Epidemiology 29: 163-175, 2019. (pdf)
Sriram Somanchi and Daniel B. Neill. Graph structure learning from unlabeled data for early outbreak detection. IEEE Intelligent Systems 32(2): 80-84, 2017. (pdf) (extended version on arXiv)
Applications – Combatting the opioid crisis
B. Allen, R. C. Schell, V. A. Jent, M. Krieger, C. Pratty, B. D. Hallowell, M. Basta, W. C. Goedel, J. L. Yedinak, Y. Li, A. R. Cartus, B. D. L. Marshall, M. Cerda, J. Ahern, and D. B. Neill. PROVIDENT: Development and validation of a machine learning model to predict neighborhood-level overdose risk in Rhode Island. Epidemiology 35(2): 232-240, 2024. (pdf)
B. Allen, D. B. Neill, R. C. Schell, J. Ahern, B. Hallowell, M. Krieger, V. A. Jent, W. C. Goedel, A. R. Cartus, J. L. Yedinak, C. Pratty, B. D. L. Marshall, and M. Cerda. Translating predictive analytics for public health practice: a case study of overdose prevention in Rhode Island. American Journal of Epidemiology 192(10): 1659-1668, 2023. (pdf)
Katie Rosman and Daniel B. Neill. Detecting anomalous networks of opioid prescribers and dispensers in prescription drug data. Proc. 37th AAAI Conf. on Artificial Intelligence, 14470-14477, 2023. (pdf) (supplement)
R. C. Schell, B. Allen, W. C. Goedel, B. D. Hallowell, R. Scagos, Y. Li, M. S. Krieger, D. B. Neill, B. D. L. Marshall, M. Cerda, and J. Ahern. Identifying predictors of opioid overdose death at a neighborhood level with machine learning. American Journal of Epidemiology 191(3): 526-533, 2022. (pdf)
B. D. L. Marshall, N. Alexander-Scott, J. L. Yedinak, B. Hallowell, W. C. Goedel, B. Allen, R. C. Schell, Y. Li, M. S. Krieger, C. Pratty, J. Ahern, D. B. Neill, and M. Cerda. Preventing overdose using information and data from the environment (PROVIDENT): Protocol for a randomised, population-based, community intervention trial. Addiction 117(4): 1152-1162, 2022. (pdf)
B. Allen, J. M. Feldman, and D. Paone. Public health and police: building ethical and equitable opioid responses. Proceedings of the National Academy of Sciences 118(45): e2118235118, 2021. (link)
A. Castillo-Carniglia, W.R. Ponicki, A. Gaidus, P.J. Gruenewald, B.D.L. Marshall, D.S. Fink, S.S. Martins, A. Rivera-Aguirre, G.J. Wintemute, and Magdalena Cerdá. Prescription drug monitoring programs and opioid overdoses: exploring sources of heterogeneity. Epidemiology 30(2): 212-220, 2019. (pdf)
Daniel B. Neill and William Herlands. Machine learning for drug overdose surveillance. Journal of Technology in Human Services 36(1): 8-14, 2018. (pdf) (journal version)
Applications – Crime and justice
Konstantin Klemmer, Daniel B. Neill, and Stephen A. Jarvis. Understanding spatial patterns in rape reporting delays. Royal Society Open Science 8: 201795, 2021. (pdf)
Dylan J. Fitzpatrick, Wilpen L. Gorr, and Daniel B. Neill. Keeping score: predictive analytics in policing. Annual Review of Criminology 2: 473-491, 2019. (link)
Konstantin Klemmer, Daniel B. Neill, and Stephen A. Jarvis. Modeling rape reporting delays using spatial, temporal and social features. Proc. NeurIPS 2018 Workshop on Modeling and Decision-Making in the Spatio-Temporal Domain, 2018. (pdf)
Sharad Goel, Justin M. Rao, and Ravi Shroff. Precinct or prejudice? Understanding racial disparities in New York City’s stop-and-frisk policy. Annals of Applied Statistics 10(1): 365-394, 2016. (pdf)
Brad J. Bushman, Katherine Newman, Sandra L. Calvert, Geraldine Downey, Mark Dredze, Michael Gottfredson, Nina G. Jablonski, Ann S. Masten, Calvin Morrill, Daniel B. Neill, Daniel Romer, and Daniel W. Webster. Youth violence: what we know and what we need to know. American Psychologist 71(1): 17-39, 2016. (pdf) (APA press release)
Magdalena Cerdá, Melissa Tracy, Katherine M. Keyes, and Sandro Galea. To treat or to prevent? Reducing the population burden of violence-related post-traumatic stress disorder. Epidemiology 26(5): 681-689, 2015. (pdf)
Applications – Healthcare
Ougni Chakraborty, Kacie L. Dragan, Ingrid Gould Ellen, Sherry A. Glied, Renata E. Howland, Daniel B. Neill*, and Scarlett Wang (listed alphabetically; *corresponding author). Housing-sensitive health conditions can predict poor-quality housing. Health Affairs 43(2): 297-304, 2024. (open access)
C. A. Koziatek, I. Bohart, R. Caldwell, J. Swartz, P. Rosen, S. Desai, K. Krol, D. B. Neill, and D. C. Lee. Neighborhood-level risk factors for severe hyperglycemia among Emergency Department patients without a prior diabetes diagnosis. Journal of Urban Health 100: 802-810, 2023. (pdf)
Said A. Ibrahim, Mary E. Charlson, and Daniel B. Neill. Big data analytics and the struggle for equity in health care: the promise and perils. Health Equity 4(1): 99-101, 2020. (pdf)
Sriram Somanchi, Daniel B. Neill, and Anil V. Parwani. Discovering anomalous patterns in large digital pathology images. Statistics in Medicine 37: 3599-3615, 2018. (pdf)
Sriram Somanchi, Samrachana Adhikari, Allen Lin, Elena Eneva, and Rayid Ghani. Early prediction of cardiac arrest (code blue) using electronic medical records. Proc. 21st ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2119–2126, 2015. (link)
Applications – Development
Konstantin Klemmer, Godwin Yeboah, Joao Porto de Albuquerque, and Stephen A. Jarvis. Population mapping in informal settlements with high-resolution satellite imagery and equitable ground-truth. Proc. ML-IRL Workshop at ICLR 2020, Addis Ababa, Ethiopia, 2020. (pdf)
Maria de Arteaga, William Herlands, Daniel B. Neill, and Artur Dubrawski. Machine learning for the developing world. ACM Transactions on Management Information Systems 9(2): 9.1-9.14, 2018. (pdf)
Methodology – Event and pattern detection
Charles A. Pehlivanian and Daniel B. Neill. Efficient optimization of partition scan statistics via the Consecutive Partitions Property. Journal of Computational and Graphical Statistics 32(2): 712-729, 2023. (pdf)
Chunpai Wang, Daniel B. Neill, and Feng Chen. Calibrated nonparametric scan statistics for anomalous pattern detection in graphs. Proc. 36th AAAI Conf. on Artificial Intelligence, 4201-4209, 2022. (pdf) (technical appendix)
Dylan J. Fitzpatrick, Yun Ni, and Daniel B. Neill. Support vector subset scan for spatial pattern detection. Computational Statistics and Data Analysis 157: 107149, 2021. (pdf)
Daniel B. Neill. Bayesian scan statistics. In J. Glaz and M. V. Koutras, eds., Handbook of Scan Statistics, 2019. (pdf)
William Herlands, Edward McFowland III, Andrew Gordon Wilson, and Daniel B. Neill. Gaussian process subset scanning for anomalous pattern detection in non-iid data. Proc. 21st International Conference on Artificial Intelligence and Statistics, PMLR 84: 425-434, 2018. (pdf)
Daniel B. Neill. Subset scanning for event and pattern detection. In S. Shekhar and H. Xiong, eds., Encyclopedia of GIS, 2nd ed., Springer, 2017, pp. 2218-2228. (pdf)
Skyler Speakman, Sriram Somanchi, Edward McFowland III, and Daniel B. Neill. Penalized fast subset scanning. Journal of Computational and Graphical Statistics, 25(2): 382-404, 2016. Selected for “Best of JCGS” invited session by the journal’s editor in chief. (pdf).
Daniel B. Neill. Fast subset scan for spatial pattern detection. Journal of the Royal Statistical Society (Series B: Statistical Methodology) 74(2): 337-360, 2012. (pdf)
Methodology – Causal inference
Benjamin Jakubowski, Sriram Somanchi, Edward McFowland III, and Daniel B. Neill. Exploiting discovered regression discontinuities to debias conditioned-on-observable estimators. Journal of Machine Learning Research 24(133): 1-57, 2023. (link) (pdf)
Samrachana Adhikari, Sherri Rose and Sharon-Lise Normand. Nonparametric Bayesian instrumental variable analysis: evaluating heterogeneous effects of arterial access sites for opening blocked blood vessels. Journal of the American Statistical Association, 2019, in press. (arXiv)
William Herlands, Daniel B. Neill, Hannes Nickisch, and Andrew Gordon Wilson. Change surfaces for expressive multidimensional changepoints and counterfactual prediction. Journal of Machine Learning Research 20(99): 1-51, 2019. (pdf)
William Herlands, Edward McFowland III, Andrew Gordon Wilson, and Daniel B. Neill. Automated local regression discontinuity design discovery. Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1512-1520, 2018. (pdf)
Seth R. Flaxman, Daniel B. Neill, and Alexander J. Smola. Gaussian processes for independence tests with non-iid data in causal inference. ACM Transactions on Intelligent Systems and Technology, 7(2): 22:1-22:23, 2015. (pdf)
Methodology – Algorithmic fairness
Pavan Ravishankar, Qingyu Mo, Edward McFowland III, and Daniel B. Neill. Provable detection of propagating sampling bias in prediction models. Proc. 37th AAAI Conf. on Artificial Intelligence, 9562-9569, 2023. (pdf) (supplement)
Jongbin Jung, Sam Corbett-Davies, Ravi Shroff, and Sharad Goel. Omitted and included variable bias in tests for disparate impact. Working paper, 2019. (pdf)
Zhe Zhang and Daniel B. Neill. Identifying significant predictive bias in classifiers. Presented at NIPS Workshop on Interpretable Machine Learning for Complex Systems, 2016, and 4th Workshop on Fairness, Accountability, and Transparency in Machine Learning, 2017. (Interpret ML version) (FAT ML version)
Methodology – Systems modeling
Magdalena Cerdá and Katherine M. Keyes. Systems modeling to advance the promise of data science in epidemiology. American Journal of Epidemiology 188(5): 862-865, 2019. (pdf)
Katherine M. Keyes, Aaron Shev, Melissa Tracy, and Magdalena Cerdá. Assessing the impact of alcohol taxation on rates of violent victimization in a large urban area: an agent-based modeling approach. Addiction 114(2): 236-247, 2019. (pdf)
Methodology – Spatial machine learning
Konstantin Klemmer, Nathan S. Safir, and Daniel B. Neill. Positional encoder graph neural networks for geographic data. Proc. 26th Intl. Conf. on Artificial Intelligence and Statistics, PMLR 206: 1379-1389, 2023. Also presented at NeurIPS Workshop on Tackling Climate Change with Machine Learning, 2022. (pdf)
K. Klemmer, T. Xu, B. Acciaio, and D. B. Neill. SPATE-GAN: Improved generative modeling of dynamic spatio-temporal patterns with an autoregressive embedding loss. Proc. 36th AAAI Conf. on Artificial Intelligence, 4523-4531, 2022. (pdf) (technical appendix)
Konstantin Klemmer and Daniel B. Neill. Auxiliary-task learning for geographic data with autoregressive embeddings. Proc. 29th ACM SIGSPATIAL Intl. Conf. on Advances in Geographic Information Systems, 141-144, 2021. (short version) (long version)