Machine Learning for Complex Scientific and Social Data

Machine learning methods such as neural networks have revolutionized prediction in many fields, but they are often designed for unstructured data like images or text. Many scientific and social science datasets have more complex structure. For example, observations may be correlated across space, time, or networks, and researchers often need uncertainty quantification and interpretability in addition to prediction.

My research develops machine learning methods that are tailored to these types of datasets. In particular, I study how neural networks and other modern learning algorithms can be combined with statistical modeling to analyze structured scientific data. This work includes both fully Bayesian models that integrate machine learning components as well as more classical machine learning approaches designed for spatial or time-series problems.

Students working in this area learn how to combine tools from

  • machine learning
  • Bayesian statistics
  • spatial and temporal modeling

to analyze complex real-world datasets.

Typical applications include environmental data, social science data, and other large scientific datasets where modern AI methods must be adapted to respect statistical structure.

Selected publications

  • Pawl, E., Ribalet, F., Parker, P.A., and Hyun, S. (2026+) Mixtures of Neural Network Experts with Application to Phytoplankton Flow Cytometry Data.
  • Kei, Y.L., Wang, Q., Parker, P.A., Ribalet, F., Hyun, S. (2026+) Change Point Detection for Cell Populations Measured via Flow Cytometry.
  • Parker, P.A. and Eideh, A. (2026+) BART-FH: Flexible Nonlinear Modeling for Small Area Estimation. Journal of Survey Statistics and Methodology.
  • Wang, Q., Parker, P.A., and Lund R. (2025) Hierarchical Count Echo State Network Models with Application to Graduate Student Enrollments. Data Science in Scienc}, Special Issue on ``Data Science in the Federal Government”, 4(1), https://doi.org/10.1080/26941899.2025.2565243.
  • Wang, Z., Parker, P.A., and Holan, S.H. (2025) Variational Autoencoded Multivariate Spatial Fay-Herriot Models. Spatial Statistics, 70, 100929.
  • Wang, Q., Parker, P.A., and Lund R. (2025) Spatial Deep Convolutional Neural Networks. Spatial Statistics, 66, 100883.
  • Parker, P.A. (2024) Nonlinear Fay-Herriot Models for Small Area Estimation using Random Weight Neural Networks. Journal of Official Statistics, 40(2), 317-332.
  • Parker, P.A., and Holan, S.H. (2023) Computationally Efficient Bayesian Unit-Level Random Neural Network Modeling of Survey Data under Informative Sampling for Small Area Estimation. Journal of the Royal Statistical Society Series A, 184(4), 722-737.
  • Parker, P.A., Holan, S.H., and Ravishanker, N. (2020) Nonlinear Time Series Classification Using Bispectrum-based Deep Convolutional Neural Networks. Applied Stochastic Models in Business and Industry, 36, 877– 890.
Paul A. Parker
Paul A. Parker
Assistant Professor

My research interests include Bayesian methods, especially when applied to dependent data scenarios, often using survey data.