An obvious practical difficulty is that biological questions are explanatory in nature, asking how and why things are the way they are, but statistics is for the most part descriptive. I am interested in developing analysis strategies for genomic data that bridge the gap between description and explanation. I'm currently focused on spatial transcriptomics and behavioral genomics, and I've also worked on:
-
Zhu, H., Zhao, S.D., Ray, A., Zhang, Y., and Li, X. (2022).
A comprehensive temporal patterning gene network in Drosophila medulla neuroblasts revealed by single-cell RNA sequencing.
Nature Communications, 13, 1247. -
Avalos, A., Fang, M., Pan, H., Lluch, A. R., Lipka, A. E., Zhao, S. D., Giray, T., Robinson, G. E., Zhang, G., and Hudson, M. E. (2020).
Genomic regions influencing aggressive behavior in honey bees are defined by colony allele frequencies.
Proceedings of the National Academy of Sciences 117, 17135–17141.
A hallmark of genomics research is that it asks a huge number of biological questions simultaneously, made possible by high-throughput technology. But more questions also leads to more inductive uncertainty. I am interested in understanding the fundamental statistical principles that make it possible to reduce this uncertainty. I'm currently studying empirical Bayesian inference and mediation analysis.
-
Barbehenn*, A. and Zhao, S.D.
A nonparametric regression approach to asymptotically optimal estimation of normal means.
arxiv:2205.00336 -
Zhou*, R. R., Wang, L., and Zhao, S. D. (2020).
Estimation and inference for the indirect effect in high-dimensional linear mediation models.
Biometrika 107, 573–589.
(An earlier version of this paper was a winner of a 2018 American Statistical Association Section of Statistics in Genomics and Genetics distinguished student paper award.)
Finally, I am interested in developing new statistical inference procedures. My work includes multiple hypothesis testing, precision matrix estimation, and high-dimensional survival analysis methods.
My research has been supported by:
-
Foss (PI), Zhao (Co-PI).
A Statistical Approach to Nonlocal Compression for Supervised Learning, Semi-Supervised Learning, and Anomaly Detection.
Sandia National Labs, LDRD22-0599, 2021–2023 -
Robinson (PI), Zhao (Co-PI).
Gut Microbiome Effects on Brain and Behavior.
NSF IOS-2120378, 2021–2024. -
Li (PI), Zhao (Co-PI).
Computational Reconstruction of Gene-Gene Dynamics in Temporal Patterning of Drosophila Medulla Neuroblasts from Single-Cell RNA-Seq.
NSF-Simons Center for Quantitative Biology at Northwestern University, 2018–2019. -
Zhao (PI).
Theory and Methods for Simultaneous Signal Analysis in Integrative Genomics.
NSF DMS-1613005, 2016–2019.