| Morning Half-Day Courses | |
|---|---|
| Title | Instructors |
| Graphical multiple comparison procedures: Combining flexibility with optimality | Yao Chen, Novartis Pharmaceuticals Corp.; Dong Xi, Gilead Sciences; Frank Bretz, Novartis Pharma AG |
| Functional Data Analysis and Its Applications | Pang Du, Virginia Tech |
| A Selective Introduction to the Statistical Foundations of Transfer Learning | Yang Feng, New York University |
| Sufficient Dimension Reduction: Linear, Nonlinear, and Deep Learning Methods | Yuexiao Dong, Temple University; Yin Tang, University of Kentucky |
| Afternoon Half-Day Courses | |
|---|---|
| Title | Instructors |
| Unlocking the Power of Semiparametric Models: A Practical Tutorial for Analyzing Complex Data with Minimum Assumptions | Xin Tu, UC San Diego; Jinyuan Liu, Vanderbilt University |
| Practical Methods for Treatment Switching in Oncology Trials: Applications Using the trtswitch R Package | Kaifeng Lu, BeOne Medicines; Ya Wang, Gilead Sciences; Yu-Che Chung, Takeda Pharmaceuticals; Jincheng (Jeni) Zhou, BeOne Medicines |
| Selective inference: methods and applications | Zhimei Ren, University of Pennsylvania |
| From Record Linkage to Post-Linkage Analysis in R and Python | Martin Slawski, University of Virginia; Priyanjali Bukke, University of Virginia |
Instructors: Yao Chen, Novartis Pharmaceuticals Corp.; Dong Xi, Gilead Sciences; Frank Bretz, Novartis Pharma AG
Category: Methodology
Target Audience: Statisticians in pharmaceutical industry, regulatory agencies, and academia.
Prerequisites: Has experience on multiple testing and understands the importance of multiplicity adjustment for the control of family wise error rate.
Computer and Software Requirements: R. Course materials, including R code, will be provided to participants.
Course Description
Addressing multiplicity is essential in confirmatory clinical trials to ensure valid statistical inference. Various multiple comparison procedures (MCPs) have been developed, including fixed-sequence, fallback, and gatekeeping procedures, which allow trialists to reflect the relative importance and interrelationships of study objectives in a tailored multiple testing procedure.
This course focuses on graphical approaches that enable the construction and exploration of tailored MCPs to meet specific study objectives, such as comparisons of multiple treatments against a common control and multiple endpoint analyses. In these approaches, MCPs are represented by directed, weighted graphs, where each node corresponds to an elementary hypothesis. A simple algorithm then facilitates the sequential testing of hypotheses.
Optimizing MCPs to maximize the probability of success is often a key concern for clinical trial teams. We will discuss clinically relevant objective functions for optimization and introduce an efficient algorithm, based on constrained nonlinear optimization, to identify optimal graphs.
Case studies will illustrate the flexibility and practicality of these approaches in clinical trial settings. We will also introduce the graphicalMCP R package, which implements weighted Bonferroni tests, weighted parametric tests that account for correlations between test statistics, and weighted Simes' tests. We will also briefly consider power and sample size calculation.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Yao Chen (Novartis Pharmaceuticals Corp.) is a Statistical Consultant in the Advanced Methodology and Data Science group at Novartis. He has supported the development and implementation of innovative statistical methodologies in multiple comparisons and treatment effect heterogeneity. He also supports exploratory projects on multimodal data using deep learning techniques.
Dr. Dong Xi (Gilead Sciences) is a Senior Director in the Biostatistics Innovation Group at Gilead Sciences. He has supported the development and implementation of innovative statistical methodologies in multiple comparisons, dose finding, group sequential designs, estimands, and causal inference. He is an Associate Editor of Statistics in Biopharmaceutical Research and a committee member of the International Conference of Multiple Comparison Procedures.
Dr. Frank Bretz (Novartis Pharma AG) is a Distinguished Quantitative Research Scientist at Novartis. He has contributed to methodological development in several areas of pharmaceutical statistics, including dose finding, estimands, multiple comparisons, and adaptive designs. He is an Adjunct Professor at Hannover Medical School in Germany and the Medical University of Vienna in Austria, and he is a Fellow of the American Statistical Association.
Instructors: Pang Du, Virginia Tech
Category: Methodology
Target Audience: Practitioners or researchers with an interest in understanding and using functional data analysis.
Prerequisites: A general knowledge of linear regression and multivariate statistics.
Computer and Software Requirements: While it is not required, a computer/laptop installed with R (and RStudio) is recommended.
Course Description
This course aims to introduce the modern field of functional data analysis to a general audience, with an emphasis on how the relevant techniques can be applied to real examples. As a generalization of the traditional data concepts from numbers and vectors of numbers to curves and surfaces, functional data have attracted considerable attention from statisticians and found many interesting applications in a variety of fields over the past decades.
The course begins with real examples of functional data. Based on these examples, common functional data analysis techniques such as function smoothing, functional principal component analysis, and functional linear regression models will be presented. R implementation of these techniques will also be introduced and demonstrated.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Pang Du (Virginia Tech) is a Professor in the Department of Statistics at Virginia Tech, where he joined in 2006 after receiving his Ph.D. in Statistics from Purdue University. His research covers both statistical methodology and applications, with more than 60 publications in statistics and scientific journals. His recent methodological interests include functional data analysis, big data analytics, lifetime data analysis, and ROC curve methodology. He collaborates with researchers in biology, computer science, engineering, and public health. He is a Lifetime Member of the International Chinese Statistical Association (ICSA) and an active participant in ICSA conferences and activities.
Originally trained in smoothing methods such as splines and kernel estimation, Dr. Du developed a natural interest in functional data analysis as the area emerged more than two decades ago. He began teaching FDA at Virginia Tech in 2012 and developed it into a graduate course in 2018. To date, 44 graduate students from statistics and other disciplines have taken the course. This short course has been taught at three conferences/workshops since 2022.
Instructors: Yang Feng, New York University
Category: Methodology / Statistical machine learning (transfer learning, domain adaptation)
Target Audience: Graduate students, researchers, and data science professionals with a statistics/ML background who want a principled, statistical view of transfer learning and related distribution-shift problems.
Prerequisites: Basic probability and mathematical statistics (expectation, conditioning, concentration basics). Familiarity with supervised learning concepts (risk minimization, generalization). Comfort with linear algebra and high-level asymptotic reasoning is helpful but not strictly required.
Computer and Software Requirements: None.
Course Description
This half-day short course provides a selective, statistics-first introduction to the foundations of transfer learning (TL) and related multi-task learning ideas. The course focuses on how distribution shift affects generalization and how statistical assumptions enable principled knowledge transfer.
Topics include domain adaptation bounds via divergence measures, covariate shift and density-ratio based reweighting, and posterior drift with biased regularization as a tool for “safe transfer.” The course emphasizes clear problem formulations, key theoretical results, and intuition for when transfer helps and when it can hurt.
The teaching style is lecture-based with guided derivations and short conceptual check-ins. Participants will leave with a unified view of divergence-based analysis, covariate shift, and biased regularization that can inform both methodological choices and theoretical work.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Yang Feng (New York University) is a Professor of Biostatistics in the School of Global Public Health at New York University, where he is also affiliated with the Center for Data Science. He earned his Ph.D. in Operations Research from Princeton University in 2010.
Dr. Feng’s research focuses on the theoretical and methodological foundations of machine learning, high-dimensional statistics, network models, and nonparametric statistics. His work addresses applications in Alzheimer’s disease prognosis, cancer subtype classification, genomics, electronic health records, and biomedical imaging, with the goal of enabling more accurate risk assessment and clinical decision-making. He has published over 70 peer-reviewed papers in leading journals across statistics, machine learning, econometrics, and medicine.
His research has been supported by grants from the National Institutes of Health (NIH) and the National Science Foundation (NSF), including the NSF CAREER Award. He currently serves as the Review Editor for the Journal of the American Statistical Association (JASA) and The American Statistician (2026–2028), and as an Associate Editor for several journals, including JASA Theory and Methods, the Journal of Business & Economic Statistics, the Journal of Computational & Graphical Statistics, and the Annals of Applied Statistics. He is a Fellow of the American Statistical Association (2022) and the Institute of Mathematical Statistics (2023), and has been an elected member of the International Statistical Institute since 2017.
Instructors: Yuexiao Dong, Temple University; Yin Tang, University of Kentucky
Category: Methodology
Target Audience: Practitioners or researchers with an interest in sufficient dimension reduction and deep learning.
Prerequisites: A general knowledge of linear regression and multivariate statistics.
Computer and Software Requirements: While it is not required, a computer/laptop installed with R (and RStudio) is recommended.
Course Description
This short course provides a gentle introduction to both classical and modern approaches to linear and nonlinear sufficient dimension reduction (SDR). Real-world applications are used to illustrate the effectiveness of dimension reduction for visualization, modeling, and statistical inference.
Core ideas in SDR, along with connections to deep learning, are introduced through hands-on R-based toy examples. The course begins with classical regression and machine learning methods—such as ordinary least squares, logistic regression, linear discriminant analysis, and support vector machines—and places them within the SDR framework.
It then gradually transitions to modern nonlinear SDR approaches, including kernel-based methods and deep-learning-based techniques such as StoNet, GMDDNet, and BENN. The course concludes with an application of SDR to knowledge distillation and edge computing within a large language model (LLM) workflow.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Yuexiao Dong (Temple University) is an Associate Professor and Gilliland Research Fellow in the Department of Statistics, Operations, and Data Science in the Fox School of Business at Temple University. His primary research focus is on sufficient dimension reduction and high-dimensional data analysis. His work has been published in journals such as The Annals of Statistics, the Journal of the American Statistical Association, and Biometrika.
Dr. Dong’s other research interests include machine learning and business analytics. His collaborative work has also appeared in Journal of Machine Learning Research, IEEE Transactions on Information Theory, IEEE Transactions on Network Science and Engineering, Pattern Recognition, and Journal of Product Innovation Management. He has served as an Associate Editor for the Journal of Systems Science and Complexity since 2015.
Dr. Yin Tang (University of Kentucky) is an Assistant Professor in the Dr. Bing Zhang Department of Statistics at the University of Kentucky. Before joining UK, he earned his Ph.D. in Statistics from Pennsylvania State University under the supervision of Dr. Bing Li. His research interests include sufficient dimension reduction, nonparametric statistics, deep learning, and causal inference.
Instructors: Xin Tu, UC San Diego; Jinyuan Liu, Vanderbilt University
Category: Methodology
Target Audience: All levels of (bio)statisticians and data scientists are welcome. The course covers both fundamental and more advanced topics in semiparametric models, accompanied by diverse real-world applications.
Prerequisites: Knowledge of statistical inference and a basic understanding of large sample theory.
Computer and Software Requirements: Basic knowledge of R programming.
Course Description
This short course gives biostatisticians and data scientists an engaging overview of semiparametric modeling through real-world applications with complex structures, such as high-throughput sequencing and network data. Both classical and cutting-edge semiparametric techniques are explored, highlighting their roles in balancing robustness, flexibility, and efficiency with minimal assumptions.
The foundation of statistical inference relies on models with explicit or implicit assumptions about the underlying data-generating process. Often, these models are characterized by finite-dimensional parameters and have only limited robustness in practice. This has motivated the advancement of semiparametric modeling, which blends finite-dimensional parameters of interest with infinite-dimensional nuisance parameters. Such flexibility has led to emerging applications in many research disciplines, especially in causal inference, missing data, survival, and survey studies.
This short course is divided into two halves. The first half introduces the fundamental concepts of semiparametric models and outlines their roles in robust inference with and without missing data. The second half discusses recent advances and applications, including settings that scale up to high-dimensional microbiome data and HIV viral genetic linkage networks, while also scaling down to inference problems involving outliers and small sample sizes.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Xin Tu (UC San Diego) is a Professor of Biostatistics at the Herbert Wertheim School of Public Health and Human Longevity Science at UC San Diego. He is also Co-Director of the UCSD Clinical and Translational Research Institute Biostatistics, Epidemiology and Research Design Core, as well as the Stein Institute for Research on Aging Biostatistics Core. He has 30 years of teaching experience and has coauthored over 320 peer-reviewed publications, along with two textbooks and two edited volumes, in areas including U-statistics, categorical data analysis, clinical trials, and social network analysis.
His methodological research spans semiparametric models for longitudinal data with informative missing follow-up, causal inference, and high-throughput data. He has served as lead biostatistician for many studies involving longitudinal data and complex modeling challenges, including doubly robust estimators, structural equation models, and structural mean models.
Dr. Jinyuan Liu (Vanderbilt University) is an Assistant Professor of Biostatistics and Psychiatry & Behavioral Sciences at Vanderbilt University. Her research focuses on effective dimension reduction and efficient integrative modeling of high-dimensional data arising from microbiome studies, imaging, wearable devices, behavior, and psychiatric science, with an emphasis on deriving causal and mediation insights from such complex data.
Instructors: Kaifeng Lu, BeOne Medicines; Ya Wang, Gilead Sciences; Yu-Che Chung, Takeda Pharmaceuticals; Jincheng (Jeni) Zhou, BeOne Medicines
Category: Methodology / Technology Training
Target Audience: Biostatisticians, clinical trial statisticians, statistical programmers, and quantitative scientists working in oncology clinical trials who want a practical introduction to treatment switching methods and hands-on experience using the trtswitch R package.
Prerequisites: Familiarity with survival analysis concepts (e.g., hazard ratio, censoring, Kaplan–Meier). Basic R programming skills (installing packages, running scripts). Knowledge of randomized clinical trial design is helpful but not required.
Computer and Software Requirements: Participants should bring a laptop with R version 4.0 or later, RStudio (recommended), the trtswitch R package installed (installation instructions will be provided prior to the course), and the ability to run example R scripts and load datasets.
Course Description
Treatment switching, in which patients cross over from the control arm to receive experimental therapy, or patients in both control and active arms take alternative treatments, is common in oncology trials and can substantially bias intention-to-treat analyses of overall survival. This short course provides a practical overview of major statistical approaches for handling treatment switching, including Rank Preserving Structural Failure Time Model (RPSFTM), Simple Two-Stage Estimation (TSEsimp), Iterative Parameter Estimation (IPE), Inverse Probability of Censoring Weights (IPCW), and Marginal Structural Models (MSM).
Participants will gain a conceptual understanding of each method and learn how to implement them using the trtswitch R package. Through demonstrations and guided hands-on exercises, attendees will work with example oncology trial datasets to estimate adjusted survival outcomes, compare methods, and interpret results for reporting. The course emphasizes practical application, reproducible code, and real-world considerations when implementing treatment switching adjustments in randomized clinical trials.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Kaifeng Lu (BeOne Medicines) is Senior Director of Biostatistics and Head of Statistical Modeling and Simulations at BeOne Medicines. He earned his Ph.D. in Statistics from North Carolina State University and has more than 23 years of experience in the pharmaceutical industry, including roles at Merck, Forest Labs, Allergan, AbbVie, and BeOne Medicines. He has extensive expertise in clinical trials across many disease areas and has authored 40 papers, mostly as first author, in leading statistical journals, along with more than 30 coauthored publications in clinical journals. He has also developed widely used software packages for group sequential designs, sample size calculations, multiplicity control, event prediction, and treatment switching.
Dr. Ya Wang (Gilead Sciences) is an Associate Director of Biostatistics at Gilead. She earned her doctorate in Biostatistics from Columbia University and has been with Gilead since 2018. Her work focuses on statistical innovation and software development, including R package development, interactive Shiny applications, methodological research, and statistical consultation across therapeutic areas. She is especially interested in open-source software, accessible analytics tools, and scientific collaboration.
Dr. Yu-Che Chung (Takeda Pharmaceuticals) is a Senior Manager of Statistics at Takeda Pharmaceuticals. He primarily works on late-phase oncology studies and has been involved in multiple regulatory submissions. His research interests include Bayesian methods, adaptive seamless designs, treatment switching, and dose optimization.
Dr. Jincheng (Jeni) Zhou (BeOne Medicines) is a Senior Manager of Biostatistics at BeOne Medicines, where she primarily supports late-phase solid tumor regulatory submissions and early-phase gastrointestinal drug development. She received her Ph.D. in Biostatistics from the University of Minnesota. Before joining BeOne Medicines in 2023, she worked at Amgen and Gilead, supporting clinical development in inflammation. Her research interests include Bayesian methods, causal inference, and meta-analysis, and she has authored multiple first-author publications in these areas. She has also contributed to cross-functional initiatives, methodological working groups, and AI-related efforts across the organizations where she has worked.
Instructors: Zhimei Ren, University of Pennsylvania
Category: Methodology
Target Audience: Researchers and practitioners interested in the field of selective inference.
Prerequisites: Basic probability theory and introductory statistical inference.
Computer and Software Requirements: R.
Course Description
Selective inference asks a simple question: how do we provide inferential guarantees for the patterns selected from the data? This short course introduces practical tools for two common sources of selection: running many tests and making data-driven decisions.
The course begins with multiple testing, covering global null testing, family-wise error rate, and false discovery rate. It then introduces procedures such as Benjamini–Hochberg and knockoffs. After the break, the course shifts to inference after decision-making, covering adaptive inference ideas and concluding with post-selection inference.
The course is aimed at a broad audience and will include running examples and lightweight code demonstrations.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Zhimei Ren (University of Pennsylvania) is an Assistant Professor in the Department of Statistics and Data Science at the Wharton School of the University of Pennsylvania. From 2021 to 2023, she was a postdoctoral researcher in the Statistics Department at the University of Chicago, advised by Professor Rina Foygel Barber. She received her Ph.D. in Statistics from Stanford University, advised by Professor Emmanuel Candès.
Her research interests include selective inference, distribution-free inference, and data-driven decision making. She has taught multiple classes at the University of Pennsylvania and is currently teaching a Ph.D. course on selective inference.
Instructors: Martin Slawski, University of Virginia; Priyanjali Bukke, University of Virginia
Category: Data Quality and Management
Target Audience: This course is designed for applied statisticians, data scientists, and quantitative researchers in government, academia, or industry. It is particularly relevant for researchers and practitioners interested in creating or analyzing linked data; individuals who aim to leverage data linkage involving administrative, clinical, survey, or private-sector sources and need to account for the associated uncertainty in their analysis; data producers and stewards responsible for data integration pipelines who wish to provide downstream users with actionable guidance and tools for error correction; and students and trainees seeking advanced skills in conducting inference using integrated data or experience in open-source R or Python software development.
Prerequisites: Familiarity with regression analysis and R or Python is recommended. No prior experience with record linkage is required.
Computer and Software Requirements: Bringing a laptop is encouraged for hands-on practice, but it is not mandatory. All course materials, including slides, code, data, and setup instructions, will be made available in advance. Analyses and output will be walked through step by step during the course.
Course Description
Data integration is a cornerstone of modern statistics, yet merging disparate files is inherently error-prone. While record linkage enriches datasets and reduces data collection costs, mismatches and missed matches introduce systematic bias and sample selection issues that can fundamentally compromise downstream statistical inference. This challenge is particularly acute in secondary analysis, where analysts must derive insights from linked data without participating in the linkage process or possessing identifiers to validate match quality.
In this course, the focus moves beyond simply matching records to the propagation of linkage uncertainty and how to adjust analysis accordingly. Participants will learn how to conduct probabilistic record linkage, diagnose possible sources of linkage errors and understand their impact on inference, and perform adjustments for such errors in the analysis of linked data.
Open-source pipelines for probabilistic matching, including fastLink and RecordLinkage in R and Splink in Python, along with post-linkage adjustment using the new postlink software, will be used to enable an interactive hands-on learning experience. The course concludes with a discussion of practical limitations in current methodologies and software, with the goal of encouraging future community-driven development of open-source resources.
Teaching Plan / Outline
Instructor Bio(s)
Dr. Martin Slawski (University of Virginia) has published broadly in leading statistics and computer science venues on topics including high-dimensional data analysis, data compression, dimensionality reduction, biometric recognition, and data integration. His research has been funded by the National Science Foundation and other federal agencies, including the National Institute of Justice and the National Institutes of Health. He was a Summer at Census Scholar in 2019 and 2024 and currently serves as an Associate Editor of the Electronic Journal of Statistics.
Dr. Slawski holds a Diplom degree in statistics from Ludwig-Maximilians-University of Munich and a doctoral degree in computer science from Saarland University in Germany. His academic career in the United States began with a postdoctoral fellowship at Rutgers University. Before joining the University of Virginia, he was an Assistant Professor and later Associate Professor in George Mason University’s Department of Statistics, and he also held visiting positions at Columbia University and Baidu Research USA.
A current focus of his research is methodology and computational tools for data integration and record linkage in the presence of linkage uncertainty, including connections to data privacy.
Miss Priyanjali Bukke (University of Virginia) is a Ph.D. student in Statistics at the University of Virginia. Her research interests include data integration and its relation to data privacy and quality. Supported by the NSF, she is involved in collaborative work under the advising of Martin Slawski to develop software and methodology for a more comprehensive cyberinfrastructure for post-linkage data analysis.
| Cookie | Duration | Description |
|---|---|---|
| cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
| cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
| cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
| cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
| cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
| viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
