Abstracts

Design of Cost-Effective Experiments

Speaker: David Banks (Duke University/SAMSI)

Abstract:

Traditional experimental design focuses upon balance, rather than accounting for the costs of specific treatment applications. When some observations are much more expensive than others, experimenters should seek designs that trade-off information for cost. Such designs are typically unbalanced, but that is not a serious challenge for modern software packages. This talk describes issues that arise in cost-effective DOE and response surface analysis, and shows how the results differ from those obtained under conventional alphabetic-optimality criteria.

Biography:

David Banks is a professor in Department of Statistical Science at Duke University and the director of the Statistical and Applied Mathematical Sciences Institute. He is the past-president of the International Society for Business and Industrial Statistics and of the Classification Society. He is a former coordinating editor of the Journal of the American Statistical Association and a founding editor of Statistics and Public Policy. Previously, he held positions at the University of Cambridge, Carnegie Mellon, the National Institute of Standards and Technology, the U.S. Department of Transportation, and the FDA. He obtained his Ph.D. in 1984 from Virginia Tech and works on adversarial risk analysis, agent-based models, and text networks.


Global Optimization of Expensive Functions Using Adaptive Radial Basis Functions Based Surrogate Model via Uncertainty Quantification

Speaker: Ray-Bing Chen (National Cheng Kung University, Taiwan)

Abstract:

Global optimization of expensive functions has important applications in physical and computer experiments. It is a challenging problem to develop efficient optimization schemes, because each function evaluation can be costly and the derivative information of the function is often not available. We propose a novel global optimization framework using adaptive radial basis functions (RBF) based surrogate model via uncertainty quantification. The framework consists of two iteration steps. It first employs an RBF-based Bayesian surrogate model to approximate the true function, where the parameters of the RBFs can be adaptively estimated, and the prediction uncertainty can also be obtained. Then it utilizes a model-guided selection criterion to identify a new point from a candidate set for function evaluation. The selection criterion adopted here is a sample version of the expected improvement (EI) criterion. We conduct simulation studies with standard test functions, and compare the performance with two existing methods to illustrate the advantages of the proposed method.

Biography:

Ray-Bing Chen is a professor in the Department of Statistics at National Cheng Kung University. He received his Ph.D. in statistics from the University of California, Los Angeles. His research interests include statistical and machine learning, statistical modeling, computer experiment, and optimal design.


Modeling and Monitoring Methods for 3D Spatial and Image data

Speaker: Bianca Maria Colosimo (Politecnico di Milano, Italy)

Abstract:

Intelligent sensing and computerized data analysis are inducing a paradigm shift in industrial statistics. Emerging technologies (e.g., additive manufacturing, micro-manufacturing) combined with new inspection solutions (e.g., noncontact systems, X-ray computer tomography) and fast multi-stream high-speed sensors (e.g., videos and images; acoustic, thermic, power and pressure signals) are paving the way for a new generation of industrial big-data sets, requiring novel solutions for data modeling and monitoring. Starting from real industrial problems, some of the main challenges to be faced in relevant industrial sectors are discussed.

Biography:

Bianca Maria Colosimo is professor in the Department of Mechanical Engineering of Politecnico di Milano where she is also deputy head for research. She is currently editor of the Journal of Quality Technology (ASQ – Taylor &Francis). Her research interests include statistical modeling, monitoring and control of complex data (3D point clouds, multi-stream functional data, image and video images).


Fisher Randomization Test: A Confidence Distribution Perspective and Applications to Massive Experiments

Speaker: Tirthankar Dasgupta (Rutgers University)

Abstract:

Fisher randomization tests (FRT) are flexible tools because they are a model free, permit assessment of causal effects of interventions on ANY type of response for ANY assignment mechanism using ANY test statistic. The tremendous development of computing resources has recently sparked a huge interest in using FRT to test complex causal hypotheses that can arise from massive studies. In spite of its wide applicability and recent surge of interest, several aspects of the theoretical properties of randomization tests still remain unclear, somewhat limiting its applicability. This research provides a theoretical inferential framework for FRT by combining two fundamental ideas: potential outcomes and confidence distributions. It also demonstrates how such a connection can be exploited to combine causal inference from multiple experiments with different structures and complexities and also to “divide and conquer” randomization-based inference arising from massive experiments.
(This is joint work with Minge Xie, Xiaokang Luo, and Regina Liu.)

Biography:

Tirthankar Dasgupta is an associate professor and co-graduate director in the Department of Statistics at Rutgers University. His research interests include experimental design, causal inference, statistical modeling of physical and engineering systems, and quality engineering. Dasgupta obtained his Ph.D. from Georgia Institute of Technology in 2007 under the supervision of Professor C.F. “Jeff” Wu. He then served as a faculty member in the Department of Statistics at Harvard University from 2008 – 2016 before moving to Rutgers University in 2017. Currently, he serves on the editorial boards of the Journal of the American Statistical Association, Journal of the Royal Statistical Society, Technometrics and the Journal of Quality Technology.


Computer Experiments for Reliability Assessment in Wind Energy Applications

Speaker: Yu Ding (Texas A&M University)

Abstract:

The principal challenge in reliability assessment of wind turbines is rooted in the fact that a small tail probability needs to be estimated. If one opts to collect enough high load values from physical turbine systems, it would have taken tens of years, as the high load values, by definition, are rare events. Wind engineers have been developing aeroelastic simulators that can produce reasonably trustworthy bending moments response under a wind force input. The availability of these simulators lends a degree of convenience to load analysis, as a simulator can be steered, at least in principle, towards the region of high load responses so as to produce more high load data points. Of course, running aeroelastic turbine load simulators can be computationally expensive. This raises the needs for new ideas and methods enabling efficient computer experiments and adequately estimation of ultra-small failure probabilities.

Biography:

Dr. Yu Ding is the Mike and Sugar Barnes Professor of Industrial & Systems Engineering and Professor of Electrical & Computer Engineering at Texas A&M University and a member of Texas A&M Institute of Data Science, Texas A&M Energy Institute, and TEES Institute of Manufacturing Systems. Dr. Ding received his Ph.D. degree from the University of Michigan in 2001. Dr. Ding is a Fellow of IISE and ASME and a recipient of the 2018 Texas A&M Engineering Research Impact Award.


On Analyzing Data in Industrial Statistics Applications

Speaker: Michael Hamada (Los Alamos National Laboratory)

Abstract:

When analyzing data, we want to develop a model that captures the nature of the data and incorporates any known science and engineering. We use a Bayesian inferential approach so that prior distributions also need to incorporate any known science and engineering or what might be known empirically from other data sources. We briefly introduce an incomplete taxonomy for a Bayesian statistical model that consists of a data type and structural and stochastic components at a data level and hierarchical models and prior distributions at a prior level. We illustrate the challenges of analyzing data in industrial statistics applications through several examples whose models fall into this taxonomy.

Biography:

Michael Hamada is a scientist at Los Alamos National Laboratory and holds a Ph.D. in statistics from the University of Wisconsin-Madison. He is a fellow of the American Statistical Association and of the American Society for Quality and winner of the Gerald J. Hahn Quality and Productivity Achievement Award. His research interests include design and analysis of experiments, reliability, quality improvement, and measurement assessment.


Evolving Paradigms of Manufacturing and the Role of Statistics

Speaker: S. Jack Hu (University of Michigan)

Abstract:

Manufacturing has evolved over time, from craft production, to mass production and mass customization. The emergence of each paradigm was driven by market dynamics and enabled by the technologies of the time. Currently, diversified customer needs are driving an increased tendency towards a new paradigm of product realization: the incorporation of customer-tailored modules or functions into products. We call this emerging paradigm “personalization,” through which customers are able to participate in the value creation of innovative products by collaborating with manufacturers. Statistical methods play an important role by providing the scientific foundation for each and every paradigm.  These methods will be reviewed and the opportunity for development of new statistical methods supporting personalization will be highlighted.

Biography:

S. Jack Hu is the J. Reid and Polly Anderson Professor of Manufacturing and vice president for research at the University of Michigan. He is also professor of mechanical engineering and professor of industrial and operations engineering. Hu’s teaching and research interests are in manufacturing systems and statistical quality methods. He is the recipient of various awards, including the William T. Ennor Manufacturing Technology Award from the American Society of Mechanical Engineers (ASME), the Gold Medal from the Society of Manufacturing Engineers (SME), and several best paper awards. He is a fellow of ASME, SME, and the International Academy for Production Engineering (CIRP). Hu was elected a member of the U.S. National Academy of Engineering in 2015 and a foreign member of Chinese Academy of Engineering in 2017.


Computer Experiments with Binary Time Series and Applications to Cell Biology: Modeling, Estimation and Calibration

Speaker: Ying Hung (Rutgers University)

Abstract:

Computer experiments have become ubiquitous in various applications from rocket injector designs to weather forecasts. Although extensive research has been devoted in the literature, computer experiments with binary time-series outputs have received scant attention. Motivated by the analysis of a class of cell adhesion experiments, we introduce a new emulator, as well as a new calibration framework for binary time-series outputs. More importantly, we provide their theoretical properties to ensure the estimation performance in an asymptotic setting. The application to the cell adhesion experiments illustrates that the proposed emulator and calibration framework not only provide an efficient alternative for the computer simulation, but also reveal important insight on the underlying adhesion mechanism, which cannot be directly observed through existing methods.

Biography:

Hung is an associate professor in the Department of Statistics at Rutgers University. She received her Ph.D. from Georgia Institute of Technology in 2008. Her research interests are experimental design, analysis of computer experiments, and uncertainty quantification.


Construction, Properties, and Analysis of Supersaturated Designs Based on Kronecker Products

Speaker: Bradley Jones (SAS Institute)

Abstract:

This talk introduces a new method for constructing supersaturated designs (SSDs) that is based on the Kronecker product of two carefully chosen matrices. The construction method leads to a partitioning of the columns of the design such that the columns within a group are correlated to the others within the same group, but are orthogonal to any factor in any other group. We leverage this structure to obtain an unbiased estimate of the error variance, and to develop an effective, design-based model selection procedure.

Simulation results show that the use of these designs, in conjunction with its model selection procedure, enables the identification of larger numbers of active main effects than have previously been reported for supersaturated designs.

This talk is, in some sense, an homage to Jeff Wu, who sparked new interest in SSDs through his own seminal work in this area.

Biography:

Bradley Jones is distinguished research fellow in the JMP division of the SAS Institute. At JMP he developed the custom designer, a general and powerful tool for generating optimal experimental designs. Jones is the inventor of the prediction profile plot — an interactive graph for exploring multivariate response surfaces. He is co-discoverer with Chris Nachtsheim of definitive screening designs. He has won the Youden Prize, and both the Brumbaugh and the Lloyd S. Nelson awards twice from the American Society for Quality. Jones is a past editor of the Journal of Quality Technology and is a fellow of the American Statistical Association.


Learning From Big Medical Data: Statistical Analysis on Electronic Health Data

Speaker: Samuel Kou (Harvard University)

Abstract:

Big data have attracted significant interest from business, government agencies, academic communities and the general public. They offer the potential to transform knowledge discovery and decision making. We consider in this talk big medical data, in particular electronic health insurance data, which have been widely adopted over the last two decades to give healthcare and insurance providers faster and easier access to record, retrieve and process patient information. The massive health insurance data also present opportunities for studying the causal relationship between diseases and treatments. We will use our analysis of the causal relationship between cancer immunotherapy and autoimmune diseases as an illustration. Immunotherapy is one of the most exciting cancer treatments developed in the last five years; it works by enhancing the body’s own immune system to fight cancer and has been shown to extend patients’ life expectancy. An unintended side effect observed anecdotally by several doctors is that immunotherapy seems to lead to more autoimmune diseases. We analyzed an electronic health insurance data system, which covers over 44 million members, to study the potential causal relationship between cancer immunotherapy and autoimmune diseases. Mining the massive data allows us to answer the causal question. We will also discuss the complications and lessons we learned from working on big medical data.

Biography:

Samuel Kou is professor of statistics at Harvard University. He received a bachelor’s degree in computational mathematics from Peking University in 1997, followed by a Ph.D. in statistics from Stanford University in 2001. After completing his Ph.D., he joined Harvard University as an assistant professor of statistics and was promoted to a full professor in 2008. His research interests include big data analytics; digital disease tracking; stochastic inference in biophysics, chemistry, and biology; protein folding; Bayesian inference for stochastic models; nonparametric statistical methods; model selection and empirical Bayes methods; and Monte Carlo methods. He is the recipient of the COPSS (Committee of Presidents of Statistical Societies) Presidents’ Award, the highest honor for a statistician under the age of 41; the Guggenheim Fellowship; a U.S. National Science Foundation CAREER Award; the Institute of Mathematical Statistics Richard Tweedie Award; the Raymond J. Carroll Young Investigator Award; and the American Statistical Association Outstanding Statistical Application Award. He is an elected fellow of the American Statistical Association, an elected member of the International Statistical Institute, and an elected fellow and a medallion lecturer of the Institute of Mathematical Statistics.


Using Prior Information for Intelligent Factor Allocation and Design Selection

Speaker: William Li (Shanghai Jiao Tong University, China)

Abstract:

While literature on constructing efficient experimental designs has been plentiful, how best to incorporate prior information when assigning factors to the columns has received little attention. This talk summarizes a series of recent studies that focus on information of individual columns. For regular designs, we propose the individual word length pattern (iWLP) that can be used to rank columns. With prior information on a factor’s likely importance, iWLP can be used to intelligently assign factors to columns, and select the best designs to accommodate such prior information. This criterion is then extended to study nonregular designs, which we denote as the individual generalized word length pattern (iGWLP). We illustrate how iGWLP helps to identify important differences in the aliasing that is likely otherwise missed. Given the complexity of characterizing partial aliasing, iGWLP will help practitioners make more informed assignment of factors to columns when utilizing nonregular fractions. The theoretical justifications of the proposed iGWLP are provided in terms of statistical model and projection properties. In the third part, we consider clear effects involving an individual column (iCE). Motivated by a real application, we introduce the clear effects pattern, derived from iCE, and propose a class of designs called maximized clear effects pattern (MCEP) designs. We compare MCEP designs with commonly used minimum aberration designs and MaxC2 designs that maximize the number of clear two-factor interaction. We also extend the definition of iCE and MCEP designs by considering blocking schemes.

Biography:

William Li is a professor of management at the Shanghai Institute of Advanced Finance (SAIF) at Shanghai Jiao Tong University. Prior to joining SAIF on a full-time basis, Li was a tenured full professor and Eric Jing Professor in the Carlson School of Management at the University of Minnesota. He was also an invited professor at the School of Management at Fudan University. Li’s papers are published in leading journals in the field of applied statistics, especially in the top journals recognized in the field of experimental design (e.g., Technometrics, and the Journal of the American Statistics Association). He is co-author of a widely used statistics textbook, Applying Linear Statistical Model (5th edition), which has been also cited many times in academia (the latest SCI has 993 references). He has been an associate editor for Technometrics. In recognition of Li’s academic contributions, he was awarded the Fellow of the American Statistical Association in 2013. He has extensive teaching experience and has won the Excellent Teaching Award five times at the University of Minnesota, as well as a teaching award in the EMBA program at Fudan University in 2018. Li received his bachelor degree in applied mathematics from Tsinghua University and his master’s and doctorate degrees in statistics from the University of Waterloo.


Optimal Designs for Generalized Linear Models

Speaker: Abhyuday Mandal (University of Georgia)

Abstract:

Generalized linear models have been used widely for modeling the mean response both for discrete and continuous random variables with an emphasis on categorical response. We first consider the problem of obtaining D-optimal designs for factorial experiments with a binary response and k qualitative factors each at two levels. Next we extend the results for designs with both discrete and continuous factors and propose a quantum-behaved particle swarm technique called d-QPSO for identifying such designs. We conclude the talk with some results on optimal designs for ordered categorical responses. For all such cases, the uniform allocation of experimental units is commonly used in practice, but often suffers from a lack of efficiency. For a predetermined set of design points, we derive the necessary and sufficient conditions for an allocation to be locally D-optimal and develop efficient algorithms for obtaining approximate and exact designs. Although we focus on locally D-optimal designs, we also provide EW D-optimal designs as a highly efficient surrogate to Bayesian D-optimal designs. Both of them can be much more robust than uniform designs.

Biography:

Abhyuday Mandal is a professor in Department of Statistics at the University of Georgia. He received his bachelor’s and master’s degrees from the Indian Statistical Institute, Kolkata, another master’s degree from the University of Michigan, and a Ph.D. in 2005 from Georgia Institute of Technology. His research interests include design of experiments, industrial statistics, optimization techniques, small area estimation, drug discovery, and fMRI data analysis. Currently he is an associate editor of Statistics and Probability Letters, Sankhya – Series B and the Journal of Statistical Theory and Practice.


Is Jeff Wu a Data Scientist?

Speaker: Xiao-Li Meng (Harvard University)

Abstract:

Banquet speech.

Biography:

Xiao-Li Meng is Whipple V. N. Jones Professor and former chair of statistics at Harvard University, editor in chief of the Harvard Data Science Review, an honorary professor of the University of Hong Kong, and a faculty affiliate at the Center of Health Statistics at the University of Chicago. He is well known for his depth and breadth in research, his innovation and passion in pedagogy, and his vision and effectiveness in administration, as well as for his engaging and entertaining style as a speaker and a writer.


Central Composite Designs for Multiple Responses with Different Models

Speaker: Max Morris (Iowa State University)

Abstract:

The central composite design (CCD) [Box and Wilson, 1951] is a popular and effective experimental plan for fitting second order polynomial regression models in all controlled variables. In some large-scale engineering problems involving multiple responses it is known a priori, through knowledge of the system or preliminary experimentation, that only a subset of controlled variables are needed to model the behavior of each response. Such information can be used to reduce the size of the design if the overlap in model forms excludes some terms. We present a procedure for modifying the CCD in a way that maintains the basic form of the design for each response, while reducing the overall number of experimental runs required.

Biography:

Max Morris is a professor and chair of the Department of Statistics at Iowa State University and is a statistical consultant affiliated with Los Alamos National Laboratory. His research program focuses on the design and analysis of experiments, with special emphasis on those that involve computer models. He has received the Jack Youden Prize and the Frank Wilcoxon Prize from the American Society for Quality and was the 2002 recipient of the Jerome Sacks Award for Cross-Disciplinary Research from the National Institute of Statistical Sciences.


Two Examples of Partial Experimentation

Speaker: Art Owen (Stanford University)

Abstract:

Randomized controlled experimentation is the most reliable way to establish and quantify causal relations. It often happens that complete randomization is infeasible which motivates causal inference methods that are less compelling than randomized experiments. This talk looks at two settings in which some experimentation can be incorporated into observational data.  In the first setting, joint with Evan Rosenman of Stanford University, one merges a small randomized experiment such as a clinical trial into a large observational database. The key idea is to employ the propensity that the experimental data would have had, had they been in the observational data. We illustrate the method on data from the Women’s Health Initiative relating hormone replacement therapy to coronary heart disease. The second setting, joint with Hal Varian of Google, is a hybrid between regression discontinuity designs and randomized experimentation. A company might offer some benefit such as an upgrade to their best customers. They might also want to measure the effects of that benefit. In a tie-breaker design they offer the upgrade to their top customers, withhold it from the least customers and randomize in between. We quantify both the information gain and allocation cost of having more experimentation.  Most of the work on the tie-breaker problem was done for Google and was not part of my Stanford responsibilities.

Biography:

Art Owen is professor and chair of statistics at Stanford University. His research interests are in nonparametric methods such as empirical likelihood, sampling methods such as randomized quasi-Monte Carlo sampling, and bioinformatics such as the plaid model.


Measuring Educational Inequality in the Brazilian Basic Education System

Speaker: José Francisco Soares (Federal University of Minas Gerais, Brazil)

Abstract:

Brazil is a particularly unequal country in several dimensions. Thus, social inequalities are a topic of interest in several areas of the social sciences. This presentation introduces a new measure of the inequality of the performance of Brazilian students in the national tests. It is argued that the proposed indicator considers all forms of inequality and, therefore, should be part of the monitoring indicators of the Brazilian basic education system.

Throughout the presentation the importance of the teachings of Jeff Wu in the production of this and other works are shown and emphasized.

Biography:

José Francisco Soares holds a Ph.D. in statistics from the University of Wisconsin-Madison and a postdoctoral tenure in education from the University of Michigan-Ann Arbor. He is professor emeritus of the Federal University of Minas Gerais. He was the first elected president of ABAVE-Brazilian Association of Evaluation that in 2011 chose him for the special homage. In 2012 he received the Bunge Foundation Award for his contributions in the area of educational evaluation. From February 2014 – February 2016 he was president of INEP, the Education Ministry branch in charge of the assessment of basic and higher education. Currently, he is a member of the Brazilian National Education Council in the Technical Council of INEE – National Institute for Educational Evaluation of Mexico. His academic work is concentrated on studying measures of educational outcomes, calculating and explaining the effect of elementary schools, and indicators of educational inequalities.


Machine Learning in Financial Modeling

Speaker: Agus Sudjianto (Wells Fargo Bank)

Abstract:

Mathematical models are extensively used by financial institutions for various purposes such as to run the business or to fulfill regulatory requirements. Among the most critical usage of mathematical models is the evaluation of a bank’s financial health and its ability to sustain adverse economic scenarios. Mathematical models are also used to perform credit underwriting, portfolio management, derivative valuation and pricing, risk measurement, and to prevent financial crime.

Historically, banks have employed traditional mathematical and statistical models. More recently, machine-learning algorithms have gained strong adoption in both model development and validation, particularly due to their ability to deal with very large structured and unstructured data sets. In this talk, I will discuss the breadth of mathematical modeling applications in financial institutions, the role of machine and deep learning, and current challenges.

Biography:

Agus Sudjianto is an executive vice president and head of corporate model risk for Wells Fargo Bank, where he is responsible for enterprise model risk management. Prior to his current position, Sudjianto was the modeling and analytics director and chief model risk officer at Lloyds Banking Group in the United Kingdom. Before joining Lloyds, he was a senior credit risk executive and head of quantitative risk at Bank of America. Prior to his career in banking, he was a product design manager in the powertrain division of Ford Motor Company. Sudjianto holds several U.S. patents in both finance and engineering. He has published numerous technical papers and is a co-author of Design and Modeling for Computer Experiments. His technical expertise and interests include quantitative risk, particularly credit risk modeling, machine learning, and computational statistics. He holds masters and doctorate degrees in engineering and management from Wayne State University and the Massachusetts Institute of Technology.


Design of Experiments with Functional Independent Variables

Speaker: David Woods (University of Southampton, UK)

Abstract:

In this talk, some novel methodology will be presented for the optimal design of experiments when at least one independent variable is a function (e.g., of time) and can be varied continuously during a single run of the experiment. Hence, finding a design becomes a question of choosing functions to define this variation for each run in the experiment. The work is motivated by, and applied to, experiments in the pharmaceutical industry.

Biography:

David Woods is a professor of statistics in the Southampton Statistical Sciences Research Institute and School of Mathematical Sciences at the University of Southampton. From 2012 – 2017, he held a five-year fellowship from the Engineering and Physical Sciences Research Council to conduct research into design of experiments for the complex nonparametric and mechanistic models required for modern scientific and industrial problems. He is an associate editor for Technometrics and the SIAM/ASA Journal of Uncertainty Quantification. His research interests are in the statistical design and analysis of experiments, particularly the development of new methods and criteria for design selection and assessment under linear and nonlinear models. A particular emphasis is on finding efficient designs when there is uncertainty in one or more aspects of the model for the response. Much of his work develops methodology that combines theory and computation to solve problems motivated by real experiments in science and industry. Application areas include engineering, chemistry, and the pharmaceutical, automotive, and aeronautics industries.


Recent Advances on Space-filling Designs

Speaker: Hongquan Xu (University of California, Los Angeles)

Abstract:

Space-filling designs such as maximin distance designs and uniform designs are commonly used for computer and physical experiments, but the construction of such designs is challenging. We introduce a series of new methods for constructing space-filling Latin hypercube designs via level permutation and good lattice point sets. We also propose a new class of space-filling designs, called uniform projection designs, which scatter points uniformly in all dimensions and have good space-filling properties in terms of distance, uniformity, and orthogonality. Theoretical results and numerical examples are presented to show that our methods outperform existing ones.

Biography:

Hongquan Xu joined UCLA after obtaining his Ph.D. from the University of Michigan in 2001. He is currently professor and vice chair of graduate studies at UCLA’s Department of Statistics. His research interests include experimental design, computer experiments, drug combination experiments, and functional linear models.