Statistics (STA)
Development of statistical concepts and theory underlying procedures used in statistical process control applications and reliability.
Development and application of two-sample inferences, analysis of variance, multiple comparison procedures, and nonparametric methods.
Introductions to the fundamentals of probability theory, random variables and their distributions, expectations, transformations of random variables, moment generating functions, special discrete and continuous distributions, multivariate distributions, order statistics, and sampling distributions.
Theory of statistical estimation and hypothesis testing. Topics include point and interval estimation, properties of estimators, properties of test of hypotheses including most powerful and likelihood ratios tests, and decision theory including Bayes and minimax criteria.
Concepts in SAS programming, including methods to establish and transform SAS data sets, perform statistical analyses, and create general customized reports. Methods from both BASE SAS and SAS SQL are considered. Successful completion of the course prepares students to take the SAS certification exam.
Introduction to descriptive and inferential statistics. Topics may be selected from the following: descriptive statistics and graphs, probability, regression, correlation, tests of hypotheses, interval estimation, measurement, reliability, experimental design, analysis of variance, nonparametric methods, and multivariate methods.
Simple and complex analysis of variance and analysis of covariance designs. The general linear model approach, including full-rank and less than full-rank models, will be emphasized.
Topics include simple linear regression, multiple regression, logistic regression, and Poisson regression. The statistical programming language R is used.
The course examines a variety of complex experimental designs that are available to researchers including split-plot factorial designs, confounded factorial designs, fractional factorial designs, incomplete block designs, and analysis of covariance. The designs are examined within the framework of the general linear model. Extensive use is made of computer software.
The focus of this course is on advanced tools using various multivariate regression techniques, statistical modeling, machine learning, and simulation for forecasting. Practical applications are emphasized.
Fundamental topics of machine learning including supervised/unsupervised learning, cost function optimization, feature selection and engineering, and bias/variance trade-off. Learning algorithms including classification methods, support vector machines, decision trees, neural networks, and deep learning are covered.
Introduction to mathematics of statistics. Fundamentals of probability theory, convergence concepts, sampling distributions, and matrix algebra.
Theory of random variables, distribution and density functions, statistical estimation, and hypothesis testing. Topics include probability, probability distributions, expectation, point and interval estimation, and sufficiency.
Topics include sampling distributions, likelihood and sufficiency principles, point and interval estimation, loss functions, Bayesian analysis, asymptotic convergence, and test of hypothesis.
Overview of analytic and computational methods in Bayesian inference beginning with two-sample t-inference procedures, and extending through regression, focusing on state-of-the-art software for Bayesian computation.
Statistical methods of analyzing time series including autocorrelation, model identification, estimation, forecasting, and spectral analysis. Applications in a variety of areas including economics and environmental science will be considered. Credit cannot be earned for both this course and STA 5362.
Statistical methods for analyzing time series. Topics include autocorrelation function and spectrum, stationary and non-stationary time series, linear filtering, trend elimination, forecasting, general models, and autoregressive integrated moving average models with applications in economics and engineering. Students cannot receive credit for this course and for STA 5361.
Advanced topics and theoretical underpinnings of modern data-driven methods are presented, including supervised and unsupervised methods from both statistical and machine learning perspectives; uncertainty analysis, model selection and development; and both nonlinear and linear methods.
Basic concepts of lifetime distributions. Topics include types of censoring, inference procedures for exponential, Weibull, extreme value distributions, parametric and nonparametric estimation of survival function and accelerated life testing.
Traditional designs of experiments are presented within the framework of the general linear model. Also included are the latest designs and analyses for clinical trials and longitudinal studies. Credit cannot be received for this course and STA 5375.
An introduction to the methods and practices of data mining and management. Concepts, principles, methods, implementation techniques, and applications of data mining, with a focus on modeling, pattern discovery, and cluster analysis.
Development of statistical concepts and theory underlying procedures used in statistical process control applications. Topics include sampling inspection procedures, continuous sampling procedures, theory of process control procedures, and experimental design and response surface analysis to design and analyze process experiments.
Methods, programming, and algorithms used in computational statistics; topics include, but are not limited to, Monte Carlo simulation, bootstrap, cross-validation, and MCMC. Programming in R and to write R functions.
Planning, execution, and analysis of sampling from finite populations. Simple random, stratified random, ratio, systematic, cluster, subsampling, regression estimates, and multi-frame techniques are covered. Use of computer software for analyzing data collected from designs covered in class.
A survey of methods of data analysis for biostatisticians in the biomedical and pharmaceutical fields. Regression analysis, experimental design, categorical data analysis, clinical trials, longitudinal data, and survival analysis.
Exploratory spatial data analysis using both graphical and quantitative descriptions of spatial data including the empirical variogram. Topics include several theoretical isotropic and anisotropic variogram models and various methods for fitting variogram models such as maximum likelihood, restricted maximum likelihood, and weighted least squares. Techniques for prediction of spatial processes will include simple, ordinary, universal and Bayesian kriging. Spatial sampling procedures, lattice data, and spatial point processes will also be considered. Existing software and case studies involving data from the environment, geological and social sciences will be discussed.
Descriptive parametric and nonparametric inferential methods for qualitative and quantitative data from a single population. Parametric and nonparametric inferential methods for qualitative and quantitative data from two populations. Linear regression using matrix notation, including topics in multiple regression, modeling diagnostic procedures, and model selection.
A continuation of STA 5380 with robust regression, quantile regression, and regression trees. K population descriptive and inferential methods. A matrix approach to one-way analysis of variance and least squares in balanced designs with fixed and random effects. Multiple comparison procedures, power, and sample size. A brief introduction to generalized linear models.
Statistical models and procedures for describing and analyzing random vector response data. Supporting theoretical topics include matrix algebra, vector geometry, the multivariate normal distribution and inference on multivariate parameters. Various procedures are used to analyze multivariate data sets.
Discriminant analysis, canonical correlation analysis, and multivariate analysis of variance.
Methods for analyzing high-dimensional multivariate data. Topics include matrix computation of summary statistics, graphical techniques using linear dimension reduction, statistical inference of high-dimensional multivariate parameters, high-dimensional principal components analysis and singular value decompositions, and supervised classification methods for high-dimensional sparse data.
The study of probability theory as motivated by applications from a variety of subject matters. Topics include: Markov chains, branching processes, Poisson processes, continuous time Markov chains with applications to queuing systems, and renewal theory.
Selected topics in Statistics. May be repeated once with change of topic.
Consulting, research, and teaching in statistics.
Selected topics in statistics. May involve texts, current literature, or an applied data model analysis. This course may be repeated up to four times with change of topic.
Supervised research for the master's thesis. A maximum of three semester hours to count for the degree.
Large sample theory, including convergence concepts, laws of large numbers, central limit theorems, and asymptotic concepts in inference.
Bayesian statistical inference, including foundations, decision theory, prior construction, Bayesian point and interval estimation, and other inference topics. Comparisons between Bayesian and non-Bayesian methods are emphasized throughout.
Semiparametric inference, with an emphasis on regression models applicable to a wider class of problems than can be addressed with parametric regression models. Topics include scatterplot smoothing, mixed models, additive models, interaction models, and generalized regression. Models are implemented using various statistical computing packages.
Bayesian methods for data analysis. Includes an overview of the Bayesian approach to statistical inference, performance of Bayesian procedures, Bayesian computational issues, model criticism, and model selection. Case studies from a variety of fields are incorporated into the study. Implementation of models using Markov chain Monte Carlo methods is emphasized.
Critical evaluation of current statistical methodology used for the analysis of genomic and proteomic data.
A comprehensive introduction to computing for statisticians. Topics range from information technology and fundamentals of scientific computing to computing environments and workflows, statistical document preparation for reproducible research, and programming languages. Students cannot receive credit for this and for STA 5373.
A continuation of STA 6374 with an emphasis on computational and applied mathematics, pseudo-random variate generation, and Monte Carlo methods. Credit cannot be received for this course and for STA 5373.
A hands-on survey of practical data science technologies and tools used in industry. Topics vary and may include version control systems and collaborative software development; distributed computing; data storage and access; cloud computing; web technologies, applications, and dashboards; and workflow and pipelining tools.
Theory of general linear models including regression models, experimental design models, and variance component models. Least squares estimation. Gauss-Markov theorem and less than full rank hypotheses.
Multivariate normal and related distributions. Topics include generalizations of classical test statistics including Wilk's Lambda and Hotelling's T2, discriminant analysis, canonical variate analysis, and principal component analysis.
Theory of generalized linear models including logistic, probit, and log linear models with special application to categorical and ordinal categorical data analysis.
Supervised research for the doctoral dissertation. maximum of nine semester hours will count for the degree. A student may register for one to six semester hours in one semester.