|
Course
Description
This 2-day hands-on workshop provides engineers, researchers, financial analysts, and statisticians a headstart in using MATLAB and the Statistics Toolbox for data analysis.
Topics include data file input and output, handling large and incommensurate data sets, computing descriptive statistics, statistical plotting and visualization, fitting distributions to data, bivariate and multivariate regression, random number generators, simulation, and basic inferential methods.
The workshop is packed with examples and exercises that cover a cross-section of application areas in science, engineering, and finance.
Prerequisites
Working knowledge of the MATLAB language and basic statistics
Course
Outline
Introduction
Objective:
- Obtain a quick overview of The MathWorks and the family of products
- Discuss course set-up, materials, and logistics
- Provide a “big picture” view of the course ahead
Data and Statistics
Objective: Learn to work with data in the MATLAB environment, compute basic descriptive statistics, and visualize data in a variety of ways
What is Statistics?
- Statistical sampling and modeling
- Statistical questions
- Data analysis
Working with data
- Data I/O
- Tabular data and case lists
- Incommensurate data
- Missing data
Descriptive statistics
- Measures of center, spread and shape
Statistical plotting
- Histograms, scatter plots, and box plots
- Grouped data
- Preprocessing and reexpression
Exercise
Probability and Distributions
Objective: Review the basics of probability and random variables and explore the variety of probability distributions available in the Statistics Toolbox
Probability concepts
- Probability measures
- Random variables
- Probability distributions
Distribution concepts
- Discrete distributions
- Continuous distributions
- Distributions in the Statistic Toolbox
- Distribution parameters
- Computing probabilities
Data and distributions
- Sampling distributions
- Choosing a distribution
- Parameter estimation
- Nonparametric density functions
- Bootstrapping and simulation
- Distribution testing
Exercise
- Distribution in diagnostics
Regression Analysis
Objective: Explore regression analysis for bivariate data
Regression concepts
- Predictors and responses
- Linear and nonlinear models
- Scatter plots
- Correlation and covariance
Linear methods
- Quantiles and quantile plots
- Solving systems of linear equations with the backslash operator
- Linear least squares
- Polynomial fitting
- Graphical user interface tools for linear regression
- Curve Fitting Toolbox
- Generalized linear models
Nonlinear methods
- Nonlinear fitting
- Graphical user interface tools for nonlinear regression
- Using the Curve Fitting Toolbox for nonlinear regression
Exercise
Multivariate Statistics
Objective: Extend the concepts of the previous section to data sets with many variables and introduce specialized techniques for multivariate analysis and visualization
Multivariate plotting
- 3-D scatter plots
- Response surfaces
Principal component analysis
- Concepts
- Set-up and analysis
Factor analysis
- Concepts
- Set-up and analysis
Cluster analysis
- Concepts
- Set-up and analysis
- Hierarchical clustering and k-means clustering
Exercise
Random Numbers and Simulation
Objective: Understand the random number generators in MATLAB and the Statistics Toolbox and their use in Monte Carlo methods
Pseudorandom numbers
- Randomness
- Multiplicative congruential algorithms
Uniform random numbers
Gaussian random numbers
Writing new generators
- Inverse transform method
- Acceptance-rejection method
- Random number generators in the Statistics Toolbox
Monte Carlo methods
Exercises
- Writing a random number generator
- Monte Carlo integration
Inferential Statistics
Objective: Explore hypothesis testing and its application to analysis of variance
Hypothesis tests
- Terminology
- Assumptions
- Tests in the Statistics Toolbox
One-way analysis of variance
Two-way analysis of variance
N-way analysis of variance
Multivariate analysis of variance
Exercise
|