Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources

DS302BKK

Statistical Data Analysis

Bangkok Campus
Sep 11, 2023 - Sep 29, 2023
During the Statistical Data Analysis course, students learn the basic concepts of theory and application for statistical inference, including descriptive statistics and probability.
Bangkok Campus
Sep 11, 2023 - Sep 29, 2023
Sasha Shapoval

Faculty

Sasha Shapoval

Professor of Applied Mathematics at the University of Lodz, Poland

Course length

3 weeks

Duration

3 hours
per day

Total hours

45 hours

Credits

6 ECTS

Language

English

Course type

Offline

Fee for single course

€1500

Fee for degree students

€750

Skills you’ll learn

Big DataStatistical Hypothesis TestingMonte-Carlo MethodsNumeric Python (NumPy)Variables and FunctionsMathematical Concepts
OverviewCourse outlineCourse materialsPrerequisitesMethod & grading

Overview

This is a “practical” course that covers selected concepts and tools that are most needed for working with the data arising in economics, finance, sociology, and physics. Although students may be familiar with the introductory concepts from the course, we will soon jump into topics that do not get enough attention in the curriculum of any undergraduate math program, including the best ones. In this course, we will rarely discuss proofs. Methods and results will be described in a loose manner, with the goal of starting to think like practitioners do and getting the statistical intuition required to choose appropriate methods to solve actual problems. This will lead us to important insights about the world around us.

Learning highlights

  • Emphasise statistical literacy and develop statistical thinking.
  • Use real data in teaching statistics
  • Stress conceptual understanding
  • Foster active learning
  • Use technology to develop conceptual understanding
  • Deeply understand statistical methods, be able to choose an appropriate test, and construct reliable mathematical models.

Course outline

15 classes

Dive into the details of the course and get a sense of what each class will cover.
Monday
Tuesday
Wednesday
Thursday
Friday
Monday
1

Session 1

Foundations of statistics: basic probability distributions: exponential, Poisson, normal, Cauchy, gamma, and beta. Central limit theorem. The sum of exponential distributions and the gamma distribution. Waiting time paradox. (PROB)

Tuesday
2

Session 2

Bayesian approach. Example of the problem: probability of success in the Bernoulli trials. Bayes rule. Application to the above example. Expected value and the deviation of the estimate. Practical example: Estimate of the centre of the Cauchy distribution. (Bayesian)

Wednesday
3

Session 3

Bayesian approach II. Metropolis algorithm (Bayesian) Practical example: detection of spam. (PYTHON)

Thursday
4

Session 4

Hypothesis testing. Motivating examples. General framework. Significance level and quantiles. P-values. One-sided and two-sided tests. Statistical errors. Size and power of the tests. Neyman-Pearson lemma. (CDA)

Exercises: examples of the tests: mean of normal sampling; the equality of the means of two samples; is the fraction larger than a given leve.l (PSE)

Friday
5

Session 5

Hypothesis testing II. Kolmogorov-Smirnov statistics. Independence on the true distribution. Comparison between a sample and a continuous probability distribution. Are two samples from the same distribution? Pearson’s test. (PSE) Practical example: does the first baby come late? (ThinkStats)

Monday
6

Session 6

Bootstrap and jackknife. Manual reconstruction of the inverse pdf. Approximation of the bootstrapped distribution with a continuous one: percentile and kernel approximation. Bootstrapped estimates and bias reduction.

Tuesday
7

Session 7

Point estimates. Bias and mean square error of the estimators. Natural estimates of the mean, sample variance, covariance, higher moments. Variance of the estimator. Non-parametric estimates. Method of moments. Application to basic distributions. (PSE)

Wednesday
8

Session 8

Maximum likelihood estimates. Examples: exponential and normal distributions. Variance of MLE. Rao-Cramer Inequality and Fisher Information. MLE and Goodness-of-the-fit: Likelihood Ratio. Pearson test to estimate the parameter as an alternative to the likelihood ratio. MLE and Bayesian estimators. (CDA)

Thursday
9

Session 9

The method of least squares. Connection with maximum likelihood. Testing goodness-of-fit with chi-squared. (CDA)

Friday
10

Session 10

Confidence intervals. Examples: the probability of the Bernoulli trials, parameter of the exponential distribution, Gaussian distributed estimator. MLE for the uniform distribution. (CDA)

Monday
11

Session 11

Consistency of data. Testa about standard deviations. A/B testing: usage of Mann–Whitney test under unknown distribution. Practical. Computation of the mean trend of the sea-level rise rates at the locations where the rise is observed.

Tuesday
12

Session 12

Linear regression I. Simple linear regression. Unbiasedness of the regression coefficients. The variance and the confidence interval of the slope of the response variable. Discussion on an appropriate model choice. (ECON)

Wednesday
13

Session 13

Linear regression II. Hypothesis: the slope is zero. Instrumental variables. Multiple linear regression. Basic equations. Qualitative issues (ECON). Practical example: Description of the bitcoin dynamics.

Thursday
14

Session 14

Causal inference. Treatment variable, treatment group, control group, outcome. Measurement problem. Average treatment effect. Average treatment effect for treated. Independence of the choice of the individuals to be treated with respect to the outcomes. Practical computation with the sample quantities. Alternative computation with the regression. (CAUSAL)

Friday
15

Session 15

Exam.

Prerequisites

A good understanding of calculus and probability theory

Preliminary knowledge of Python helps, as my calculations are performed in Python. Nevertheless, you may acquire the necessary knowledge of Python during our course.

The knowledge of Linear Algebra is a plus, as rare references to this discipline are expected.

Methodology

Lectures

Labs

Homework

Brainstorming

Question and answer sessions

Final exam

Grading

The final grade will be composed of the following criteria:
60% - Problem sets
40% - Final exam
Students' achievements will be evaluated based on problem sets and the final exam. The exam and the problem sets are graded on a scale of 0 to 100 points. The final score is also computed on a 100-point scale. The weights related to the problem sets and the exam are 0.4 and 0.6. This yields the following equation for the final score: Final_Score = 0.6 * (PS1 + PS2 + PS3 + PS4) / 4 + 0.4 * Exam
Sasha Shapoval

Faculty

Sasha Shapoval

Professor of Applied Mathematics at the University of Lodz, Poland

The area of expertise of Dr. Shapoval is complex systems. In interdisciplinary research, he applies modern mathematical, computational, machine learning, and statistical tools to analyze data, construct predictions, detect anomalies, and assess scenarios of real-life processes. His papers are published in Scientific Reports, Physical Review, Astrophysical Journal, Chaos, Journal of Mathematical Economics, and other professional first-tier outlets. Dr. Shapoval joined the Department of Mathematics and Computer Science at the University of Lodz in 2021. Prior to coming to Lodz, he was a Professor at HSE University. As a visiting researcher he worked a month per year in the Paris Institute of Earth Physics from 2011 till 2019.

Dr. Shapoval obtained his PhD from Lomonosov Moscow State University and was a postdoctoral research fellow at the International Center of Theoretical Physics in Italy, as well as an invited lecturer at New Economic School.

See full profile

Apply for this course

Snap up your chance to enroll before all spaces fill up.

Statistical Data Analysis

by Sasha Shapoval

Total hours

45 Hours

Dates

Sep 11 - Sep 29, 2023

Fee for single course

€1500

Fee for degree students

€750

How to secure your spot

Complete the form below to kickstart your application

Schedule your Harbour.Space interview

If successful, get ready to join us on campus

FAQ

Will I receive a certificate after completion?

Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.

Do I need a visa?

This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.

Can I get a discount?

Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.