Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources

CS410BKK

Data Storages

Bangkok Campus
Mar 11, 2024 - Mar 29, 2024
During this course we will study what problems of modern software can be solved by data storages. We will study the whole spectrum of existing data storages.
Bangkok Campus
Mar 11, 2024 - Mar 29, 2024
Nikolay Golov

Faculty

Nikolay Golov

CPO of Tengri Data Platform

Course length

3 weeks

Duration

3 hours
per day

Total hours

45 hours

Credits

4 ECTS

Language

English

Course type

Offline

Fee for single course

€2999

Fee for degree students

€1999

Skills you’ll learn

SQLDatabase EnginesQuery OptimizationLandscape of Modern DatabasesDesign a Multi-Database SystemBuilding a Big Data Analytical Infrastructure
OverviewCourse outlineCourse materialsPrerequisitesMethod & grading

Overview

Any modern application, mobile, web or an online service, should be able to store and retrieve data. This task produces a lot of tools, theories andpractices with a single responsibility: data storage.

This course examines the basics of modern database technologies and its use in data-intensive applications. A particular focus on how to choose the right database technology according to tradeoffs between performance, ease of use, data integrity, as well as other considerations will be learned. Hands-on lessons are going to be taught using SQLite, PostgreSQL, MongoDB, Redis and Snowflake databases.

The course starts with a summary of how data storing works for an application. We proceed with a list of essential data storing tool requirements, such as: ACID, transactions, data access languages (SQL). Afterwards, we’ll study classical relational databases (Oracle, MS SQL, PostgreSQL, MySQL). Later, we’ll describe how this century's technological advancements gave birth to a set of non-classical databases, such as: in-memory storage, document storage, columnar storage, etc. The bulk of the remaining course focuses on the tradeoffs to be considered during technology selection and database design. We will emphasize the difference between OLAP (analytical) and OLTP tasks. We’ll spend some time on data modeling for BI and OLTP.

A significant amount of time will be dedicated to the practical skills needed for developers and analysts to work with most of the databases listed above. The course will try to illustrate what type of skills can be useful, regardless of the database selected: SQLite, PostgreSQL, MySQL, Snowflake, CockroachDB or Spanner. The course will illustrate similarities between all the databases and the important differences between them. Practical tasks will help the students develop skills on how to master a new database and discover its capabilities and limitations.

Learning highlights

  • Select an appropriate database engine for a particular task .
  • Understand the pros and cons of different existing database engines.
  • Know and efficiently use SQL, both for OLTP and analytical tasks.
  • Create a data model for practical tasks for a selected database.
  • Know how to autonomously estimate performance of a database in a given environment.

Course outline

15 classes

Dive into the details of the course and get a sense of what each class will cover.
Monday
Tuesday
Wednesday
Thursday
Friday
Monday
1

Session 1

Introduction. Data storage in general. CRUD. Evolution of approaches, the birth of relational models. Data Normalization. SQLite. Task 01 - SQLite.

Tuesday
2

Session 2

SQLite. SQL - Create, Insert, Select, Group by. ACID - Durability, Isolation, risks of improper Isolation. Google sheet ⇔ SQLite table.

Wednesday
3

Session 3

ACID - atomicity. SQL - transactions, commit/rollback. SQLite. Rollback journal, SQL to answer questions about task 1.

Thursday
4

Session 4

Classical client-server databases: PostgreSQL, MSSQL, Oracle, MySQL. Master/Slave replication. ACID - Isolation. Transaction isolation levels. Replication techniques.

Friday
5

Session 5

PostgreSQL. Indexes: LSM/B-tree, hashtable, projection. SQL - Join, View. Isolation levels, isolation practice. Task 02: PostgreSQL.

Monday
6

Session 6

PostgreSQL. Analytics, BI, OLAP. Kimball vs Inmon. “Star” schema, “Snowflake” schema. Analytical SQL: window functions. Modern BI tools - Tableau, Looker. Slow Changing Dimensions - practice.

Tuesday
7

Session 7

Redis. Limits of classical databases: single master, raw-storage, inefficiency. Key-value storages. Rethinking everything. Memcached. Using key-value as cash. Sharding approach. Task 03: Redis

Wednesday
8

Session 8

MongoDB. Document-oriented databases. JSON and SQL, document store for classical DB, like PostgreSQL. Sharding.Risks of inconsistency.

Thursday
9

Session 9

Column storages. OLTP vs OLAP tasks for columnar databases. Vertica, Greenplum, ClickHouse, Snowflake. Sharding approach. SQL - window functions

Friday
10

Session 10

Snowflake. Modern Data Modeling: Data Vault, Anchor Modeling. Big data. Clickstream analytics.

Monday
11

Session 11

Databuses. Kafka, Pulsar and other tools. Databuses for OLTP (event-based architecture) and OLAP (data streaming).

Tuesday
12

Session 12

Polyglot persistence. Modern databases on a Performance/Complexity/Delay graph. Proper roles of classical, key-value, document-orients, columnar databases and data-buses. Event driven architecture. SAGA pattern.

Wednesday
13

Session 13

Risks of polyglot persistence. Eventual consistency. CAP theorem - meaning and applications.

Thursday
14

Session 14

Databases of the future. Clouds change everything. Managed databases. Serverless databases. BigQuery, Snowflake, YDB, CockRoachDB.

Friday
15

Session 15

Final Quiz

Prerequisites

Python coding experience.

Basic understanding of algorithms or set theory.

Basics of SQL can help, but not required.

Methodology

Lectures

Discussions

Practice with a database, using SQL, IDE or through the python applications.

Develop a few projects to try new skills and tools for practical tasks.

Grading

The final grade will be composed of the following criteria:
30% - Final quiz
50% - Practice tasks combined
20% - Participation
There will be 3 tasks to do after classes (small projects), and a final quiz. The grading breakdown is as follows:
Nikolay Golov

Faculty

Nikolay Golov

CPO of Tengri Data Platform

Nikolay got his M.S. degree in applied mathematics and cybernetics from Moscow State University, Russia. Afterwards, he had 15 years of experience building data platforms for various startups and enterprises. From 2013 until 2019, he headed the Data Platform of Avito, Craigslist of Russia, which grew to a multi-billion-dollar company from a small startup. In Avito, he was responsible for analytical databases (Vertica, ClickHouse), OLTP engines (PostgreSQL, Redis, MongoDB), and data buses (Kafka) for analytics and microservices. Later he was Head of Data Platform at ManyChat (a California and Barcelona-based SaaS startup), responsible for the implementation and growth of its Data Platform (AWS+Redis+Snowflake+Tableau), which is being used for analytics and AI. Currently Nikolay is a CPO of a startup, creating a new analytical database, Tengri Data Platform.

See full profile

Apply for this course

Snap up your chance to enroll before all spaces fill up.

Data Storages

by Nikolay Golov

Total hours

45 Hours

Dates

Mar 11 - Mar 29, 2024

Fee for single course

€2999

Fee for degree students

€1999

How to secure your spot

Complete the form below to kickstart your application

Schedule your Harbour.Space interview

If successful, get ready to join us on campus

FAQ

Will I receive a certificate after completion?

Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.

Do I need a visa?

This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.

Can I get a discount?

Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.