What is OLTP and OLAP: Difference between revisions
(Created page with "There are basically two types database workloads leading to different database designs: {| class="wikitable" |+ ! !OLTP !OLAP |- |abbr. |Online Transaction Processing |Online ''Analytical'' Processing |- |Example applications |ERP, Ticket System |BI suite, Statistical programs |- |Typical SQL Queries |large amounts of small UPDATE, INSERT |few SELECT with big computation demand |- |Best Storage Format |Row Store |Columnar store |- |Database Engines |MySQL, Postgres |Mone...") |
|||
Line 53: | Line 53: | ||
== See also == |
== See also == |
||
+ | |||
− | [[Shards, RecordIDs, Main Storage, Delta Storage]] |
+ | * [[Shards, RecordIDs, Main Storage, Delta Storage]] |
Latest revision as of 19:27, 20 November 2024
There are basically two types database workloads leading to different database designs:
OLTP | OLAP | |
---|---|---|
abbr. | Online Transaction Processing | Online Analytical Processing |
Example applications | ERP, Ticket System | BI suite, Statistical programs |
Typical SQL Queries | large amounts of small UPDATE, INSERT | few SELECT with big computation demand |
Best Storage Format | Row Store | Columnar store |
Database Engines | MySQL, Postgres | MonetDB |
Users | Many Small Users | One Data Operator |
OLAP databases use ETL (Extract, Transform, Load) processes to load data from an OLTP database. This allows for the data to be transformed into a format suitable for data analysis and reporting. The ETL process enables data to be extracted from the source database, transformed into the format desired for the analysis, and then loaded into the OLAP database.
However, wouldn’t it be nice to have a DBMS that can handle both, OLAP and OLTP workloads well?
Marrying both worlds
OLAP databases work best with columnar storage, which allows for large data sets to be stored and accessed quickly. This makes it ideal for data analysis and reporting. On the other hand, OLTP databases are optimized for record-wise data storage, which is better suited for read/write operations.
The ideal DBMS should be able to handle both OLAP and OLTP workloads well. It should be able to store and access data quickly and efficiently, while also being able to handle the read/write operations associated with OLTP workloads. It should also provide a robust set of features for data analysis and reporting.
So we have two kinds of data in the database:
- A large set of old records where we want to do analysis on (called main storage) stored in the OLAP-optimized columnar storage
- A small set of recent changes that came from the OLTP workloads (called delta storage) stored as a OLTP-optimized row storage
While the main storage can be compressed and optimized for read operations, the delta storage must be optimized for write operations.
So, the main storage is a column store while delta storage is a row store.
Whenever there is enough „old“ data in the delta storage, the main storage can be rebuilt from main storage + delta storage and delta storage can start empty again.
Of course, queries on the database have to consider both: delta and main storage.