Persistency and Performance Guarantees: Difference between revisions
Jump to navigation
Jump to search
(Created page with "MemCP gives the user several Guarantees for persistency and performance. These are the guarantees: == Persistency Guarantees == There are three persistency modes per table which are: * ENGINE = memory * ENGINE = sloppy * ENGINE = logged * ENGINE = safe === ENGINE = memory === * all data is held in memory and only in memory * in case of a crash, all data is gone * the schema is saved on disk * after a recovery, the table starts empty * fastest way to store data * use...") |
|||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees: |
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees: |
||
⚫ | |||
⚫ | |||
+ | |||
⚫ | |||
⚫ | |||
+ | |||
⚫ | |||
+ | |||
+ | Please consider that memory bandwith still might be a bottleneck. Even modern 128 core CPUs do not scale better than 8 cores for uncompressed data (usually 40 GiB/s bandwith on tested machines). '''MemCPs [[Columnar Storage]] scales beyond that limit.''' |
||
== Persistency Guarantees == |
== Persistency Guarantees == |
||
There are three persistency modes per table which are: |
There are three persistency modes per table which are: |
||
Line 8: | Line 17: | ||
* ENGINE = logged |
* ENGINE = logged |
||
* ENGINE = safe |
* ENGINE = safe |
||
+ | In comparison to other RDBMs, MemCP can do this setting per table. MySQL/MariaDB for instance only allows a global setting which is sometimes not what you want. |
||
=== ENGINE = memory === |
=== ENGINE = memory === |
||
Line 26: | Line 36: | ||
* in case of a crash, the delta storage is gone, the main storage is recovered |
* in case of a crash, the delta storage is gone, the main storage is recovered |
||
* after a recovery, some datasets or deletions that happend in the last 15 minutes before the crash may be gone |
* after a recovery, some datasets or deletions that happend in the last 15 minutes before the crash may be gone |
||
− | * extremely fast way to store data |
+ | * extremely fast way to store data without the fear of losing them in normal operation |
− | * use it for frequently updated tables with |
+ | * use it for frequently updated tables with unimportant data like usage statistics or sensor data |
=== ENGINE = logged === |
=== ENGINE = logged === |
||
Line 48: | Line 58: | ||
* an operation only succeeds after the data is synced to disk permanently |
* an operation only succeeds after the data is synced to disk permanently |
||
* in case of a crash, all data will be recovered |
* in case of a crash, all data will be recovered |
||
+ | * introduces delays for each transaction since the system has to wait for a write fence |
||
* IO bound (limited to ~1,700 write operations per second) |
* IO bound (limited to ~1,700 write operations per second) |
||
* use it for accounting data or any data that must not be lost |
* use it for accounting data or any data that must not be lost |
||
− | |||
⚫ | |||
⚫ | |||
− | |||
⚫ | |||
⚫ | |||
− | |||
⚫ |
Latest revision as of 22:23, 8 October 2024
MemCP gives the user several Guarantees for persistency and performance. These are the guarantees:
Performance Guarantees
MemCP guarantees that:
- Insert/Update without any unique key check or foreign key check will scale over shards
- Aggregate functions like SUM, AVG will scale perfectly with the amount of CPUs
With these guarantees, you can add more CPU cores whenever you have more data to ensure the same query time even with growing data.
Please consider that memory bandwith still might be a bottleneck. Even modern 128 core CPUs do not scale better than 8 cores for uncompressed data (usually 40 GiB/s bandwith on tested machines). MemCPs Columnar Storage scales beyond that limit.
Persistency Guarantees
There are three persistency modes per table which are:
- ENGINE = memory
- ENGINE = sloppy
- ENGINE = logged
- ENGINE = safe
In comparison to other RDBMs, MemCP can do this setting per table. MySQL/MariaDB for instance only allows a global setting which is sometimes not what you want.
ENGINE = memory
- all data is held in memory and only in memory
- in case of a crash, all data is gone
- the schema is saved on disk
- after a recovery, the table starts empty
- fastest way to store data
- use it for session data, observer handles, caches and other data that can be recreated by the software
ENGINE = sloppy
- all data is held in memory
- the main storage is mirrored on disk
- the delta storage is RAM-only
- a main storage rebuild is triggered every 15 minutes, so data older than 15 minutes are guaranteed to be persistent
- in case of a crash, the delta storage is gone, the main storage is recovered
- after a recovery, some datasets or deletions that happend in the last 15 minutes before the crash may be gone
- extremely fast way to store data without the fear of losing them in normal operation
- use it for frequently updated tables with unimportant data like usage statistics or sensor data
ENGINE = logged
- all data is held in memory
- the main storage is mirrored on disk
- changes to the delta storage are logged on disk files
- an operation succeeds even if data is not synced to disk permanently yet
- in case of a crash, data might be recovered
- in case of a crash of the process, all data will be recovered
- in case of a power outage or kernel crash, data might be lost
- allows buffering of the log file in return for the risk of data loss
- use it for data where you need a high update performance but cannot afford losing the last 15 minutes of your work
ENGINE = safe
- all data is held in memory
- the main storage is mirrored on disk
- changes to the delta storage are logged on disk files
- an operation only succeeds after the data is synced to disk permanently
- in case of a crash, all data will be recovered
- introduces delays for each transaction since the system has to wait for a write fence
- IO bound (limited to ~1,700 write operations per second)
- use it for accounting data or any data that must not be lost