system-designnotesDatabases

Relational databases

  1. Relational databases, or relational database management system (RDMS), or SQL database represent and store data in tables. Data across different database tables can be combined using SQL join operations.
  2. Examples include MySQL, Oracle database, PostgreSQL.

Non-relational database

  1. Non-relational databases are grouped into four types: key-value stores, graph stores, column stores, and document stores. Join operations are generally not supported in non-relational databases.
  2. Examples include CouchDB, Cassandra, HBase, Amazon DynamoDB.
  3. Might be the right choice if:
    1. the application requires super-low latency
    2. application data is unstructured, or is not relational
    3. your use case involves storing and retrieving serialized data (e.g., JSON objects) and you don’t need complex joins, foreign keys, or structured queries
    4. you need to store a massive amount of data

Database replication

  1. A master generally only supports write operations. A slave database gets copies of the data from the master and only supports read operations. All the insert, delete, or update must be sent to the master database.
  2. Most applications require a much higher ratio of reads to writes. Thus, the number of slave databases in a system is usually larger than the number of master databases.
  3. Benefits:
    1. Better performance: read opeartions are distributed across slave nodes, which allows more queries to be processed in parallel
    2. Reliability: data is replicated across multiple locations, so even if one database is lost, data is still preserved
    3. High availability: database remains in operation even if a database is offline as you can access data stored in another database server
  4. If only one slave database is available and it goes offline, read operations will be routed to the master database temporarily until the issue is resolved.
  5. If the master database goes offline, a slave database will be promoted to be the new master, and all the database operations will be temporarily executed on the new master database, and a new slave database will replace the old one for data replication.
  6. In a production setting, data on a slave database might not be up-to-date. The missing data would have to be updated by running data recovery scripts before promoting a slave database to a master database.