Last modified: May 16, 2025

This article is written in: ๐Ÿ‡บ๐Ÿ‡ธ

Choosing Database

Choosing the right database can significantly influence your project's success. It requires careful evaluation of factors such as the data model, performance requirements, scalability, availability, and cost. Understanding your specific use case and its limitations helps ensure that your choice supports both immediate needs and future growth.

Data Models

Each model handles specific workloads and data shapes, so understanding their strengths helps you make informed architectural decisions.

Choose the model that matches the shape of your data and the way it will be queriedโ€”the engine comes later.

Relational (Row-Store SQL)

Relational databases structure data into tables with fixed schemas, enforcing strong consistency and rich relational integrity through ACID transactions.

Structured rows, strong consistency, rich joins.

Best for Typical engines Key traits
Financial ledgers, inventory, order processing PostgreSQL, MySQL, SQL Server ACID, referential integrity, mature tooling

+----+----------+----------+
| id | customer | balance  |
+----+----------+----------+
|  1 | Omar     |  1 250.00|
|  2 | Layla    |    93.70 |
+----+----------+----------+

Document & Key-Value (NoSQL)

Document and key-value stores offer flexible schema designs, storing JSON-like documents or simple key-value pairs for scalable and agile development.

Flexible JSON blobs or simple keys, line-out scalability.

Best for Typical engines Key traits
Content feeds, user profiles, catalogues MongoDB, DynamoDB, Couchbase Schema-optional, sharding built-in, eventual or tunable consistency

{
  "user": "Omar",
  "posts": [
    {"id": 1, "text": "Hello"},
    {"id": 2, "text": "World"}
  ]
}

Time-Series

Time-series databases are optimized for storing and querying time-stamped or sequential data, enabling efficient analysis of trends and temporal patterns.

Append-only metrics indexed by timestamp.

Best for Typical engines Key traits
IoT telemetry, server metrics, market ticks TimescaleDB, InfluxDB, Amazon Timestream High ingest rate, roll-ups, window queries

value
20 โ”ค                              *
18 โ”ค                         *
16 โ”ค                    *
14 โ”ค               *
12 โ”ค          *
10 โ”ค     *
 8 โ”ค *
 0 โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ time

Graph

Graph databases represent data as nodes and edges, enabling fast traversals and complex relationship queries for connected data scenarios.

Nodes and edges with millisecond traversals.

Best for Typical engines Key traits
Social graphs, fraud rings, recommendations Neo4j, Amazon Neptune, ArangoDB Index-free adjacency, path finding, pattern matching

(Omar)โ”€โ”€โ”€knowsโ”€โ”€โ”€(Layla)
  โ”‚ โ–ฒ              โ”‚
  โ”‚ โ””โ”€โ”€knowsโ”€โ”€โ”€โ”€โ”  โ”‚
  โ–ผ             โ–ผ  โ–ผ
(Zayn)       (Noura)

In-Memory / Cache

In-memory and caching systems load data directly into RAM, delivering sub-millisecond access times for high-performance and transient workloads.

Entire datasetโ€”or the hot subsetโ€”in RAM.

Best for Typical engines Key traits
Session state, leaderboard counters, low-latency look-ups Redis, Memcached ยตs read/write, TTL eviction, optional persistence

Client โ†’ [Redis] โ†’ (TTL 60 s) โ†’ Primary DB

Scalability

Scalability involves deciding between horizontal scaling (adding nodes) or vertical scaling (enhancing existing nodes). Managed services often offer auto-scaling to adjust capacity dynamically based on demand, supporting seamless growth without manual intervention.

Scale-Up (Vertical)

Vertical scaling, or scaling up, increases the capacity of a single node by adding more CPU, RAM, or IOPS, simplifying operations at the expense of hitting hardware limits.

Best for How it scales Trade-offs
Early MVPs, small OLTP apps Add CPU / RAM / IOPS to one box Simpler ops, but hard ceiling and cost curve is exponential

+------------+
| 4 vCPU, 8G |
+------------+
      โ”‚   upgrade
      โ–ผ
+------------+
| 32 vCPU,   |
| 256 G RAM  |
+------------+

Read Replication

Read replication distributes query load by creating copies of the primary database, improving read throughput while maintaining a single write master.

Best for How it scales Trade-offs
Read-heavy workloads, dashboards Fans out read traffic to N replicas Write master can still bottleneck; replica lag

Clients
  โ”‚
  โ”œโ”€โ–บ Replica 1  (SELECT)
  โ”œโ”€โ–บ Replica 2  (SELECT)
  โ””โ”€โ–บ Primary    (RW)

Sharding / Partitioning

Sharding splits data across multiple nodes based on key ranges or hash functions, enabling vast datasets to be partitioned for parallel processing.

Best for How it scales Trade-offs
Large OLTP sets, multitenant SaaS Split data by key range or hash across nodes Cross-shard joins/txns become complex

Shard Key = user_id % 3
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Shard 0   โ”‚  Shard 1   โ”‚  Shard 2   โ”‚
โ”‚ users 0,3โ€ฆ โ”‚ users 1,4โ€ฆ โ”‚ users 2,5โ€ฆ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Geo-Distributed / Active-Active

Geo-distributed active-active configurations place writable nodes across regions to reduce latency and enhance resilience, requiring careful consensus and CAP trade-offs.

Best for How it scales Trade-offs
Low-latency global apps, BCDR Writes land in nearest region; consensus syncs across sites Network partitions โ†’ CAP decisions; higher cost

โ•ญโ”€โ”€ us-east โ”€โ”€โ•ฎ        โ•ญโ”€โ”€ eu-west โ”€โ•ฎ
โ”‚  RW node    โ”‚โ—€โ”€โ”€โ”€โ”€โ”€โ”€โ–ถโ”‚  RW node   โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ        โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Serverless / Elastic Autoscaling

Serverless and elastic autoscaling platforms automatically adjust compute resources in response to traffic spikes, offering rapid scaling at the cost of cold start delays.

Best for How it scales Trade-offs
Spiky, unpredictable traffic Capacity grows & shrinks in seconds Cold starts, price premium per request

Load
  โ”‚
  โ”œโ”€โ”€โ”€โ”€ low โ”€โ”€โ–บ 1 container
  โ””โ”€โ”€โ”€โ”€ burst โ–บ 50 containers

Dedicated Cache Layer

Adding a dedicated caching layer offloads frequent reads to in-memory stores, accelerating data access but introducing extra hops and eviction complexities.

Best for How it scales Trade-offs
Hot-read amplification, session state Offload reads to in-memory tier; add nodes linearly Extra hop, eviction strategy required

Client โ†’ Redis Cluster โ†’ Primary DB
          (1 ms)          (10 ms)

Availability and Reliability

Maintaining uptime and reliability requires features like Multi-AZ deployments, read replicas, and robust backup and recovery options. Managed services with built-in failover capabilities can ensure your database remains accessible during hardware failures or other disruptions.

Resiliency is architecture, not luckโ€”choose a pattern that meets your SLA and your budget.*

Single-Node + Local Backup

Local backups provide a basic level of protection by dumping data to the same infrastructure, suitable for non-critical environments with relaxed recovery objectives.

Best for How it protects Trade-offs
Dev, POCs, throw-away demos Nightly dump to same data center RTO hours, RPO โ‰ค 24 h; hardware failure = downtime

App โ”€โ”€โ–บ DB โ”‚โ”€โ”€โ–บ local backup.gz
           โ”‚
      same rack

Synchronous Failover (Same Zone)

Synchronous failover replicates data in real time within the same availability zone, providing fast recovery but remaining vulnerable to zone-level outages.

Best for How it protects Trade-offs
Small prod apps, cost-sensitive OLTP Sync replica in same AZ; auto-promote on failure AZ outage brings both down; write latency โ†‘ โ€ข

Primary  โ”€โ”€โ–บ Sync Replica
   โ”‚ 1-RTT write ack  โ–ฒ
   โ””โ”€โ”€โ”€โ”€ health checkโ”€โ”˜

Cross-Zone Async Replication

Cross-zone asynchronous replication ships updates to a standby in another zone, balancing recovery speed with geographic fault tolerance at the cost of potential data lag.

Best for How it protects Trade-offs
Standard web & mobile backends Replica in different AZ; failover script RPO seconds-minutes (async); split-brain risk if promote too early

AZ-A (RW)  โ”€โ”€โ”€asyncโ”€โ”€โ–บ  AZ-B (Stand-by)

Multi-Region Active-Passive (Warm Stand-by)

Warm stand-by setups continuously ship logs to a remote region, enabling fast DNS-based failover while controlling costs compared to fully active designs.

Best for How it protects Trade-offs
Compliance-driven, 99.95 % SLA Continuous log shipping to remote region; DNS cut-over RTO 1-15 min; extra 50โ€“100 % cost; app tier must also fail over

us-east-1  โ”€โ”€logsโ”€โ”€โ–บ  eu-west-1
 RW                Warm Stand-by

Multi-Region Active-Active (Consensus / Global Txn)

Global active-active architectures accept writes in all regions, coordinating via consensus protocols to maintain consistency across geographies at scale.

Best for How it protects Trade-offs
24ร—7 global products, <100 ms writes All regions accept writes; paxos/raft ensures quorum Higher write latency (N/2+1), complex conflict resolution, premium pricing

โ•ญโ”€us-eastโ”€โ”€โ•ฎ  โ†” Paxos โ†”  โ•ญโ”€eu-westโ”€โ”€โ•ฎ
โ”‚  RW node โ”‚             โ”‚  RW node โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ             โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
          โ†•
        ap-southeast

Periodic Immutable Backups + Point-in-Time Restore

Immutable backups combined with WAL-based point-in-time restore protect against data corruption and human error by ensuring snapshots cannot be altered.

Best for How it protects Trade-offs
Ransomware, operator error Snapshots to object store; PITR via WAL No live HA; downtime during restore

DB WAL โ–บ S3 Glacier (immutability)

Chaos & Observability Layer

Integrating chaos testing and observability into your stack validates resilience by proactively injecting failures and monitoring key signals against SLOs.

Best for How it protects Trade-offs
Enforced SLOs, audits Fault-injection + golden signals (latency, error %, saturation) Requires culture & tooling investment

[Chaos Monkey] โ†’ inject fault
      โ”‚
      โ–ผ
Alerts (SLO burn) โ”€โ–บ PagerDuty

Decision-Making Trees

Quick-start: decide which branch to follow before you think about engines or vendors. Answer the questions in this one diagramโ€”then jump straight to the matching decision tree.

#
                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                       โ”‚   What does your application store?    โ”‚
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                       โ”‚
                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ–ผ                       โ–ผ                        โ–ผ
   Rows / columns fit a rigid   JSON, XML, key-value with    Raw binaries, logs,
        schema that rarely        evolving fields and       audio-video, images,
          changes? (Yes)          partial structure? (Yes)      sensor dumps?
                โ”‚                       โ”‚                         โ”‚
                โ–ผ                       โ–ผ                         โ–ผ
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚  STRUCTURED  โ”‚        โ”‚ SEMI-STRUCTURED  โ”‚       โ”‚  UNSTRUCTURED   โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                โ”‚                       โ”‚                         โ”‚
     โ†’ Go to the **Structured**   โ†’ Go to the **Semi-**     โ†’ Go to the **Un-**
          decision tree                decision tree              decision tree

Rule of thumb: if even 10 % of your data canโ€™t live in a fixed table, treat the set as semi-structured; if you mostly store files or streams, treat it as unstructured.

Structured Data

#
                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                       โ”‚ Structured    โ”‚
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
             โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
             โ–ผ                                   โ–ผ
         OLTP / ACID                    Analytics / OLAP
   (high-QPS, row lookups)           (large scans, aggregates)
             โ”‚                                     โ”‚
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ–ผ                       โ–ผ               โ–ผ                   โ–ผ
< 1 TB single node   > 1 TB or global   Column-store    Serverless
 MySQL, Postgres     NewSQL (TiDB,      (Snowflake,     column-store
 + read replicas     CockroachDB,       Redshift,       (BigQuery,
 or partitions       Spanner)           ClickHouse)     Athena)

Rule of thumb: start simple (row-store) until analytics queries dominate or data tops \~1 TB.

Unstructured Data

#
                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                       โ”‚       Unstructured          โ”‚
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                      โ”‚
                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                โ–ผ                     โ–ผ                       โ–ผ
        POSIXโ€like FS?         Huge blobs?          ยตs-latency blocks?
            โ”‚ Yes                  โ”‚ Yes                      โ”‚ Yes
            โ–ผ                      โ–ผ                          โ–ผ
   Distributed File FS      Object Store (S3, GCS)      Block Store (EBS)
            โ”‚ No                   โ”‚ No                       โ”‚ No
            โ””โ”€โ–ถ Need global CDN? โ”€โ–ถ CDN โ”€โ–ถ otherwise FS / OS

Translation: choose by access semantics (files vs. objects vs. blocks) and latency vs. size constraints.

Semi-Structured Data

#
                      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                      โ”‚   Semi-Structured      โ”‚
                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                   โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ–ผ                        โ–ผ                         โ–ผ
      Document-style?        Simple KV / cache?        Graph relations?
          โ”‚ Yes                    โ”‚ Yes                     โ”‚ Yes
          โ–ผ                        โ–ผ                         โ–ผ
   Document DB (Mongo,       In-Memory KV (Redis)       Graph DB (Neo4j,
   Couchbase) 10 GB-10 TB     ยตs latency, TTL keys       Neptune)
          โ”‚ No                     โ”‚ No                      โ”‚ No
          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ–ผ                      โ–ผ
               Wide/sparse cols?        Full-text search?
                    โ”‚ Yes                     โ”‚ Yes
                    โ–ผ                         โ–ผ
         Wide-Column (Cassandra)     Search DB (Elastic,
             Time-series OK           OpenSearch)
                    โ”‚ No
                    โ–ผ
           Time-ordered writes?
                    โ”‚ Yes
                    โ–ผ
        Time-Series DB (InfluxDB,
             TimescaleDB)

Shortcut: pick by dominant access patternโ€”JSON blobs, key-value, graph traversals, etc.

How to Use These Trees

  1. Locate your data shape (structured, semi-, unstructured).
  2. Walk the branch that matches latency, size, and query style.
  3. Prototype quicklyโ€”run real workloads for a day.
  4. Re-check every 6-12 months as traffic and cloud services evolve.

Stick to the diagram path and youโ€™ll land on a database that fits today without painting yourself into a corner tomorrow.

Table of Contents

    Choosing Database
    1. Data Models
      1. Relational (Row-Store SQL)
      2. Document & Key-Value (NoSQL)
      3. Time-Series
      4. Graph
      5. In-Memory / Cache
    2. Scalability
      1. Scale-Up (Vertical)
      2. Read Replication
      3. Sharding / Partitioning
      4. Geo-Distributed / Active-Active
      5. Serverless / Elastic Autoscaling
      6. Dedicated Cache Layer
    3. Availability and Reliability
      1. Single-Node + Local Backup
      2. Synchronous Failover (Same Zone)
      3. Cross-Zone Async Replication
      4. Multi-Region Active-Passive (Warm Stand-by)
      5. Multi-Region Active-Active (Consensus / Global Txn)
      6. Periodic Immutable Backups + Point-in-Time Restore
      7. Chaos & Observability Layer
    4. Decision-Making Trees
      1. Structured Data
      2. Unstructured Data
      3. Semi-Structured Data
      4. How to Use These Trees