Data has became one of the most valuable assets over the years and storing huge data in an optimised way is
the greatest challenge faced by engineers.
Today we are going to learn about 7 different paradigms of the database and which one you should use for
your use case.
Choosing a right database for the job is very critical, we dont want to over or under engineer the
project.
So Let's get started...
In this type of DB the data is stored in key value pair, just like a JSON object or Python dictionary. Every key is unique and has assigned value. Redis, MEMcache and AWS DynamoDB falls under this paradigm. These DB are stored in system memory instead of disk, which does not make them persistant. But as they are stored in memory they have very low database operation latency.
As the name suggests, this type of NoSQL database implements a hash table to store unique keys along with the pointers to the corresponding data values. The values can be of scalar data types such as integers or complex structures such as JSON, lists, BLOB, and so on. A value can be stored as an integer, a string, JSON, or an array—with a key used to reference that value. It typically offers excellent performance and can be optimized to fit an organization's needs. Key-value stores have no query language but they do provide a way to add and remove key-value pairs. Values cannot be queried or searched upon. Only the key can be queried.
Fig 1. Key-Value database example
Let's list out when to use these DB and when not to use
When to use
When not to use
The Key-Value database is easy to implement, but it limits the data complexity to be stored and quering the values. Wide-Column database overcome this limitation. A wide column database is like, you took a key-value database and added a second dimension to it. At the outer layer, you have a keyspace that holds one or more column families and each column family holds a set of ordered rows, making it possible to group related data together, but unlike a relational database, it doesn't have a schema, so it can only handle unstructured data. This is nice for developers because you get a query language called CQL that's very similar to SQL, although much more limited and you can't do joins but it's much easier to scale up. Unlike an SQL database, it's decentralized and can scale horizontally.
Cassandra and Apache HBase falls under this category
Fig 2. Wide-column database example
Let's list out when to use these DB and when not to use
When to use
When not to use
In this paradigm we have documents where each document is a container for key-value pairs. They are unstructured and don't require a schema. A collection can be indexed and can be organized into a logical hierarchy, allowing us to model and retrieve relational data to a pretty significant degree.
They don't support joins, so instead of normalizing your data into a bunch of small parts, you're encouraged to embed the data into a single document. This creates a trade-off where reads from a friend application are much faster and writing or updating data tends to be more complex. These databases are far more general-purpose than the other options we've looked at so far. From a developer's perspective, they're very easy to use.
Fig 3. Document database
Amazon DocumentDB, MongoDB, Cosmos DB, ArangoDB and CouchDB are some of the popular options in this paradigm.
Let's list out when to use these DB and when not to use
When to use
When not to use
A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are based on the relational model, an intuitive, straightforward way of representing data in tables. In a relational database, each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data, and each record usually has a value for each attribute, making it easy to establish the relationships among data points.
Document databases fall short on holing relational data and joining those data, whereas relational database covers this limitation. The relationships are established using a unique key (known as primary key) of the table pointing to the other key (known as foreign key) of the other table. This relationship can be One to One, One to Many or Many to Many.
Fig 4. Relational database
MySQL, MariaDB, PostgreSQL, AWS RDS and SQL Server are some of the popular relational databases.
Let's list out when to use these DB and when not to use
When to use
When not to use
Graph databases are purpose-built to store and navigate relationships. Relationships are first-class citizens in graph databases, and most of the value of graph databases is derived from these relationships. Graph databases use nodes to store data entities, and edges to store relationships between entities. An edge always has a start node, end node, type, and direction, and an edge can describe parent-child relationships, actions, ownership, and the like. There is no limit to the number and kind of relationships a node can have. A graph in a graph database can be traversed along specific edge types or across the entire graph. In graph databases, traversing the joins or relationships is very fast because the relationships between nodes are not calculated at query times but are persisted in the database
Fig 5. Graph database
Neo4J and DGraph falls under this category
Let's list out when to use these DB and when not to use
When to use
When not to use
A search-engine database is a type of nonrelational database that is dedicated to the search of data content. Search-engine databases use indexes to categorize the similar characteristics among data and facilitate search capability. Search-engine databases are optimized for dealing with data that may be long, semistructured, or unstructured, and they typically offer specialized methods such as full-text search, complex search expressions, and ranking of search results.
Fig 6. Search Engine Database
Sphinx, ElasticSearch, Splunk and Solr are some of the popular option for this paradigm
Let's list out when to use these DB and when not to use
When to use
When not to use
There are a few different options out there, but the database we will focus on here is fauna DB which is very different from anything else we've looked at so far. With fauna DB you describe how you want to access your data using graph QL.
Consider an example where we have a Todo model. If we upload our graph QL schema into fauna, it automatically creates collections where we can store data in an index to query the data behind the scenes, it's figuring out how to take advantage of multiple database paradigms like a graph, relational and document, and determining how to best use these paradigms based on the graph QL code you provided.
Fig 7. Search Engine Database
On top of that, it's acid compliant, extremely fast, and you never have to worry about provisioning the actual infrastructure. Decide how to consume the data, and let the cloud figure everything else out.
There are many databases to choose from for the job, but choosing the right one is critical. Hope this article give you brief overview of all the paradigms and help you choose wisely. You can even use multiple databases to take advantage of different paradigms.
https://redis.com/nosql/key-value-databases/
https://tudip.com/blog-post/7-database-paradigms/
https://aws.amazon.com/nosql/graph/
https://aws.amazon.com/nosql/search/
©2023 MNJ Tech., All right reserved