MNJ Tech.

HOME SERVICES CAREER BLOGS PORTFOLIO CONTACT US

7 Database paradigms : Which to Use and When

7 Database paradigms : Which to use and when

Data has became one of the most valuable assets over the years and storing huge data in an optimised way is the greatest challenge faced by engineers. Today we are going to learn about 7 different paradigms of the database and which one you should use for your use case. Choosing a right database for the job is very critical, we dont want to over or under engineer the project.

So Let's get started...

1. Key-Value Database

In this type of DB the data is stored in key value pair, just like a JSON object or Python dictionary. Every key is unique and has assigned value. Redis, MEMcache and AWS DynamoDB falls under this paradigm. These DB are stored in system memory instead of disk, which does not make them persistant. But as they are stored in memory they have very low database operation latency.

As the name suggests, this type of NoSQL database implements a hash table to store unique keys along with the pointers to the corresponding data values. The values can be of scalar data types such as integers or complex structures such as JSON, lists, BLOB, and so on. A value can be stored as an integer, a string, JSON, or an array—with a key used to reference that value. It typically offers excellent performance and can be optimized to fit an organization's needs. Key-value stores have no query language but they do provide a way to add and remove key-value pairs. Values cannot be queried or searched upon. Only the key can be queried.

Fig 1. Key-Value database example

Let's list out when to use these DB and when not to use

When to use

When your project requires low latency data manupulation
When data loss is acceptable. Although these DBs offer strategies to restore data on restart, but there is still some data loss.
When the application simple data models
Example : Message queue, game leaderboards, shopping cart, socket connection.

When not to use

When your project has complex data models
When your project require quering and searching values
When data percistance is the top priority

2. Wide-Column Database

The Key-Value database is easy to implement, but it limits the data complexity to be stored and quering the values. Wide-Column database overcome this limitation. A wide column database is like, you took a key-value database and added a second dimension to it. At the outer layer, you have a keyspace that holds one or more column families and each column family holds a set of ordered rows, making it possible to group related data together, but unlike a relational database, it doesn't have a schema, so it can only handle unstructured data. This is nice for developers because you get a query language called CQL that's very similar to SQL, although much more limited and you can't do joins but it's much easier to scale up. Unlike an SQL database, it's decentralized and can scale horizontally.

Cassandra and Apache HBase falls under this category

Fig 2. Wide-column database example

Let's list out when to use these DB and when not to use

When to use

When your project has high write and less read and update operation
When you have high scalability requirement, because these DBs can scale horizontally easily.
When the data is unstructured
When you require basic query language to fetch data
Example : Data warehousing, OLAP (Online Analytical Processing), Real time analysis, big data, IoT, etc.

When not to use

When your project has complex data models
When your project requires complex query to fetch data
When your project has frequent updates and reads

3. Document Database

In this paradigm we have documents where each document is a container for key-value pairs. They are unstructured and don't require a schema. A collection can be indexed and can be organized into a logical hierarchy, allowing us to model and retrieve relational data to a pretty significant degree.

They don't support joins, so instead of normalizing your data into a bunch of small parts, you're encouraged to embed the data into a single document. This creates a trade-off where reads from a friend application are much faster and writing or updating data tends to be more complex. These databases are far more general-purpose than the other options we've looked at so far. From a developer's perspective, they're very easy to use.

Fig 3. Document database

Amazon DocumentDB, MongoDB, Cosmos DB, ArangoDB and CouchDB are some of the popular options in this paradigm.

Let's list out when to use these DB and when not to use

When to use

When you are not sure about the structure of the data to be stored
When you less relational data
When the data is nested
Example : Content Management, Catalog, Analytics and many more

When not to use

When the data has complex relations with other data
When you require frequent join operations
When atomicity is priority

4. Relational Database

A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are based on the relational model, an intuitive, straightforward way of representing data in tables. In a relational database, each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data, and each record usually has a value for each attribute, making it easy to establish the relationships among data points.

Document databases fall short on holing relational data and joining those data, whereas relational database covers this limitation. The relationships are established using a unique key (known as primary key) of the table pointing to the other key (known as foreign key) of the other table. This relationship can be One to One, One to Many or Many to Many.

Fig 4. Relational database

MySQL, MariaDB, PostgreSQL, AWS RDS and SQL Server are some of the popular relational databases.

Let's list out when to use these DB and when not to use

When to use

When you have a proper structure of the data to be stored
When stored data is linked with each other
When you need complex queries to retrive data
When you want ACID compliant database
Example : Social app, Financial and Analytics data, etc.

When not to use

When the structure of data is not fixed
When database requires frequent migrations
When the project is highly scalable

5. Graph Database

Graph databases are purpose-built to store and navigate relationships. Relationships are first-class citizens in graph databases, and most of the value of graph databases is derived from these relationships. Graph databases use nodes to store data entities, and edges to store relationships between entities. An edge always has a start node, end node, type, and direction, and an edge can describe parent-child relationships, actions, ownership, and the like. There is no limit to the number and kind of relationships a node can have. A graph in a graph database can be traversed along specific edge types or across the entire graph. In graph databases, traversing the joins or relationships is very fast because the relationships between nodes are not calculated at query times but are persisted in the database

Fig 5. Graph database

Neo4J and DGraph falls under this category

Let's list out when to use these DB and when not to use

When to use

When stored data is linked with each other
When project requires frequent quries on realtions
Example : Social app, recommndation engine, fraud detection, etc.

When not to use

When your project requies standardized query language
When you require transactional based system
There is less user base around graph database making it hard to find support

6. Search Engine Database

A search-engine database is a type of nonrelational database that is dedicated to the search of data content. Search-engine databases use indexes to categorize the similar characteristics among data and facilitate search capability. Search-engine databases are optimized for dealing with data that may be long, semistructured, or unstructured, and they typically offer specialized methods such as full-text search, complex search expressions, and ranking of search results.

Fig 6. Search Engine Database

Sphinx, ElasticSearch, Splunk and Solr are some of the popular option for this paradigm

Let's list out when to use these DB and when not to use

When to use

When the project need performant text based search
When data can be structured or unstructured
Example : Text based search engine

When not to use

When data needs to be updated or written frequently
Not to be used for anything other than search engine

7. Multi Model Database

There are a few different options out there, but the database we will focus on here is fauna DB which is very different from anything else we've looked at so far. With fauna DB you describe how you want to access your data using graph QL.

Consider an example where we have a Todo model. If we upload our graph QL schema into fauna, it automatically creates collections where we can store data in an index to query the data behind the scenes, it's figuring out how to take advantage of multiple database paradigms like a graph, relational and document, and determining how to best use these paradigms based on the graph QL code you provided.

Fig 7. Search Engine Database

On top of that, it's acid compliant, extremely fast, and you never have to worry about provisioning the actual infrastructure. Decide how to consume the data, and let the cloud figure everything else out.

Conclusion

There are many databases to choose from for the job, but choosing the right one is critical. Hope this article give you brief overview of all the paradigms and help you choose wisely. You can even use multiple databases to take advantage of different paradigms.