Minal Wad

NoSQL, Azure Cosmos DB and DocumentDB – Storage, Retrieval, and More

In my previous blog, I explained about NoSQL. This blog will give further insight on how data is stored and retrieved in NoSQL. We will be using Microsoft’s DocumentDB as an example to understand NoSQL.

Here is a quick revision on the comparison of NoSQL vs. traditional relational databases:

In relational databases, we first normalize data and save it in different tables. Then, when data is needed, we join these tables to retrieve information. As you would imagine, this adds a bit of extra time and effort to retrieve data. But, data is not duplicated and thus, more information can be saved in the same amount of space available. There is a single physical server, and, to scale up, more memory, processors and storage need to be added.

In ‘No SQL’ data is not stored in related tables; instead it is stored as individually wrapped pieces of information. Information can be stored in the form of key-value pair, or columns, or documents, or Graphs. Information is not rigid and does not have to follow a schema. Each piece of information has a unique id that distinguishes it from other information. Data structures used to store data in NoSQL are different than in relational databases. E.g. key-value pair uses dictionary, document database uses JSON. Data retrieval is very fast as each piece of information has all the data it needs without having to locate other pieces of information. NoSQL can be scaled out as far as needed just by increasing its hardware.

What is Azure Cosmos DB?

Azure Cosmos DB is a globally-distributed, multi-model data service that lets you elastically scale throughput and storage across any number of geographical regions using low latency, high availability, and consistency. Document DB is part of Cosmos DB.

What is DocumentDB?

DocumentDB is Microsoft’s flavor for nonrelational document database. The name may give out a false impression that it is a collection of documents such as a SharePoint Document Library but it is far away from that. You can consider it just like a traditional SQL database, as in, it saves information which can later be retrieved. Only difference is that it is nonrelational and schema-free. DocumentDB stores data as “documents,” which are actually JSON objects.

As I mentioned earlier, Document DB is a NoSQL database for saving data as “documents”. Document, in this context, is flat data which is saved as JSON objects. JSON stands for JavaScript Object Notation and it represents data as a collection of name-value pair.
Let’s consider an example in relational database and compare its storage with DocumentDB. Data in RDBMS is structured in tabular format with fixed number of columns, and each piece of information is saved as rows. A relational database table has a fixed structure, and in order to make changes to a table, such as adding a new column, changing a specific column to allow NULL values, or changing a data type, it is necessary to modify the table’s schema.

FirstName	LastName	Gender
Minal	Wad	F
Sid	Atreva	M
Roma	Kole	Null

JSON representation of the same data would be as follows:

Each row will be a JSON object and a table would be collection of JSON objects. In the 3rd row since above, ‘Gender’ is an optional field, and DocumentDB will not include it if its value is Null. This is in accordance with no-schema behavior of DocumentDB.

Some customers may have additional information such as Age which can simply be appended to the JSON without having to change the original definition of Customer object.

This means that different Customer objects may have different schemas and will still be valid customers. It may not be completely wrong to call them amorphic objects.

Now, let’s see how related data is stored in DocumentDB. In relational database, this is how an address for a customer is saved. It’s a separate table with a foreign key relation.

FirstName	LastName	Gender	Address Id (FK)
Minal	Wad	F	Null
Sid	Atreya	M	1
Roma	Kole	Null	2

AddressId	AddressLine1	State	Country	Zipcode
1	123 Land Rd	NJ	USA	33345
2	456 Sunset Blvd	VA	USA	22278

In DocumentDB the related address information is saved with the customer information. It is also not uncommon in document databases to repeat some data so that each document has the data it needs without having to locate other documents.

JSON representation of the same data would be as follows:

Even multiple Addresses can be saved by not breaking the schema, because there is no-schema ☺

Querying DocumentDB

One of the best features of DocumentDB is that its native querying language is very similar to SQL. Lets build our first query.

SELECT * FROM Customers

Where Customers is the alias for Active Collection.

In my next blog, I will demonstrate how to create your first DocumentDB in Cosmos DB.

Minal Wad

NoSQL, Azure Cosmos DB and DocumentDB – Storage, Retrieval, and More

What is Azure Cosmos DB?

What is DocumentDB?

Querying DocumentDB

Share this post

Tags

Recent Posts

Categories

Related Posts

Hosting Microsoft Dynamics GP on Azure: Is It Right for You?

Checking Your CMMC Progress

Dynamics 365 Customer Engagement or ExpandIT?

Sisterhood of the Traveling Microsoft Partners: My Story From CIC 2024

Closing the CMMC Knowledge Gap: Why IT Professionals Need Specialized Training

Navigating the Cosmos of Copilot

CUI-CON 2024: Navigating My First Tech Conference as a Security and Compliance Analyst

KTL 360: Managed Services for the Modern SME

Getting CMMC Level 2 Complaint With ITAR in Your Environment

Product Comparison: Dynamics 365 Business Central and Sage Intacct Licensing

To Infinity and Beyond: Microsoft’s Copilot Soars Among the Clouds

Fact or Fiction? CMMC Steps up to the Plate

Compliance Cotton-Headed Ninny Myths: Separating the Real from the Make-Believe

Multi-Platform Reporting Made Easy: Popdock by eOne Solutions

CMMC Proposed Rule: What’s New?

Women Belong in Tech: NC Women in Tech Conference

Getting CMMC Level 2 Compliant With ITAR in Your Environment

NIST SP 800-171 Rev 3 Final Public Draft: The New Requirements

Three Factors to Consider in Your ERP Hunt

Fathiya’s Top 3: Takeaways From My Start at KTL Solutions

National Cyber Summit: Our Takeaways From One of the Year’s Best Events

Quick Links

About

Newsletter