Diving deep into the Cosmos DB resource model

First, we must understand the Cosmos DB resource model, which is used by all supported NoSQL data models and some APIs. When we provision a new Cosmos DB account, we will be provided with a URI and an endpoint that represents the account and allows clients to establish a connection. At the time we provision the account, we must select the API that we want to use, and this selection will determine the type of NoSQL database that we will be creating, among other things, which we will learn about later. The following list shows the available APIs with the names used in the Azure portal and the type of NoSQL database that each of them will end up creating:

  • SQL: Document
  • MongoDB: Document
  • Cassandra: Wide-column
  • Azure Table: Key/value
  • Gremlin (graph): Graph

Once we have an account provisioned, we can create a new database that will use the API that was selected for the account. An account can have many databases of the same NoSQL type that use the same API.

The following diagram shows the generalized hierarchy of elements that belong to a Cosmos DB account:

Each database will have a set of containers whose name will be different based on the NoSQL database type and API. In fact, based on the NoSQL database type, the containers will be projected in a different way to the underlying data storage. The following list specifies the container name for each NoSQL database type:

  • Document: Collection
  • Graph: Graph
  • Key/value: Table
  • Wide-column: Table

For example, when we work with a document database with either the SQL API or the MongoDB API, we will organize documents into containers known as collections. Whenever we create a new collection, we are able to provision the desired throughput, which we can then scale up or down on demand. We will also be able to specify a hint for how we want to distribute the data on the underlying partition sets. We will analyze each of these topics in detail later, as we want to stay focused on the Cosmos DB resource model for now.

Once we have a container provisioned, we can start storing data on it. One of the latest enhancements added in 2018 for this database service was the introduction of a multi-master capability. When we enable this feature, Cosmos DB allows us to write to our Cosmos DB containers in multiple regions at the same time with a latency of less than 10 milliseconds at the 99th percentile when we consume the Cosmos DB service within the Azure network. The multi-master feature makes it possible to use the provisioned throughput for databases and containers in all the available regions.

Each container will have a set of items whose names will be different based on the NoSQL database type and API. As is the case with the containers, based on the NoSQL database type, the items will be projected in a different way to the underlying data storage. The following list specifies the item name for each NoSQL database type:

  • Document: Documents
  • Graph: Vertexes and edges
  • Key/value: Entities
  • Wide-column: Rows

The following diagram shows the generalized hierarchy of elements that belong to a Cosmos DB account with the appropriate names based on the NoSQL database type on the right-hand side:

In addition, there are other container-level resources for server-side programmability that enable multi-record transactions within the partition key. We can write these resources in ECMAScript 2015 JavaScript:

  • Stored procedures
  • Triggers
  • User-defined functions, also known as UDF

When we work with document databases, stored procedures allow us to operate on any document in the collection in which the stored procedure is defined.

We can write triggers that will be executed when specific operations are performed on a document. We can define pre-triggers, which are executed before the operation is performed; and post-triggers, which are executed after the operation is performed.

We can declare user-defined functions to extend the Cosmos DB query language's grammar and provide functions that implement custom business logic.

If a version conflict occurs on a resource for any operation, the conflicting resource will be persisted in a conflict feed within the container.

The following diagram shows the generalized container-level resources that belong to any Cosmos DB container:

The following diagram shows the generalized collection-level resources that belong to a Cosmos DB collection for a document database that uses either the SQL API or the MongoDB API:

The following diagram illustrates the way Cosmos DB projects the data stored in the ARS format to the appropriate individual item for the different supported NoSQL database types and APIs:

It is very important to understand the Cosmos DB resource model and the name used to identify each element, because we will be working with its different components throughout this book, as well as the different examples.