How graphs are distributed with Azure Cosmos DB’s Gremlin API and how to use that information while designing a scale-out graph application?

In this article, we will look at one such graph database platform, Azure Cosmos DB’s API — a distributed, geo-replicated, self-managed graph database, and how understanding its operational philosophy can help us make informed design choices while building a scalable graph application.

What we look to cover: We will start with an overview of Cosmos DB partitioned containers, followed by how graphs are distributed and it affects the cost of various graph operations. Finally, we will look at some of the important design considerations curated from a set of practical use-cases.

What we will not attempt to cover: For the…


Azure Cosmos DB’s Graph API supports turn-key geo-distribution which means that on a click of a button, one can replicate the entire graph to another Azure region. Moreover, Azure Cosmos DB being a ring-zero service one can pick a region from a large number of ever-growing Azure data centers. In a world where “apps” are becoming increasingly globalized, geo-replication provides excellent benefits like:

  1. Read scalability by distributing the reads to a number of read regions.
  2. Low latency by directing the users to the nearest read region.
  3. Easy roll-out of a product/service to a new geographic user base, with the local…


It is often required that multiple writes via the Gremlin API is required to executed as an atomic unit. One common example is to add a new vertex and add an edge to/from the new vertex from/to an existing vertex.

To make the example a bit explicit, let’s say that the task at hand is: Add a ‘tweet vertex’ and create an ‘edge’ between the tweet and ‘user vertex’ who created the tweet. The application semantics here is that a Tweet vertex must always be connected to the user that created it.

The standard way to write this in gremlin…


Why is pagination a hard problem for TinkerPop graph databases? What can we do as application developers? What else can we achieve as a by-product?

If you have looked for efficient solutions to paginate the results of your Gremlin queries, you may have stumbled upon this post on StackOverflow, however, perhaps only to realize that the pagination support is a difficult problem to solve for most TinkerPop-enabled graph databases. Here, I am referring to the response by Stephen who is one of the most prolific contributors to the graph technology community. Now, while there doesn’t appear to be any silver bullet for the problem, as application developers, we can do much better by shouldering some of the responsibilities ourselves.

TLDR: Supporting efficient pagination on generic…


Gremlin is one of the most popular query languages for exploring and analyzing data modeled as property graphs. There are many graph-database vendors out there that support Gremlin as their query language, but in this article, we will focus on Azure Cosmos DB which is one of the world’s first self-managed, geo-distributed, multi-master capable graph databases.

To set the expectation, this article is not aimed at teaching Gremlin, rather it should be seen as a self-help article. …


Prerequisites: Cosmos DB Partitioning, Bulk Executor Library Overview, Sample: Bulk Importing Documents, Sample: Bulk Importing Graphs

Before addressing the primary topic, it is worth mentioning that bulk-loading data to a distributed, fault-tolerant, auto-scaled, and auto-indexed database poses a completely different set of challenges than uploading data to a centralized database. The fact that each ingested data point needs to be stored, 4-way replicated, indexed, and automatically load-balanced makes it a lot more complicated than loading data to centralized and index build-deferred systems. …

Jayanta Mondal

These opinions are my own and not the views of my employer (Microsoft).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store