We can achieve the multi-tenancy by either partitioning the database or collection as per the tenant’s data size. As per the Microsoft Azure recommendation, if tenant data is smaller and tenancy numbers are higher then we should store data for multiple tenants in the same collection to reduce the overall resources necessary for the application. We can identify our tenant by a property in the document and issue filter queries to retrieve tenant-specific data. We can also use Document DB’s users and permissions to isolate tenant data and restrict access at a resource-level via authorization keys.
In cases where tenants are larger, need dedicated resources, or require further isolation – we can allocate dedicated collections or databases for tenants. In either case, Document DB will handle a significant portion of the operational burden for us as we scale-out your application’s data store.
Approaches to achieve Multi-Tenancy :
There are different below mentioned approaches to achieve multi-tenancy in your application –
- By using single database having one collection: In this approach, We will be having one database for our application and then we will be creating one collection which is a container to store all the JSON documents under this database. Due to storage and throughput limitations of a single collection, we need to enforce partitioning in this created collection so that we can achieve using Data-Set Id which can act as a partition key and this way we can achieve multi-tenancy as well. Security can be enforced at the Document-db level as well by creating a user for each tenant, assigning permissions to tenant’s data-set , and querying tenant data-set via the user’s authorization key.
- Pros : Major benefits of storing tenant data within a single collection include reducing complexity, ensuring transactional support across application data, and minimizing financial cost of storage.
- Cons : One collection can’t not store the higher amount of data using this model and in turn that will throttle the upcoming request after reaching the storage and throughput limit of Document DB for one collection.
- By using single database having multiple collections : In this approach, We will be having one database for our application and then we will be creating multiple collections based on the tenant id under this database. Now we can partition data across multiple collections based on the data-set id. Fortunately, there is no need to set a limit on the size of your database and the number of collections. Document DB allows us to dynamically add collections and capacity as the application grows.
- Pros : Major benefits of storing tenant data within multiple collections include increased resource capacity and higher throughput. Moreover, we can place a tenant, who needs dedicated throughput, on to their own collection based on the permission given to the user for the resource (Collection, Document, Attachment, Stored Procedure, Trigger, UDF).
- Cons : High Cost as pricing of Document Db increases as soon as new collection is created based on the required throughput in a region.
- By using multiple database having multiple collections : In this approach, We will be having multiple database based on the tenant id for our application and then we will be creating multiple collections based on the tenant data-set under the respective database. For the most part, placing tenants across databases works fairly similar to placing tenants across collections. The major distinction between Document DB collections and databases is that users and permissions are scoped at a database-level. It means that each database has its own set of users and permissions – which you can use to isolate specific collections and documents.
- Pros : Major benefits of storing tenant data within multiple databases include increased resource capacity and higher throughput. In this we can place a tenant in their own database having user permission at the DB level.
- Cons : Again, High Cost as pricing of Document Db increases as soon as new collection is created in the respective database based on the required throughput in a region.
4. By using multiple Azure accounts having respected database/collections : In this approach, We will be having multiple azure accounts based on the tenant id for our application and then we will be creating the individual database for the tenants respectively. Now collections will be created based on the tenant data-set under the respective database. In this, data of the each tenant is separated from each other.
- Pros : Major benefits of storing tenant data within multiple accounts is to enforce security at account level. It also includes increased resource capacity and higher throughput.
- Cons : Again, Very high cost due to subscription of new account for every tenant.
Comparison of the above 3 approaches based on few parameters :
Approaches 1st 2nd 3rd
Storage of Tenant’s Data-set
|Single collection will be partitioned based on the partition key (Tenant Id + Data-Set Id).||Single Database will be partitioned into Collections based on the Tenant Id and further respective collection will be partitioned based on the Data-Set Id of the tenant.||Multiple Database will be created based on the Tenant Id and further respective database will be partitioned into Collections based on the Data-Set Id of tenant.|
Handling a very large data-set
|Not easily feasible because of the limitation of storage and throughput at collection level.||Easily feasible by using multiple partition of collection. But we can’t enforce throughput at partition level.||Very easily feasible by creating separate collection for each data-set. Here we can’t enforce throughput at database level.|
|Handling of hot spots (Generic)||Document DB provides the way to scale up/down the throughput based on the requirement at the different pricing scheme.||Same||Same|
|Cost as per storage/throughput (Generic)||For single-partition collection, single collection will limited to the 250, 1000, or 2500 RU/s of throughput and 10 GB storage on your S1(₹2.58/hr), S2(₹5.07/hr), or S3(₹10.14/hr) collection respectively.||As per new Standard pricing model, we’ll be charged based on the throughput level we choose, between 400 and 10,000 RU/s. And yes, we can switch back to S1, S2, or S3 collections if we change our mind.|
|Throttling of the request (generic)||Our application will be behind the Platform IO so that will take care of the throttling of the request.||Apart from Platform IO, Document DB gives us a provision to throttle the request at collection level. But we can’t apply throttling at db-level.|
|Primary Key||By default, Document DB generates a unique id (string) for each document but we can make it by the combination of tenant id, data set id and time-stamp (Discussion in process).||Same||Same.|
Hope it works for you and now you would be able to implement the multi-tenants scenario in your application easily. Enjoy 🙂