Architecture design is a must while developing a SaaS application to ensure its scalability and optimising infrastructure costs. In this blog, Lets discuss the implementation of one such architecture with Quarkus java framework and Hibernate ORM
Unblocking The Main Thread Solving ANRs and Frozen Frames
Schema-based multi-tenant SaaS architecture using Quarkus & Hibernate-ORM
1. Schema-based multi-tenant architecture using Quarkus
& Hibernate-ORM
Architecture design is a must while developing a SaaS application to ensure its
scalability and optimising infrastructure costs. In this blog, Lets discuss the
implementation of one such architecture with Quarkus java framework and Hibernate
ORM
By CoffeeBeans Consulting
Suppose you have a great product idea, you started working on it and developed a cool
SaaS product. You get new tenants to use your product and they liked it. In a nutshell,
your product is a hit.
Everything seems to be going well until you realize you are bleeding money with every
increasing tenant. You want to scale up but you are unable to do so because you have
already exhausted all your cash.
You try to figure out what is going wrong and eventually look into your application
infrastructure costs.
You understood that every time a new tenant onboards your application you deploy a
new server, database, etc. for that tenant. Separate servers and/or databases are only
useful for computing or data-intensive applications but yours is neither of that.
2. In such scenarios, multi-tenant architecture can be very useful in designing applications
with shared resources. Multiple combinations are possible some of which are:
1. Separate servers and shared database
2. Shared server and separate databases
3. Shared server and shared database
How should I decide which resource to keep separate and
which to be shared?
The decision should be made on each resource after looking at its constraints for
scaling up. It’s good if each of your resources scales individually based on the utilization
rather than complete application scaling up and remaining underutilized.
There are a few things one should keep in mind while deciding.
For servers:
If your business logic is very customized for each tenant then shared servers are not an
optimal solution.
OR
If your application is compute-intensive or each tenant is limited by their respective
compute then separate servers are a good solution.
For databases:
Generally, it’s a trade-off between data isolation and resource utilization. Development
complexity and scaling constraints are also key decision factors.
A separate database provides tenant-specific customization freedom and ease in
monitoring per tenant usage and billing them for backup and disaster recovery.
Multi-tenant database patterns
There are three types of patterns:
1. Database per tenant
2. Schema per tenant
3. Row per tenant
3. Database per tenant
Separated databases will be provisioned for each tenant. It achieves maximum data
isolation and will give you freedom for high customization and tenant-specific operations
to the database.
The development complexity and cost of this approach are high.
Schema per tenant
Separate schemas are created for each tenant. It achieves data isolation at schema
level.
High customization and tenant-specific operations like backup, monitoring, and
encryption are not supported in many databases.
The development complexity is high but the cost is low.
Row per tenant
In this approach, all the tenants’ data are stored in the same table. Each row is
identified by the tenant identifier. Some databases like PostgreSQL support RLS (Row-
Level Security) where the rows are implicitly filtered based on the tenant and there is no
need to explicitly write the WHERE clause with the tenant identifier.
There is no data isolation and more care has to be put into writing database queries at
the application level.
The development complexity and cost are low.
In this blog, we will discuss the implementation of the shared server and schema-based
shared database architecture.
What we will build?
We will build an app that will store the information of the users of the tenants.
Each tenant will have its separate schema which will store its users’ details. A public
schema that will be accessed by every tenant will contain the pricing detail of the
product.
Database pattern
4. schema-based multi-tenancy database pattern
Tables
We will build an app that will store the information of the users of the tenants.
Each tenant will have its separate schema which will store its users’ details. A public schema
that will be accessed by every tenant will contain the pricing detail of the product.
Database pattern
6. One important factor while approaching this solution is whether or not frameworks or
tools are available that support this architecture and pattern.
For the implementation part, we will be using the Quarkus framework to deploy the
REST APIs and perform the business logic.
The Quarkus is an open-source java framework developed by Red Hat which
empowers the developers to build cloud-native and modern java applications. You can
learn more about Quarkus here.
PostgreSQL will be used as the database. It is an open-source SQL database.
Hibernate-ORM supports schema-based multi-tenancy.
Let’s start
Go to https://code.quarkus.io/ and fill in the application configuration.
Application configuration
Add the following extensions
Selected extensions
Open in an IDE and run the following command from the terminal:
mvn compile quarkus:dev
7. It will download all the dependencies and start the server with the dev profile activated.
In dev, profile Quarkus provides the support for live reloading.
After the server has started, perform a GET request to the following URL:
http://localhost:8080/hello
It should return ‘Hello RESTEasy’
Create entities
Package information is important, all the entities of the public (shared) schema should
reside in the same package. Similarly for the tenant schema. We will discuss this in
detail later while discussing multiple persistence units.
https://medium.com/media/ef34cf9db5e75b5fc6c7b508d068bd1e
Hibernate does not support database generation (create or update) in the multi-tenant
mode so we need to create the databases ourselves.
Create pricing_details table on public schema
CREATE TABLE public.pricing_details
(
id integer,
product_name character varying NOT NULL,
price integer NOT NULL,
created_at date,
updated_at date,
PRIMARY KEY (id)
);
Create schema for Tenant A (GOOGL) and Tenant B (AAPL)
CREATE SCHEMA googl
AUTHORIZATION postgres;
CREATE SCHEMA aapl
AUTHORIZATION postgres;
Create user_details table on both schema
CREATE TABLE googl.user_details
(
id integer,
first_name character varying NOT NULL,
last_name character varying NOT NULL,
created_at date,
updated_at date,
PRIMARY KEY (id)
8. );CREATE TABLE aapl.user_details
(
id integer,
first_name character varying NOT NULL,
last_name character varying NOT NULL,
created_at date,
updated_at date,
PRIMARY KEY (id)
);
Insert values in the tables.
Create repositories for entities
https://medium.com/media/b7ad2829e2f9a60984a3d216724202c5
Configuring application properties
write the following configuration properties in the application.properties
https://medium.com/media/f8ff107dfcfa6acdebc505b21b946d81
Only a single data source is configured because we have only one database to connect
to.
A persistence unit defines a set of all entity classes that are managed by
EntityManager instances in an application
We have separate persistence units for public schema (<default>) and tenant-specific
schema (tenant)
The data source is set to <default> so that they connect to the default data source.
Multi-tenant mode is set to SCHEMA
But we need to configure which persistence unit to manage which entities. For that, we
need to create package-info.java files in the packages containing the entity files of
public and tenant-specific files.
https://medium.com/media/2ee973e87d56c5fec30db9633af8fa78Inside the package
containing the entities of the ‘public’ schema
https://medium.com/media/89a6ee2e71ad4bb1c359c7e059b38b6bInside the package
containing the entities of tenant-specific schema
Okay, we configured which persistence should handle which entities but what about
schema information?
9. Each tenant will have its own schema name then how will the persistence unit know
which schema to go for. For this, we will create a Tenant Resolver which will resolve the
schema name based on the incoming request.
We need to create a Tenant Resolver for all the persistence units.
We will implement the hibernate’s TenantResolver interface which provides a method to
resolve schema names.
https://medium.com/media/c0c771f1f279bde9dc869a368e7883e9
The @PersistenceUnitExtension takes the name of the persistence unit for which this
TenantResolver implementation is.
@PersistenceUnitExtension is for the default persistence unit.
The @RequestScoped is for defining the lifecycle of this bean same to that of the
request the application receives (Bean is created when the application receives the
request and is destroyed when the request is completed). Once the request is
completed it will be destroyed and will be recreated with every incoming request.
what about the below code?
public String resolveTenantId() {
String tenantName = context.request().getHeader("tenant");
return tenantName;
}
To keep the implementation simpler we will pass the tenant identifier as the header in
the HTTP request with the key as “tenant” and the value as the identifier. The above
code will retrieve that value and return it as the schema name for that tenant.
Now all the configuration is done but where are the APIs for fetching data?
https://medium.com/media/bb3dc31fe4641c24aefcab1dd47d3b1d
‘/pricing/all’ endpoint will fetch the list of pricing details from the public schema and
return the list.
‘/user/all’ endpoint requires a tenant identifier as a header and will fetch the data from
that schema and will return the list of users.
Let’s insert some data into the tables
13. list of users from aapl schema
Great! we finally achieved schema-based multi-tenancy for our application.
Important
For the simplicity of the implementation, we passed the tenant identifier in the header. It
should not be passed in the header. A better approach will be to include it in the JWT
tokens which can’t be modified and retrieve that by intercepting the HTTP request and
storing the identifier in a TenantContext.
14. Now I hope you will apply this architecture to your SaaS product and save on your
infrastructure costs.
Source code
You can find the source code here.
Pros
1. Efficient use of resources.
2. A better architecture for multi-tenancy compared to the row per tenant approach
in terms of data isolation.
3. Low infrastructure costs.
4. Can easily scale.
Cons
1. Development complexity is high.
2. High customization per tenant will be difficult.
3. Data backup for each tenant is not possible.
4. Monitoring for each tenant may not be supported.
5. Frameworks supporting this pattern may not be available in all languages.
To know more visit
Website :- https://coffeebeans.io/
Blog :- https://coffeebeans.io/blogs