The Art of Splitting Read and Write Database Traffic for Optimal Performance
The performance of a distributed application can be greatly improved by correctly separating database read and write traffic. In order to achieve this, it is important to understand the differences between the two types of traffic and to develop a strategy for routing them appropriately.
Read Traffic
The application request in question solely aims to retrieve records from the database and does not intend to modify any data through operations like update, insert, or delete. It is exclusively focused on read operations, and no other type of operation is performed on the database.
Write Traffic
The application request in question involves creating, updating, or deleting at least one record in the database. Additionally, the request may make any number of additional requests, but it is required to perform at least one CUD (create, update, delete) operation.
Why do we need to split Read/Write Traffic?
Splitting database read and write traffic is important for optimizing the performance of a database system and application. The separation of these two types of traffic allows for different resources and configurations to be used to handle each type, leading to more efficient processing and faster response times. Since reads and writes have different requirements, allowing them to be handled independently can prevent bottlenecks and improve overall system performance. By correctly splitting read and write traffic, the application developers can ensure that their systems are running at their maximum potential, delivering optimal results and a better user experience.
Distributed Database Setup
A distributed database setup is a type of database system that stores data across multiple servers or nodes, allowing for greater scalability, improved performance, and increased fault tolerance. In this setup, each server or node is responsible for storing a portion of the data or complete data, and a distributed database management system (DBMS) is used to coordinate data access and ensure consistency across the system. Distributed database setups are commonly used in large-scale applications where data needs to be accessible from multiple locations, and can also be used to provide higher availability and faster response times. In various cases, A Multi-AZ (Availability Zone) database configuration is used, in which a primary database is replicated asynchronously to a standby database in a different availability zone within the same or different region. This setup is designed to provide high availability and automatic failover capabilities, ensuring that the database remains accessible even in the event of a hardware, or software failure or an entire zone failure (disaster recovery).
The diagram depicts two availability zones, A and B, each consisting of three database nodes, with one primary (master) node and two secondary (slave) nodes in each zone. The number of primary and secondary nodes may vary depending on the database software in use; for instance, DynamoDB may have multiple primary nodes, whereas a standard MySQL setup would usually have just one primary node at any given time. The communication between the primary and secondary nodes is represented by dotted lines, with the replication process determining how data is replicated. The replication process can be either synchronous or asynchronous, depending on the database system and configuration in use.
A typical application write Process
Application -> Query Router -> DB Node
The application can send a query to either the query router or the primary database. Different databases like MongoDB
and DynamoDB
have their own mechanisms for handling requests, such as mongod
and a request router, respectively. The replication mechanism in use determines how data is synchronized with secondary nodes. Although secondary nodes often have stale data, they eventually catch up to the primary node’s data due to eventual consistency. However, secondary nodes can lag behind the primary node by zero or more seconds, with the worst-case scenario being several minutes. This delay could be caused by a variety of factors, including a large number of write requests, slow connectivity, or hardware issues such as low CPU or memory.
One common mistake made by developers is reading from the primary, secondary, or a combination of both to make a final decision. However, since secondary nodes can lag behind the primary node, the decision made using this approach may not be accurate.
Record = { id: String, ... }
PROCEDURE SaveRecord( Record record )
BEGIN
entry = secondaryDB.findById( record.id )
IF entry != nil THEN
RAISE "Record already exist"
ENDIF
primaryDB.save( record )
END
The code snippet above defines a Record object that includes a primary key represented as a String. The type of primary key can vary depending on the specific database in use and the application, such as an embedded key, partition key, or a combination of partition and range keys.
The above code snippet has two problems
- Inconsistency This code will function properly under normal circumstances, but if the secondary nodes begin to lag behind the primary node, errors may occur if multiple calls are made to save the same record.
- Reliability There is an issue with the reliability of this code as it relies on both primary and secondary databases. If the secondary database experiences an outage for any reason, the code will generate an error which is an unintended consequence of the secondary database’s failure. Additionally, this code puts additional strain on the secondary database.
Although this code is only a simple one-liner, in a typical application, we may be making numerous decisions based on the results from the secondary database. However, this approach should not be used in the write
path as it is an anti-pattern. Instead, it is recommended to use a different approach that will rarely fail, even when the secondary node is lagging by several minutes. The above code can be rewritten as
Record = { id: String, ... }
PROCEDURE SaveRecord( Record record )
BEGIN
entry = primaryDB.findById( record.id )
IF entry != nil THEN
RAISE "Record already exist"
ENDIF
primaryDB.save( record )
END
This solution will work in most cases, but it can fail if parallel calls are made to save the same record. One significant disadvantage is that we must be very careful when selecting the primary and secondary databases for each database operation. A small mistake can lead to inconsistencies in a large application. For example, suppose Service A calls Service B to retrieve records, in this case, if Service A is in the write
path and Service B uses the secondary database to retrieve the records, inconsistencies can arise.
PROCEDURE CreateOrder(MerchantId, ExternalOrderId, OrderDetails)
BEGIN
...
END
PROCEDURE FindMerchantOrder( MerchantId, ExternalOrderId )
BEGIN
order = secondaryDB.find( MerchantId=MerchantId, ExternalOrderId=ExternalOrderId)
return order
END
PROCEDURE PlaceOrder ( MerchantId, ExternalOrderId, OrderDetails)
BEGIN
order = FindMerchantOrder(MerchantId, ExternalOrderId)
IF order != nil THEN
RAISE "Duplicate order"
ENDIF
order = CreateOrder( MerchantId, ExternalOrderId, OrderDetails )
primaryDB.save( order )
END
The existing code functions correctly only when the secondary database remains synchronized with the primary database. However, when the secondary database begins to lag behind the primary database, this code may begin to throw errors (excluding parallel calls). The most effective way to remedy this code path is to avoid defining operations that are specific to either the primary or secondary database. Instead, we should define the primary and secondary databases at the request level, and ensure that all underlying methods utilize the same database. A revised implementation might resemble the following
PROCEDURE DB
BEGIN
// retrieve DB from request context
END
PROCEDURE FindMerchantOrder( MerchantId, ExternalOrderId )
BEGIN
order = DB().find( MerchantId=MerchantId, ExternalOrderId=ExternalOrderId)
return order
END
PROCEDURE PlaceOrder ( MerchantId, ExternalOrderId, OrderDetails)
BEGIN
order = FindMerchantOrder(MerchantId, ExternalOrderId)
IF order != nil THEN
RAISE "Duplicate order"
ENDIF
order = CreateOrder( MerchantId, ExternalOrderId, OrderDetails )
DB().save( order )
END
How to handle multiple secondary nodes?
- We can create an array of all secondary nodes and choose one of the secondary nodes randomly from the list. Alternatively, other methods such as weighted or simple round-robin can be employed instead of randomly selecting nodes.
- The second method involves utilizing DNS to map various nodes to a singular endpoint, which will be resolved to one of them. Services such as AWS RDS Aurora abstract this intricacy by furnishing a solitary reader endpoint.
NOTE: Certain libraries such as JDBC and ORMs like GORM may enable load balancing among secondary nodes. However, it’s still necessary to manage pool size, max connections, connection lifetime, and idle timeout per pool since each type of traffic has unique requirements. i.e we still need a way to select the right database for the current request.
How do we do it in different frameworks?
For testing, let's create MySQL Read and Write User with permissions
Write DB username is my_app
and read username is my_app_reader
mysql> CREATE USER 'my_app'@'%' IDENTIFIED WITH mysql_native_password BY 'my_app_pass';
Query OK, 0 rows affected (0.01 sec)
mysql> GRANT CREATE, ALTER, DROP, INDEX, INSERT, UPDATE, DELETE, SELECT, REFERENCES on *.* TO 'my_app'@'%' WITH GRANT OPTION;
Query OK, 0 rows affected (0.01 sec)
mysql> CREATE USER 'my_app_reader'@'%' IDENTIFIED WITH mysql_native_password BY 'my_app_reader_pass';
Query OK, 0 rows affected (0.01 sec)
mysql> GRANT SELECT on *.* TO 'my_app_reader'@'%' WITH GRANT OPTION;
Query OK, 0 rows affected (0.01 sec)
Ruby on Rails (RoR)
Ruby on Rails (RoR) provides the Octopus Gem, which is a database wrapper that enables developers to easily switch between different databases within an RoR application. RoR 6.0 also includes a built-in feature that supports both the read and write paths, allowing developers to specify which database connection to use for each operation. This feature simplifies the development process and helps ensure consistency in the application’s data.
Using Octopus gem
database.yml file
production:
host: localhost
port: 3306
database: my_app_db
username: my_app
password: my_app_pass
config/shards.yml
octopus:
replicated: true
fully_replicated: false
environments:
- production
production:
replica:
host: localhost
port: 3306
database: my_app_db
username: my_app_reader
password: my_app_reader_pass
Using Octopus Gem to handle the request
class OrderController < ActionController::Base
...
def create_order
Octopus.using(:master) do
# code goes here
end
rescue => ex
# handle exception
end
def order_detail
Octopus.using(:replica) do
# code goes here
end
rescue => ex
end
end
Using Rails 6.0
config/database.yml
production:
primary:
host: localhost
port: 3306
database: my_app_db
username: my_app
password: my_app_pass
primary_replica:
host: localhost
port: 3306
database: my_app_db
username: my_app_reader
password: my_app_reader_pass
class OrderController < ActionController::Base
...
def create_order
ActiveRecord::Base.connected_to(role: :writing) do
# code goes here
end
rescue => ex
# handle exception
end
def order_detail
ActiveRecord::Base.connected_to(role: :reading) do
# code goes here
end
rescue => ex
end
end
The provided code snippet can be utilized either at the individual block level or at the method level of a controller. However, it is recommended to use it at the controller’s method level to prevent the mixing of primary and secondary database queries. This approach ensures that all queries within the controller are consistently directed toward the same database, thereby improving the overall integrity and consistency of the application’s data.
Django
To implement read/write splitting in Django, we can make use of ThreadLocal
, Middleware
, and a DatabaseRouter
. An example project could look something like this. In this article, we will be placing these files in the CoreApp
. If you wish to add them to another app, such as the common-app
library, make sure to update the settings
file accordingly.
In our example, we have included two databases: primary
and secondary
. The db_from_the_request
function identifies the appropriate database based on the request object.
We’ll use a Database router to route the traffic to the individual databases and this will be configured in the settings
file later.
The database middleware will locate the database and assign it to a thread-local object. To ensure that these middlewares are invoked by Django, add the DatabaseMiddleware
to the settings
file. The db_wrapper
can be applied to any method to enforce the selection of the database, and it can be used in worker or consumer methods to specify the database.
The settings
file needs to include the databases, including the default
, primary
, and secondary
databases. We have included the default
database so that in the event that the application has not set a database, it will use the default one. Additionally, we have added the DatabaseMiddleware
to the middleware list, and we need to set the DATABASE_ROUTERS
so that Django
can use the DatabaseRouter
to identify the appropriate database based on the current context/request.
This setting will work for all the use cases, for worker/consumer methods we can do something like this
View code, we do not have to make any changes in the view logic.
The complete code is available at https://github.com/sonus21/django-read-write-split
Go
In Go, we can use middleware
and context
to pass/set DB for the entire request path. We’ll use the chi
router to handle the HTTP request. Go Project structure.
Here we’re creating sql.DB
object based on the application configuration.
database.go
file to create sql.DB
objectWe’ll store *sql.DB
objects in the context
and it will be used in service and middleware to route the database traffic.
chi
Middleware is used to set the database context
Database utils method to find the database name
utils.go
Order service, this service uses the database from the database context, that's set by the middleware.
The main package that uses, the database package is initialized from the main method.
Spring Boot
To achieve a read/write split in Spring Boot, we can utilize ThreadLocal
, AOP
, Database router
and annotation
. In the Spring boot app, we’ll add annotation either on the class
or on method
or both
. It will prefer method-level annotation over class level.
Annotation that will guide the application to select an appropriate database.
AOP handler, we’re going to add Around
annotation to execute our custom logic.
DatbaseContext store the current database context in the thread-local object.
A database router that will provide a look-up key based on the database context.
The SqlDatabaseConfiguration class configures the data source for two instances based on the app configuration, which are utilized in the DatabaseRouter. A default data source is also established, to be used in case the other sources are unavailable. For instance, if a ThreadPool is used without passing a database context, it will default to the default database.
The Controller
class employs the Database
annotation to set the primary
and secondary
databases. The secondary
database is included at the controller level, while the primary
database is defined in the createOrder
method. Method-level annotations take precedence over class-level annotations. Utilizing class-level annotations eliminates the need to add the same annotation multiple times.
The complete Code is available at https://github.com/sonus21/spring-read-write-split
There’s another technique known as CQRS, that’ll be a new article.
If you found this post helpful, please share, follow, and give a thumbs up.