The Art of Splitting Read and Write Database Traffic for Optimal Performance

11 min readApr 9, 2023

The performance of a distributed application can be greatly improved by correctly separating database read and write traffic. In order to achieve this, it is important to understand the differences between the two types of traffic and to develop a strategy for routing them appropriately.

Read Traffic

The application request in question solely aims to retrieve records from the database and does not intend to modify any data through operations like update, insert, or delete. It is exclusively focused on read operations, and no other type of operation is performed on the database.

Write Traffic

The application request in question involves creating, updating, or deleting at least one record in the database. Additionally, the request may make any number of additional requests, but it is required to perform at least one CUD (create, update, delete) operation.

Why do we need to split Read/Write Traffic?

Splitting database read and write traffic is important for optimizing the performance of a database system and application. The separation of these two types of traffic allows for different resources and configurations to be used to handle each type, leading to more efficient processing and faster response times. Since reads and writes have different requirements, allowing them to be handled independently can prevent bottlenecks and improve overall system performance. By correctly splitting read and write traffic, the application developers can ensure that their systems are running at their maximum potential, delivering optimal results and a better user experience.

Distributed Database Setup

A distributed database setup is a type of database system that stores data across multiple servers or nodes, allowing for greater scalability, improved performance, and increased fault tolerance. In this setup, each server or node is responsible for storing a portion of the data or complete data, and a distributed database management system (DBMS) is used to coordinate data access and ensure consistency across the system. Distributed database setups are commonly used in large-scale applications where data needs to be accessible from multiple locations, and can also be used to provide higher availability and faster response times. In various cases, A Multi-AZ (Availability Zone) database configuration is used, in which a primary database is replicated asynchronously to a standby database in a different availability zone within the same or different region. This setup is designed to provide high availability and automatic failover capabilities, ensuring that the database remains accessible even in the event of a hardware, or software failure or an entire zone failure (disaster recovery).

The diagram depicts two availability zones, A and B, each consisting of three database nodes, with one primary (master) node and two secondary (slave) nodes in each zone. The number of primary and secondary nodes may vary depending on the database software in use; for instance, DynamoDB may have multiple primary nodes, whereas a standard MySQL setup would usually have just one primary node at any given time. The communication between the primary and secondary nodes is represented by dotted lines, with the replication process determining how data is replicated. The replication process can be either synchronous or asynchronous, depending on the database system and configuration in use.

A typical application write Process

Application -> Query Router -> DB Node

The application can send a query to either the query router or the primary database. Different databases like MongoDB and DynamoDB have their own mechanisms for handling requests, such as mongod and a request router, respectively. The replication mechanism in use determines how data is synchronized with secondary nodes. Although secondary nodes often have stale data, they eventually catch up to the primary node’s data due to eventual consistency. However, secondary nodes can lag behind the primary node by zero or more seconds, with the worst-case scenario being several minutes. This delay could be caused by a variety of factors, including a large number of write requests, slow connectivity, or hardware issues such as low CPU or memory.

One common mistake made by developers is reading from the primary, secondary, or a combination of both to make a final decision. However, since secondary nodes can lag behind the primary node, the decision made using this approach may not be accurate.

Record = { id: String, ... }

PROCEDURE SaveRecord( Record record )
BEGIN
   entry = secondaryDB.findById( record.id ) 
   IF entry != nil THEN
     RAISE "Record already exist"
   ENDIF
   primaryDB.save( record )
END

The code snippet above defines a Record object that includes a primary key represented as a String. The type of primary key can vary depending on the specific database in use and the application, such as an embedded key, partition key, or a combination of partition and range keys.

The above code snippet has two problems

Inconsistency This code will function properly under normal circumstances, but if the secondary nodes begin to lag behind the primary node, errors may occur if multiple calls are made to save the same record.
Reliability There is an issue with the reliability of this code as it relies on both primary and secondary databases. If the secondary database experiences an outage for any reason, the code will generate an error which is an unintended consequence of the secondary database’s failure. Additionally, this code puts additional strain on the secondary database.

Although this code is only a simple one-liner, in a typical application, we may be making numerous decisions based on the results from the secondary database. However, this approach should not be used in the write path as it is an anti-pattern. Instead, it is recommended to use a different approach that will rarely fail, even when the secondary node is lagging by several minutes. The above code can be rewritten as

Record = { id: String, ... }

PROCEDURE SaveRecord( Record record )
BEGIN
   entry = primaryDB.findById( record.id ) 
   IF entry != nil THEN
     RAISE "Record already exist"
   ENDIF
   primaryDB.save( record )
END

This solution will work in most cases, but it can fail if parallel calls are made to save the same record. One significant disadvantage is that we must be very careful when selecting the primary and secondary databases for each database operation. A small mistake can lead to inconsistencies in a large application. For example, suppose Service A calls Service B to retrieve records, in this case, if Service A is in the write path and Service B uses the secondary database to retrieve the records, inconsistencies can arise.

PROCEDURE CreateOrder(MerchantId, ExternalOrderId, OrderDetails)
BEGIN
    ...
END

PROCEDURE FindMerchantOrder( MerchantId, ExternalOrderId )
BEGIN
    order = secondaryDB.find( MerchantId=MerchantId, ExternalOrderId=ExternalOrderId)
    return order
END

PROCEDURE PlaceOrder ( MerchantId, ExternalOrderId, OrderDetails)
BEGIN
   order = FindMerchantOrder(MerchantId, ExternalOrderId)
   IF order != nil THEN
      RAISE "Duplicate order"
   ENDIF
   order = CreateOrder( MerchantId, ExternalOrderId, OrderDetails )
   primaryDB.save( order )
END

The existing code functions correctly only when the secondary database remains synchronized with the primary database. However, when the secondary database begins to lag behind the primary database, this code may begin to throw errors (excluding parallel calls). The most effective way to remedy this code path is to avoid defining operations that are specific to either the primary or secondary database. Instead, we should define the primary and secondary databases at the request level, and ensure that all underlying methods utilize the same database. A revised implementation might resemble the following

PROCEDURE DB
BEGIN
 // retrieve DB  from request context
END

PROCEDURE FindMerchantOrder( MerchantId, ExternalOrderId )
BEGIN
    order = DB().find( MerchantId=MerchantId, ExternalOrderId=ExternalOrderId)
    return order
END



PROCEDURE PlaceOrder ( MerchantId, ExternalOrderId, OrderDetails)
BEGIN
   order = FindMerchantOrder(MerchantId, ExternalOrderId)
   IF order != nil THEN
      RAISE "Duplicate order"
   ENDIF
   order = CreateOrder( MerchantId, ExternalOrderId, OrderDetails )
   DB().save( order )
END

How to handle multiple secondary nodes?

We can create an array of all secondary nodes and choose one of the secondary nodes randomly from the list. Alternatively, other methods such as weighted or simple round-robin can be employed instead of randomly selecting nodes.
The second method involves utilizing DNS to map various nodes to a singular endpoint, which will be resolved to one of them. Services such as AWS RDS Aurora abstract this intricacy by furnishing a solitary reader endpoint.

NOTE: Certain libraries such as JDBC and ORMs like GORM may enable load balancing among secondary nodes. However, it’s still necessary to manage pool size, max connections, connection lifetime, and idle timeout per pool since each type of traffic has unique requirements. i.e we still need a way to select the right database for the current request.

How do we do it in different frameworks?

For testing, let's create MySQL Read and Write User with permissions

Write DB username is my_app and read username is my_app_reader

mysql> CREATE USER 'my_app'@'%' IDENTIFIED WITH mysql_native_password BY 'my_app_pass';
Query OK, 0 rows affected (0.01 sec)

mysql> GRANT CREATE, ALTER, DROP, INDEX, INSERT, UPDATE, DELETE, SELECT, REFERENCES on *.* TO 'my_app'@'%' WITH GRANT OPTION;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE USER 'my_app_reader'@'%' IDENTIFIED WITH mysql_native_password BY 'my_app_reader_pass';
Query OK, 0 rows affected (0.01 sec)
mysql> GRANT SELECT on *.* TO 'my_app_reader'@'%' WITH GRANT OPTION;
Query OK, 0 rows affected (0.01 sec)

Ruby on Rails (RoR)

Ruby on Rails (RoR) provides the Octopus Gem, which is a database wrapper that enables developers to easily switch between different databases within an RoR application. RoR 6.0 also includes a built-in feature that supports both the read and write paths, allowing developers to specify which database connection to use for each operation. This feature simplifies the development process and helps ensure consistency in the application’s data.

Using Octopus gem

database.yml file

production:
  host: localhost
  port: 3306
  database: my_app_db
  username: my_app
  password: my_app_pass

config/shards.yml

octopus:
  replicated: true
  fully_replicated: false
  environments:
    - production

  production:
     replica:
       host: localhost
       port: 3306
       database: my_app_db
       username: my_app_reader
       password: my_app_reader_pass

Using Octopus Gem to handle the request

class OrderController < ActionController::Base
  ...
  def create_order
    Octopus.using(:master) do
    # code goes here
    end
    rescue => ex
    # handle exception
  end

  def order_detail
    Octopus.using(:replica) do
    # code goes here
    end
    rescue => ex
  end
end

Using Rails 6.0

config/database.yml

production:
   primary:
      host: localhost
      port: 3306
      database: my_app_db
      username: my_app
      password: my_app_pass
   primary_replica:
      host: localhost
      port: 3306
      database: my_app_db
      username: my_app_reader
      password: my_app_reader_pass

class OrderController < ActionController::Base
  ...
  def create_order
    ActiveRecord::Base.connected_to(role: :writing) do
    # code goes here
    end
    rescue => ex
    # handle exception
  end

  def order_detail
    ActiveRecord::Base.connected_to(role: :reading) do
    # code goes here
    end
    rescue => ex
  end
end

The provided code snippet can be utilized either at the individual block level or at the method level of a controller. However, it is recommended to use it at the controller’s method level to prevent the mixing of primary and secondary database queries. This approach ensures that all queries within the controller are consistently directed toward the same database, thereby improving the overall integrity and consistency of the application’s data.

Django

To implement read/write splitting in Django, we can make use of ThreadLocal, Middleware, and a DatabaseRouter. An example project could look something like this. In this article, we will be placing these files in the CoreApp. If you wish to add them to another app, such as the common-app library, make sure to update the settings file accordingly.

In our example, we have included two databases: primary and secondary. The db_from_the_request function identifies the appropriate database based on the request object.

CoreApp/utils.py

We’ll use a Database router to route the traffic to the individual databases and this will be configured in the settings file later.

CoreApp/db_router.py

The database middleware will locate the database and assign it to a thread-local object. To ensure that these middlewares are invoked by Django, add the DatabaseMiddleware to the settings file. The db_wrapper can be applied to any method to enforce the selection of the database, and it can be used in worker or consumer methods to specify the database.

CoreApp/middlewares.py

The settings file needs to include the databases, including the default, primary, and secondary databases. We have included the default database so that in the event that the application has not set a database, it will use the default one. Additionally, we have added the DatabaseMiddleware to the middleware list, and we need to set the DATABASE_ROUTERS so that Django can use the DatabaseRouter to identify the appropriate database based on the current context/request.

settings.py file

This setting will work for all the use cases, for worker/consumer methods we can do something like this

Worker/Consumer Usage

View code, we do not have to make any changes in the view logic.

The complete code is available at https://github.com/sonus21/django-read-write-split

Go

In Go, we can use middleware and context to pass/set DB for the entire request path. We’ll use the chi router to handle the HTTP request. Go Project structure.

Here we’re creating sql.DB object based on the application configuration.

database.go file to create sql.DB object

We’ll store *sql.DB objects in the context and it will be used in service and middleware to route the database traffic.

Database context

chi Middleware is used to set the database context

Middleware

Database utils method to find the database name

utils.go

Order service, this service uses the database from the database context, that's set by the middleware.

order_service.go

The main package that uses, the database package is initialized from the main method.

https://github.com/sonus21/go-read-write-split

Spring Boot

To achieve a read/write split in Spring Boot, we can utilize ThreadLocal, AOP, Database router and annotation. In the Spring boot app, we’ll add annotation either on the classor on method or both. It will prefer method-level annotation over class level.

Annotation that will guide the application to select an appropriate database.

AOP handler, we’re going to add Around annotation to execute our custom logic.

DatbaseContext store the current database context in the thread-local object.

A database router that will provide a look-up key based on the database context.

The SqlDatabaseConfiguration class configures the data source for two instances based on the app configuration, which are utilized in the DatabaseRouter. A default data source is also established, to be used in case the other sources are unavailable. For instance, if a ThreadPool is used without passing a database context, it will default to the default database.

The Controller class employs the Database annotation to set the primary and secondary databases. The secondary database is included at the controller level, while the primary database is defined in the createOrder method. Method-level annotations take precedence over class-level annotations. Utilizing class-level annotations eliminates the need to add the same annotation multiple times.

The complete Code is available at https://github.com/sonus21/spring-read-write-split

There’s another technique known as CQRS, that’ll be a new article.

If you found this post helpful, please share, follow, and give a thumbs up.