We are India’s leading payment and financial services platform, and as pioneers of mobile payments and QR technology, we serve 84 million monthly transacting users (MTU) doing billions of transactions, and supports over 28.3 million merchants. Millions of Indians rely on us everyday to manage their finances online and pay for everything from groceries to utilities to movie tickets.
Our ecosystem is supported by a few thousand microservices that uses multiple Redis Clusters on Scale. These Redis cluster(s) serve few hundred thousand TPS on a key size of a few hundred million. We provide an enriching experience of payments to the customer using multiple payment instruments. We are on a mission to bring half a billion Indians into the mainstream economy through payments, commerce, banking, investments, and financial services.
However, managing, upgrading and performance tuning such a large scale clusters while eliminating outages is a big challenge. There are not any opensourced or enterprize tool exist to create, mange and update clusters while taking care of security, audit and compliance.
We recognized this need to design an inhouse self-diagnostic and self-healing Redis Manager to manage such high workloads while taking care of availability and security for an ever-growing our customer base.
Redis Manager for Paytm
As described, we work on the scale and handle many business offerings. So there is a need for the cache layer. We have a few hundred AWS accounts and most of them have multiple redis clusters. Redis is our main cache layer that is spread over a few hundred AWS accounts. Redis is opensourced and this was one of the reasons for selecting it as our main Cache layer.
Redis is an in-memory, data-structured tool. It can be used as a key value pair store, caching, Message broker, etc. Redis can support multiple types of data structures i.e. String, List, Hash, Set, sorted Set, maps and HyperLogLogs, etc. It is popular for many use cases i.e. Caching, Session storing, Gaming, Leaderboard, real-time analysis, etc., and can support most leading programming languages such as Python, Java, PHP, Go and Node.js, etc.
As managing so many Redis clusters (across accounts) is a tedious task. So our DevOps team has designed it from scratch. As Redis is a single processor, another task was to move our systems to AWS Graviton for more computing power and cost reduction. We could not find any open-source Redis Manager, hence the entire in-house development was done by our DevOps team.
Redis cluster is the best way to scale our system horizontally along with the data sharding. The cluster can have multiple nodes. It’s important to implement the cluster with the best industry practices and to manage at run time.
We have designed an internal tool that can help the Teams to create and manage the Redis cluster easily. It’s an In-house solution that is completely written in Python and ReactJs using the Redis module in Python.
Features of Paytm Redis Manager
Easiness of Redis cluster setup with Best practices – Ability to setup homogeneous Redis Clusters with the best and recommended industry practices to serve the high scale of traffic
Highly Available – Redis Manager takes cares of availability of the Redis cluster by making sure that the configs are baked well.
All-in-one place – Single place to update configs in any cluster across accounts
Run time changes – Using the Redis manager it’s easy to make the change on run time without impacting any operation. There’s no impact on performance while using these run time changes.
Fine grain monitoring dashboard – Redis manager provides a fine-grained view of monitoring dashboards. It provides a simple integration with Prometheus and Grafana.
Cluster Failover – Failover is the most important feature of any distributed system. Redis manager can enable the users to perform all three types of failover without impacting the traffic.
Manage existing cluster – Ability to move your legacy old Redis clusters to Redis Manager with some simple clicks in less than 60 seconds.
Our Redis manager also benchmarks feature before taking it live, add/ remove nodes at any time, protect clusters with password, GET/Delete the keys, and we are planning to include many more features.
Redis Operation Modes
Redis can be operated in multiple modes: Standalone master/slave, Sentinel, and Redis cluster
Use Redis Cluster to scale your system
Redis scales horizontally with a deployment topology called Redis Cluster. Redis Cluster provides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes. Redis Cluster also provides some degree of availability during partitions—in practical terms, the ability to continue operations when some nodes fail or are unable to communicate.