At the beginning of week 35, we observed slower than usual performance for our web services. We started investigating and soon determed the root cause of the issue to be twofold; increasing usage of our services, in combination with the ever growing amount of data in our database, caused the system to be insufficient in keeping up with the demand.
For all of last week, our development and operations teams have been working hard to implement multiple improvements in order to increase the response time and overall performance of our web services. Following is a list of some mitigating actions that have been implemented so far:
- Code changes have been implemented in order to optimize some database queries that were found to be suboptimal. The same processes are performed as before, but some in a slightly faster manner.
- Unnecessary data has been cleaned up and/or removed from the database. By decreasing the overall size of the database, resources are effectively freed to be used for the core system functionality.
- Horizontal scalability has been deployed for the database layer; multiple servers now share the workload, and system performance is less likely to be affected by individual clients or users.
- We have deployed new performance monitoring software that provides us with fine-grained analysis of performance and system workloads. These tools let our operations team not only monitor, but also proactively detect and mitigate anomalies before they become problems in the future.
We are confident that these actions will help us avoid similar incidents in the future. If you have any further questions regarding this incident, do not hesitate to contact our customer support.
Lyyti Operations Team