Why You Need A Separate System for Analytics
TL;DR Separating Analytic and Transactional workloads has positive impact on system usability.
The production transactional (OLTP) database is the lifeblood of software systems, and it houses all the real-time transactional data that business operations depend on. It is a critical aspect of your data infrastructure that has significant implications for the performance, stability, and security of our production software systems.
Any undue strain on the OLTP database can result in slow response times for critical applications, downtime, and an increased risk of data corruption/loss.
In this article, references to analytic workloads encompass: Reporting, Business Intelligence (BI), and Machine Learning (ML).
Operational Risks
Several operational risks are introduced by commingling analytic workloads on the OLTP database.
- Performance Impact: Analytic workloads are complex and resource intensive. They consume significant CPU, memory, and disk I/O. Running them directly on the production OLTP database will consume resources required for essential business operations, and cause bottlenecks that negatively affect system performance and user experience.
- Workload Execution and Scheduling: One of the key features of analytics systems is the ability for end users to execute and schedule workloads. If these workloads are executed or scheduled to execute without proper coordination for business operations, it will create performance degradation.
- Downtime Risk: If not properly architected, resource-intensive analytic workloads can create performance degradation to the extent that it results in outages. Also, when the analytics workloads are tied to the production database, whenever the production database is down for patching or upgrades the analytics workloads must also be down.
- Inefficient Cost Allocation: Commingling analytic workloads on a production OLTP database prevents the ability to track the cost of these resources independently.
- Security: Separation of concerns is best practice from a security perspective. It enables the ability to secure the systems independently, and separate sets of users into roles/groups based on use case. Analytic users would not require access to the production OLTP database, so it could have a higher level of restriction.
Benefits of Workload Separation
It is common practice in database management to maintain a dedicated read-only system for analytic purposes separate from the OLTP database. This separation would eliminate the risks described above, and create the following advantages:
- Performance Isolation: A separate analytics system can handle analytics (reporting, BI, and ML) workloads, ensuring that the production database remains responsive for transactional operations.
- Enhanced Scalability: As analytics needs grow, you can scale the analytics system infrastructure independently, without affecting production database performance.
- Data Integrity: The analytics system is a read-only copy, minimizing the risk of data corruption due to performance degradation.
- Improved Security: Access to the analytics system can be tightly controlled and restricted to authorized users and applications, reducing the risk of unauthorized access or data breaches on the production OLTP database.
- Disaster Recovery: In case of a catastrophic failure in the production database the analytics system can serve as a backup, helping recover critical data.
- Improved Availability (for Analytics): When your source system is down for maintenance (upgrades, patching, etc.), the analytics system can remain up. The enables analytics workloads to run, and users to perform analytics job functions while the source system is down.
- Analytics Workload Independence: When you separate the analytics system from the production database you divorce the analytics systems from the production systems. If there is a need to change production databases (platforms, vendors, architectures, etc.) there is no requirement to rebuild the analytics systems. There may be additional re-integration work if the production system changes, but the analytics system will largely remain the same.
Adoption of a dedicated system for analytic purposes will help us ensure the stability, performance, and security of the production OLTP database. It will allow analytics teams to access the data they need without compromising the core operations of the business.
Separating analytic and transactional workloads is an opportunity to improve performance, rather than degrading it.