“We deploy multiple times per day” is a new badge of honor at companies across the world. But what you don’t often hear about are the problems caused by moving so fast as a result of continuous integration and continuous deployment (CI/CD).
Problems Caused By CI/CD
It wasn’t long ago that such rapid change was reserved for an elite few disruptive companies. It’s generally accepted that if you are going to be successful with continuous deployment you need monitoring and observability in test and production. As more companies adopt this rapid release approach, they’re learning that some unique problems come along with the continuous change generated by CI/CD:
- Rapid introduction of performance problems and errors
- Rapid introduction of new endpoints causing monitoring issues
- Lengthy root cause analysis as number of services expand
Why Does CI/CD Cause Application Issues?
As DevOps team shift to a continuous deployment model there is a significant increase in the amount of dynamism in production. More software is updated or added more frequently on more infrastructure. It’s great for solving business problems but difficult on monitoring tools that were designed for a much slower rate of change. Most monitoring tools, including Open Source tools often employ many, if not ALL, of the following processes:
- Manually configure data consumers, producers, and instrument code for tracing
- Manually configure service discovery, map dependencies, and decide how to correlate data
- Manually configure and define alerting rules, thresholds, and policies
- Manually build dashboards to visualize correlation and transform data into service quality KPIs
When you implement CI/CD you realize that there can’t be ANY manual intervention at any step in the process or the entire pipeline slows down. Manual monitoring slows down your deployment pipeline and increases the risk of performance problems propagating in production.
An example continuous deployment scenario
- Source Code Management
- Automated Build
- Automated Unit Test
- Automated Integration Test
- Automated Deployment
Developers are always under tight deadlines. The business wants their new functionality and their bug fixes right now. Taking time to manually decorate their code for monitoring is not the highest priority and in many cases a complete waste of time since automated solutions will do this and more after deployment.
After a developer commits changes to their code repository, their CI tool (Jenkins, Bamboo) detects the commit and instructs the automated build tool (Maven, NPM) to build their software, also initiating unit testing. Typical unit tests (Junit, Unity) look for failures and not for performance at this point. Assuming successful unit test, the build tooling will then integrate the new code for a more complete integration test. This is a point where you can look for errors AND performance issues with the changes. If there are failures as a result of errors or poor performance the developer is notified and they need to figure out what went wrong.
This is a critical juncture where having proper monitoring in place can save significant time. Consider the difference of being told that the checkout function was too slow having taken 5 seconds to complete versus being told the checkout function was too slow having taken 5 seconds to complete because one of your revised data queries now take 4 seconds to return. The latter scenario provides a much faster path to fixing the problematic code.
Getting to this level of detail can be non-trivial and if you have to manually add monitoring to any infrastructure components or add instrumentation to trace the calls of your transactions this will add delay to the integration testing stage and potentially add bugs or impact performance.
Now let’s assume that new code has been checked in and we’ve passed the integration testing phase. It’s time for automated deployment to production (Chef, Puppet, Ansible). The latest version is deployed and monitoring must be applied.
Manually adding monitoring at this point is slow and introduces risk that a newly introduced performance defect will impact users and worse, impact revenue.
Now imagine repeating this entire process many times per day for each new code push!
Additional Monitoring Complications Caused By Kubernetes (and Other Orchestration Tools)
As if continuous deployment wasn’t complicated enough, companies are now starting to use Kubernetes at the core of their deployment pipeline. Kubernetes is a great tool that adds significant capabilities when it comes to deployment but as was recently discussed, Kubernetes is COMPLEX. Using a complex orchestration tool in the CI/CD pipeline adds another layer to the stack that must be monitored.
Automatic Montoring – The solution to CI/CD Issues
Monitoring must be automated in the same way integration, testing, and deployment have become automated. In highly dynamic and scaled environments, the process of monitoring microservices must adapt to changes without manual intervention and configuration.
Monitoring automation must be applied to infrastructure, application, and orchestration technologies. The critical capabilities are as follows:
- Automatic discovery and correlation of full stack and dependencies
- Automatic instrumentation and monitoring of application components
- Automatic visualization of all components and their dependencies
- Automatic alerting of infrastructure and application performance problems
Automated continuous monitoring will keep your continuous deployment pipelines flowing smoothly and efficiently. You’ll be able to deploy faster with confidence; confidence that you will know immediately if a performance regression has been introduced; confidence that your full infrastructure and application stack is monitored; confidence that you’ll be able to quickly resolve any incidents that arise over time.