Cloud Native – Seeing through the hype

Is “Cloud Native” just another overhyped, marketing driven buzz word, or is there real meaning and importance behind the term?

You’ve probably heard at least a dozen people in the last week say the words “Cloud Native” when talking – or writing some blog. And you probably have some picture in your head of what that is. But do you really, truly know what Cloud Native computing is?

Because Cloud Native is more than just a buzzword. It’s a complete methodology – of design, of technology choices, of architecture – and of operations. To better understand how this is so different, let’s quickly review how we arrived at the current state of Cloud Native.

Cloud Native Evolution

Traditionally, software applications have been developed in single projects and teams using one shared codebase and deployed as a single application. These monolithic applications were hard to maintain – changes had to be tightly managed as they were risky with many potential side effects. Updates, new features, and changes were massively slowed down by this approach and architecture.

Agile came to the rescue of these waterfall methodologies. Shorter release cycles and smaller teams were used to gain speed. Manual processes of building applications from the codebase were automated (Continuous Integration), and a whole craft to automate the process from checking code in to deploying it in production was established (Continuous Delivery). As a consequence, Development and Operations had to work closely together and establish a culture of common responsibility (DevOps).

Agile was a leap step ahead, but teams figured out that they still were slowed down by a shared codebase and monolithic architectures. This is why companies like Netflix established new architecture patterns that are called microservices. Microservices are small, independent components that can be developed by a small team without having to sync with other teams. They also allow independent deployment so that changes can be made without impacting other parts of an application. They are also easier to scale and meet the rising demand in the internet and mobile age.

On the infrastructure side, two trends have helped to become more agile and fast – Cloud and Containers.

The cloud made it fast and easy to provision servers and storage with a few mouse clicks. Cloud infrastructure and platforms allow users to operate applications without maintaining all the underlying software components.

Containers were the next revolution in infrastructure – small, isolated environments that made it easy to deploy and scale microservices; managed by schedulers like Kubernetes that help to manage the containers at runtime. Serverless is the next level of microservices, where infrastructure is totally abstracted from the execution of the microservice functions.

This is Cloud Native

cloud-native-diagram

This new way of developing and deploying software is called Cloud Native. It combines the process and methodologies of Agile, DevOps and Continuous Delivery with the architecture and technologies of Microservices, Cloud, Containers and Serverless.

Companies that want to perform in the new digitized world have to adopt Cloud Native to release software more frequently. Cloud Native provides companies with a “super power” to become faster in delivering software at higher scale, and with potentially lower cost! As with most super powers, Cloud Native also comes with some nasty side effects: Higher complexity and the pain of cultural change. Therefore, we want to examine the complexity of cloud native by taking a deeper look into monitoring, as complexity can only be managed with sufficient visibility.

Monitoring is essential to successfully managing applications in Cloud Native environments. Deployments are more frequent, with changes happening multiple times per day or hour. Every change has the potential to introduce new errors or performance issues, and these must be identified quickly to repair them either by applying a fix or by rolling back to the last working version of the microservice. Microservices bring a new level of scale, and operating these types of applications introduces greater complexity and dynamism, as schedulers can start and stop containers on demand, depending on the workload.

These new challenges require a fresh approach to performance monitoring and problem detection. The core concepts that we will explain in detail

1. Continuous Monitoring
Monitoring must be automated in the same way integration, testing, and deployment have been highly automated. In highly dynamic and scaled environments, the process of monitoring microservices must adapt to changes without manual interaction or configuration.

2. AI Driven Incident Prediction and Root Cause Analysis
The complexity of microservice applications is at a level where even the best expert cannot understand and map all the dependencies and constant change. Modern machine learning and data analytics must come to the rescue.

3. Automatic Monitoring vs. Monitoring as Code
As developers have become a leading stakeholder in DevOps motivated organizations, monitoring has been added as code in a similar way that configuration and automation are added as code. The concept of monitoring as code grew out of necessity as there were no good commercial tools that could automate monitoring in rapidly changing applications. However, fully automatic monitoring for these rapidly changing applications is now readily available (in fact, that is exactly what Instana does) so the practice of monitoring as code should be minimized as it is not scalable.

Monitoring Cloud Native applications requires collection of business metrics, telemetry data, and distributed application traces to correlate and analyze them in near real-time. This guarantees that the impact on the business can be observed anytime a change happens and the time to resolution of these issues can be reduced to a minimum as the root cause is analyzed using intelligent machine learning algorithms and tactics.

Why Traditional Tools Struggle to Monitor Cloud Native Applications

Traditional monitoring strategies DO NOT WORK with Cloud Native applications for the following reasons:

  • Manual instrumentation is not efficient
    • Writing tracing/monitoring as code takes real effort and time – this works against rapid release strategy
    • Frequent CI/CD releases makes it difficult to keep manual monitoring accurate and up to date
    • Having to write monitoring as code slows down release cycles
  • Humans can’t keep up with constant change to re-configure monitoring
  • Manual health rules rapidly go out of date (deteriorate, stale) – Because of CI/CD, and especially in orchestrated environments (Kubernetes, OpenShift, Mesos, Swarm), the rules break down and no longer correlate properly.

Cloud Native Maturity

As the State of DevOps report 2017 outlines, companies have to monitor their IT performance using four metrics:

  • Deployment frequency
  • Lead time for changes
  • Mean Time to Recover (MTTR)
  • Change failure rate

Only with a modern application monitoring strategy that can deal with the rate of change of these dynamic applications will companies be able to measure and improve these metrics to get to the required excellence and speed. Companies like Netflix or Uber who are thought leaders in Cloud Native software development have already adopted new and modern monitoring strategies including automation, machine learning and distributed tracing to achieve their excellence in development and operations.

Instana’s mission is to enable organizations to achieve Cloud Native excellence by providing automated monitoring that delivers actionable information to everyone involved with application delivery and support.