In my previous blog post we talked about how the world is changing around us. Kubernetes is everywhere and more and more companies already started reworking their applications to take advantage of the orchestration features of Kubernetes.
I also mentioned, that from my perspective, it is important for developers to understand the basics of Kubernetes. I understand this is a controversial opinion, however, I want to take the chance to clarify why I think that way.
Face Your Challenge
Building scalable and fault tolerant system is only one part of the story, the other side is to deploy services to achieve our goals. To support scalability and reliability we could deploy into multiple instances on a single dedicated machine, on multiple VMs, using lots of containers – no matter how we try to achieve our target, we have to make it efficient.
Efficient in the sense of, every single service will have peak times and, you guessed it, slow times. Services have to be designed to be able to scale up and down depending on the current needs. Especially scaling down is often forgotten when developers think about scalability, but we also have to be efficient in terms of cost saving – believe me your CTO will love you for that.
To achieve that, I see three specific challenges, which directly correspond to what, as a developer, I need to know about Kubernetes.
Services, Networking and Security
The first challenge is services, networking and security. That relates basically to how microservices applications are laid out.
As already mentioned services should be able to scale up and down dynamically based on the expected or current services load. Scaling requires that services keep as little as possible internal state since other service instances might have to take over at almost any point in time.
Apart from that, another challenge in modern microservices styled applications is the fact that a lot more communication happens than in the past. From a user’s perspective there’s still a call to the backend, but the backend is split up into many services now, serving this single request.
That said, we’re facing additional latencies for every single call between the different services. Additional network hops that add a bit of latency to the overall response latency. And latencies may vary based on various factors. Another important fact is, that a lot of environments like Kubernetes use virtual networks which seem like a local LAN but might be spread out over not only different real network segments but even data centers.
To mitigate the increasing latency, many move to a more asynchronous architecture, which adds additional complexity.
Furthermore, we now have to find out how and where to call a specific backend service. In dynamic environments with multiple instances of the same service and automatic scaling up and down, service endpoints change too. That means we need to find currently available endpoints, which is commonly known as service discovery and heavily depends on the underlying infrastructure that developers build on. In Kubernetes we commonly use internal domain names consisting of service name, namespace and an internal domain identifier; other systems offer REST APIs or similar solutions. Finding services in our application ranges from fairly easy, simple domain name, to more advanced with clients to register, update, unregister my service instance.
Last but not least there’s the security aspect to it. In common microservices environments data isolation is often a big topic. Not every service is allowed to access or mutate all data anymore. Often systems have specific services to handle account-related data and use tokens or certificates to authorize execution of certain operations in the context of a logged in user. Breaking this isolation is easy, if developers don’t put necessary limitations into the service.
I already hear people screaming at me: Istio to the rescue. And it’s kind of true, Istio will take a lot of those challenges away from the user, by adding an additional proxy between the services taking care of authorization between each other. It also takes care of service discovery and routing requests to currently available instances. On the other hand, it adds not one but two more network hops (outgoing and incoming) for every single in-between services interaction. Furthermore developers have to understand why all their service calls use a localhost connection, since this normally doesn’t seem to make sense 🙂
After we designed our application with dynamism and scalability in mind, we want to deploy it. Applications, however, often need some kind of configuration. The most basic reason is independent databases between development, staging and production.
But configuring an application in a container which is created from an immutable image is far away from what we did in the past with simple configuration files on a filesystem, even though there’s ways to simulate such style for applications.
A common way, coming from early Docker days, is using environment variables configured at container start and being available inside the containers OS environment. It mostly works and is simple to debug using the CLI access, it feels, however, like a hack from an application developer point of view. There is very little we’d normally read from the environment variables.
Another way would be to mount an NFS (or any other shareable path) on each of the Kubernetes hosts and to mount parts, like a subdirectory, into the container with a volume mount. As developers we have our files as expected, but now Devops feel the urge to hand in their resignation. It’s a PITA to configure, keep automatically up to date, and monitor or analyze in case of failures.
The most Kubernetes-ish way is to use ConfigMaps which present both ways, environment variables or mounted filesystems with configuration files being created and updates from the ConfigMap.
While configuration of ConfigMaps is not a developer responsibility, it is important to understand that, as a developer, you either have to make yourself familiar with using environment variables or make sure Devops knows a file path to mount the ConfigMap accordingly into your application’s container.
Observability and Debugging
Possibly the biggest problem when running in Kubernetes managed environments is observability, not only of your service, but of all services and especially the interactions between user-backend and multiple services. Furthermore we should understand how to debug issues with our services when there are production issues. Log files are nice but often enough not the best or fastest approach to solve issues.
As discussed earlier, latencies between services are an important part of the overall response time and directly related to the user’s experience with your product. Even if low response time is not immediately connected to the business use case.
Optimizing applications is an iterative and group effort. Sometimes decreasing latency is the important bit, sometimes fixing a loop requesting values from a database and exchange it with a single batch request. Distributed Tracing capability helps finding the sweet spots, the quick-wins, that’ll help improving the user’s experience as much as possible. It also helps us developers to move the blame responsibility away from our application / service to the underlying environment and the bad round-trip times. If it’s only for this specific reason, that’d be enough for me to make sure I understand the observability part 😉
Famous Last Words
All that said, there are plenty of reasons why developers should be interested in the environment their applications will run in. Most of those reasons are not even specific to Kubernetes, AWS or any other special environment, they’re general rules we all observed in the past, but tend to forget with environments becoming more and more of a black box alike thing.
As a quick closing note, the content in the post is not an exhaustive list but the most obvious points when honestly thinking about it. As an advanced example I could bring health-checks to the game, to provide support to the scaling algorithm and make sure important operations are not killed within a processing step or before reaching a safe point for interception.
I hope I managed to make myself clear on why I think developers should care a bit more about the environment than many seems to like to. And remember, we’re not talking about this one colleague that everyone of us has worked with, but about a big part of engineering departments world-wide.
That said, happy Kubernetes-ing 🙂