Service meshes are a hot technology and sometimes even touted as a requirement to succeed with microservices. However, as with many abstractions, service meshes save time doing, but don’t save time learning. Indeed, many small platform teams are overwhelmed with the added complexity of service meshes, especially when it comes to day 2 operations.
It is natural to ask the question: Does the extra complexity truly outweigh the benefits?
In this post, we present alternatives to consider before investing in service meshes.
The most popular benefits of service meshes are:
- Ingress encryption;
- in-cluster network encryption;
- segregation of communication.
For each of these benefits, we will show alternatives which – in our experience – are closer to what administrators are already familiar with. These may be more attractive for organizations where expertise or platform engineering bandwidth is scarce.
In some cases, you will need service meshes, for example when you need secure Pod-to-Pod communication across multiple Kubernetes clusters. By having excluded the solutions that don’t fulfill your needs, you will further convince yourself why you opted for service meshes to begin with.
Service Mesh Benefit 1:
Authentication with OAuth2 Proxy
Many application teams need to add an authentication layer in front of their microservices. For example, fully implementing OAuth2 or OpenID involves quite some “dancing around”. Instead of writing a lot of boilerplate code – which is rather application non-specific and non-business-differentiating – teams prefer if “something” just hands their application a JWT token with the right claims, so as to focus on application-specific access control.
We previously blogged about how Istio can be integrated with OAuth2-proxy to achieve just that. However, if this is the only thing you need from Istio, it might be overkill to adopt it.
Alternative to Service Meshes: Nginx Ingress Controller
Let me illustrate a solution which I consider simpler, especially for teams already used to Nginx. If all you need is some good ol’ oauth2-proxy, the Nginx Ingress Controller readily integrates with it. Just use the auth-url annotation and the controller will do the rest. The diagram below illustrates how this works.
The reason I see this solution as simpler is that this really only influences how traffic gets into your Kubernetes cluster. Pod-to-Pod communication works as before.
As a bonus, if you are familiar with Nginx, but are scared of the automation that the Ingress Controller adds around it, you can directly inspect how Nginx is configured by typing:
kubectl exec -ti \ -n ingress-nginx \ $(kubectl get pod \ -n ingress-nginx \ -l app.kubernetes.io/component=controller \ -o name \ | head -1 \ ) \ -- \ cat /etc/nginx/nginx.conf
By default, your application will not get the JWT token. To ensure your application gets the claims for fine-grained access control, you must do two things:
- First, add to oauth2-proxy the –set-authorization-header command-line option: This ensures that oauth2-proxy generates an HTTP Authorization header.
- Second, add to your Ingress the following annotation: nginx.ingress.kubernetes.io/auth-response-headers: “authorization”. This ensures that Nginx Ingress Controller forwards the HTTP Authorization header from oauth2-proxy to your application.
Service Mesh Benefits 2: Ingress Encryption
Many regulations require encryption of network traffic over untrusted networks. For example, PCI-DSS Requirement 4 states: “Transmit cardholder data by encrypting it over open, public networks.” GDPR and Swedish Healthcare (“Behandling av personuppgifter i öppna nät” / “Processing of personal data over open networks”) include similar provisions.
The solution is simple: Add TLS termination. However, TLS termination is not business differentiating, nor application-specific. Ideally, the platform should “just do it”. I often see teams adopting service meshes just for this one feature, but there is a simpler alternative.
Alternative to Service Meshes: Cert-manager
You can install cert-manager, create a ClusterIssuer and configure your Ingress with that ClusterIssuer through an annotation. Cert-manager will do the magic of talking to LetsEncrypt, provisioning a TLS certificate and rotating it before it expires.
This is illustrated in the diagram above and the same ready-to-use code snippets. See the cert-manager.io/cluster-issuer annotation.
Service Mesh Benefit 3: In-Cluster Network Encryption
Buckle up for some controversy. 😄
Too often I see organizations adding service meshes because “mTLS and Pod-to-Pod encryption is cool and it might be required by some regulation”. Here is my view on the topic.
First, you rarely — if ever — need Pod-to-Pod encryption. As quoted above, both PCI-DSS and Swedish Healthcare require encryption only over open (i.e., untrusted) networks. Too often I heard teams arguing for Pod-to-Pod encryption “just in case” the underlying network is not trusted. If you cannot trust your infrastructure provider, change providers. No amount of encryption will stop them from tapping into your data while it is unencrypted in memory.
Second, say you really need in-cluster encryption. For example, you want to stretch a Kubernetes cluster over two data-centers connected via an untrusted network. Alternatively, you want to avoid signing yet another GDPR-style Data Protection Agreement (DPA) with the network provider between the two data centers.
Alternative to Service Meshes: CNI-level Encryption
In that case, simply enable WireGuard and/or IPsec in your Container Network Interface (CNI) provider. This achieves the effect of encrypting network traffic Node-to-Node. At least Calico and Flannel have support for this. For example, if you set up Calico with Kubespray, this is as simple as adding:
Service Mesh Benefit 4:
Segregation of Network Communication
Service meshes bring another feature: They give each Pod an identity, then enforce identity-based access control via mutual authentication (mTLS). This brings two benefits: First, your Pods “don’t talk to strangers”, which is good for making some vulnerabilities harder to exploit, like the infamous Log4Shell. Second, it reduces blast radius: In case a Pod gets broken into, an attacker will find it harder to move laterally.
Alternative to Service Meshes: NetworkPolicies
However, the same benefits can be achieved simpler and in a more standardized way with NetworkPolicies. They are like firewall rules or SecurityGroups in the containerized world. With Pod essentially changing IP addresses for every deployment, NetworkPolicies essentially translate Pod identities into IP-based firewall rules, enforced by the Linux kernel.
A NetworkPolicy consists of two parts: a selector and rules. The selector chooses to which Pods a NetworkPolicy applies, either matching Pod labels or Namespace labels. The rules specify what ingress and egress traffic is allowed to/from the selected Pods. Safeguards can be put in place to ensure each Pod is selected by a NetworkPolicy.
In some organizations, network security and application security is the responsibility of different teams. This can be technically enforced with NetworkPolicies and Kubernetes RBACs. Only the network team is given permission to edit NetworkPolicies, while the application teams are only given permission to deploy in selected Namespaces.
Final piece of advice: Treat Namespaces as “internal API”. Since Namespaces end up becoming part of in-cluster DNS names, it is best to name Namespaces by what service it provides (e.g., “auth”, “database”, “licensing”), as opposed to team name (“team-green”, “team-yellow”, etc.). This practice also simplifies setting up NetworkPolicies by the network security team.
Simplicity and understandability are key to security. While security meshes bring great benefits, consider the simpler alternatives before adopting them. My experience is that networking and network security is already complex enough. Adding another layer risks overwhelming your platform team and giving them “on-call anxiety”.
Of course, there are many great service mesh features which lack simpler alternatives, such multi-cluster secure communication and federated network observability. If you do need the more advanced features, we hope this blog post helps you take an informed decision and embrace the added tech.
Both Kubernetes networking and service meshes are moving fast. Just in the last few months, Calico added an eBPF data plane and Istio was donated to the CNCF. Such events can quickly tilt the decision making either in favor of adopting or not adopting service meshes. We will certainly monitor the situation and continue writing on Kubernetes Networking. Follow us on LinkedIn to stay up-to-date.