The Elastisys Tech Blog


How to Strike a Healthy Balance Between Dev and Ops (DevOps?)


Companies that want to adopt a DevOps mindset and culture soon start asking themselves a tough question:

Are Dev and Ops engineers the same?

And if not, how do we strike a healthy balance between Dev and Ops?

In this post, I share my thoughts on the psyche of Dev and Ops engineers. What motivates them, how they prefer to work, and how to help them strike a balance between Dev and Ops tasks, such that you can adopt a DevOps way of working.

Developers: all about control

I find that a typical developer is motivated a lot by the ability to create their own world, as represented in a computer. They take in the realities of the real world and model it according to business needs in a program. Say, for instance, a ticket booking system for an airline. They will understand the complexities of that reality, and turn the parts that are relevant to model into code.

If something goes wrong in a program, the developer mindset is to immediately go back to the code and dive deep into it. Because the problem is certainly going to be in there, somewhere. This means that they are used to being in complete control of how “everything” works. And, also, that they are used to taking on full responsibility for any error that occurs. Because the error is likely to be with their code. Either directly or indirectly, in how they use some code library. Errors are mismatches between reality and the implementation of their model of it.

Edsger W. Dijkstra said that “if debugging is the process of removing software bugs, then programming must be the process of putting them in.”

Developers know and feel this.

Operators: all about trust and fundamentals

The typical operator or system administrator on the other hand is someone who, to me, has completely adopted an attitude of trusting that computer systems will work as long as fundamental needs are met. Operators might not even have access to underlying source code to most of the systems they operate, but they work from a place of trust. They trust that as long as the software has what it fundamentally needs (harddrive space, enough resources to run, network connectivity, …), it will function correctly. Or, at worst, that the reason it is not working right is simply due to some configuration error.

The scope of what an operator will manage is too large to have complete control over. It includes not only hardware (virtual or physical), an operating system, supporting software, background processes, and finally the actual application that the computer is there to run. It’s simply too large a scope to have everything under perfect control!

Instead of working out of a place of control, an operator is therefore comfortable with their own ability to “just figure it out” if something goes wrong. Given enough caffeine and Google, they assume that they will be able to handle any problem that comes up.

Tools of their respective trades

A developer, who assumes responsibility over a piece of software they wrote, will default to using both tests (to recreate the erroneous behavior) and a debugger to find out the source of an error. An operator, who is responsible for a large cluster of servers, will rely on monitoring tools to show if there are any errors that can be predicted or have already occurred.

The way to use these tools effectively takes time to learn, and especially, if they are to become second nature to guide the engineer’s intuition.

A developer is like an architect, who will model a building on paper in great detail. An operator is more like a doctor, who will inspect a very complex system, and knows how to maintain it for peak performance.

DevOps: a healthy balance between Dev and Ops?

I find that forcing a typical developer to do Ops work, or the other way around, is not going to work. There is a large difference in mindset and even, possibly, in personality type. If you are used to being in total control, and you get dropped into a context where control is impossible, you will get stressed. And if you are used to maintaining systems on a broad level, you might get frustrated dealing with the minutiae of development work.

“But is this not what DevOps is?”, I hear you ask. I’ll get there, I promise!

In my mind, DevOps is a mindset of “doing Ops in a Dev-inspired way”. As in, it fosters a culture that is supported by tools common to developers. Version control systems track all configuration changes and servers/containers are thrown out and replaced instead of being upgraded in place (like how we restart processes in a computer, rather than try to live-patch them during runtime), to name a few examples.

And to be able to do both effectively, engineers working in this manner have to learn the tools that are appropriate for each task (debugger vs. monitoring systems), and learn how to adopt the attitude that is appropriate.

However, the sad truth is that doing so will only get you that far. You can’t force a square peg down a round hole, and you can’t put a person who lives for control into an uncontrollable situation and have it all work out fine.

And you will encounter situations where you have to do the hard Ops tasks. Because the “throw out and replace with new” approach works wonderfully for application containers, but not for, e.g., database servers. Or entire Kubernetes clusters. The containerized applications on top of Kubernetes, yes. But not the actual entire cluster itself.

Outsourcing the hard Ops tasks via a managed service

So to companies that want to transition their typically large team of Devs and smaller team of Ops into a DevOps culture, my suggestion is to outsource the hardest Ops tasks to a managed service provider. What remains is perfectly amenable to handle in a “DevOps” way. Because what remains is about operating what you have developed: your application.

Let’s take Kubernetes as an example. In a fully-managed service, one where not just the control plane (like what you get at the major cloud providers) but also the worker nodes are managed, your newly formed DevOps team is not responsible for the difficult problems related to maintaining an entire cluster. The provider manages networks, storage, operating systems, and the Kubernetes platform itself. This means your team only manages the applications that run on top of it.

It also means that those engineers that have more of an affinity for Ops are going to excel at using monitoring tools in understanding your application. And those that come more from the Dev side will use their tools to develop your application.

In this way, all engineers are working within their comfort zone and are fully focused on increasing the value of your application to your end users.

Three tips to get you started

Tip #1: Figure out if you have developers or operators. I like to draw a line or spectrum on a whiteboard, with “frontend” on one side, “backend” around the half way point, “database” at about three quarters, and “Linux” on the opposite end. Then ask what parts people are comfortable with, to draw their own line. The results may surprise you.

Tip #2: Prioritize business value. If you have a market, you also have competitors. You should focus all your efforts and resources on what makes your offering better suited for your market, rather than just what keeps your team busy.

Tip #3: Train or recruit to strengthen your team. Once you know what team composition you have, determine how to make maximum use of your current skill sets and direct your efforts into increasing the skills you may be lacking, either via training or new recruitment.
Failure to do the above means you run the risk of being outrun by your competitors, who know how to direct their efforts more effectively.

For most software as a service companies, there is a solution. And that is to do considerably less operations of the underlying tech stack, and to leverage managed services as much as possible. This way, you can automate the ordinary so you have more time for the extraordinary that is your service.