Don’t Throw Your Code Over The Wall: 5 Ways To Work With Ops Engineers

While the tech world is shifting to more of a DevOps structure, in many places the invisible wall that separates developers and operations engineers is still pretty high. Communication between both teams is necessary to make sure the application is deployed correctly and achieves high availability for end users. Here are some suggestions from different Rackers (Rackspace employees) on how to best handoff your app to the ops team instead of just throwing the code over the wall.

The Devil is in the Documentation Details

Documenting a well-defined list of dependencies is one of the most important things you can do when handing off your application. Rather than being open-ended with what the app needs to run properly, it is important to be specific. Let’s say you’ve developed an app that relies on MongoDB for storing and retrieving data.

“Saying that your app needs MongoDB leaves many things to interpretation. For instance, what is your scalability strategy? Should this application need to be sharded? Do you want replica sets? What version do you want installed? What are your indexing needs?” Rackspace Deployment Services engineer BK Box says. “If your app requires a MySQL backend, do you want Master-Master replication or Master-Slave? How much memory and disk space do you need? These are a couple examples of the types of details operations engineers need to have. Be sure to keep the communication open to talk about requirements needed to get the application running.”

Set Up Descriptive Logging

Having detailed logs is one way to help operations engineers quickly diagnose and fix a problem when an alert comes in. “Looking at the application logs is one of the first ways you can tell if a developer came from an ops background,” says Farid Saad, Cloud Servers engineer. Saad says that ops engineers have to quickly determine what is going on if an application begins behaving strangely, and having to constantly debug logs that are not meaningful can cause frustration and delays in getting the app back online. Overcome this issue by ensuring that your logs adequately describe the errors that the app is encountering.

Fully Tested Code

Major Hayden, the Rackspace Chief Security Architect, advises that developers fully test the code to help mitigate any surprises on launch day. Hayden emphasizes that the testing should extend beyond the typical unit testing that happens in most QE processes. “When testing, it is important to understand what happens when you put all the pieces together,” Hayden says. “With integration testing, you should make sure that the code works well with other dependent applications, such as an authentication system.” Be certain to verify the integration points between your code and other applications.

Have a Rollback Plan

Everyone expects code deployments to go successfully, however, you have to make the assumption that something could go wrong. “We see it all the time where a new piece of code is rolled out and the application starts behaving poorly,” says Kenny Gorman, co-founder of ObjectRocket, a MongoDB solution by Rackspace. “Many times it could be a simple missing database index, but other times the new application could actually be logically corrupting the database. In these cases it’s great to have a rollback plan, which not only mitigates the risk of a bad deployment but also minimizes the amount of downtime if something goes awry.”

Communicate Availability Expectations

Rackspace Cloud Servers engineer Richard Maynard wants to know what the developer’s expectations are for the app. “How often should it be available? How often will it be updated? What is the anticipated usage?” Maynard asked. “To me, it is about requirements for availability and understanding what the operations team can do to meet those requirements.”

There is a difference between 99.99 percent and 99.999 percent uptime. While it may not sound like a lot, the amount of 9s can affect how you architect the infrastructure for the app (in the above example, it is the difference between 52 minutes of downtime and 5 minutes of downtime). “What it comes down to is an understanding of the cost versus complexity and the impact of downtime to the business,” Maynard says. “Are you developing a payment gateway and could lose money out the window if it goes down, or is it a non-critical app that could simply upset a handful of users?” Communicating this expectation to the ops team can help ensure that the hardware is architected to achieve the required level of uptime.

While industries are beginning to tear down the wall between operations and developers, it would be helpful to consider these tips until finally arriving at a true DevOps structure.

Looking to host your newest app? Rackspace offers performance cloud hosting with all SSD drives so you can get the most out of your app in addition to ObjectRocket, a Database as a Service solution to run MongoDB on custom, fine-tuned hardware.

The Rackspace DevOps Automation Service automates application environments using DevOps tools, and includes 24×7 DevOps Engineering support.


    • Actually it was 816 words, or 4,154 characters without spaces, or 4,955 characters with spaces, and 15 paragraphs. This information comes without the title, byline, social badges, tags, and about the author. Peter if your going bash the post, get your facts right. Also, this post is great information. As a developer, people are often passive and choose not to criticize other’s work which makes knowing your faults impossible. Although this may seem as common sense, as a developer I often feel that by saying things like ‘MongoDB is required’ I often think the answer to those questions require nothing more than common sense. It all depends on which side of the wall your working from.

      • Hi Ian.. it is a ridiculous article… there are many people who earn their living writing ‘tech’ pieces that have zero value. Your contribution to this is evidence that your lot in IT is zero.

  1. Fast, cheap and good. Pick any two. Human nature/ the adminsphere do not seem to like more than two choices. This creates the “Nothing tests like production” scenario.

  2. Back in the day, I had a simple technique that helped with documentation. People were allowed to ask me questions directly but I had to respond in the documentation. The assumption is that if someone has to ask, the docs are insufficient to the task. I would either enhance the docs or re-engineer the user interface to make things clearer. Me answering directly might solve the problem for that user but there would be others.

    The other technique I used was what I called incremental development. I would meet with my customer and we would bang out some very basic requirements that we could all agree on. We would leave the more minor functions for later. I would deliver the basic application very quickly and that would give us all a frame of reference for filling in the other features. My customers loved it. They didn’t have to try to think of everything they wanted all at once and then wait until it was all done to see if it was what they really wanted.

    Interestingly, I was part of an acquisition and my new customers were used to a very regimented specification process. They expected to have to write a full specification and then wait a year for delivery. I come along and, after an initial discussion, I deliver a rough version in about 2 weeks. They freak out thinking that the rough version is what I expect them to use. Once they get into my way of doing things, they are thrilled. I can get them going in under a month and we can collaborate and prioritize the rest over the next month or so.

    Then my job was outsourced to a group of 20 programmers in Bangkok and they laid down the edict that all software changes would be subject to a strict specification process and would take a year to complete. I guess the company saved money. I moved on.

    • Good stuff! Sometimes process and red tape get in the way of true progress. Sad to hear the higher echelons were more focused on the budget, and not their users/customers…

    • Absolutely, my team developed an application where they could, during the customer interview process, begin to develop the application immediately. In the vast majority of cases when the interview finished the customer was able to log onto the basic application that usually meant at least 80/85% of wishes had been covered. With the customer working hand in hand with the development team it was quickly evident where deficiencies occurred and allowed for quick pivots and response to needs. That last 15/20% was what made the business go and where we could then spend the majority of our time and resources, it also meant we responded with the tools needed to move the business forward quickly, resulting in both the business and team growth and satisfaction. A process that in “todays” development world is call Agile, I our “old fashioned world was business as usual, that is being customer centric.

  3. DO throw it over the wall, just make sure it is well documented and tested. You’re supposed to throw functional, well-documented product over the wall, not hand-grenades. Lowering the wall creates a situation where ops does not take responsibility for execution and maintenance of systems. I’ve seen it first hand.

    • I believe having an open doors along the wall helps. Some of the information above is pertinent when initially starting a project, or when major upgrades need to occur. It would be essential to sit down with Ops to go over the requirements. Most of the time, they have a better view of the services than Devs do. But agree, the wall should still stay, as a way of defining responsibility.

    • In our case, our customers (internal) didn’t know exactly what they wanted. Incremental development was the best thing for them. They could very quickly describe the basic framework of what they wanted and I could give them that just as quickly. Then the application became the basis for our refinements.

      But these were fairly small applications for an internal group that was buried in work. When I started supporting them, they were struggling to tread water. It would have been too much for them to take time out to create a detailed specification. By giving them something that worked, I immediately relieved some of their pain.

      You do what works.

    • In a way, I agree. Design it right so that it’s very easy to use and often it doesn’t even need documentation. (“We’ve just released our ‘hello world’ program and the 300 page manual…”)

      • Very true, I could never understand the need for the creation of 2/4 hours of documentation for something that took 15 minutes to develop. This is so very true when a fix is made. Such as, could never understand why an application was down in some cases days until the next PMO meeting approval of a fix, again a fix that took only 15 minutes to complete.

        I my 45 years in IT I shudder to remember how many wasted hours of that type of discussion occurred and just how difficult it was as a COO to get staff thinking about the business rather than their empires silo.

  4. As a software engineer who graduated with BS in computer science….I have no idea what a operations engineer is. Is that the monkey who hits next until a windows program is installed?

    • you don’t capitalize Engineer…meaning you don’t value what you do.. or moreover, you are not a real Engineer…just the happy americanization of everyone is an Engineer… you are the monkey.

      • You only capitalize engineer when it’s used as part of a job title, not when used in general terms. You can say “I am an engineer”, and you are correct not to capitalize. If you say “My current job position is Lead Systems Engineer at Rackspace” then you had better capitalize.

      • You know what he’s saying. Don’t be snarky. He’s totally en pointe in his description. DevOps is just the newest title of this role.

      • Turns out Capital E Engineer actually means something which I have not achieved. I am guessing you have not even considered that Mr “Guest”. Next time you critisize someone at least have the bravery to put your name next to it.

    • The operations engineer is the guy who stays up all night during a maintenance window trying to make sense of your crappy code and horrible deployment instructions. He’s also the guy that gets pages at 4am when your whole house of cards comes tumbling down because of a cron job that failed to run.

  5. Fundamentally what is the functional space of Documentation, Engineering, and Operations Engineer and Software Engineering? After working for 7 years in IT on numerous project with no direct knowledge that was the space(not my educational background) I have seen it all, waterfall, incremental, iterative, small team, managed service, custom local extreme programming. Can we get some definitions of what the structure of these teams are supposed to be doing? Managing every team/ business model is a unique challenge, but realistically what are the teams descriptions/supposed makeups ideally expected in your discipline.

  6. Actually ~8 hrs 45 min of downtime equates to 99.9% uptime (yearly), not 99.999%. And 87 hours 36 min of downtime certainly doesn’t equate to 99.99% uptime. If that were true my job would be A LOT easier!
    (edited because IE)

  7. Great article that still has merit 2+ years later.

    DevOps is a two way street. Would be great to see a follow-up like “5 Ways For Ops Engineers To Work With Devs”.

    The “Fully Tested Code” advice might be better received with more details on how to prioritize your testing investments.

  8. Unquestionably believe that which you stated. Your favorite reason appeared
    to be on the net the easiest thing to be aware of. I say to you, I certainly get annoyed while people think about worries that they just don’t know about.
    You managed to hit the nail upon the top and also defined out
    the whole thing without having side-effects , people can take a signal.
    Will likely be back to get more. Thanks


Please enter your comment!
Please enter your name here