Hello again, cloud friends! Scott Mabe here with a real horror story that many of you can relate to. There’s a ghost haunting your technology stack, creating horrific havoc. How? I like analogies, so I'm going to start with one to explain.
If you put your food in a microwave, enter your cook time and hit the start button but don’t see lights or hear any noise, would you assume your food has been cooked? I’m guessing your answer is, “No.”
So why do we expect this kind of performance from our software delivery and operations teams?
How Can You Verify Code Was Written Correctly?
We all have seen it happen. A software developer writes code, runs it on his machine and then puts that executable on a shared folder for Operations to install.
But...how do we know that it works? Where’s the feedback, the tests, a change log or means to automatically rollback if there’s an issue? Why do we just assume everything is fine?
We shouldn’t. That’s for sure. There are plenty of times things aren’t fine. By the time you realize that all is not well, chances are you’ve already had a catastrophe.
Scared yet? You should be. Not having fast and frequent feedback, proper measurement of system metrics/telemetry, or crucial business information is like expecting your car to move without any indication that your ignition system is functional. Or, for that matter, like that microwave that doesn’t have lights or noise, even after you hit start.
That’s sheer madness, right?! Right. It’s a no-brainer. Thing is, it’s not for many organizations.
Day after day, we read the same horror stories about how some big company had an outage caused by a seemingly small coding error. It can happen. It does, every day.
One small glitch can take down an entire e-commerce application. One wrong bit of code can grind your sales to a dead halt and cause mass panic among your operations teams. Terrifying, isn't it?
Want to know what's even scarier? Not using an automated test that would have warned your developer that their code change was going to awaken an ancient hidden mummy’s curse in your app. One that would cause havoc on your whole system and bring it to its knees.
What Does "Breaking a Build" Mean?
Having this kind of automated mechanism, known as a feedback loop, breaks the build and allows your dev team to swarm the problem, resolve it and create company-wide learning in the process. If breaking a build is not a familiar concept to you, here’s a quick explanation.
Breaking builds is the principle of preparing new code and testing it to ensure it’s functional. If the tests fail, that build is "broken" and won’t work. In fact, the feedback loop keeps it from being installed.
What’s more, your team will be notified in minutes vs. waiting for another department to tell you everything is terrible and possibly cursed. Peer reviews can also help catch problems that can break builds. FYI...You can learn more about peer reviews by checking out my past blog, Maintaining Security and Sanity at Cloud Speed: Part Two.” <wink, wink.>
Simply put, to avoid big scares, not just during Halloween week, but all year long, feedback loops are a must when it comes to implementing code changes.
A core concept of the Agile/Lean/DevOps movements, fast and frequent feedback loops help reduce fear and stress. Proper feedback also helps support a culture of learning and teamwork. They also let you know if your metaphorical car (or microwave) is running or not.
You don’t need someone to lift that mummy’s curse to enjoy “all systems go.” It all comes down to simple automated test. Savor that thought. Until next time, I hope all of your tricks are treats this week!
Topics: Cloud Infrastructure