Agility in operations

It seems like Facebok is pretty agile in how it handles new features and roll outs. According to an article on the High Scalability site they actually do major releases every week. One of the things that struck me was this:

Be Innovative, Not Safe. Fear of failure often shuts down the organizational brain and makes it hide behind excessive rules and regulations. A technology company should have a bias towards action and innovation. Release software. Don’t stifle genius. Rely on your tools and processes to recover from problems.

This isn’t a solution to problems, but it is a pretty accurate description of what I want to achieve myself. Making a release shouldn’t be difficult or scary. This means that we need tools and methods that:

  • Enables us to be relatively certain that we don’t introduce any errors
  • Enables us to recover from a failure, because we will eventually fail

Tools like JUnit, Fitnesse, Selenium are all tools that allow us to verify the behaviour of our application. They help us verify that what we have done doesn’t introduce any errors. This should enable us to roll out quite easily, but I think in many projects one doesn’t trust the quality of the tests and you fear rolling out because you don’t have a good recovery plan.

I think we have a lot of tools available to us when it comes to writing tests, we just have to get better at using them, and eventually improving the tools. Where we seem to be missing is the part where we do good rollbacks. Maybe we don’t even need tools for that? I’d like to hear how you do it, and what tools you use or are missing.

2 replies on “Agility in operations”

I think you’re exactly right that we need to find ways to release more often.

On the issue of rolling back: Could a period where the new version is run in parallel as a stability test make the problem of rolling back smaller? And when you have to roll back, of course it’s easier to roll back small changes. We try and make our database changes backwards compatible and our application releases small.

Running in parallel with the new version could be a way to both be more certain og not introducing errors and have a safe rollback.

Quite intriguing actually, if you were able to always have the current and previous version running in parallell. If you discovered a critical defect in the current version you could “switch” to the previous version.

Sounds a bit like utopia though, since it brings up a lot of questions on how to handle integration etc. New features supported in the new version would also be a problem for processing in the old version, but the differences would be small if you were releasing often.

I think a good start and the lowest cost and complexity would be to have decent roll-forward and roll-back scripts for the database. It won’t be fail safe, but it will work most of the time.

Leave a Reply

Your email address will not be published. Required fields are marked *