Last week, we were fortunate to have Eric Ries come out and spend some time talking with our team while he was here for the Agile Vancouver event. We had the chance to talk about 5 whys, split testing and other topics. I would have liked to spend a bit more time discussing continuous deployment, but I did get some more insight into how they got started with CD at IMVU.

One thing that I was surprised to learn was that IMVU started out with continuous deployment. They were deploying to production with every commit before they had an automated build server or extensive automated test coverage in place. Intuitively this seemed completely backwards to me – surely it would be better to start with CI, build up the test coverage until it reached an acceptable level and then work on deploying continuously. In retrospect and with a better understanding of their context, their approach makes perfect sense. Moreover, approaching the problem from the direction that I had intuitively is a recipe for never reaching a point where continuous deployment is feasible.

Initially, IMVU sought to quickly build a product that would prove out the soundness of their ideas and test the validity of their business model. Their initial users were super early adopters who were willing to trade quality for access to new features. Getting features and fixes into hands of users was the greatest priority – a test environment would just get in the way and slow down the validation coming from having code running in production. As the product matured, they were able to ratchet up the quality to prevent regression on features that had been truly embraced by their customers.

Second, leveraging a dynamic scripting language (like PHP) for building web applications made it easy to quickly set up a simple, non-disruptive deployment process. There’s no compilation or packaging steps which would generally be performed by an automated build server – just copy and change the symlink.

Third, they evolved ways to selectively expose functionality to sets of users. As Eric said, “at IMVU, ‘release’ is a marketing term”. New functionality could be living in production for days or weeks before being released to the majority of users. They could test, get feedback and refine a new feature with a subset of users until it was ready for wider consumption. Users were not just an extension of the testing team – they were an extension of the product design team.

Understanding these three factors makes it clear as to why continuous deployment was a starting point for IMVU. In contrast, at most organizations – especially those with mature products – high quality is the starting point. It is assumed that users will not tolerate any decrease in quality. Users should only see new functionality once it is ready, fully implemented and thoroughly tested, lest they get a bad impression of the product that could adversely affect the company’s brand. They would rather build the wrong product well than risk this kind of exposure. In this context, the automated test coverage would need to be so good as to render continuous deployment infeasible for most systems. Starting instead from a position where feedback cycle time is the priority and allowing quality to ratchet up as the product matures provides a more natural lead in to continuous deployment.

For my company, even though we do weekly deployments, we’re still a fair way off from being able to deploy continuously. As we are operating in a new and rapidly evolving market, we focus on building and releasing a simple initial version of new features that demonstrate the potential of the software. We can then receive feedback and invest more effort in expanding those features that resonate with our clients. While we do routinely selectively expose new functionality to a subset of users (generally internal users) to solicit feedback, we still need to create more sophisticated ways to do user segmentation. Aside from the obvious bugbear of automated test coverage (we use JUnit and Selenium, but our coverage isn’t nearly good enough), our main blocking issue from a technology perspective is the deployment process itself.

To deploy continuously, the deployment has to be quick and it has to be transparent to end users (ie. there should be no visible downtime). Performing a rollback should have the same characteristics. Our deployment process is automated, but in the world of Java application servers (even lightweight ones like Glassfish) deployment is anything but fast. Deployment entails all kinds work that the app server needs to do (parsing configuration files, generating WSDLs, starting thread pools, etc) during which the application is unresponsive. Also, because of memory leak issues in the container, we always restart the application server with each deployment anyway. All in all, the only way to avoid downtime is to pull the application server out of the load balancer pool until the deployment completes. Rollback is the same process in reverse.

A bit of an aside, but I know of some teams that package Glassfish with their app, inverting the container metaphor and simply treating it as another library/dependency. This makes it easier to just flip the symlink on deployment and rollback. It’s an interesting idea, as long as you don’t mind copying a massive WAR to production with each deploy (which for us would just shift the deployment bottleneck to the network).

We have made a fair bit of head way on streamlining our deployment process, and while we’re not ready to do continuous deployments into production, I am trying to get us into a position where we can do continuous deployment to test. I used to be of the opinion that deployment to test was something that should be controlled by testers (via a deploy button on the automated build server). Most testers want to work against a stable baseline, limiting the number of variables that they are dealing with when testing the app. But this is a fallacy because a batch of changes is simply piling up behind whatever version is deployed into test. It’s classic batch-and-queue thinking.

What if deployments happened without downtime in a way that was invisible to the tester or the end user? What if test coverage was sufficient to ensure that there would be no regression on major areas of functionality? I think that the fears of continuous deployment into test and the need for a stable baseline would evaporate. Moreover, this is something that we would want to test because it would mirror the experience of users using the site when a new version goes into production. In our office, every time we do a deployment to test, someone needs to call out “deploying to test”. This too would go away.

That’s the plan anyway. Over the next couple of weeks, I’ll see if we can move closer to achieving it. I’ll let you know how it goes.