DevOps Zone is brought to you in partnership with:

John is an experienced consultant specialising in Enterprise Java, Web Development, and Open Source technologies, currently based in Sydney, Australia. Well known in the Java community for his many published articles, and as author of Java Power Tools and Jenkins: The Definitive Guide, and founder of the open source Thucydides Automated Acceptance Test Library project, John helps organisations to optimize their Java development processes and infrastructures and provides training and mentoring in agile development, automated testing practices, continuous integration and delivery, and open source technologies in general. John is the CEO of Wakaleo Consulting, and runs several Training Courses on open source Java development tools and best practices. John is a DZone MVB and is not an employee of DZone and has posted 125 posts at DZone. You can read more from them at their website. View Full User Profile

BDD, Automated Acceptance Tests and Continuous Delivery: Dealing with Scenarios that are "Work-in-Progress"

  • submit to reddit

One of the principle rules of Continuous Integration (and Continuous Delivery) is that you should never knowingly commit code that will break the build. When you practice test-driven development this is easy: you write a failing test (or, more precisely, a failing "executable specification"), make it pass, and then refactor as required. You only commit your code once you have refactored and ran all of your unit tests, to ensure that you haven't inadvertently broken anything elsewhere in the code.

But acceptance tests take typically require a lot more code than unit tests, and take a lot longer to implement. If you start with a failing automated acceptance test, you may have a failing test for hours or even days.

The general principle of CI still applies for automated acceptance tests - you should never knowingly commit code that breaks one on the build server. When people do this, it inevitably results in a continual stream of broken builds, which people ignore because it is considered the normal state of affairs. There is no easy way to know if a build is broken because of a regression, or because of an "in-progress" acceptance test. In these circumstances, CI has very little value. The status reporting becomes flawed. If "real" regression issues occur, they are detected and fixed more slowly. And any attempt at Continuous Delivery becomes impossible, since you can never reliably know when a build is ready to be released into production.

Here are a few techniques that teams use to get around this problem:

Tagging the acceptance tests

One common approach used with tools like JBehave, Cucumber and SpecFlow is to tag the acceptance tests that are work in progress, and to configure the Continuous Integration build to only run the stories without the work-in-progress tag. For example, the following JBehave scenario uses the @wip tag to mark a scenario that is work-in-progress:

User Authentication

In order to prevent unauthorized use of member points
As the system admin
I want users to authenticate before they can access their account


Scenario: Successful authentication
Given Jane is a registered Frequent Flyer
When Jane authenticates with a valid email address and password
Then Jane should be given access to her account

This approach manages the living documentation well enough, but some other aspects need to be considered when it comes to actually implementing the features.

Feature Branches

Many teams use short-lived, preferably local, branches to develop new features. This is fast, easy and common practice for teams using git. The linux code base, for example, relies extensively on feature branching to develop and integrate new features.

For teams still on centralized version control systems, it is a little more dissuasive, as branching and merging using tools like Subversion can be a painful process, and the concept of a local branch generally does not exist.. But can still be a viable option. The trick is not to let the branch live for too long (for example, more than a couple of days), because long-lived branches create a risk of integration issues down the track.

At the risk of stating the obvious, feature branches should also include the corresponding automated tests, whether they be unit, integration, acceptance, or any other automated tests that will be run on the build server. These are written and run locally, alongside the application code, and merged back into the master branch along with the application code when the feature is finished.

Incremental implementation

Another, preferable, approach is to break down the new feature into small pieces that can be built and delivered incrementally. Even if short-lived feature branches are often used for this sort of work (simply because they are convenient, and make it easier to experiment safely), the increments are completed quickly, often within a few hours, before being merged back into the master.

For bigger changes, you can use a slightly different approach. This usually involves building the new feature in isolation, maintaining the existing solution until you are ready to replace it completely. For example suppose you need to replace a payment processing module in your application. This is a large chunk of work, that you won't be able to do in one sitting. The first thing you do is to isolate the payment processing module, for example using an interface (if you are using a dependency injection framework such as Spring or Guice, this may already be done as part of your normal development work). You then build an alternative implementation of the module, according to the new or modified requirements, using TDD to drive the design and implementation. Your new acceptance tests use the new module; once these all pass, you are ready to replace the old implementation with the new.

This approach is similar to the idea of "Feature Toggles" promoted by Martin Fowler, but much simpler to implement. It makes it feasible to work directly against the master branch, though it will not reduce the risk of integration issues if the development takes too long.


In both of these cases, the aim of the game is to never commit code that breaks a build, but at the same time to keep your code up to date with the latest changes in the code base.

John Ferguson Smart is a specialist in BDD, automated testing, and software life cycle development optimization, and author of BDD in Action and other books. John runs regular courses in Australia, London and Europe on related topics such as Agile Requirements GatheringBehaviour Driven DevelopmentTest Driven Development, and Automated Acceptance Testing.

Published at DZone with permission of John Ferguson Smart, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)


Serge Bureau replied on Thu, 2014/08/21 - 4:24am

You can never warrant that your code won't break the build.

TDD or not.


John Ferguson Smart replied on Thu, 2014/08/21 - 6:35am

The key word here is "knowingly": the habit I am referring to here is when developers knowingly commit code that will break the build, because some of the acceptance tests are still "work-in-progress".

Serge Bureau replied on Fri, 2014/08/22 - 1:13am in response to: John Ferguson Smart


Of course knowingly is bad !

but linking TDD as a condition before committing does not make much sense.

What is the quality of the tests, how do you measure it ?

Tests are important, but I think that too much is expected from them, they can be badly written as any other code.

Also, from experience: on complex applications the bugs are coming mostly from timing issues from complex multi processors.distributions issues.

It is incredibly rare that unit tests brought an error, all the work for not much.

Integration testing is where the work should go.

Unit tests and TDD do not help me sleep better


John Ferguson Smart replied on Fri, 2014/08/22 - 1:46am

I've seem teams over-invest in unit tests, but usually when they think of them as tests, which in my opinion is not an effective way of working. And I absolutely agree, tests can be badly written, in which case they loose a lot of their value.

For these reasons, I prefer to think in terms of executable specifications, starting from high level, often end-to-end automated acceptance criteria (which play the role of integration tests), and working down to more detailed specifications for low level components. Whether it is implemented as a unit or an integration test is not that important to me, unless it adversely affects the time it takes to run (and hence the feedback cycle). The quality of the tests, and just as importantly, their relevance, is built in: if I don't know what a unit test does, I consider that I might as well delete it because if it fails, I won't know what to do. So the way the tests are named, how they are organized, and how they are written, is critical. Tests written this way are good at finding regressions, but they also document your thought process when you wrote them.

Written this way, these tests most definitely do help me sleep better ;-).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.