BDD is much more than Gherkin or Cucumber

Written by Thomas Pryce | 05 August 2020 14:07:33 Z

Behaviour Driven Development is, at its heart, about communication. It’s not about using Gherkin to formulate specifications, or Cucumber to run tests. It’s about stakeholders across the delivery lifecycle working in parallel from a shared vision, delivering systems quickly enough that they reflect fast-changing user needs.

BDD is much more than Gherkin or Cucumber: three approaches that are not BDD

Done right, BDD means no more miscommunication, minimal rework, and no more thinking “If only someone had told me that!”[i] It introduces a “ubiquitous language”: one that is understood by stakeholders regardless of their background.

Using this bridge language, everyone can work in parallel from an up-to-date understanding of the desired user behaviour. System designers, developers and testers can provide rapid feedback, communicating constantly and clearly to stay aligned. This in turn avoids the frustrating delays and costly bugs caused by building and testing the ‘wrong’ system.

Sounds great, right? This two-part article considers one approach for achieving this vision in practice. First, Part one (below) considers the core goals of BDD, evaluating some approaches that are not BDD in the full sense described. Part two then sets out a model-based approach, using visual modelling to maximise communication, collaboration and automation across the delivery lifecycle.

The driving argument will be that this alternative approach, placing flowcharts centrally, offers a practical approach to realising the benefits of Behaviour-Driven Development.

Criteria for evaluating behaviour-driven approaches

BDD aims to maintain a near-constant understanding of the desired system behaviour, with stakeholders in different roles collaborating seamlessly in parallel.

To this end, BDD aims to communicate clearly as soon as any one stakeholder’s understanding of the desired system changes. That understanding can then be updated if it has become misaligned, or it should be synthesized with a new understanding of the desired system. This makes sure that all stakeholders are pulling right directions.[ii]

Like other forms of iterative development, BDD must also be able to move this agreed vision rapidly from idea to release. The parallel design, development and testing must be at least fast enough to keep pace with changing user needs, incrementally improving the latest system to better fulfil them.

Maintaining an up-to-date understanding of changing user needs is therefore the first criterion for successful BDD. The second is the ability to deliver those fast-changing user needs accurately in production.

Three approaches that are not behaviour driven development

With these two goals, Behaviour Driven Development will rarely constitute the following three approaches:

1. Introducing Gherkin to existing processes and doing nothing else.

Requirements gathering formats like Feature Files or user stories offers far greater flexibility to change than trying to maintain monolithic requirements documents. That’s good for BDD and Gherkin is a particularly common syntax for this.

A further, significant advantage of Gherkin in Behaviour-Driven contexts is that it leverages natural language, easily understandable by a range of different stakeholders.

So, how well does introducing Gherkin by itself fulfil our first criterion? How well does it retain an up-to-date, accurate and complete understanding of fast-changing desired system behaviour?

In short, not particularly well.

Firing off scenarios in Gherkin Feature Files does allow you to make change requests quickly; however, it rarely explains how the new Feature Files relate to previously defined scenarios. Fragmentary Feature Files in turn often simply pile up, providing little guidance on how the vast system logic should relate, or what needs to be changed across the existing system.

This increases the scope for incompleteness, as it is hard to spot logically gaps in a myriad of repetitive, unconnected text-based Files. Meanwhile, the mass of existing Feature Files can grow increasingly out-of-date as new ones are added. Without updating existing files, technical debt and uncertainty will grow.

The natural language used in Gherkin further increases the scope for ambiguity. Its lack of logical precision risks mistranslation as written language is converted into logically precise (and unforgiving) code. This increases the potential for miscommunication, bugs and time-consuming rework.

That’s not to say that Gherkin should not feature in Behaviour-Driven approaches. It can and, in many cases, should; however, it must be combined with a broader strategy for maintaining a logically accurate, up-to-date and complete understanding of the desired system behaviour.

This understanding must moreover be one that can be used directly in testing and development, without the risk of misunderstanding or mistranslation. For this, the Feature Files must feed a logically precise picture of the desired system behaviour, complete with dependency mapping between components. This enables accurate impact analysis whenever a new change request arrives.

2. Slowly converting behaviour-driven requirements to out-of-date tests

In addition to maintaining an up-to-date, shared understanding of the desired system behaviour, BDD must be capable of translating this desired behaviour rapidly into code. The updated system must be released quickly enough to keep pace with fast-changing user needs.

Today, this demands rapid iterative delivery, developing and testing in parallel from Behaviour-Driven specifications. QA must be capable of building up-to-date tests to ensure the quality of each release, often building test suites while or before code is being developed.

However, rigorously testing fast-changing systems in short iterations is frequently made impossible by time-consuming dependencies and cumbersome manual processes. These swallow time and undermine test coverage, forcing test execution behind releases. This in turn reduces the likelihood that production systems will accurately reflect the desired system behaviour.

I’ve written a fair amount previously on some of these challenges, each of which could constitute their own article. For now, I’ve summarised five of the most common QA barriers to BDD, linking to additional resources on each. Each of these bottlenecks is capable of failing the second criterion of BDD set out above:

Manual test design is simply too slow and unsystematic to test complex systems rigorously in short iterations. It’s hard to know what logic needs testing after systems change, and manually created test cases instead typically focus on the ‘low hanging fruit’, happy path scenarios. This leaves off the majority of negative tests, leaving much system logic exposed to damaging bugs.
Unavailable or incomplete test data throws up some of the biggest delays. QA teams often wait for data to be provisioned by an upstream team, but then there is never enough data to test in parallel. Slowly copied production data furthermore lacks the variety needed for comprehensive functional testing, while out-of-date or invalid data combinations creates frustrating test failures. The challenges posed by test data are today so great that the average QA team spends 44% of their time waiting for or making data, or hunting for combinations among large production data copies.[iii]
Repetitive test scripting can undermine any time saved from automated testing. Automating test execution is necessary when testing fast-changing, complex systems, as there are today more tests than could ever be run manually in-sprint. However, copying, pasting and editing repetitive boilerplate code frequently replaces one bottleneck with another: manual test creation is replaced by slow and manual test scripting.
Test environment management is another complex process that can undermine test coverage. In production, users might use a vast array of configurations, made up of devices, operating systems and related software. However, there is no time spin up thousands of possible configurations manually for test execution, while the hardware and software would simply be too costly. Cloud-based execution offers a fast alternative to executing on physical devices, but it does not overcome the challenge of unfinished or unavailable software dependencies in test environments.
Maintenance of existing test artefacts is typically the greatest barrier to rigorous testing in short iterations. After a system change, a lack of dependency mapping between requirements, code and tests forces QA teams to check their existing assets. They must then update a mass of test cases, scripts and data manually, making sure all the artefacts remain in alignment.

Today, regression suites can contain thousands of tests, making it impossible to manually update all existing artefacts after every change. In turn, test teams are forced to focus either on testing newly introduced logic, or running the regression needed to identify integration errors and unexpected impact of system change. Either way, system logic goes untested before a release, exposing systems to damaging bugs.

Again, the format of system requirements are frequently a root cause of these delays. The lack of logical precision and ‘flat’ format of text-heavy Gherkin Feature Files prevents automated or systematic test design. Meanwhile, a lack of dependency mapping between Gherkin and a lack of traceability to test artefacts prevents automated maintenance of test assets after a system changes.

QA teams instead manually translate the Gherkin into test artefacts, often waiting for code to be developed to get a better understanding of the system under test. This prevents parallelisation of testing and development, pushing testing ever-later. These ‘mini-Waterfalls’ – in which systems are designed, then developed, and only then tested – prevents the rapid, iterative delivery necessary for effective BDD.

A requirements-driven approach to test design would instead be a truly behaviour-driven approach, by virtue of the requirements themselves being Behaviour-Driven. If this same approach to test design and maintenance is automated, it can furthermore remove the bottlenecks set out in this section. Rigorous QA can then occur in-sprint, detecting damaging defects before each release. Such an approach will be set out in Part Two.

3. Releasing untested code at the end of an ‘iteration’.

To conclude Part One of this article, let’s consider an unfortunately common alternative to tackling the challenges to in-sprint testing. This alternative allows testing to roll continuously over to the next sprint, growing further behind development. Sometimes, ‘newly’ released functionality might be tested sprints (and months) after its release.

As emphasised above, this approach is not compatible with truly Behaviour-Driven Development. Releasing untested functionality to production leaves fast-changing systems exposed to damaging bugs, failing the second criterion defined above. These bugs not only cost up to 10 times more to fix than those found during QA,[iv] but they are almost certainly not compatible with the desired user behaviour.

So, what does a truly behaviour-driven approach to software delivery look like? Part two of this article will set out a model-based approach to capturing fast-changing user requirements accurately and comprehensively, before using those same models to drive accurate development and rigorous in-sprint testing.

Endnotes:

[i] Dan North (2006), Introducing BDD, retrieved from https://dannorth.net/introducing-bdd/ on August 5^th 2020

[ii] Of course, not everyone needs to be pulling in one, homogenous direction. There will be exploration and experimentation, as well as divergent understands to be explored. Yet, all of these understandings should be valid and backed by a chance of successfully implementing the desired user behaviour; working from a misunderstanding does not achieve this.

[iii] Sogeti, Capgemini (2019), World Quality Report 2019-20, retrieved from https://www.capgemini.com/gb-en/research/world-quality-report-2019/ on August 5^th 2020.

[iv] National Institute of Standards and Technology, cited in Deep Source (2019), Exponential Cost of Fixing Bugs, retrieved from https://deepsource.io/blog/exponential-cost-of-fixing-bugs/ on August 5^th 2020.

View full post