Agile Automated Testing Strategies: Flipping the Testing Pyramid Right-Side-Up

Average Rating: -

Transition to Agile Testing:

Flipping the Automated Testing Pyramid Right-Side Up

Strategies for adopting healthy automated testing practices, tools, and skills.

Abstract

A partly-experiential “survey course” of the three main initiatives that make up an overall attempt to transition to healthy automated testing practices — what I have been recently calling an overall Automated Test Design (ATD). Forgive the acronym, but it has been a useful “orienting generalization” for my work with clients, especially when I am pulling together software professionals with different traditional roles and responsibilities.

Goal

To summarize the technical and learning challenges inherent in each of the three kinds of automated tests in a healthy “ATD,” to give a taste of the three kinds of testing in unhealthy and healthy form, to sketch strategies for launching and nurturing all three initiatives, and to inspire and encourage attendees that in fact, Yes, such a sweeping transition is possible.

The Ideal Automated Testing Pyramid

Mike Cohn introduced a very useful notion a few years back: a pyramid of three kinds of automated tests that are useful to an agile software development team.

Tier One

The lowest tier of Mike’s pyramid is a fat foundation of robust, high speed unit tests, written by programmers. We want these tests to be carrying as much of the load of verifying that we are “building the thing right” as possible. We want them representing the bulk of the automated testing work in general. We would like to have more of this kind of test (xUnit test methods) than other kind of test. Furthermore, we want as many of these test methods as possible to fit Mike Feathers’ definition of unit tests – essentially isolation tests that:

• test and specify a single tiny sliver of behavior,

• run entirely in memory,

• don’t rely on any external resources like files, networks, or persistence systems (which likely involves more than a few test doubles, especially fakes), and

• comply with the healthiest patterns best expressed in Gerard Meszaros’ xUnit patterns book.

Tier Two

Next up from that tier is a middle tier of what Cohn labels “Acceptance Tests.” We want these tests veryifying that we are “building the right thing,” but not carrying any of the load of verifying that we are “building the thing right.” We want these to be a requisite mix of end-to-end tests that indeed verify complete round-trips for features through all system resources and touch points. We want these to be visible to lots of stakeholders, using something like FitNesse, ZiBreve, or a similar Fit framework. We might easily end up with significantly fewer Fit test tables than xUnit test methods, but what really matters is that the Fit tests are describing feature-level functionality, not checking for defects.

Tier Three

The top, thinnest level of our pyramid is made up of “through the GUI” tests, meaning automated tests that actually operate the GUI of the application. We want to have fewer of this kind of test than any other, because they are inherently brittle and slow. In presentations of the pyramid, Mike has sometimes drawn a little dot above the pyramid, indicating the number of manual tests we would like to rely upon. We want that number to be tiny compared to the number of automated tests of all kinds. This is the pyramid “right-side up,” our desired end-state.

Wait! We Don’t Get to Start There! We Start with an Inverted Pyramid.

We want to end up with the right-side-up pyramid, and the good news is that we can. The bad news, for many teams adopting any agile blend, is that we cannot get there today. It’s a path, a journey, with many little steps.

Indeed, our initial pyramid is likely to be more or less inverted, with way too many costly manual and GUI tests at the top, and way too few cheap unit tests at the bottom, and perhaps no automated acceptance tests to speak of. And you know what? That’s OK. The brittle tests are typically easier to learn and write. The healthy unit tests, as wonderful as they are, are not cheap or easy for most pre-agile teams to learn; they have many prerequisites and preconditions.

Transition Patterns: Flipping the Inverted Pyramid Right-Side-Up

My tutorial dives into the TCO (Total Cost of Ownership) of the different kinds of tests, the costs of learning the different kinds of tests, and transition patterns between these two kinds of pyramids: the inverted, high-TCO pyramid, and the right-side-up, low-TCO pyramid. My goal is to present summary automated testing strategies in a way that is at once forthright, pragmatic, inspiring, and encouraging.

I’ll also touch on several miscellaneous factors that affect automated testing adoption strategies and patterns, including:

• Continuous Integration as foundation for automated testing,

• control/generation/ownership of canonical test data for end-to-end & Fit tests,

• cultural issues, especially relationships between QA groups and other groups,

• the entire team owns the ATD, even though different traditional roles gravitate toward different test types among our three types,

• database-related development and schema evolution practices, like CI-only db instances for testing, and individual developer db sandboxes.

Process/Mechanics

Tutorial Format, Flow, Schedule, Logistics

  1. First I’ll introduce the topics above quickly, in broad terms.

Duration: less than 20 minutes.

  1. GUI-testing exercise: If time seems to permit, I’ll ask for volunteers to help me create small, smoke-test suites of Selenium regression-tests for an existing web app (or public website navigation path). We’ll project this exercise for everyone to see.

Alternately, if time is already short, I’ll display some manual test scripts, HTMLUnit tests, and Selenium tests that all accomplish the same work, and ask for comparisons from the crowd. We’ll at least have to time to tweak a Selenium test suite, and demonstrate both brittleness and adaptability.

Duration: 25 minutes for the exercise/demo and questions and answers.

  1. I’ll then summarize theory and practice of Fit-based automated acceptance testing, its TCO patterns and learning patterns, as described by Mugridge, Cunningham, and others.

Duration: less than 10 minutes.

  1. Fit-testing exercise: if time permits, then as a mob, the entire class and I will create some Fit tests for requirements for an existing toy codebase. We’ll do the work and handle questions and answers as we go.

Alternately, if time is short, I’ll show samples of an end-to-end test run from JUnit (whose results are really only visible to programmers), then the same test run from FitNesse, then perhaps the same test run from ZiBreve. I’d also like to show examples of unnecessarily thick Fit fixture code and badly-factored FitNesse test suites, compared to well-factored test suites, and nicely-thin Fit fixtures. I’ll solicit comments about FitNesse refactoring headaches, Fit fixture patterns and anti-patterns/headaches, etc.

Duration: 25 minutes.

  1. I’ll then introduce the theories, practices, patterns, and pre-requisites of xUnit testing, covering everyone from Rainsberger, Meszaros, Beck, Uncle Bob, Hunt/Thompson, and Astels, to Feathers. I’ll boil it down to this handful of concepts that challenge unit testers on greenfield projects: keeping methods small, coding to interfaces, dependency injection, using test doubles (mostly fakes), and keeping unit test suites well-factored.

Duration: 30 minutes.

  1. xUnit isolation-testing exercise: if we really get lucky and time permits, I’ll ask for pairing volunteer(s) again, and we’ll do a bit of mob Java/jUnit programming on test-driving some functionality for the same toy codebase above. We’ll also take an unhealthy unit test or two, and refactor them into healthy condition.

Alternately, I’ll show three versions of a jUnit TestCase object tree. Version A will be in poor shape, Version B in slightly better shape, and Version C will be factored the way we would prefer. I’ll ask the crowd to choose which of the three they would prefer to maintain and extend, and why.

Duration: 35 minutes.

  1. Finally, we’ll have closing discussion of pyramid-flipping transition strategy hints, hacks, tips and Q/A. Here I’ll talk about finding and leveraging local natural leaders/change agents and continuous learners for each of these three kinds of automated testing initiative (healthy GUI testing, healthy acceptance testing, and healthy unit testing). I’ll describe these natural leaders as people who model continuous learning, courageous experimentation, encouragement of other team members, mentoring, and overall leadership.

Duration: 25 minutes.

Total duration: 170 minutes (10 spare minutes!).

I’ll cover how, without natural leaders on your team of developers, it’s very hard to get a continuously improving unit testing practice going. Without natural leaders among BAs, SMEs, or other customer/user proxies, it’s very hard to get a continuously improving automated acceptance testing practice going. Finally, I’ll cover how I would like natural leaders in traditional QA departments to own issues like overall test data control, testing the APIs at integration points between applications, the evaluation of evolving open-source automated testing tools, and focusing on the kinds of testing that tend to fall between the cracks of unit/isolation testing and Fit/example/story testing.

I’ll reiterate how key a healthy Continuous Integration practice is to all three kinds of initiatives.

Then we’ll have a last bit of retrospective.

Logistics: All of the code and test tools will be installed on my machine, ready to go.

At the end, I shall hand out a list of recommended books and website that provide resources for all three tiers of the “ATD pyramid.” I shall also have links on my website that point to these resources.