Exploratory Testing - An Introduction

A Comprehensive History and Future Direction

The 1980’s and 90’s were a time of “professionalizing” testing meaning the time for makingt testing stable, predictable, and repeatable. At the same time, a small band of rebels were observing that the successful silicon valley companies were not doing things that way. Instead of trying to make testers do the same thing all the time, companies like Apple, HP and Borland looked at testers as technical investigators, seeing every build as a new and different collection of risks to be managed.

In 1984, Dr. Cem Kaner, then a test manager in Silicon Valley, later a professor of Software Engineering at Florida Tech, coined the term Exploratory Testing. James Bach, another tester in the valley around that time, took the term and ran with it. When James became a consultant, he began publishing under the term, first defining Exploratory Testing as “test design and test execution at the same time,” then later as “simultaneous learning, test design, and test execution”.

The term "simultaneous” isn’t quite accurate - it’s more like a rapid-fire toggle between test design, execution, and learning. First the tester has a test idea and performs it, generating more test ideas from the results of the test. The Workshop on Heuristic and Exploratory Techniques defined exploratory testing this way:

“Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.”

It was in the dotcom era, the late 1990’s, that Adam Yuret received his first technology job. With too many open positions and too few jobs, Yuret was hired as a software tester with no training because he knew a few UNIX commands. When Adam asked how to test, he was told to “play with” the software.

While there is no training there, that idea contains the nugget, the first step to getting started. In a very small program, with one screen or workflow, it might be possible to cover a reasonable combination of the workflows through play. Once the application is complex enough to require more than a frantic burst of work for an hour or so, testers need some structure and guidance to help them decide what to test next and when to stop.

The book Lessons Learned In Software Testing shows the next level of complexity, suggesting “Dive in and Quit” as a method. That is, dive into one piece of the application, explore it until the energy is exhausted. Take a break, look at the bugs found, the areas of the software untested, find a new area, and dive into that. Focus and Defocus is a similar approach, where the tester periodically “zooms in” to test in detail, then “zooms out” to look at what the findings mean and make decisions about where to go next. These are exploratory approaches by definition, because what the tester learns helps them understand the software and inform where to go next.

“Coverage,” the idea of looking at as many of the important things as possible (and assessing risks by what is left “not covered”) comes next. For now, we are talking about ways to explore. Below outlines the four more methods of exploratory testing.

Most test design approaches assume the tester has some background on the project. They talk about taking requirements documents and converting them into “test cases.” Sometimes, testers don’t get documents. They might not even get a brief-in.

Quick attacks are a series of techniques that create tests from nearly any user interface when the attacker has little to no background. The tester generally inputs valid data (first name, for example, should be short and contain regular characters) then attempts to discover problems in the software by sending in invalid data such as a blank field, an incredibly long name, a name with special characters, HTML codes, spaces, or other invalid data.

Quick attacks for responsive design include using tools like browsershots to see thumbnails of screen renderings in dozens of browsers in seconds, or manually resizing the browser quickly to find rendering errors at specific resolutions. Classic hardware quick attacks include yanking out the mouse during a mouse operation, closing and opening the screen of a laptop without going into hibernate mode, or trying to print to a printer with no paper.

Quick attacks are born from common platform problems. These are issues that tend to crop up in textboxes, or with search screens or checkout screens. These sorts of user interfaces have patterns of error. Quick attacks are just collections of these common platform errors.

The power of quick attacks is that they tend to provide a large number of bugs quickly. They can also be performed almost instantly after the tester comes up with an idea. The problem with quick attacks is the argument that the bug finds may not be important. Instead, the power in quick attacks is its ability to provide information about the status of the software. Quick attacks, after all, represent extreme situations. If the programmer handles those situations well, the programmer probably handled the happy path well. If not, testers know they will have a great deal of testing to do.

Best of all, once quick attack testing is complete, the testers likely know enough about the software to recognize the happy path, with or without advice.

“Walking the happy path” means trying to accomplish the basic goals of the system. For an email system, this is composing, sending, and reading email, along with search and related functions. It’s a simple thing to try to use the system, as a system, to see if that is possible - without any special characters, HTML, embedded code, or massive attachments. Plain, simple, middle-of-the-road use is the happy path, which, often enough, can yield bugs.

When happy path testing finds bugs, they will be important bugs -- huge barriers in the path which block progress. Shipping will be delayed until the bugs are fixed and the product re-tested. Meanwhile, testers can go beyond the happy path (at least where the bug is not blocking) and try other approaches like Soap Opera tests or Tours.

Television Soap Operas are known for extraordinary events - missing people who turn up with amnesia, heroes that get into a helicopter crash and barely survive, and other extravagant events. Han Buwalda of Logigear Corporation adapted that idea for software testing and created the soap opera test, which will be explained in the example below.

In health insurance companies there are criteria that must be met for a claim to be considered valid such as a person who has coverage must submit within 30 days of the event. Depending on the plan, dependents of the primary member could be covered until the day they turned eighteen, twenty-one, or twenty-six. By creating a soap opera test for a dependent that turned 18 the day after an accident, which was submitted 29 days after the event, and cause an account to go one dollar over the deductible for the plan year - which expired the day after the accident - would be an extreme example that could be tested with a dozen conditions. 

A good soap opera test for a spreadsheet might be to a very long, complex, mathematical function that tested every single function in the math library. The sin(cos(abs(10*5(avg(a1:F5)))) would be a start - as long as the correct good value was known. Make that function long enough, and it could provide substantial coverage of the code libraries with a single test.

Debugging Soap Opera tests that fail is a little more work, as testers need to construct less powerful tests until they find the problem. Sometimes when a soap opera test fails there are many problems to find, and that is okay, too.

Popularized by James Whittaker in his book Exploratory Testing: Tips, Tricks, and Techniques to Guide Test Design, tours are a metaphor used as a guide for looking at a product. You might think of them as the lens to examine the product through, or the “charter” for a “session” of exploratory testing. In an article for Software Test and Performance Magazine, Michael Kelly described a few tours he learned from James Bach in this way:

The first tour he suggested was the feature tour. In the feature tour you, you move through the application, getting familiar with all the controls and features you come across. You ask simple questions like “What’s this and what does it do?” and “How would I know if this feature is working?” You look for interactions, calculations, transformations, multimedia and error handling. When taking a feature tour, it can be helpful to look one factor at a time.

The second tour was the variability tour. In the variability tour, you look for things you can change in the product-and then you try to change them. Click buttons, select values, change setting and so forth. The goal is to try to get a feel for how things work and what possible values might be...

The third tour was the complexity tour. In this tour, one attempts to find the five most complex things about the product. Complexity can exist around features or data. Complex features can be the most common features in the application (the algorithm behind google search) or they might be rarely used-hidden away waiting to cause problems (end-of-year processing for an accounting system.)

A Heuristic is a bit like a guideline, an imperfect method of solving a problem. Whittaker’s book goes on to provide examples and case studies of tours with real products.

Tours, Soap Operas, Happy Paths and Quick Attacks are all different than traditional testing in that they put the tester in the driving seat. In all these methods, the tester comes up with the initial, lightweight plan, then changes it as soon as something goes not according to plan. Most of the time, the complete plan doesn’t really exist when the tester starts. Instead they have a jumping-off point.

The latest thinking in Exploratory Testing is that those things should not be different. That all testing is, to some extent, exploratory, and that people who plan to “just follow the script” without diverging may be doing a form of testing, but lack the substance.

Even with extremely detailed directions, testers need to investigate problems and go off-script. Dr. Kaner has pointed out similar things, that when testers find bugs and go off-script to investigate, they almost never return to the same line. Even when testers do return to the same line, the environment and data in the Software Under Test (SUT) has changed in a way that makes the testing non-repeatable. This contrasts testing with checking, which per Bach and his colleague, Michael Bolton, is “the process of making evaluations by applying an algorithmic decision rules to specific observations of a product.”

Checking, then, can be automated by a computer or done by an unthinking human that is following directions. It is part of testing, which, includes the thinking and the exploring we’ve been talking about here. The skills involved: investigation, learning, in-the-moment test design, evaluation, note-taking and reporting, can be valuable to any technical team member, regardless of role.

This introduction to Exploratory testing barely scratches the surface. It does not include how to perform deliberate practice, how to know when the testing is done, how to track coverage, or how to plan and report coverage in a way that stands up to scrutiny.

The exploratory skill set can take moment to learn, but a lifetime to master.