What is Manual Testing

History, Principles and Future of the Quintessential Testing Technique

James Whitaker, the author of “How To Break Software,” went on to work at Microsoft, then Google, where he became a director of testing, then organized Google’s Test Automation Conference. A few months later, when asked what he was excited about in software testing, he replied “Manual Testing.”

Manual Testing, really? What possibly could he mean?

At another conference, the Great Lakes Software Excellence Conference (GLSEC), another well-known testing expert, Michael Bolton, told me that all testing is automated to some extent. He went on to press all the number keys, followed by control-A(all), C(copy), V(paste), in rapid succession, creating a large blob of test data more quickly than a human could by typing it all out. Michael explained to me that even the most junior of testers will employ a large number of these techniques every day - and if they don’t, it is still computer code that makes the pressing of the letter ‘A’ create a letter ‘A’ on the screen.

Somehow, that doesn’t seem quite right either.

At least, when people say “automation,” they probably are not thinking of copy & paste.

One of the core problems in early testing (and still today) is the idea of coverage. We want to know what elements of the system have been tested, and perhaps how thoroughly. The most obvious way to do this is to break requirements down into bullet points, and make sure there is something to cover each bullet point.

Each bullet point was a possible test case, described on ten different pages of “Program Test Methods,” Bill Hetzel’s 1973 collection of conference papers that became the first book on software testing.

The test case idea is relatively simple, consisting of four steps: preconditions, actions, expected result, actual result. Before the actual testing begins, someone creates all the test cases, then hands them down to the testers to do the actual work. Sometimes these are called “test scripts,” as in scripts to follow, like a play.

When Hetzel’s work came out in 1972, most organizations were either living in chaos, or trying to follow some form of a waterfall model. Under a waterfall model, the program was built, the system was tested until bugs were uncovered and fixed, and finally the product was shipped. Testers were often end-users, people from customer support, or other roles. The scripts told them what to do. When the software needed an upgrade, the scripts could be re-run. In theory, they allowed the company to swap out anyone to do testing, lowering the hourly rate and making it easier to find testers.

In practice, this leads to boring work often done poorly.

Dr. Cem Kaner is a retired professor of Software Engineering at Florida Tech. He is also a co-author of Testing Computer Software, the best-selling testing book of all time. Dr. Kaner has pointed out problems with test cases. For example, some people claim their applications are too complex and that scripts provide navigational guidance to the new tester. Dr. Kaner would ask you to remember a time you drove to a new place with a navigator who gave you specific directions on when to turn, but not why each of those directions mattered. Now, if it was time to drive back, would you remember the route?

Script-following encourages people to follow the same steps every time. It might be consistent, but it can decrease coverage over time as features are added to the application.

Worst of all, test cases can cause the person viewing them to focus only on the expected result, missing other bugs in the software.

As a result, extremely-detailed manual test cases have a reputation for being boring, no-fun, low-value, and even brittle, because when the user interface changes, the test case directions “break.”

Exploration is the opposite of scripted testing, which has its own risks.

When Adam Yuret started his technology career, he was doing testing. His first assignment was to take a new piece of software and “play with it.”

At the sound of the words “play with it,” some professionals will wince. Yet play, fundamentally, is an open-ended exploration. Exploratory testing is the process of learning about software while testing it, having the next test idea derived from the results of the previous. Like chess, it actually takes skill and discipline to do well -- while appearing unpredictable, even confusing, to the outside observer.

The problem with exploration is the pesky problem of test coverage. How do you know when you are done? How can we have confidence that we touched all the pieces of the application that matter to a customer? And, how can you describe the work you did to other people?

One way around this is to create a map, or a testing dashboard. The testing dashboard allows the testers to evaluate the amount of coverage they have for any feature, and the quality, on a scale of one-to-ten.

Manual-Testing-Image-1.png

The presentation “How to Talk About Coverage” is full of examples on how to combine exploring with visualizations to show how much of your requirements was covered by testing.

A third kind of manual testing is using tools to aggregate data. A tester that populates a grid, cuts and pastes the data into a spreadsheet, and applies a sum function to see if the totals add up is certainly doing manual testing. What if the data came out of a database, and the tester wrote a SELECT statement, then ran it through a diffing tool to compare with the previous release? Is that manual testing?

The key distinction here is between tools that run against software unattended, producing results, and tools that are crafted by a tester for one-time use. The first time the tester created the tool and ran it, the work was manual. They are actively working with the product, learning, and changing what they do based on that information. If the test becomes something that runs automatically, only creating an email alert in case of failure, then it is no longer manual testing.

Few people would call that manual testing; they are more likely to call it “technical” testing.

Most modern testing groups have at least two kinds of functional testing - testing the individual features by themselves, and testing to see if a change made since the feature was tested has caused it to regress, or go backward. In a two-week sprint, regression testing can’t be more than a day or two, which means the teams want to have both fewer regressions in the process (cleaner code, fewer mistakes) and more tooling that runs continuously, to find bugs as soon as they are created.

Feature testing is usually a combination of creating tooling to run every time there is a new build, and exploration for the purposes of finding new, unexpected bugs. There are a million questions a tester might want to ask once. But a tester needs to create a smaller list of questions about the software that needs to be asked for every release; these questions are created in code and run as automated checks. 

Where feature testing is mostly manual with some automation, most teams attempt to make regression testing more automated and less manual - so they can run regression tests more often, with less effort, and thereby release more often, with less effort.

The problem is finding the right balance. How do you get the right test coverage in a set amount of time without sacrificing any quality. 

Despite years of hearing the contrary, manual testing is still very much alive. In fact, as long as people have a few tests they don’t want to automate, we’ll have manual testing. If some person asks the question “I wonder what happens if …” we’ll have exploration.

Here’s the challenge.

For any given sprint, the technical staff might have something like 200 person-hours to devote to testing above the unit (code) level. Recording a piece of automation might take 10 person-hours; doing it by hand might take one. The technical staff needs to decide if they will create twenty automated checks, or 200 human feature-checks.

The “right” answer is probably a combination of the two. The manual side of that might contain following some directions, some exploration and some tool-work. Doctor Kaner suggests checklists, allowing the staff member to do the testing first, then refer to the checklist to see if they missed anything.

While manual testing is undeniably still an important practice, it’s hard to ignore the benefits and pressure of moving to automated testing. The industry is picking up pace and development cycles are getting aggressively shorter. To stay competitive and deliver quality software at the rate the market demands, many organizations and teams have started to – or already have – adopt automation.

Moving from manual to automated testing isn’t easy and to do it right the first time is even harder. You outlined goals, a structured roadmap, and well-built testing framework that can support you at every step of the way. Not to mention, choosing the right tools that fit your needs. Check out this Test Automation Starter Kit for tips, resources, and tools to help you make the transformation seamless – and most importantly, more information on what should stay manual.

A few years ago, a fortune 1000 firm in the Boston area outsourced a large number of test scripts to a team in a developing nation. The scripts were high-level, and the overnight test results came out all passing, even when the databases and web servers were down. The company did not really have insight into what the testers were doing, so they re-wrote the test cases with more detail, at a lower grade level, then re-wrote them again. Eventually they found that at a fifth-grade level, the results that came back made sense.

This is the worst kind of manual testing, the kind that makes luminaries say things like “You’re a manual tester? You’re a dinosaur, you’re going to die!” Which was overheard at a conference in Sweden in 2012.

It is sad that the only testing so many people have seen is that kind. When people hear the term “Manual Testing,” that is often what they think of.

Instead of a single answer to the question, I suggest asking the person what they mean by manual testing: Is it exploration? Following directions? Building small, one-time use tools? When does that make sense - for special corner-cases that might only need to be tested once?

Unless everybody involved understands what the objectives are, the definition of manual testing is left up to debate, and the quality of your application is at risk.