Five Ways to Think about Black Box Testing

[article]
Summary:

Have you ever seen a software testing discussion erupt into a debate over the definition of black box testing, or the difference between black box and white box testing? It seems lots of people have lots of ideas about what the terms really mean. Columnist Bret Pettichord uses the five dimensions of testing to examine black box and white box testing. And he leaves you with a few puzzles to consider.

An easy way to start up a debate in a software testing forum is to ask the difference between black box and white box testing. These terms are commonly used, yet everyone seems to have a different idea of what they mean.

Black box testing begins with a metaphor. Imagine you're testing an electronics system. It's housed in a black box with lights, switches, and dials on the outside. You must test it without opening it up, and you can't see beyond its surface. You have to see if it works just by flipping switches (inputs) and seeing what happens to the lights and dials (outputs). This is black box testing. Black box software testing is doing the same thing, but with software. The actual meaning of the metaphor, however, depends on how you define the boundary of the box and what kind of access the "blackness" is blocking.

An opposite test approach would be to open up the electronics system, see how the circuits are wired, apply probes internally and maybe even disassemble parts of it. By analogy, this is called white box testing, but already the metaphor is faulty. It is just as hard to see inside a white box as a black one. In the interests of logic, some people prefer the term "clear box" testing, but even a clear box keeps you from probing the internals of the system. It might be even more logical to call it "no box" testing. But the metaphor operates beyond logic. Since "white box" testing is the most common term, I'll use it here.

To help understand the different ways that software testing can be divided between black box and white box techniques, I'll use the Five-Fold Testing System. It lays out five dimensions that can be used for examining testing: 

  1. People (who does the testing)
  2. Coverage (what gets tested)
  3. Risks (why you are testing)
  4. Activities (how you are testing)
  5. Evaluation (how you know you've found a bug)

Let's use this system to understand and clarify the characteristics of black box and white box testing.

People: Who does the testing?: Some people know how software works (developers) and others just use it (users). Accordingly, any testing by users or other nondevelopers is sometimes called "black box" testing. Developer testing is called "white box" testing. The distinction here is based on what the person knows or can understand.

Coverage: What is tested?: If we draw the box around the system as a whole, "black box" testing becomes another name for system testing. And testing the units inside the box becomes white box testing. This is one way to think about coverage. Another is to contrast testing that aims to cover all the requirements with testing that aims to cover all the code. These are the two most commonly used coverage criteria. Both are supported by extensive literature and commercial tools. Requirements-based testing could be called "black box" because it makes sure that all the customer requirements have been verified. Code-based testing is often called "white box" because it makes sure that all the code (the statements, paths, or decisions) is exercised.

Risks: Why are you testing?: Sometimes testing is targeted at particular risks. Boundary testing and other attack-based techniques are targeted at common coding errors. Effective security testing also requires a detailed understanding of the code and the system architecture. Thus, these techniques might be classified as "white box." Another set of risks concerns whether the software will actually provide value to users. Usability testing focuses on this risk, and could be termed "black box."Activities: How do you test?: A common distinction is made between behavioral test design, which defines tests based on functional requirements, and structural test design, which defines tests based on the code itself. These are two design approaches. Since behavioral testing is based on external functional definition, it is often called "black box," while structural testing—based on the code internals—is called "white box." Indeed, this is probably the most commonly cited definition for black box and white box testing. Another activity-based distinction contrasts dynamic test execution with formal code inspection. In this case, the metaphor maps test execution (dynamic testing) with black box testing, and maps code inspection (static testing) with white box testing. We could also focus on the tools used. Some tool vendors refer to code-coverage tools as white box tools, and tools that facilitate applying inputs and capturing inputs—most notably GUI capture replay tools—as black box tools. Testing is then categorized based on the types of tools used.

Evaluation: How do you know if you've found a bug?: There are certain kinds of software faults that don't always lead to obvious failures. They may be masked by fault tolerance or simply luck. Memory leaks and wild pointers are examples. Certain test techniques seek to make these kinds of problems more visible. Related techniques capture code history and stack information when faults occur, helping with diagnosis. Assertions are another technique for helping to make problems more visible. All of these techniques could be considered white box test techniques, since they use code instrumentation to make the internal workings of the software more visible. These contrast with black box techniques that simply look at the official outputs of a program.

To summarize, black box testing can sometimes describe user-based testing (people); system or requirements-based testing (coverage); usability testing (risk); or behavioral testing or capture replay automation (activities). White box testing, on the other hand, can sometimes describe developer-based testing (people); unit or code-coverage testing (coverage); boundary or security testing (risks); structural testing, inspection or code-coverage automation (activities); or testing based on probes, assertions, and logs (evaluation).

So now that we've examined some ways to think about the differences between black box and white box testing, let me leave you with a few puzzles. Let's hear what you think. 

A. A programmer tests a class to ensure that it meets its functional requirements. Is this black box or white box testing?
B. Your company develops software under a contract that stipulates that both white box and black box test techniques will be used. What tests are you obliged to execute?
C. A nonprogrammer uses a test tool that automatically instruments the code and then generates tests to ensure that a maximal number of lines of code are executed. The tests are considered to pass as long as the software doesn't crash or hang. Is this black box or white box testing?
D. What could it mean to perform "gray box" testing?

Tags: 

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.