Chapter 8: Test Driven Development

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What is Test-driven development? (TDD)

- advocates the use of tests to drive the development of code - tests come into existence before any of the code being tested -resulting code is more modular and easier to read than code developed separately from tests

What is operational acceptance testing?

may manually try additional scenarios to ensure you "built the thing right."

What is Bottom-up integration?

starts at the bottom of the dependency tree and works up. There is no need for stubs, as you can integrate all the pieces you need for a module. Alas, you don't get an idea how the app will look until you get all the code written and integrated.

Difference in stubbing "far away" vs stubbing "close by"

"far away": using Webmock is more realistic and appropriate for functional or integration tests "close by": in a gem or library that communicates with the remote service is often adequate for low-level unit tests.

What is the testing strategy methods that have side effects?

- In Arrange phase, observe the relevant state before executing test code -in Assert phase, observe it again and check for side effect

What are common assertions?

- checks on values (equality, between-ness, and so - checks on behavior (is an exception raised or not).

What are the responsibilities of the Quality Assurance (QA) team?

- focus on improving the testing tools infrastructure, - helping developers make their code more testable - verifying that customer-reported bugs are reproducible

What are benefits of approaching software from a test-centric perspective?

- improves the software's readability and maintainability. - testable code tends to be clear code, and vice versa

What is compatibility testing?

- less prominent in SaaS since the app developers control the server environment, but may still be important for testing the app's UI in different browsers. -running SaaS integration tests on a variety of browsers and operating systems to check correct client behavior, and even captures a screencast of each run so you can visually check behaviors such as whether the same fonts look good in different browsers.

Where should we stub external methods when testing using an external service?

- stub "closer to the remote service." - we could create fixtures and arrange to intercept calls to the remote service and return the contents of those fixture files instead.

When are formal methods typically used?

- when the cost to repair errors is very high, the features are very hard to test, and the item being verified is not too large. example: vital parts of hardware like network protocols or safety critical software systems like medical equipment

From the Plan-and-Document perspective, what are the three options on how to integrate the units and perform integration tests

1. Top-down integration 2. Bottom-up integration 3. Sandwich integration

What are two characteristics of a method that can complicate the task of creating unit tests for it?

1. it has side effect 2. it has dependencies—it calls other methods as part of doing its job.

Which two cases do factories shine?

1. when the object to be created has many attributes that must be initialized at creation time, even though any particular test case may only care about the specific values of a few of them. ex) you can ask the factory to create an object in which certain attribute values are specified but others are filled in with valid defaults 2. when objects you need to create have has-many or belongs-to relationships with other objects,

What is a test suite?

A collection of test cases

The Microsoft Zune music player had an infamous bug that caused all Zunes to "lock up" on December 31, 2008. Later analysis showed that the bug would be triggered on the last day of any leap year. What kinds of tests—black-box vs. glass-box (see Section 6.8), mutation, or fuzz—would have been likely to catch this bug?

A glass-box test for the special code paths used for leap years would have been effective. Fuzz testing might have been effective: since the bug occurs roughly once in every 1460 days, a few thousand fuzz tests would likely have found it.

to_receive combines _____ and _____, whereas stub is only _____. - A seam and an expectation, an expectation - A mock and an expectation, a mock - A mock and an expectation, an expectation - A seam and an expectation, a seam

A seam and an expectation, a seam Explanation: Recall that seams help you isolate the behavior of an application and change it without having to change the code, while an expectation is similar to the idea of an assertion that indicates what about the nature of an application should be true.

What is an integration test?

Any test that covers more than one method but is not a full-stack test example: RSpec test of a controller action would probably stub out calls to the database and bypass the routing mechanism, neither of which is central to testing the controller action itself, but would probably include interactions with mechanisms such as parsing form input, which clearly are outside the controller action.

What is the structure for every test case (leaf or not)

Arrange, Act, Assert Arrange: create any necessary preconditions for the test case, such as setting values of variables that affect the behavior of the SUT. Act: exercise the SUT. Assert: verify that the result or behavior matches what was expected. ( makes each test Self-checking, eliminating the need for a human programmer to inspect test results.) Similar to Cucumber given, when, then

What is the testing strategy for a pure leaf function ? Pure leaf: no side effects, no collaborator methods or classes, same inputs yield same outputs

Assert correct output results for critical values and for arbitrary values in noncritical regions

Which kinds of code can be tested Repeatably and Independently? (i) Code that relies on randomness (e.g. shuffling a deck of cards); (ii) Code that relies on time of day (e.g. run backups every Sunday at midnight).

Both Explanation: With randomness, we can test repeatably by using a random number seed that fixes the order of random numbers from a generator. For the time of day, we can use an approach called "stubbing" that can help us define a mock context that allows code to run.

What is the difference between C0 code coverage and code-to-test ratio?

C0 coverage is a dynamic measurement of what fraction of all statements are executed by a test suite. Code-to-test ratio is a staticmeasurement comparing the total number of lines of code to the total number of lines of tests.

Why does high test coverage not necessarily imply a well-tested application?

Coverage says nothing about the quality of the tests. However, low coverage certainly implies a poorly-tested application.

What is the difference between Cucumber and RSpec?

Cucumber describes behavior via features and scenarios Rspec tests individual modules that contribute to those behaviors (test driven development)

What are depended-on components and how do test cases handle them?

DOC: the other methods it calls to help do its work Handle: Test cases should isolate the SUT from those dependencies.

What is a happy path?

Defined by what works correctly and how we assume the user to behave

Which statement regarding testing is FALSE? - PandD developers code before they write tests while its vice versa after - andD sandwich integration aims to reduce wasted work making stubs while trying to get general functionality early - Agile developers perform module, integration, system, and acceptance tests. PandD developers don't - Formal methods are expensive but worthwhile to verify important applications

Formal methods are expensive but worthwhile to verify important applications Recall that formal methods use mathematical proofs to verify whether applications are performing correctly. These methods are indeed expensive, and while helpful, they are not worthwhile for verifying application correctness because of - 1. How time and labor intensive developing formal methods are and 2. They must be updated as the application changes, which is not feasible in an Agile environment.

Is "stubbing the Internet" in conflict with the advice of Chapter 7 that one should avoid mocks or stubs in full-system Cucumber scenarios?

Full-system testing should avoid "faking" certain parts of it as we have done using seams in most of this chapter. However, if the "full system" includes interacting with outside services we don't control, such as the interaction with TMDb in this example, we do need a way to "fake" their behavior for testing.

What is code coverage?

Has everything been used at least once? S0: method coverage -> Is every method executed at least once by the test suite? S1: call coverage -> has each method been called for every place it could be called from? C0: Statement coverage-> Is every statement of the source code executed at least once by the test suite, counting both branches of a conditional as a single statement? C1: Branch coverage -> Has each branch been taken in each direction at least once? (for example both statements in if statement) C2: Path coverage -> Has every possible route through the code been executed? Modified Condition/Decision Coverage (MCDC): combines a subset of the above levels -want high C0

Suppose a test suite contains a test that adds a model object to a table and then expects to find a certain number of model objects in the table as a result. Explain how the use of fixtures may affect the Independence of the tests in this suite, and how the use of Factories can remedy this problem.

If the fixtures file is ever changed so that the number of items initially populating that table changes, this test may suddenly start failing because its assumptions about the initial state of the table no longer hold. In contrast, a factory can be used to quickly create only those objects needed for each test or example group on demand, so no test needs to depend on any global "initial state" of the database.

What is the testing strategy for methods that rely on results from depended-on components (DOCs)

In Arrange phase, create doubles that "force" the desired behavior by returning prearranged values, raising an exception, and so on

What is the testing strategy for Nondeterministic or time-dependent behavior?

In code under test, isolate the nondeterminism in a method call that can be stubbed using a double in the Arrange phase

What are the five principles for creating good tests?

Independent, Repeatable, Self-checking, and Timely (FIRST)

Which part of FIRST do fixture potentially interfere with?

Independent. - as every test now depends implicitly on the fixture state, so changing the fixtures might change the behavior of tests.

Which is FALSE about expect(...).to receive? - It provides a stand-in for a real method that doesn't exist yet - It can be issued either before or after the code that should make the call - It would override the real method, even if it did exist - It exploits Ruby's open classes and metaprogramming to "intercept" a method call at testing time

It can be issued either before or after the code that should make the call Explanation: The expect(...).to receive matcher clause must be issued after the code making the call, as it serves as an assertion that certain effects were caused by the method calls.

Which of the following kinds of data, if any, should not be set up as fixtures? - Fixtures would be fine for all of these - The application's time zone - Movies and their ratings - The TMDb API key

Movies and their ratings Explanation: Recall that the definition of a fixture is a fixed state that is used as a baseline for running tests in software testing. Therefore, it'd be best to set up any data that is not dependent on the user's configurations as fixtures. In this case, movies and their ratings can be divulged to other developers, but users may have their own unique API key and live in different time zones.

What is Independent in FIRST?

No test should rely on preconditions created by other tests, so that we can prioritize running only a subset of tests that cover recent code changes.

What are the project managers and developers roles when it comes to The Plan-And-Document Perspective on Testing?

PM: takes the Software Requirements Specification from the requirements planning phase and divides it into the individual program units Dev: write the code for each unit, and then perform unit tests to make sure they work. In many organizations, quality assurance staff performs the rest of the higher-level tests, such as module, integration, system, and acceptance tests.

Structure of test cases

Page 266 in textbook

What do you do when the SUT has side effects when executed?

Potential side effect: auses a change in application state visible outside the test code itself. Handle: Test cases should verify that the correct side effect occurred, which involves inspecting app state outside the SUT.

What is Red-Green-Refactor

Pre-step: Think about one thing the code should do Red step: Run the test, and verify that it fails because you haven't yet implemented the code necessary to make it pass (that is, the code you wish you had aka subject code). Green step: Write the simplest possible code that causes this test to pass without breaking any existing tests. Refactor step: Look for opportunities to refactor either your code or your tests—changing the code's structure to eliminate redundancy or repetition that may have arisen as a result of adding the new code. The tests ensure that your refactoring doesn't introduce bugs.

What is non-functional testing?

Preference, stress, security testing - ensure the software meets these operational criteria, which are particularly important for SaaS

What is the next step after integration tests?

QA does a system test, as the full app should work. This is the last step before showing it to customers for them to try out. Note that system tests both non-functional requirements, such as performance, as well as functional requirements of features found in the SRS.

What is Regression testing?

Regression testing involves repeating previously run tests to ensure that known failures of prior versions do not appear in new versions of the software. - ensures that previously-fixed bugs do not reappear. We return to regression tests

Which non-obvious statement about testing is FALSE? - Testing eliminates the need to use a debugger - Even 100% test coverage is not a guarantee of being bug-free - If you can stimulate a bug-causing condition in a debugger, you can capture it in a test - When you change your code, you need to change your tests as well

Testing eliminates the need to use a debugger Explanation: Recall that even 100% test coverage doesn't mean code is bug free, whether it is with regards to the technical implementation or overall system correctness according to the customer behavior. Therefore, using a debugger is still needed to trace errant behavior that may not be covered by an existing test, or perhaps cannot be written as a test (non-deterministic errors).

What is a unit test?

The finest-grained test cases, for which the SUT is a single method.

What is the goal of a single test case?

The goal of a single test case for some SUT is to check that some specific behavior happens (for example, the return value from a function matches an expected result) or doesn't happen (for example, passing an empty string to a string comparison function doesn't result in an error or exception)

Name two likely violations of FIRST that arise when unit tests actually call an external service as part of testing.

The test may no longer be Fast, since it takes much longer to call an external service than to compute locally. The test may no longer be Repeatable, since circumstances beyond our control could affect its outcome, such as the temporary unavailability of the external service.

Why are integration test insufficient?

Their resolution is poor: if an integration test fails, it is harder to pinpoint the cause since the test touches many parts of the code.

Compare and contrast integration strategies including top-down, bottom-up, and sandwich integration.

Top-down needs stubs to perform the tests, but it lets stakeholders get a feeling for how the app works. Bottom-up does not need stubs, but needs potentially everything written before stakeholders see it work. Sandwich integration works from both ends to try to get both benefits.

Which of these is POOR advice for TDD? - Mock and stub early and often in unit tests - Sometimes it's OK to use stubs and mocks in integration tests - Unit tests give you higher confidence of system correctness than integration tests - Aim for high unit test coverage

Unit tests give you higher confidence of system correctness than integration tests Explanation: More unit tests and more test coverage in general is correct, but it doesn't necessarily translate to more system correctness. Recall that unit tests target functionality at very technical levels (does this method work as intended). Integration tests are much more comprehensive and test several software modules altogether as a group. Therefore, it reflects system correctness more accurately.

Main difference between BDD/TDD and plan-and-document processes?

Unlike BDD/TDD, the plan-and-document process starts with writing code before you write the tests

How do you keep tests Fast and Independent from the behavior of other classes,

Use mock objects and stubs—"stunt doubles" that stand in for real objects in tests, but whose behavior you can closely control. Both of these are referred to as test doubles and are examples of seams

What does TDD handle legacy code?

When TDD is used to extend or modify legacy code, new tests may be created for code that already exists

Are agile developers expected to write their own tests?

Yes!

What is mutation testing?

a test-automation technique in which small but syntactically legal changes are automatically made to the program's source code, such as replacing a+b with a-b or replacing if ( c ) with if (!c) - most changes should cause at least one test to fail, if it fails to do so then that indicates a lack of test coverage

From the plan-and-document perspective, how do you decide when testing is complete?

an organization will enforce a standard level of testing coverage before a product is ready for the customer. Example: statement coverage (all statements executed at least once), or all user input opportunities are tested with both good input and problematic input.

What is a basic block and how are they used?

basic blocks: each of which executes from the beginning to the end with no possibility of branching, They are used by joining these basic blocks into a graph in which conditionals in the code result in graph nodes with multiple out-edges.

Why do we think of cucumber as acceptance test?

because a properly-written scenario reflects and verifies the behavior the user said they wanted

Why do we think of cucumber as a system test?

because it exercises code in many different parts of the application in the same ways a user would

The expect assertion needs to be stated _____ the action is taken?

before!

What does module or Functional testing test?

behavior across methods/classes -ex) controller flow from GET/POST all the way to template rendering - more focuses than a full scenario

What is a factory?

bits of code (or declarative descriptions of objects) framework designed to allow quick creation of full-featured objects (rather than mocks) at testing time The goal of a factory is to quickly create valid instances of a class using some default attributes that you can selectively override for testing. -create only what you need per test with default attributes

What is black box fuzzing vs smart fuzzing vs white-box fuzzing?

black box fuzzing: generates completely random data or randomly mutates valid input data, such as changing certain bytes of metadata in a JPEG image to test the robustness of the image decoder. smart fuzzing: incorporates knowledge about the app's structure and possibly a way to specify how to construct "realistic but fake" fuzz data white box fuzzing: uses symbolic execution, which simulates execution of a program observing the conditions under which each branch is taken or not, then generates fuzzed inputs to exercise the branch paths not taken during the simulated execution. It requires NO explicit knowledge of the app's structure and can provide c2 (all paths) coverage

What is black box testing vs white box testing?

black box: design is based solely on the software's external specifications white box: design reflects knowledge about the software's implementation that is not implied by external specifications

What is a smoke test?

consists of a minimal attempt to operate the software, to see whether anything is obviously wrong before running the rest of the test suite. example: if a low-level coding error prevents a SaaS app from displaying its home page or accepting logins, there is no point in running further tests.

Why are systems test insufficient?

coverage also tends to be poor because even though a single scenario touches many classes, it executes only a few code paths in each class.

In the plan and document process, what is the final test? How does this compare to Agile?

customers try the product in their environment to decide whether they will accept the product or not. The aim is validation, not just verification. In Agile development, the customer is involved in trying prototypes of the app early in the process, so there is no separate system test before running the acceptance tests.

What does the ruby gem FactoryBot do?

define a factory for any kind of model in your app and create just the objects you need quickly for each test, selectively overriding only certain attributes - everything it creates is incinerated after the test, even when it uses (Table.create())

What is Self-Checking in FIRST?

each test should be able to determine on its own whether it passed or failed, rather than relying on humans to check its output.

What is Accessibility testing?

ensures that the software is usable by persons with disabilities. In SaaS, accessibility testing focuses primarily on the client-side user experience.

Which of these, if any, is NOT a valid expectation? - All of these are valid expectations - expect(5).to be <=> result - expect(result).not_to be_empty - expect(result).to match /^D'oh!$/

expect(5).to be <=> result Explanation: An explanation is like an assertion, in that it checks whether something is strictly true or false. "expect(5).to be <=> result" is a comparison expression that evaluates to -1, 0, or 1, instead of true or false.

What is the difference between expect and allow?

expect: this test should fail if it doesn't happen allow: this may or may not happen, and whether it happens is not part of the test, but if it happens here's what to do

What's a fixture?

fixtures—files containing the JSON content returned by actual calls to the service. a fixture file defines a set of objects that is automatically loaded into the test database before tests are run, so you can use those objects in your tests without first setting them up - fixture: a set of objects whose existence is guaranteed and fixed, and can be assumed by all test cases

What are coverage reports used for?

identify under-tested parts of your app so you can enhance the test suite accordingly.

What is define-use coverage?

if we consider every place that x is assigned a value and every place that the value of x is used, DU-coverage asks what fraction of all pairs of define and use sites are exercised by a test suite. - typically weaker

What is Fast in FIRST?

it should be easy and quick to run the subset of test cases relevant to your current coding task, to avoid interfering with your train of thought. We will use a Ruby tool called Guard to help with this.

What does the WebMock Gem do?

it stubs out the entire Web except for particular URIs that return a canned response when accessed from a Ruby program. - similar to "allow(...).to receive(...).and_return" for the whole Web -less compelling for unit testing

What do you do when the SUT is not a pure function?

not a pure function: because its output depends not only on its input but other implicit factors, such as the time of day or a random event. Handle: Test cases should control the values of these factors to force the SUT to traverse predictable code paths.

What is a pure function?

one that has no side effects and whose return value is always the same for the same arguments.

What data should you use fixtures for?

primarily for truly fixed data that, in production, would not be expected to change while the app is running but need to be present in order for it to work. - ex) configuration data

What are formal methods?

rely on formal specifications and automated proofs or exhaustive state search to verify more than what testing can do, but they are so expensive to perform that today they are only applicable to small, stable, critical portions of hardware or software.

What is a "seam"

seam: a place where you can alter behavior in your program without editing in that place. A place where you can change the app's behavior without changing the source code . Allows for isolating behavior of some code from that of other code it depends on example: calling a fake/dummy method instead of the real one (AKA a method stub) example: expect...to receive (ruby) -Rspec resets all mocks and stubs after each example (keeps test independent)

What is a test double?

set up in the Arrange phase of a test case, can isolate the SUT from its collaborators by controlling the return values from collaborator methods or providing a "stunt double" object with predetermined behaviors. - appropriate when you need a stand-in with a small amount of functionality to isolate the code under test from its dependencies

In SaaS, system test and acceptance test are often called full-stack tests, why?

since a typical scenario exercises every part of the app from the browser-based UI to the database. Unlike unit tets, system tests rarely rely on test doubles to isolate behavior; on the contrary, the goal is to simulate real users as closely as possible.

What is Top-down integration?

starts with the top of tree structure showing the dependency among all the units. The advantage of top-down is that you quickly get some of the high level functions working, such as the user interface, which allows stakeholders to offer feedback for the app in time to make changes. The downside is that you have to create many stubs to get the app to limp along in this nascent form.

What is Repeatable in FIRST?

test behavior should not depend on external factors such as today's date or on "magic constants" that will break the tests if their values change, as occurred with many 1960s programs when the year 2000 arrived.

How do you ensure your tests are independent when use factories and fixtures?

test teardown: - restoring the state of the world to look "pristine" before the next test case runs - the database is completely erased, and any fixtures are then reloaded

What is Timely in FIRST?

tests should be created or updated at the same time as the code being tested. As we'll see, with test-driven development the tests are written immediately before the code.

What is a lead method? (subclass of unit test)

the method being tested does not call any other methods to help do its job Since even a leaf method may have multiple testable behaviors, a single method may be the subject of multiple test cases. - deterministic, no side effects, no collaborators or helper methods called. - worth structuring your code to expose as much functionality as possible in pure leaf functions.

What is the code-to-test ratio?

the number of non-comment lines of code divided by number of lines of tests of all types -n production systems, this ratio is usually less than 1 aka more lines of test than lines of code

What is the "System under Test" (SUT)?

the object being tested, whether that "object" is a single method, a group of methods, or even the entire application. That is, SUT is defined from the point of view of the test.

What is fuzz testing?

throwing random data at your application and seeing what breaks. Fuzz testing has been particularly useful for finding security vulnerabilities that are missed by both manual code inspection and formal analysis, including stack and buffer overflows, unchecked null pointers.

What is Sandwich Integration?

tries to get the best of both worlds by integrating from both ends simultaneously. - try to reduce the number of stubs by selectively integrating some units bottom-up and try to get the user interface operational sooner by selectively integrating some units top-down.

When do you use factories compared to when you use fixtures?

use factories: for kinds of data that normally change while the app is running consider fixtures: for data that doesn't change but must be present for the app to work at all.

What do you to create Fast and Repeatable test cases for code that communicates with an external service?

use stubs to mimic the service's behavior. context blocks can group specs that test different behaviors of the remote service, using before blocks to set up necessary stubs or other preconditions to simulate each behavior.

What is user acceptance testing and operational acceptance testing??

user: observes actual users (or QA engineers acting as "typical" users) using the product to determine whether you "built the right thing," operational: may manually try additional scenarios to ensure you "built the thing right." both: uncover bugs that were previously undetected, some of which can then have automated tests created for them.

What are the two aspects of software assurance for the Agile lifecycle?

validation ("Did you build the right thing?") and verification ("Did you build the thing right?")

Why are unit tests insufficient?

while unit tests run quickly and can isolate the subject code with great precision (improving both coverage resolution and error localization), because they rely on fake objects to isolate the subject code, they may mask problems that would only arise in integration tests.


Kaugnay na mga set ng pag-aaral

Chapter 19: Revolutions in Politics (ca. 1775-1815)

View Set

Chapter 47: Care of the Patient with a Cardiovascular or a Peripheral Vascular Disorder

View Set

Coursera (Practice Quiz: Introduction to Debugging)

View Set

4 Sorting Types - Last Third of Class

View Set

Phase Changes Assignment and quiz

View Set