Testing, debugging and verification
Divide and conquer
1. Cut away one half of the test input 2. Check, whetever one of the halves still exhibits failure. 3. Continue until minimal failing input is obtained. (Same principle as binary search)
Statement coverage
All nodes in a control flow graph are visited.
Regression testing
(Automatically) Run all tests again after changing the code. Making sure the program is not having failures after making changes.
How do we pick tests?
1. Look at the specification. 2. Divide input space into regions with for which the program acts "similar". 3. Take in some inputs from region, especially borders. OBS This is a guideline, not a formal procedure.
Debugging steps
1. Verify the bug and determine correct behavior. 2. Isolate and minimize (shrink). 3. Eyeball the code, where could it be? (reason backwards). 4. Devise and run an experiment to test your hypothesis. 5. Repeat 3,4 until you understand what's wrong. 6. Fix the bug and verify the fix. 7. Create regression test.
Test driven development
1. Write tests before implementation. 2. Run them, make sure it fails. 3. Implement. 4. Run tests again.
Predicate
A .... is a function giving a boolean, a .... method is a function method giving a boolean
Propositional logic
A formula F is Satisfiable if F can be true and Valid if F always is true.
Loop invariant
A property of a program that is true after any number of iterations (including 0)
Continuous integration
A server checks out the current version of the code periodically, builds code and runs tests.
Test set
A set of tests. Several tests cases together.
Types of bugs
Abnormal execution bugs (Nontermination, crash) and Incorrect outcome/behavior (1+1=3).
Levels of testing
Acceptance Testing: assess software with respect to user requirements. System Testing: assess software with respect to system-level requirements. Integration Testing: test interactions between modules. Unit testing: test a single method/function
Advantages of Bottom-Up testing
Advantageous if major flaws occur toward bottom level. Judgement of test results easier.
Advantages of Top-Down testing
Advantageous if major flaws occur toward top level. Early skeletal program allows demonstrations and boosts morale.
Path coverage
All different paths are taken in a control flow graph. Usually it's not realistic.
Loop variant
An expression that decreases with each iteration of the loop, and is bounded from below by 0. Dafny can often guess .... .
Integration testing
Assess software with respect to high-level design. Testing interaction between modules.
Unit testing
Assess software with respect to low-level unit design. Testing individual units of a system.
System testing
Assess software with respect to system-level requirements. Testing system against specification of externally observable behavior.
Acceptance testing
Assess software with respect to user requirements.
Shrinking
Automatically go from failing test to minimal failing test. ..... the failing input somehow. See if the test still fails, if so ..... more, otherwise backtrack
Logic coverage
Based on boolean (sub-)expressions in the program. Definitions: Decision: Boolean expression (example: a || b && (3>c). Condition: Atomic boolean subexpression (examples: a, b, 3>c)..
Testing
Check for bugs. Try out inputs, see if outputs are correct.
Techniques for assurance
Code review, Pair programming, KISS, Testing, Types, Formal proof of correctness (verification), "Proven technology".
Modified Condition Decision Coverage (MCDC)
Condition/decision coverage + show that each condition influences its decision independently.
White-box testing
Create test based on externals & internals.
Black-box testing
Create tests only based on externals (specification) without knowing internals (source code).
Framing
Dafny requires you to state which variables are: Read (for functions) Modified (for methods).
What may a defect cause?
Defect may cause infection of a program state during execution (not all defects cause infection).
Specification
Describe expected behavior. An unambiguous (entydig) description of what a function should do. Bug = failure to meet .... .
Extreme testing (ET) [BeckGamma]
Developers create tests first, developers re-run tests on all incremental changes. JUnit is designed for this.
Disadvantages of Bottom-Up testing
Driver units must be produced. The program as an entity does not exist until the last unit is added.
Branch coverage
Each edge in a control flow graph are visited.
Fresh
For the verifier to know that some given object has been freshly allocated in a given method.
Property based testing
Generate random inputs and check that a property of the output holds. Different properties to test: Postcondition holds, no abnormal execution, point wise equivalence on functions and algebraic properties.
Hoare-Triple
If execution of a program S starts in a state satisfying pre-condition Q, the S is guaranteed to terminate in a state satisfying the post-condition R.
Client (caller)
Implementer of calling method or user.
Supplier (callee)
Implementer of method.
What may an infection propagate into?
Infected state propagates (sprider sig) during execution (infected parts of states may be overwritten or corrected).
What can an infection cause?
Infection may cause a failure: an externally observerable error (includin, for example, non-termination).
Bug - Defect
Introduced into code by programmer (not always programmer's fault, if, for example the requirements have changed).
Test case
Is an input to the function and a way to check that on that input, the function conforms to the specification.
Test suite
Is the set of tests for a piece of software.
Failure
Method m() .... if precondition held before calling m(), but postcondtition does not hold after C.m() (or if C.m() does not finish).
Correct
Method m() cannot fail. In other words, whenever m() is called and the precondition holds, then m() finishes and the post condition holds.
Driver
Placeholder for a calling function, sets up the context.
Stub
Placeholder implementation of a leaf function (to make sure test is run able and fails).
Ensures
Postcondition
Requires
Precondition
(Formal) Verification
Prove that program conforms to specification (prove that there are no bugs) mathematically.
Debugging
Remove bugs. Understand why a program does not do what it's supposed to.
Contract
Requires: What the client must ensure. Ensures: What the supplier must ensure.
Data-dependent
Statement B is .... on A if A writes a variable that B reads.
Control-dependent
Statement B is ..... on A if A influences whether B is executed.
(Directly) Backwards dependent
Statement B is ..... on A if either or both: B is control-dependent on A. B is data-dependent on A.
Disadvantages of Top-Down testing
Stubs must be produced ( often more complicated than anticipated ). Judgement of test results more difficult. Tempting to defer completion of testing of certain modules.
Bottom-up testing
Test leaves in call hierarchy and move up to the root. Procedure is not tested until all 'children' have been tested. Requires drivers, but no stubs
Top-Down testing
Test main procedure, then go down the call hierarchy, requires stubs, but no drivers.
Weakest precondition
The .... of a program S and post-condition R represents the set of all states such that execution of S started in any of these is guaranteed to terminate in a state satisfying R.
Test principles in [Mayers]
The programming organisation or the programmer itself should not test its own programs.
Benefits of extreme testing
You gain confidence that code will meet specification, better understand specification and requirements, express end result before you start coding and may implement simple designs and optimize later.