Chapter 13: Processing Integrity and Availability Controls
Data Entry Controls Source documents should be scanned for reasonableness and propriety before being entered into the system. However, this manual control must be supplemented with automated data entry controls, such as the following:
-A field check determines whether the characters in a field are of the proper type. For example, a check on a field that is supposed to contain only numeric values, such as a U.S. zip code, would indicate an error if it contained alphabetic characters. -A sign check determines whether the data in a field have the appropriate arithmetic sign. For example, the quantity-ordered field should never be negative. -A limit check tests a numerical amount against a fixed value. For example, the regular hours-worked field in weekly payroll input must be less than or equal to 40 hours. Similarly, the hourly wage field should be greater than or equal to the minimum wage. -A range check tests whether a numerical amount falls between predetermined lower and upper limits. For example, a marketing promotion might be directed only to prospects with incomes between $50,000$50,000 and $99,999.$99,999. -A size check ensures that the input data will fit into the assigned field. For example, the value 458,976,253 will not fit in an eight-digit field. As discussed in Chapter 11, size checks are especially important for applications that directly accept end-user input, providing a way to prevent buffer overflow vulnerabilities. -A completeness check (or test) verifies that all required data items have been entered. For example, sales transaction records should not be accepted for processing unless they include the customer's shipping and billing addresses. -A validity check compares the ID code or account number in transaction data with similar data in the master file to verify that the account exists. For example, if product number 65432 is entered on a sales order, the computer must verify that there is indeed a product 65432 in the inventory database. -A reasonableness test determines the correctness of the logical relationship between two data items. For example, overtime hours should be zero for someone who has not worked the maximum number of regular hours in a pay period. -ID codes (such as part numbers) can contain a check digit that is computed from the other digits. For example, the system could assign each new inventory item a nine-digit number, then calculate a tenth digit from the original nine and append that calculated number to the original nine to form a 10-digit part number. Data entry devices can then be programmed to perform check digit verification, which involves recalculating the check digit to identify data entry errors. Continuing our example, check digit verification could be used to verify accuracy of an inventory item number by using the first nine digits to calculate what the tenth digit should be. If an error is made in entering any of the ten digits, the calculation made on the first nine digits will not match the tenth, or check digit. Note that check digit verification only tests whether an ID code in a transaction record could exist and is designed to catch transposition errors (e.g., entering 35689 instead of 35869 as a part number). A validity check is the only way to verify that the ID code really does exist.
Additional Online Data Entry Controls
-Prompting, in which the system requests each input data item and waits for an acceptable response, ensures that all necessary data are entered (i.e., prompting is an online completeness check). -Closed-loop verification checks the accuracy of input data by using it to retrieve and display other related information. For example, if a clerk enters an account number, the system could retrieve and display the account name so that the clerk could verify that the correct account number had been entered. -A transaction log includes a detailed record of all transactions, including a unique transaction identifier, the date and time of entry, and who entered the transaction. If an online file is damaged, the transaction log can be used to reconstruct the file. If a malfunction temporarily shuts down the system, the transaction log can be used to ensure that transactions are not lost or entered twice.
COBIT 2019 management practices DSS01.04 and DSS01.05 address the importance of locating and designing the data centers housing mission-critical servers and databases so as to minimize the risks associated with natural and human-caused disasters. Common design features include the following:
-Raised floors provide protection from damage caused by flooding. -Fire detection and suppression devices reduce the likelihood of fire damage. -Adequate air-conditioning systems reduce the likelihood of damage to computer equipment due to overheating or humidity. -Cables with special plugs that cannot be easily removed reduce the risk of system damage due to accidental unplugging of the device. -Surge-protection devices provide protection against temporary power fluctuations that might otherwise cause computers and other network equipment to crash. -An uninterruptible power supply (UPS) system provides protection in the event of a prolonged power outage, using battery power to enable the system to operate long enough to back up critical data and safely shut down. (However, it is important to regularly inspect and test the batteries in a UPS to ensure that it will function when needed.) -Physical access controls reduce the risk of theft or damage.
Forms Design Source documents and other forms should be designed to minimize the chances for errors and omissions. Two particularly important forms design controls involve sequentially prenumbering source documents and using turnaround documents.
1. All source documents should be sequentially prenumbered. Prenumbering improves control by making it possible to verify that no documents are missing. (To understand this, consider the difficulty you would have in balancing your checking account if none of your checks were numbered.) When sequentially prenumbered source data documents are used, the system should be programmed to identify and report missing or duplicate source documents. 2. As explained in Chapter 2, companies use turnaround documents to eliminate the need for an external party to submit information that the organization already possesses, such as the customer's account number. Instead, that data is preprinted in machine-readable format on the turnaround document. An example is a utility bill that a special scanning device reads when the bill is returned with a payment. Turnaround documents improve accuracy by eliminating the potential for input errors when entering data manually.
the two types of daily partial backups:
1. An incremental backup involves copying only the data items that have changed since the last partial backup. This produces a set of incremental backup files, each containing the results of one day's transactions. Restoration involves first loading the last full backup and then installing each subsequent incremental backup in the proper sequence. 2. A differential backup copies all changes made since the last full backup. Thus, each new differential backup file contains the cumulative effects of all activity since the last full backup. Consequently, except for the first day following a full backup, daily differential backups take longer than incremental backups. Restoration is simpler, however, because the last full backup needs to be supplemented with only the most recent differential backup, instead of a set of daily incremental backup files.
Processing Controls Controls are also needed to ensure data is processed correctly. Important processing controls include the following:
1. Data matching. In certain cases, two or more items of data must be matched before an action can take place. For example, before paying a vendor, the system should verify that information on the vendor invoice matches information on both the purchase order and the receiving report. 2. File labels. File labels need to be checked to ensure that the correct and most current files are being updated. Both external labels that are readable by humans and internal labels that are written in machine-readable form on the data recording media should be used. Two important types of internal labels are header and trailer records. The header record is located at the beginning of each file and contains the file name, expiration date, and other identification data. The trailer record is located at the end of the file; in transaction files it contains the batch totals calculated during input. Programs should be designed to read the header record prior to processing, to ensure that the correct file is being updated. Programs should also be designed to read the information in the trailer record after processing, to verify that all input records have been correctly processed. 3. Recalculation of batch totals. Batch totals should be recomputed as each transaction record is processed, by comparing a running total calculated during processing to the corresponding batch total calculated during input and stored in the trailer record. Any discrepancies indicate a processing error. Often, the nature of the discrepancy provides a clue about the type of error that occurred. For example, if the recomputed record count is smaller than the original, one or more transaction records were not processed. Conversely, if the recomputed record count is larger than the original, either additional unauthorized transactions were processed, or some transaction records were processed twice. If a financial or hash total discrepancy is evenly divisible by 9, the likely cause is a transposition error, in which two adjacent digits were inadvertently reversed (e.g., 46 instead of 64). Transposition errors may appear to be trivial but can have enormous financial consequences. For example, consider the effect of misrecording the interest rate on a loan as 6.4%6.4% instead of 4.6%.4.6%. 4. Cross-footing and zero-balance tests. Often totals can be calculated in multiple ways. For example, in spreadsheets a grand total can be computed either by summing a column of row totals or by summing a row of column totals. These two methods should produce the same result. A cross-footing balance test compares the results produced by each method to verify accuracy. A zero-balance test applies this same logic to verify the accuracy of processing that involves control accounts. For example, the payroll clearing account is debited for the total gross pay of all employees in a particular time period. It is then credited for the amount of all labor costs allocated to various expense categories. The payroll clearing account should have a zero balance after both sets of entries have been made; a nonzero balance indicates a processing error. 5. Write-protection mechanisms. These protect against overwriting or erasing of data files stored on magnetic media. Write-protection mechanisms have long been used to protect master files from accidentally being damaged. Technological innovations also necessitate the use of write-protection mechanisms to protect the integrity of transaction data. For example, radio frequency identification (RFID) tags used to track inventory need to be write-protected so that unscrupulous customers cannot change the price of merchandise. 6. Concurrent update controls. Errors can occur when two or more users attempt to update the same record simultaneously. Concurrent update controls prevent such errors by locking out one user until the system has finished processing the transaction entered by the other.
The preventive controls discussed in the preceding section can minimize, but not entirely eliminate, the risk of system downtime. Hardware malfunctions, software problems, or human error can cause data to become inaccessible. For example, RAID devices can experience catastrophic failures, rendering all the drives useless. That's why senior management needs to answer two fundamental questions:
1. How much data are we willing to recreate from source documents (if they exist) or potentially lose (if no source documents exist)? 2. How long can we function without our information system?
Output Controls Careful checking of system output provides additional control over processing integrity. Important output controls include the following:
1. User review of output. Users should carefully examine system output to verify that it is reasonable and complete, and that they are the intended recipients. 2. Reconciliation procedures. Periodically, all transactions and other system updates should be reconciled to control reports, file status/update reports, or other control mechanisms. In addition, general ledger accounts should be reconciled to subsidiary account totals on a regular basis. For example, the balance of the inventory control account in the general ledger should equal the sum of the item balances in the inventory database. The same is true for the accounts receivable, capital assets, and accounts payable control accounts. 3. External data reconciliation. Database totals should periodically be reconciled with data maintained outside the system. For example, the number of employee records in the payroll file can be compared with the total number of employees in the human resources database to detect attempts to add fictitious employees to the payroll database. Similarly, inventory on hand should be physically counted and compared to the quantity on hand recorded in the database. The results of the physical count should be used to update the recorded amounts and significant discrepancies should be investigated. 4. Data transmission controls. Organizations also need to implement controls designed to minimize the risk of data transmission errors. Whenever the receiving device detects a data transmission error, it requests the sending device to retransmit that data. Generally, this happens automatically, and the user is unaware that it has occurred. For example, the Transmission Control Protocol (TCP) discussed in Chapter 11 assigns a sequence number to each packet and uses that information to verify that all packets have been received and to reassemble them in the correct order. Two other common data transmission controls are checksums and parity bits. 5. Checksums. When data are transmitted, the sending device can calculate a hash of the file, called a checksum. The receiving device performs the same calculation and sends the result to the sending device. If the two hashes agree, the transmission is presumed to be accurate. Otherwise, the file is resent. 6. Parity bits. Computers represent characters as a set of binary digits called bits. Each bit has two possible values: 0 or 1. Many computers use a seven-bit coding scheme, which is more than enough to represent the 26 letters in the English alphabet (both upper- and lowercase), the numbers 0 through 9, and a variety of special symbols ($, %, &, etc.). A parity bit is an extra digit added to the beginning of every character that can be used to check transmission accuracy. Two basic schemes are referred to as even parity and odd parity. In even parity, the parity bit is set so that each character has an even number of bits with the value 1; in odd parity, the parity bit is set so that an odd number of bits in the character have the value 1. For example, the digits 5 and 7 can be represented by the seven-bit patterns 0000101 and 0000111, respectively. An even parity system would set the parity bit for 5 to 0, so that it would be transmitted as 00000101 (because the binary code for 5 already has two bits with the value 1). The parity bit for 7 would be set to 1 so that it would be transmitted as 10000111 (because the binary code for 7 has 3 bits with the value 1). The receiving device performs parity checking, which entails verifying that the proper number of bits are set to the value 1 in each character received. 7. Blockchain. As explained in Chapter 12, blockchains provide a way to ensure that validated transactions and documents are not altered. Integrity is assured by hashing the contents of each block and then storing multiple copies of the entire chain on different devices.
archive
A copy of a database, master file, or software retained indefinitely as a historical record, usually to satisfy legal and regulatory requirements.
checksum
A data transmission control that uses a hash of a file to verify accuracy.
hot site
A disaster recovery option that relies on access to a completely operational alternative data center not only prewired but also contains all necessary hardware and software.
cold site
A disaster recovery option that relies on access to an alternative facility prewired for necessary telephone and Internet access, but does not contain any computing equipment.
redundant arrays of independent drives (RAID)
A fault tolerance technique that records data on multiple disk drives instead of just one to reduce the risk of data loss.
business continuity plan (BCP)
A plan that specifies how to resume all business processes in the event of a major calamity.
disaster recovery plan (DRP)
A plan to restore an organization's IT capability in the event its data center is destroyed.
deduplication
A process that uses hashing to identify and backup only those portions of a file or database that have been updated since the last backup.
cross-footing balance test
A processing control that verifies accuracy by comparing two alternative ways of calculating the same total.
zero-balance test
A processing control that verifies that the balance of a control account equals zero after all entries to it have been made.
uninterruptible power supply (UPS)
An alternative power supply device that protects against the loss of power and fluctuations in the power level by using battery power to enable the system to operate long enough to back up critical data and safely shut down.
transposition error
An error that results when numbers in two adjacent columns are inadvertently exchanged (for example, 64 is written as 46).
concurrent update controls
Controls that lock out users to protect individual records from errors that could occur if multiple users attempted to update the same record simultaneously.
full backup
Exact copy of an entire database.
real-time mirroring
Maintaining complete copies of a database at two separate data centers and updating both copies in real time as each transaction occurs.
Backups are designed to mitigate problems when one or more files or databases become corrupted because of hardware, software, or human error. Disaster recovery plans and business continuity plans are designed to mitigate more serious problems.
Note
Cancellation and Storage of Source Documents Source documents that have been entered into the system should be canceled so they cannot be inadvertently or fraudulently reentered into the system.
Note
Data backup procedures are designed to deal with situations where information is not accessible because the relevant files or databases have become corrupted as a result of hardware failure, software problems, or human error, but the information system itself is still functioning.
Note
If the customer purchases more than one product, there will be multiple inventory item numbers, quantities sold, and prices associated with each sales transaction. Processing these transactions includes the following steps: (1) entering and editing the transaction data; (2) updating the customer and inventory records (the amount of the credit purchase is added to the customer's balance; for each inventory item, the quantity sold is subtracted from the quantity on hand); and (3) preparing and distributing shipping and/or billing documents.
Note
In addition, forms design, cancellation and storage of source documents, and automated data entry controls are needed to verify the validity of input data.
Note
The phrase "garbage in, garbage out" highlights the importance of input controls. If the data entered into a system are inaccurate, incomplete, or invalid, the output will be too
Note
The purpose of backups is to enable restoration of data in the event that the original copy becomes inaccessible.
Note
What media should be used for backups and archives, tape or disk? Disk backup is faster, and so is the time required to retrieve the data. Tape, however, is cheaper, easier to transport, and more durable.
Note
recovery point objective (RPO)
The amount of data the organization is willing to reenter or potentially lose.
fault tolerance
The capability of a system to continue performing when there is a hardware failure.
recovery time objective (RTO)
The maximum tolerable time to restore an organization's information system following a disaster, representing the length of time that the organization is willing to attempt to function without its information system.
header record
Type of internal label that appears at the beginning of each file and contains the file name, expiration date, and other file identification information.
trailer record
Type of internal label that appears at the end of a file; in transaction files, the trailer record contains the batch totals calculated during input.
Additional Batch Processing Data Entry Controls
• Batch processing works more efficiently if the transactions are sorted so that the accounts affected are in the same sequence as records are stored in the master file. For example, accurate batch processing of sales transactions to update customer account balances requires that the sales transactions file first be sorted by customer account number. A sequence check tests whether a transaction file is in the proper numerical or alphabetical sequence. • An error log that identifies data input errors (date, cause, problem) facilitates timely review and resubmission of transactions that cannot be processed. • Batch totals calculate numeric values for a batch of input records. Batch totals are used to ensure that all records in a batch are processed correctly. The following are three commonly used batch totals: 1. A financial total sums a field that contains monetary values, such as the total dollar amount of all sales for a batch of sales transactions. 2. A hash total sums a nonfinancial numeric field, such as the total of the quantity-ordered field in a batch of sales transactions. Unlike financial totals, hash totals have no inherent meaning. For example, it is possible to sum up the invoice numbers in a batch of sales transactions but the result is meaningless; its only purpose is to serve as an input control. 3. A record count is the number of records in a batch.