Chapter 8 Data Structures and CATTS for Data Extraction

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Common GAS Uses

-Footing and balancing entire files or selected data items (e.g., extending inventory) -Selecting and reporting detail data -Selecting stratified statistical samples from data files -Printing confirmations -Screening / filtering data -Comparing multiple files for differences -Recalculating values in data

Virtual Storage access Method (vsam)

-Used for large files that require routine batch processing and a moderate degree of individual record processing. Ex. Customer file of public utility company - billing is done on the batch level, customer queries done at processing level. -Used for files across cylinders -uses number of indexes with summarized content -Can be searched sequentially for quick batch processing. "Gets you in the neighborhood"

Anomalies

1. Insertion - new item cannot be added to the table until at least one entity uses a particular attribute item. No class w/o professor. Can result in unrecorded transactions, incomplete audit trails 2. Deletion - attribute item used by only one entity is deleted, all information about the attribute is lost. If you delete a professor, you lose the class information associated with the professor. Can destroy the audit trail. 3. Update - modification on an attribute must be made in each of the rows in which the attribute appears. If O'Donnell becomes the Weeknd, you would have to update 3 records. Generate conflicting and obsolete database values. Relational databases utilize indexed sequential file structure.

Types of Pointers

1. Physical address - contains the actual disk storage location needed by the disk controller. 2. Relative address - contains the relative position of a record in a file. 3. Logical key pointer - contains primary key of related record. key value is the converted into the record;s physical address by a hashing algorithm.

Embedded Audit Module

AKA continuous auditing. Identify important transactions while they are being processed and extract copies of them in real time. DIS - could slow down system, can be costly - if there are frequent changes to the system, auditors must verify that no changes have been made to the EAM ADV - continuous auditing

Hashing Structure

Employs an algorithm that converts the primary key of a record directly into a storage address. Eliminates the need for a separate index. By calculating the address rather than reading it from an index, records can be retrieved more quickly. Two disadvantages: -technique does not use storage space effectively. Algorithm will never select some locations bc do not correspond to legitimate key values. - Different record keys may generate the same residual which results in the same address ==> collision bc two records cannot be stored at the same location.

Flat File Structure: Sequential

Flat files Lack integration. Sequential: All records stored in a contiguous storage space in a specified sequence. Ei key value of 1999 will come right after key value 1998. Simple & easy to process Application reads from the beginning in the sequence. (0-1999)

Data Structures

Have Two fundamental components: Organization - refers to the way records are physically arranged on the secondary storage device. Can be sequential or random. Access - technique used to locate records and to navigate through the database/file. Direct/Sequential

Flat File Structure: Indexed

In addition to the actual data file, there is a separate index that is itself a file of record addresses. The index contains the numeric storage location for each record. Physical Organization may be Seq or Random: Indexed random file - dispersed throughout a disk w/o regard to their physical proximity to other related records. Advantage - more rapid searches than sequential (less records to search through)

Comparing VSAM, SEQ, RANDOM

Random is most efficient for identifying single records, followed by VSAM and Sequential. VSAM is the most efficient for accessing entire files, followed by sequential, while Random is inefficient for accessing entire files.

Conceptual Database Models

Refer to the particular method used to organize record in a database. Objective is to develop the database efficiently so data can be accessed quickly and easily. - hierarchical (tree) - network - relational (most) Some legacy systems use hierarchical or network

GAS cont

Simple - directly access files Complex - convert files, ACL w/ excel, csv, access files. Risk that data integrity is compromised: Aud confirm w/ Comparing totals, number of records Reviewing computer log of file creation

Pointer structure

Stores in a field of one record the address (pointer) of a related record. Spread out over the entire disk w/o concern for physical proximity w/ other related records.


Kaugnay na mga set ng pag-aaral

CCJ 3024 Criminal Justice System

View Set

Chapter 7: SW Asia & N Africa Final Exam Part 1

View Set

Introduction to Government Insurance Programs

View Set

Psych 100 - Brains, Bodies, Behavior

View Set

International Business final exam (exam 2)

View Set