ch8 410
Benefits of Denormalization
-Can improve performance (speed) by reducing number of table lookups (i.e. reduce number of necessary join queries)
•Recovery and Restart Procedures
-Disk Mirroring - switch between identical copies of databases -Restore/Rerun - reprocess transactions against the backup (only done as a last resort) -Backward Recovery (Rollback) - apply before images -Forward Recovery (Roll Forward) - apply after images (preferable to restore/rerun)
•Advantages of Partitioning
-Efficiency: records used together are grouped together -Local optimization: each partition can be optimized for performance -Security: data not relevant to users are segregated -Recovery and uptime: smaller files take less back up time -Load balancing: partitions stored on different disks, reduces contention
•Disadvantages of Partitioning
-Inconsistent access speed: slow retrievals across partitions -Complexity: non-transparent partitioning -Extra space or update time: duplicate data; access from multiple partitions
Integrity Controls
-Protect data from unauthorized use -Domains - set allowable values -Assertions - enforce database conditions -Triggers - prevent inappropriate actions, invoke special handling procedures, write to log files
•Tablespace components
-Segment - a table, index, or partition -Extent - contiguous section of disk space -Data block - smallest unit of storage
views
-Subset of the database that is presented to one or more users -User can be given access privilege to view without allowing access privilege to underlying tables
Checkpoint Facilities
A DBMS may perform checkpoints automatically (which is preferred) or in response to commands in user application programs. Checkpoints should be taken frequently (say, several times an hour).
information repository
A component that stores metadata that describe an organization's data and data processing resources, manages the total information processing environment, and combines information about an organization's business information and its application portfolio
after image
A copy of a record (or page of memory) after it has been modified
before image
A copy of a record (or page of memory) before it has been modified
data type
A detailed coding scheme recognized by system software, such as a DBMS, for representing organizational data
transaction
A discrete unit of work that must be completely processed or not processed at all within a computer system
tablespace
A named logical storage unit in which data from one or more database tables, views, or other database objects may be stored
hashed file organization
A storage system in which the address for each record is determined using a hashing algorithm
base table
A table in the relational data model containing the inserted raw data and correspond to the relations that are identified in the database's conceptual schema
file organization
A technique for physically arranging the records of a file on secondary storage devices
forward recovery
A technique that starts with an earlier copy of a database, applies images, then quickly moves forward to a later state
Authentication versus authorization.
Authorization rules specify who has what access rights to what data elements. Authentication schemes are means of ensuring that a user is who he or she claims to be.
BLOB:
Binary large object, can store up to 4 gigabytes, Used for photos, sound clips, etc.
C L O B
Character large object, capable of storing up to 4 gigabytes of one variable length character data field
Critical Decisions for Physical Design
Choosing the storage format (called data type) for each attribute from the logical data model; Giving the D B M S guidance regarding how to group attributes from the logical data model into physical records Giving the D B M S guidance on how to arrange similarly structured records in secondary memory (file organization) Selecting structures (including indexes and the overall database architecture) for storing and connecting files to make retrieving related data more efficient Preparing strategies for handling queries against the database that will optimize performance (query optimization)
authorization
Controls incorporated in a data management system that restrict access to data and actions people may take when they access data
Backup Facilities
DBMS copy utility that produces backup copy of the entire database or subset Periodic backup (e.g. nightly, weekly) Cold backup-database is shut down during backup Hot backup-selected portion is shut down and backed up at a given time Backups stored in secure, off-site location
recovery manager
DBMS module that restores a database after a failure
vertical partitioning
Distribution of the columns of a logical relation into several separate physical tables -Useful for situations where different users need access to different columns -The primary key must be repeated in each file
horizontal partitioning
Distribution of the rows of a logical relation into several separate tables -Useful for situations where different users need access to different rows -Three types: Key Range Partitioning, Hash Partitioning, or Composite Partitioning
CHAR(length) max 200 characters
Fixed length character data. N CHAR is Unicode.
•Substitute an estimate of the missing value (e.g., using a formula) •Construct a report listing missing values •In programs, ignore missing data unless the value is significant (sensitivity testing)
How to Handle Missing Data:
disk mirroring, backups, and the maintenance of transaxtion and changelogs
How to make a system fault tolerant
The key focus of SOX audits is around three areas of control
IT change management; Logical access to data; IT operations
Information Needed for Physical Design
Normalized relations, including estimates for the range of the number of rows in each table; Definitions of each attribute, along with physical specifications such as maximum possible length; Descriptions of where and when data are used in various ways (entered, retrieved, deleted, and updated), including typical frequencies of these events; Expectations or requirements for response time and data security, backup, recovery, retention, and integrity; Descriptions of the technologies (database management systems) used for implementing the database
secondary key
One field or a combination of fields for which more than one record may have the same combination of values. Also called a nonunique key
I T operations
Policies and procedures for day-to-day management of infrastructure, applications, and databases
NUMBER
Positive or negative number. NUMBER(5) means a 5 digit integer. NUMBER(5,2): 5 digits, two to the right of the decimal point.
I T change management
Processes by which changes to operational systems and databases are authorized
Logical access to data
Security procedures to prevent unauthorized access; Personnel controls and physical access controls
system catalog
System-created database that describes all database objects
backward recovery
The backout, or undo, of unwanted changes to a database; rollback
encryption
The coding or scrambling of data so that humans cannot read them
database administrator (DBA)
The organizational role that typically has the primary responsibility for physical database design
denormalization
The process of transforming normalized relations into non-normalized physical record specifications
indexed file organization
The storage of records either sequentially or non-sequentially with an index that allows software to locate individual records
sequential file organization
The storage of records in a file in sequence according to a primary key value
VARCHAR2(length) max 400 characters
Variable-length character data. A string that is shorter than the maximum length will consume only the required space. N VARCHAR2 is Unicode.
extent
a contiguous section of disk storage space
index
a data structure used to determine in a file the location of records
A data type
a detailed coding scheme recognized by a system software, such as a DBMS, for representing organizational data. Also specifies the bit pattern of data stored in that coding scheme.
pointer
a field not containing business data
physical file
a named area of secondary memory for the purpose of storing physical records
Segment
a table, index, or partition
Range control
allowable value limitations (constraints or validation rules)
Null value control
allowing or prohibiting empty fields
Forward Recovery (Roll Forward)
apply after images (preferable to restore/rerun)
Backward Recovery (Rollback)
apply before images
Default value
assumed value if no explicit value
backup facility
automatically produces a saved copy of an entire database
Extent
contiguous section of disk space
hashing algorithm
converts a key value into an address
Goal of physical database design
create a design for storing data that will provide adequate performance and ensure database integrity, security, and recoverability
-Security
data not relevant to users are segregated
The primary goal of physical database design is:
data processing efficiency.
data dictionary
documents data elements of a database
-Local optimization
each partition can be optimized for performance
Assertions
enforce database conditions
Input to the physical design comes from:
outputs from requirements analysis and logical design
-Load balancing
partitions stored on different disks, reduces contention
Triggers
prevent inappropriate actions, invoke special handling procedures, write to log files
Sarbanes- Oxley Act (SOX)
protect investors by improving accuracy and reliability legislates importance of financial data integrity.
Referential integrity
range control (and null value allowances) for foreign-key to primary-key match-ups
-Efficiency
records used together are grouped together
Four objectives of a data type:
represent all possible values; improve data integrity; support all data manipulations, minimize storage space
Restore/Rerun
reprocess transactions against the backup (only done as a last resort)
database recovery
restoring a database after a loss
Domains
set allowable values
-Recovery and uptime
smaller files take less back up time
Field
smallest unit of application data recognized by system software
Data block
smallest unit of storage
Disk Mirroring
switch between identical copies of databases
Without careful physical design, an organization cannot demonstrate:
that its data are accurate and well protected. These listed standards, guidelines, and rules focus on corporate governance, risk assessment, and security and controls of data.
Purpose of physical database design
translate the logical description of data into the technical specification for storing and retrieving data
Disadvantages of Denormalization
•Costs (due to data duplication) -Wasted storage space -Data integrity/consistency threats