Chapter 4

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Primary Key

¤ An attribute or combination of attributes that can be used to uniquely identify a specific row (record) in a table. ¤ In some tables, two or more attributes may be joined to form the primary key. Look at the Student x Course • Primary key is the Student # and Course ID #. ¤ The primary key in inventory is the item # ¤ The primary key in sales is the invoice # ¤ The primary key in customer is the customer #

Entity

¤ An entity is anything about which the organization wishes to store data. ¤ At your college or university, one entity would be the student.

Data Hierarchy

¤ Field stores the attributes about an entity ¤ Record are a related group of fields ¤ File is a related group of records ¤ Database is a related group of files

Relational Database Model

¤ Relational database model represents the conceptual and external level schemas as if data are stored in tables. ¤ Table • Each row, a tuple, contains data about one instance of an entity. This is equivalent to a record • Each column contains data about one attribute of an entity. This is equivalent to a field

Schemas

¤ Schemas describe the logical structure of a database ¤ External Level is a set of individual user views of portions of the database, i.e., how each user sees the portion of the system with which he interacts. • These individual views are referred to as subschema. ¤ Conceptual Level is the organization-wide view of the entire database—i.e., the big picture. • Lists all data elements and the relationships between them. ¤ Internal Level is a low-level view of the database. • It describes how the data are actually stored and accessed including: 1. Record layouts 2. Definitions 3. Addresses 4. Indexes

DBMS Languages Function: 2. Changing a database (DML)

¤ The set of commands used to change the database is known as data manipulation language (DML). ¤ DML is used for maintaining the data including: 1. Updating data 2. Inserting data 3. Deleting portions of the database

Advantages of Database Systems

1. Data Integration: Files are logically combined and made accessible to various systems. 2. Data Sharing: With data in one place it is more easily accessed by authorized users. 3. Minimizing Data Redundancy and Data Inconsistency: Eliminates the same data being stored in multiple files, thus reducing inconsistency in multiple versions of the same data. 4. Data Independence: Data is separate from the programs that access it. Changes can be made to the data without necessitating a change in the programs and vice versa. 5. Cross-Functional Analysis: Relationships between data from various organizational departments can be more easily combined.

Design Requirements for Relational Database

1. Every column must be single valued. 2. Primary keys must contain data (not null). 3. Foreign keys must contain the same data as the primary key in another table. 4. All non-key attributes must identify a characteristic of the table identified by the primary key.

There are two basic ways to design well-structured relational databases

1. Normalization • Starts with the assumption that everything is initially stored in one large table. • A set of rules is followed to decompose that initial table into a set of normalized tables. • Objective is to produce a set of tables in third-normal form (3NF) because such tables are free of update, insert, and delete anomalies. 2. Semantic data modeling • Database designer uses knowledge about how business processes typically work and the information needs associated with transaction processing to draw a graphical picture of what should be included in the database. • The resulting graphic is used to create a set of relational tables that are in 3NF.

Proliferation of master files problems

1. Often the same information was stored in multiple master files. • Example: Fact A, B, and C are in multiple different programs. 2. Made it more difficult to effectively integrate data and obtain an organization-wide view of the data. 3. Also, the same information may not have been consistent between files. • If a student changed his phone number, it may have been updated in one master file but not another. • A magazine customer has changed addresses and called the company's shipping department to inform them of the change. The shipping department has updated the address of the customer but the billing department may still have the old address.

Logical vs. Physical

Physical View: Depends on explicitly knowing: ¤ How is the data actually arranged in a file and ¤ Where is the data stored on the computer Logical View: How people conceptually organize and understand the relationship between data items. ¤ A Schema separates storage of data from use of the data ¤ Unnecessary to explicitly know how and where data is stored.

Database Overview

¤ A database is a set of inter-related, centrally coordinated files. ¤ The database approach treats data as an organizational resource that should be used by and managed for the entire organization, not just a particular department. ¤ A database management system (DBMS) serves as the interface between the database and the various application programs. ¤ The combination of the database, the DBMS, and the application programs that access the database is referred to as the database system. ¤ The person responsible for the database is the database administrator. ¤ Data warehouses is a large database with detailed and summary data that contains data related to many different years. Used for analysis rather than transaction processing. • Complements transaction processing, it doesn't replace it. • It is periodically updated. • Expensive to maintain but worth it because it speed up queries.

Design Requirements for Relational Database: Foreign keys must contain the same data as the primary key in another table.

¤ A foreign key must either be null or correspond to the value of a primary key in another table. ¤ This rule is referred to as the referential integrity rule. • Ensures there is no update anomaly ¤ The rule is necessary because foreign keys are used to link rows in one table to rows in another table.

Data Dictionary

¤ A key component of a DBMS is the data dictionary. ¤ Contains information about the structure of the database. ¤ For each data element, there is a corresponding record in the data dictionary describing that element. ¤ Information provided for each element includes: 1. A description or explanation of the element. 2. The records in which it is contained. 3. Its source. 4. The length and type of the field in which it is stored. 5. The programs in which it is used. 6. The outputs in which it is contained. 7. The authorized users of the element. 8. Other names for the element.

File

¤ A set of all related records forms a file (e.g., the student file). ¤ If this university only had three students and five fields for each student, then the entire file would be depicted in the image.

Database

¤ A set of interrelated, centrally coordinated files forms a database. ¤ A database consolidates records previously stored in separate files into a common pool and serves a variety of users and data processing applications.

Design Requirements for Relational Database: All non-key attributes must identify a characteristic of the table identified by the primary key.

¤ All non-key attributes in a table should describe a characteristic of the object identified by the primary key. ¤ Could nationality be a non-key attribute in the student table? • Yes, it describes the student ¤ Could advisor's nationality be a non-key attribute in the student table? • No, it describes the advisor, not the student

Record

¤ All the fields containing data about one entity (e.g., one student) form a record. ¤ The example shows the record for Artie Moore.

Foreign Key

¤ An attribute in one table that is a primary key in another table. ¤ Used to link the two tables ¤ The customer # is the primary key in the customer table but is a foreign key in the sale table. ¤ The advisor # is the primary key in the advisor table but is the foreign key for the student table.

Bad Alternatives to Relational Database Model: 2. One row with multiple columns

¤ Another possible approach would be to store each student in one row of the table and create multiple columns to accommodate each class that he is taking. ¤ Why this is bad: 1. Lots of repeating rows vs. Lots of Repeating Columns 2. Data is may be different for each column so you must query each column thus time-consuming and inefficient. 3. Lots of Empty Data 4. You don't know how many columns you may be need

Database Key Terms & Terminology

¤ Business Intelligence: Analyzing large amounts of data for strategic decision making • Also called Data Analytics • Two main techniques in business intelligence are OLAP and Dating mining. ¤ Online Analytical Processing (OLAP): Using Queries to investigate hypothesized relationships among data • If a store wants to know how many sales of a particular item have been sold within a period of time, like a gaming console during December. ¤ Data Mining: Using sophisticated statistical analysis to "discover" hypothesized relationships in the data. • Credit card companies use data mining to find fraudulent usage patterns. • Stores use data mining to analyze goods for sale that may not be intuitive. Like during a crisis, Pop-Tarts intuitively would not be thought of as a crisis food item but data mining showed historically they sold in volume. ¤ Record Layout: Document shows which items are stored in a file including the order and length of the data fields.

Database's Purpose

¤ Database systems were developed to address the problems associated with the proliferation of master files. ¤ For years, each time a new information need arose, companies created new files and programs. • The result: a significant increase in the number of master files.

DBMS Languages

¤ Every DBMS must provide a means of performing the three basic functions of: 1. Creating a database (DDL) • Should be restricted to authorized administrators and programmers. 2. Changing a database (DML) • Should be restricted to authorized administrators and programmers. 3. Querying a database (DQL) and Report Writer • Availaible to all users

Database Design Errors

¤ If database is not designed properly data errors can occur. 1. Update Anomaly: Changes to existing data are not correctly recorded. • Due to multiple records with the same data attributes • Updating one may not update all instances of the data. 2. Insert Anomaly: Unable to add a record to the database. • (Two things must be added at once, impossible to add only one of them until the other happens). • Students can't register for class unless they are assigned an advisor 3. Delete Anomaly: Removing a record also removes unintended data from the database. • Deleting the only bill for a customer ends up deleting all information about the customer.

Logical and physical views of data in database-oriented systems

¤ In database-oriented systems, the logical and physical views of data are separated. ¤ Separating these views facilitates application development, because programmers can focus on coding the logic and not be concerned with storage details. ¤ The DBMS handles the link between the physical and logical views of the data. • Allows the user to access, query, and update data without reference to how or where it is physically stored. • The user only needs to define the logical data requirements. ¤ Separating the logical and physical views of data also means users can change their conceptualizations of the data relationships without making changes in the physical storage. ¤ The database administrator can also change the physical storage of the data without affecting users or application programs.

Logical and physical views of data in file-oriented systems

¤ In file-oriented systems, programmers must know the physical location and layout of records used by a program. • They must reference the location, length, and format of every field they utilize. • When data is used from several files, this process becomes more complex.

Attributes and Fields

¤ Information about the attributes of an entity (e.g., the student's ID number and birth date) are stored in fields.

Bad Alternatives to Relational Database Model: 1. One uniform table

¤ One possible alternate approach would be to store all data in one uniform table. • Many flat file application based systems use this method. ¤ For example, instead of separate tables for students and classes, we could store all data in one table and have a separate line for each student x class combination. ¤ Why this is bad: 1. This can be very large and you don't know how many rows you need. 2. Data is repeated several times 3. You have to update all data that it pertains to individually. Like if Simpson change their phone #, it must be updated for all instances and you could forget one or many. 4. Insert data can be pain, it would not be simple and you will have to reorganize data for it to be presentable or know where to go specifically or having to wait until a required action is done to insert. 5. If you delete data, it is gone permanently if there are not alternative copies or multiple instances of the data to reference.

Advantages of Semantic data over normalization rules

¤ Semantic data modeling uses the designer's knowledge about business processes and practices • Therefore it facilitates the efficient design of transaction processing databases. ¤ The resulting graphical model explicitly represents information about the organization's business processes and policies and facilitates communication with intended users.

Why is Design Requirements for Relational Database Important

¤ The preceding constraints produce a well-structured (normalized) database in which: • Data are consistent. • Redundancy is minimized and controlled. ¤ In a normalized database, attributes appear multiple times only when they function as foreign keys. ¤ The referential integrity rule ensures there will be no update anomaly problem with foreign keys.

Design Requirements for Relational Database: Primary keys must contain data (not null).

¤ The primary key uniquely identifies a specific row in the table, so it cannot be null, and it must be unique for every record. • No duplicate primary keys so they are unique for each table. ¤ Primary keys reference a table, you can't reference nothing. ¤ This rule is referred to as the entity integrity rule.

DBMS Languages Function: 1. Creating a database (DDL)

¤ The set of commands used to create the database is known as data definition language (DDL). ¤ DDL is used to: 1. Build the data dictionary 2. Initialize or create the database 3. Describe the logical views for each individual user or programmer 4. Specify any limitations or constraints on security imposed on database records or fields

DBMS Languages Function: 3. Querying a database (DQL) and Report Writer

¤ The set of commands used to query the database is known as data query language (DQL). ¤ DQL is used to interrogate the database, including: 1. Retrieving records 2. Sorting records 3. Ordering records 4. Presenting subsets of the database ¤ The DQL usually contains easy-to-use, powerful commands that enable users to satisfy their own information needs. ¤ Report Writer: Language that simplifies report creation. • Used in conjunction with Data Query Language • Users specify the data elements they want printed, and the report writer searches the database, extracts the data elements, and prints them in the user-specified format.

Relational Database and Why It Works

¤ The solution to the bad database problems is to use a set of tables in a relational database. ¤ Each entity is stored in a separate table, and separate tables or foreign keys can be used to link the entities together. ¤ There are no update, insert, and delete anomalies since tables are separated.


Ensembles d'études connexes

Chp. 35: Assessment of immune Function

View Set

Chapter 5 - The Executive Branch - The Presidency and Vice Presidency - Civics E Dawidczyk

View Set

Chapter 18 - Performance and Discharge

View Set

Primal or Local Religions Study Guide

View Set

EMT - Chapter 13 - Patient Assessment

View Set