Database Quiz 2
Relationship level integrity
The connection between the two tables or key fields in a relationship is sound Can insert new records into each table in a meaningful manner Can delete an existing record without producing any adverse effects A meaningful limit to the number of records that can be interrelated within the relationship
Validation table: stores data that you specifically use to implement data integrity
Usual comprised of two fields Primary key and is what you'll use to help you enforce data integrity Non-key field that stores a set of values required by some other field in the database
equijoin
a join based on the testing of equality between two tables also called an inner join
join
a mechanism used to associate tables within a SELECT statement and thus join them returns a single set of output does not exist in the actual database tables must specify all the tables to be included and how they are related to each other
entity-relationship diagram
a specialized graphic that illustrates the relationships between entities (tables) in a database aka entity relationship model
business rule
a statement that imposes some form of constraint on a specific aspect of the database, such as the elements within a field specification for a particular field or the characteristics of a given relationships
scale
able to handle an increasing load without failing well-designed database or application is said to scale well
Business rules
are the final component of overall data integrity o Definition: statement that imposes some form of constraint on a specific aspect of a database based on an organizations expectation of data utilization determined by the way it functions and conducts business. o Each organization will have unique needs that drive specific business rules i.e. Order date can't be prior to product received date • Agent can't have more than 5 clients o Some business rules can be established as constraints within the logical design of the database, and others outside of the logical design. o 2 Types of Business Rules: Data Base Oriented - within logical DB design • Done via field specifications and/or relationship characteristics Application Oriented - can't be done in logical DB design, and therefore established during physical design or within the application design
views
are virtual tables composed of fields from one or more base tables and possibly fields from other views
table diagramming
boxed corners
relational level integrity
connection between 2 tables in a relationship is sound -done via PK and FK for 1-1 and 1-M -done via linking tables for M-M you can insert new records into each table in meaningful way -done via Type of Participation you can delete an existing record without adverse effects -done via deletion rules there is a meaningful limit to the number of records that can be interrelated in a relationship -done via degree of participation
set default
delete the record in the parent table and will then update the foreign key values of related records in the child table to the current default value logical element setting in the foreign key's field specifications; must have a setting for the default value
Five types of deletion rules
deny restrict cascade nullify set default
relational tables (SQL lesson 12)
designed so that information is split into multiple tables, one for each data type scale far better than non-relational databases allows for efficient storage, easier manipulation, and greater scalability
degree of participation
determines the minimum number of records a given table must have and the maximum number of records allowed to have in a relationship reflected on both sides of the table unlimited max: N
deletion rules
determines what the RDBMS should do with a child record when a parent record is deleted helps protect against "orphaned records"
Natural join
does everything an inner join does but eliminates multiple occurrences so that only one of each column is returned you select only columns that are unique -usually using a wildcard for one table and explicit subsets of the columns for all other tables
oracle and alias
don't use AS example: instead of Customers AS C you use Customers C
refining all FK
elements of a FK: -same name as PK from which it was copied --occasionally will break rule and use Alias when necessary -uses a replica of the field specs for the PK from which it was copied --general elements may need to be updated ---FK specification type = replica ---parent table = name of FK parent table ---source specification = name of the parent PK field ---description = indicates the FK purpose --logical elements may need to be updated ---key type = FK ---uniqueness = non unique for 1-M; unique for 1-1 ---values entered by = user ---range of values = uses range from parent pk ---edit rule = enter now, edits allowed -draws values from the pk to which it refers
connection
establish a connection between a pair of tables that are logically related to each other
Relationships
exist when you want to associate records from one table to another allows you to draw data from multiple tables simultaneously a proper relationship ensures relationship level integrity (referential integrity) which is part of overall data integrity
self referencing relationship
exists between the records within a table when a given record in the table is related to other records within the table can be 1-1, 1-M, M-M 1-1: exists when a given record in the table can be related to only one other record within the table 1-M: a given record in the table can be related to one or more other records within the table M-M: when a given record in the table can be related to one or more other records within the table AND one or more records can themselves be related to the given record
type of participation
in a relationship between 2 tables, each table participates in a particular manner determines whether a record must exist in that table before you enter a record into the related table 2 types: mandatory and optional ball or extra vertical line for diagramming
problems with many to many relationships
it will be tedious and somewhat difficult to retrieve information from one of the tables one of the tables will contain a large amount of redundant data duplicate data will exist within both tables it will be difficult for you to insert, update, and delete data
cross join
join that returns a Cartesian Product
joining tables problem
joins at run-time meaning it is very resource intensive meaning more performance degrade
logical elements category for field specification for a FK
key type: foreign uniqueness: non-unique values entered: user range of values: set this element in such a way that you can enter only existing values from the parent PK edit rule: enter now edits allowed or enter later edits allowed
relationship types
means cardinality 3 types: --One to one (1-1) --One to many (1-M) --Many to many (M-M) two tables participate in ONLY ONE type of relationship at any given time with each other
how do you decide which deletion rule to use
need to understand business requirements and application intent use Restrict as a matter of course and others as needed pose the questions using the tables in the relationship
Data Base Oriented Business Rules
o 2 Categories: Field Specifications - imposes constraints on the elements of a field • Ex: Date of Birth must be show as MM/DD/YYYY o Use the field specification "Display Format" element Relationship Specific - imposes constraints on the characteristics of a relationship between 2 tables • Ex: A manager must have at least 5 employees, but not more than 30 o Use degree of participation (5,30) o Defining and Establishing Business Rules Suggested approach - define field specifications first; relationships second Interview key users and management - group meeting recommended
view types
o 3 Logical - Data, Aggregate, and Validation o 2 Physical - materialized and partitioned (RDBMS specific and not within scope of class)
why use a view?
o Allows you to work with data from multiple tables o Reflects the most current information o Customizable to meet various organizational/business needs o Helps enforce data integrity via a Validation Views (similar to Validation Table) o Provides another method for protecting sensitive/private information where only certain users can access information
validation tables
o As we know, constraints can be imposed on a field to use a specific set of values via the fields Range of Values element in the field specifications o If there are many values and/or they change frequently, it may be difficult to define via field specifications o Validation tables allow you to store data specifically used for data integrity validation - aka "Look up tables" o Usually have 2 fields - PK, Non-key o Diagramming Technique: See State Table below box with curved edges.
data view
o Can pull data from single or multiple tables: Single Table View • Uses some fields from the base table • RDBMS rebuilds view every time accessed • As DB table changes, so does the view Multi Table View • Requires tables used in the view to be in a relationship • Uses some fields from the various base tables • RDBMS rebuilds view every time accessed • As DB table changes, so does the view
Calculated Fields
o Decide carefully if you need a calculated field o Determine if a function can be used for your calculated field Check RDBMs documentation - i.e. Count, Max o Review the list of calculated fields created during the design process - Analyze Current Database Phase o If you create a new calculated field that is not on the list previously created, add it to the list to keep track of all calculated fields
Defining Views Process
o Decide what type of views (Data, Aggregate, Validation) o Decide what tables and fields needed for each view o Determine what if any calculated fields needed (aggregate view) o Record it in a View Diagram o Create View Specification Sheet
Important points on RDB Views:
o Pulls data from base tables o Don't store data in its own table o RDBMS stores the structure of the view o RDBMS rebuilds and populates the view every time it is accessed o Some RDBMs refer to Views as "Saved Queries"
View Specification Sheet
o Record the characteristics of the view o Name - follow similar guidelines for creating table names, but you may use more than one subject for the table name identification o Type - Data, Aggregate, Validation o Base Tables - name the base tables used in the view o Calculated Field & Expressions - Follow guidelines for creating field names, but you may refer to more than one characteristic in the field name o Filter - record the criteria (condition) that the view will use to filter the records it will display
Relationship Specific Business Rules Process
o Select a relationship Ask - What kind of information do these tables provide? o Review the relationship and determine if requires a constraint o Define business rule for the relationship Example: An agent can't have more than 5 clients. o Establish rule by modifying the appropriate relationship characteristics - Update the diagram accordingly: • - degree of participation, type of participation, deletion rules o Determine what actions will test the rule - Check what happens when you try to insert, delete, or update a record in the table or a value into a field o Record the rule on the Business Rules Specification Sheet
Field Specifications Business Rules Process
o Select a table (will do for all tables) Ask - How does the organization use info in this table? o Review each field and determine if requires a constraint o Define business rule for the field Example: A client must be associated with an agent. o Establish rule by modifying the appropriate field specifications - need to first identify which element - Example: A client must be associated with an agent • - Logical Element - Required Value = Yes • - Null Support = No Nulls • - Edit Rule = Enter Now, Edits Allowed o Determine what actions test the rule Check what happens when you try to insert a record in the table or a value into a field; delete a record or a value in a field; or update a field value o Record the rule on the Business Specification Sheet - do it for all rules - DB and Application Oriented - you will have them in one place and in a standard format
Advanced Joins
o Self Joins - SQL refers to the same table more than once in a single select o Natural Joins - is similar to the standard inner join, but eliminates multiple occurrences of the same column since joined tables usually have at least 1 common column name o Outer Joins - Where most joins relate rows in Table A with rows in Table B, an outer join allows you to include rows that have no related rows. - Example - you want to list agents (table 1) and number of clients (table 2), but you want the output list to also list the agents who have no assigned clients.
validation view
o Similar to a validation table; helps with data integrity o Draws data from base tables o Unlike a Validation Tables that stores its own data o Restricts Fields that the view will have from the base tables (limits access)
aggregate view
o Special view that aggregates data in a particular manner o Like the Data View, it can draw data from one or more base tables o Can include calculated fields; list them in the view o Uses aggregate functions - most common are: Sum, Avg, Min, Max, Count o So, what does the RDBMS do in our Class Registration example: Rebuilds the Class Registration view when accessed Brings forward the most current data from the two base table (Classes and Student Classes) Since Total Student Reg will be defined as a Count function, it will return a single number for each Class Name by aggregating the data Eliminated the need to keep track of redundant data (class name) since the instances of class data are grouped into a single instance • Class Name becomes a "grouping field" and its values can't be modified
Filter the Data
o You can impose criteria against one or more fields in the view to filter records it displays / makes available o Use the minimum number of criteria that will cause the view to display the records you want o Field you impose criteria on must exist in the view o Note - you will not be able to indicate a filter criterion on the View Diagram; it is recorded on the View Specification Sheet
special notes
o You can update a view and data will flow through to the base tables Updates will be constrained by field specification and business rules Can't modify values of a PK that belongs to one of base tables o View can have redundant data since merging 1 record from 1 table to multiple records from another table (this is ok since don't actually store the data) o View doesn't have a PK since not an actual table
How do you know if you need a view and which to use?
o You start by defining a basic set at this phase of DB design o You look back at information gathered from Mission Statement, Report Samples, Screen Designs, Tables/Subject Notes, Table Relationships, Business Rules...and any other special notes to create initial set of view o Some will surface when you implement the DB design o Others will surface as you create the application programs
alias names
often used for column names and calculated fields can be used for tables too
self join
often used to replace statements using subqueries that retrieve data from the same table as the other statement DBMS usually processes joins faster than subqueries subqueries have multiple selects
linking table
original M-M relationship is dissolved beacuse there isn't a direct relationship between table A and table B contains 2 FKs since copy of PK from 2 tables in relationship has a composite primary key composed of PK from the 2 tables in the relationship keeps redundant data to an absolute minimum name reflects the purpose of the established relationship sometimes you will have to move/add fields to the linking table to reduce data redundancy and to refine the structure of tables in a relationship
data
pair of tables are logically related via the data each contains
orphaned records
records in child table that have no relationship to any records in the parent table
self referencing relationship (recursive relationship)
relationships that exist between records in the same table can be 1-1, 1-M, M-M
multi-valued fields
remove the field from the table and use it as the basis for a new table use a field from the original table to relate the original table to the new table; try to select fields that represent the subject of the table as closely as possible assign an appropriate name, type, and description to the new table and add it to the final table list
elements of the ideal table
represents a single subject, which can be an object or event primary key does not contain multivalued fields not contain calculated fields not contain unnecessary duplicate fields contains only an absolute minimum amount of redundant data
Cartesian Product
results returned by a table relationship without a join condition number of rows retrieved will be the number of rows in the first table multiplied by the number of rows in the second table
important items to note about joins
select has columns from 2 tables where clauses serves as the join condition where clause column names are fully qualified if a where clause was not used, every records in 1st table would be joined with every record in 2nd table (cartesian product) -likely not what you want Do NOT skip where clause
two reasons to alias tables
shorten the SQL syntax to enable multiple uses of the same table within a single SELECT statement
M-M
single record in table A can be related to one or more records in table B AND a single record in table B can be related to one or more records in table A 2nd most common relationship M-M requires a linking table to establish relationship --linking table is a new table established by: ---taking PK from each table in relationship ----2 keys together serve as the table's composite primary key (CPK) ----each is a unique FK that establishes relationship between the parent table and the linking table ---name linking table using name that represent relationship ---add linking table to the final table list and update its table type and table description --diagram for linking table: | | student classes | |
1-M
single record in table A is related to one or more records in table B BUT a single record in table B is related to only one record in table A most common relationship helps eliminate duplicate/redundant data as style and when possible, "one" side of relationship is placed on left side of diagram similar techniques as 1-1 PK from the "one" side of relationship is added to the "many" side of the relationship as a FK
1-1
single record in table A is related to only one record in table B single record in table B is related to only one record in table A -usually involves a subset table (rounded corners) one table behaves as a parent table and other the child record must exist in parent table before it can exist in child table basic logic/common sense helps determine which is parent and child example: STAFF (data table as parent) and COMP (subset table as child)
review table structure of each table
single subject, has PK, no multi part/value fields, no calculated field, no duplicate fields, absolute minimum of redundancy refine as needed
general elements category for field specification for a FK
specification type - replica parent table - name of the parent table source specification - indicate the name of the parent PK description - indicates the FK purpose
cascade
take two specific actions - it will delete the record in the parent table and will also automatically delete all related records in the child table
Degree of participation:
the minimum number of records that a given table must have associated with a single record in the related table and the maximum number of records that the table is allowed to have associated with a single record in the related table Example: (1, 1)
self referencing M-M
use linking table same as in dual table relationships: -fields come from the same parent table -sometimes you have to create a field
Outer joins
when you might want to include rows that have no related rows RIGHT or LEFT is used to specify which table you want all the rows from -RIGHT for the one on the right of OUTER JOIN -LEFT for the one on the left of OUTER JOIN FULL OUTER JOIN gets all rows from both tables
inner join
where you test for equality between 2 tables
nullify
will delete the record in the parent table and will then update the foreign key values of related records in the child table to null; must modify the foreign key's specifications and set the null support logical element to nulls allowed
deny
will not delete the record in the parent table but will instead keep the record and designate it as inactive
restrict
will not delete the record in the parent table if related records exist in the child table; must have it delete all of the related records in the child table before you can have it delete the record in the parent table
example for business rule
• Example: Business rule that constrains using vendors from certain states. o - Single record in state can be associated with one or more records in Vendors (Cardinality) o - Vendor must be associated to one record in state (Type of Participation: Optional & Mandatory) o - There must be at least 1 rec in state table to assign rec in Vendors (Degree of Participation: Min, Max) o - Don't delete records from state table as long as you have vendors doing business there (Deletion Rule)
relationships characteristics determine
-what will occur when you delete a record -type of relationship each table bears in the relationship -degree of participation each table bears in the relationship
establishing each relationship
1-1 and 1-M relationships use PK and FK M-M relationships use linking table
process of identifying table relationships
1. create a matrix of all tables in your database --list all tables across the top and left of the matrix 2. ask interview questions about records in each table --associative ----can a single record in classes be associated with one or more records in buildings? --contextual ----ownership - can a single order contain one or more products? ----action oriented - can a staff member supervise one or more staff members (self referencing)? 3. indicate relationship type in appropriate matrix cell --1-1, 1-M, M-M --some cells will be blank because no relationship exists --start at top and complete for all tables 4. once done going across top, repeat same going down left hand side of matrix 5. using the results of the matrix, must make a determination of the official relationship between 2 tables --rule set: ----1:1 + 1:1 = 1:1 ----1:M + 1:1 = 1:M ----1:M + 1:M = M:M
two common methods to beat the problems with M-M
1. taking a field from one table and incorporating it a given number of times within the other table 2. take one or more fields from one table and incorporate a single instance of each field within the other table (unnecessary duplicate fields, large amount of redundant data, difficult to insert a new record, difficult to delete a record)
self referencing
1:1 and 1:M will use PK and FK in dual table realtionships FK will reside in the same table as PK FK is often already in table structure; if not, create one
Types of business rules
Database-oriented: impose constraints that you can establish within the logical design of the database; modifying various field specification elements, relationship characteristics, or a combo of the two Application-oriented: impose constraints that you cannot establish within the logical design of the database; established with the physical design or within the design of a database application
Why relationships are important
Establishes a connection between a pair of tables that are logically related to each other Helps further refine table structures and minimize redundant data Enables you to draw data from multiple tables simultaneously
Categories of business rules
Field-specific business rules: impose constraints on the elements of a field specification for a particular field Relationship-specific: impose constraints that affect the characteristics of a relationship
Defining the participation for each table
Mandatory: must be at least one record in this table before you can enter any records into the related table Optional: no requirement for any records to exist in this table before you can enter records into the related table
restrict deletion
RDBMS does not delete parent table record if it has related records in the child table
deny deletion
RDBMS does not delete parent table record, but makes it inactive
cascade deletion
RDBMS will delete parent table record and all related records in the child table (can be dangerous)
set default deletion
RDBMS will delete parent table record and updates the FK value of related child table record with "default value" from logical specification setting
nullify deletion
RDBMS will delete parent table record and will then update the FK value in the child table to NULL (requires the FK specification rule to set the NULL support logical elements to NULLS Allowed)
natural join code example
SELECT C.*, O.order_num, O.order_date, OI.prod_id, OI.quantity, OI.item_price FROM Customers AS C, Orders AS O, OrderItems AS OI WHERE C.cust_id = O.cust_id AND OI.order_num = O.order_num AND prod_id = 'RGAN01';
aggregate functions and joins
SELECT Customers.cust_id, COUNT(Orders.order_num) AS num_ord FROM etc ON etc GROUP BY Customers.cust_id; group by - counts the number of orders for each customer and returns it as num_ord
outer join code example
SELECT Customers.cust_id, Orders.order_num FROM Customers LEFT OUTER JOIN Orders ON Customers.cust_id = Orders.cust_id;
table alias code
SELECT cust_name, cust_contact FROM Customers AS C, Orders AS O, OderItems AS OI WHERE C.cust_id = O.cust_id AND OI.order_num = O.order_num AND prod_id = 'RGAN01';
self join code example
SELECT etc FROM Customers AS c1, Customers AS c2 WHERE c1.cust_name = c2.cust_name AND c2.cust_contact = 'Jim Jones';
equijoin code
SELECT vend_name, prod_name, prod_price FROM Vendors INNER JOIN Products ON Vendors.vend_id = Products.vend_id; uses ON instead of WHERE when you specify the INNER JOIN you pass the same info that a WHERE would use to the ON
join code example
SELECT vend_name, prod_name, prod_price FROM Vendors, Products WHERE Vendors.vend_id = Products.vend_id; you must include a where because you have to instruct the DBMS how to join the tables
table joins
SQL can join tables real time within SQL Select statement retrieving data across the related tables is achieved using a join multiple tables are joined to return one output set by associating the applicable rows from each table participating in the join the join itself is not a physical nor a logical data structure -it is created by the RDBMS as needed when instructed via SQL -db definitions created do not instruct the RDBMS how to do a join -however the right logical DB design enables a join to work well
Defining and establishing relationship-specific business rules
Select a relationship -What kind of information do these tables provide? -Why is the relationship between these two tables important? Review the relationship and determine whether it requires any constraints -Is there a need to impose some type of limitation on this relationship based on the way the organization functions or conduct its business? -Is there a need to impose some type of limitation on this relationship based on the way the dance studio functions or conducts its business? --If yes, go to the next step Define the necessary business rules for the relationship Establish the rule by modifying the appropriate relationship characteristics Determine what actions will test the rule -Are there circumstances under which this rule will be violated if I enter a new record into this table? -Will this rule be violated if I do not enter a new record into this table? -Will this rule be violated if I delete a record from this table? Record the rule on business rule specifications sheet
Defining and establishing field-specific business rules
Select a table Review each field and determine whether it requires any constraints -Based on how the table is used within the database, is a constraint necessary for any element within this specifications? Define the necessary business rules for the field Establish the rules by modifying the appropriate field specification elements Determine what actions test the rule -Will this rule be violated if --I enter a new record into this table? --I do not enter a new record into this table? --I delete a record from this table? --I enter a value into this field? --I do not enter a value into this field? --I update the value of this field? --I delete the value of this field? Record the rule on a business rule specifications sheet -It allows you to document every database-oriented business rule -It allows you to document every application-oriented business rule -It provides a standard method for recording all business rules -Business rule specifications sheet
Using validation tables to support business rules
Set the relationship's characteristics in this manner -Deletion rule: define a restrict deletion rule for this relationship -Type of participation -Degree of participation
Business rule specifications sheet
Statement: this is the text of the business rule itself. It should be clear and succinct and should convey the required constraints without any confusion or ambiguity. Constraint: this is a brief explanation of how the constraint applies to the tables and fields. Type: state whether the rule is database oriented or application oriented Category: where you indicate whether the rule is field specific or relationship specific Test on: where you indicate which actions (insert, delete, update) will test the constraint the business rule imposes Structures affected: depending on the type of the business rule, the constraint will affect either a field or a relationship. This is where you designate the name of the field(s) the rule will affect or the name of the table(s) involved in the relationship that the rule affects Field elements affected: a business rule that pertains to a field can affect one or more elements of that field's specifications. This is where you indicate the elements the rule affects Relationship characteristics affected: a business rule that pertains to a relationship will affect one or more of the relationship's characteristics. Here is where you indicate the characteristics that the rule affects Action taken: here you indicate the modifications you've made to the elements of a field specification or to a relationship diagram; statement needs to be clear and unambiguous as possible; serves as accurate documentation of the steps you have taken to establish the rule