Database Planning and Management
three types of data base design
- from existing data: analyze spreadsheets and other data tables, extract data from other databases, design using normalization principles. - New systems development : create data model from application requirements, transform data model into database design. -Database redesign : migrate databsees to newer databases, integrate two or more databases reverse-engineer and design new database using normalization princicples and data model transformation
Metadata Tables
- tables of user data - metadata - indexes - user - defined functions - stored procedures - triggers - security data - back up and recovery
First normal Form
-meets the set of conditions for a relation. Has a definded primary key.
SQL Categories
1. Data definition language (ddl) - used for creating tables, relationships, and other structures. 2. Data manipulation language (DML) used for : Queries - SQL SELECT statement. Inserting data - SQL INSERT statment. Modifying data - SQL UDATE statement. Deleting data - SQL DELETE statement. 3. Data control language (DCL) 4. Transaction Control Lanaguage (TCL) 5. SQL/Persistent Stored modules (SQL,PSM)
Homework 1
1. How many records does the file contain? How many fields are there per record?- There are 7 records with with 5 fields for each record. 2. What problem would you encounter if you wanted to produce a listing by city? How would you solve this problem by altering the file structure? - The address field is not currently separated, we would need to separate the address into 4additional fields for street address, city, state, and zip 3. If you wanted to produce a listing of the file contents by last name, area code, city, state, or zip code, how would you alter the file structure? - We would need to alter the fields to have additional fields with this information by separating the information and adding it 4. What data redundancies do you detect? How could those redundancies lead to different types of anomalies? - There are redundancies on the managers phone numbers and addresses because they share offices. This could lead to insertion, deletion, or updation anomalies.
Eliminating anomalies from functional dependencies with BCNF
1. identify every functional dependency. 2. Identify every canidate key. 3. If there is a dunctional dependency that has a determinant that is not a canadiate key a. move the columns of the fucntional dependency into a new relation. B. make the determinant of that dunctional dependency the primary key of the new relation. C. Leave a copy of feterminant as a foreign key in the original relation. 4. Repeate step 3 until every determinant of every relation is a canadiaate key
Limitations of SQL
1. you can not combine table column names with an SQL built in function. 2. You cannot use table names in SQL WHERE caluse because the SQL qhere clause operates on rows and aggregage functions run on columns.
Comparison Operators
= is equal to <> is not equal to < is less than > iis greater than <= is less than or equal to >= is greater than or equal to IN is wqual to one of a set of values Not in is not wqual to one of a set of values Betwween is within range of numbers include the end points not between is not within a range of numbers include end points LIKE matches a sequence of characters NOT LIKE does not match a sequence of characters IS NUL is equal to null IS NOT NULL is not equal to null
Second Normal Form (2NF)
A relation in first normal form in which every non key attribute is fully functionally dependent on the primary key.
BCNF
A relation is in BCNF if every determinant is a candidate key
Third Normal Form (3NF)
A table is in 3NF when it is in 2NF and no non-key attribute is functionally dependent on another non-key attribute; that is, it cannot include transitive dependencies.
SQL Logical operators
AND both conditions are true OR one or the other or both of the conditions are ture NOT negates the associated condition
SQL Built in aggregate functions
Count(*) count the number of rows in the table COUNT {{Name}} Count the number of rows in the table where column {Name} IS NOT NULL SUM calculate the sum of all calues numeric colums onlu, AVG caluclate the average of all values numeric columns only. MIN calculate the minimum value of all values. MAX calculate the maximum value of all values
Application programs - basic functions
Create and process forms, process user queries, create and process reports, execute application logic, control the application itself.
DMBS - basic Functions
Create database - create tables - create supporting structures - modify database data - read database data - maintain database structures, enforce rules - control concurrency - perform back up and recovery
Type of modification anomalies
Deletion, insertion, update
SQL joins operator is used to combine parts or all of two or more tables.
Explicit Join the SQL JOIN operator is used as part of the SWL statement. Implicit join - the SQL join operator is not used as part of the SQL statement. CROSS JOIN combines each row in one table with every row in another table
Databases in the internet and mobile device world
Facebook : posts, likes. Twitter-Tweets
CHAR(l)
Fixed length character strings
Interger data type
INTEGAR
Union Rule
If A -> B and A -> C, then A -> (B, C) However is (a,B)-> C then neither A nor B determines C by itself
Decomposition rule
If A then (b,c), then A-> B and A->C
Primary Key
Is a candidate key selected as the primary means of identifying rows in a relation? There is only one primary key per relation. The primary key may be a single key or a composite key
Foreign Key
Is a column. of composite of columns that is the primary key of a yable other than the one in which is appears. The term arises because it is a key of a table foreign to the one in which it appears as a primary key.
Enttity
Is some identifiable thing that users want to track: Customers, computers, sales
Data
Recorded facts and figures
Relation characteristics
Row contain data about an entity Columns contain data about attributes of the entities all entries in a column are of the same kind. Each column has a unique name (within table) Cells of the table hold a single value. The order of the columns is unimportant. The order of the rows is unimportant. No two rows may be identicial
Write an SQL statement to display the WarehouseID and the sum of QuantityOnHand grouped by WarehouseID. Omit all SKU items that have three or more items on hand from the sum, name the sum TotalItemsOnHandLT3, and display the results in descending order of TotalItemsOnHandLT3.
SELECT WAREHOUSE ID, SUM (QUANTITYONHAND) AS TOTALITEMSONHANDT3FROM INVENTORYGROUPED BY WAREHOUSE IDWHERE QUANTITYONHAND => 3ORDER BY TOTALITEMSONHANDT3 DESC;
Write an SQL statement to display the WarehouseID and the sum of QuantityOnHand grouped by WarehouseID. Omit all SKU items that have three or more items on hand from the sum, and name the sum TotalItemsOnHandLT3. Show Warehouse ID only for warehouses having fewer than two SKUs in their TotalItemsOnHandLT3. Display the results in descending order of TotalItemsOnHandLT3.
SELECT WAREHOUSE ID, SUM (QUANTITYONHAND) AS TOTALITEMSONHANDT3FROM INVENTORYWHERE QUANTITYONHAND => 3GROUPED BY WAREHOUSE IDHAVING SUM(QIANTITYONHAND)<2ORDER BY TOTALITEMSONHANDT3 DESC;
Write an SQL statement to display the WarehouseID and the sum of QuantityOnHand grouped by WarehouseID. Name the sum TotalItemsOnHand and display the results in descending order of TotalItemsOnHand
SELECT WAREHOUSE ID, SUM(QUANTITYONHAND) AS TOTALITEMSONHANDFROM INVENTORYGROUPED BY WAREHOUSE IDORDER BY TOTALITEMSONHAND DESC;
The SQL SELECT statement
SELECT {ColumnNames} FROM {TableNames} WHERE {CONDITIONS} Note: All SQL statements end with a semicolon;
Date/Time:
SQL standard provides 3 data types; most dbms only support one data type; data type name is not standard across dbms
Metadata
Self describing data is data about data.
Foreign Key with referential integrity constraints
Sku_Data (sku, sku_descriptions, department, buyer) Order_Item (OrderNumber, Sku, quantity, price, extended price) WHERE order_ITem.SKU must exist in SKU_Data.sku
Grouping Rows in SQL queries V
The SQL QHERE clause specifies which rows will be used to determine the groups. The SQL HAVING clause specifies which groups with be used in the final result. In general, place WHERE before GROUPBY. Some DBMS products do not require that placement; but to be safe always put where before GROUP BY. There is ambiguity in statement that include both WHERE and HAVING clauses. The results can vary, so to elimate this ambiguity SQL always applies WHERE before HaAVING.
Cartesian product
The combination of all rows in the first table and all rows in the second table
The relational data base model
The dominant database model, all current major DBMS products are based on it. It was created by IBM engineer E.F. Codd. IT was based on math and called relational algebra.
SQL Set Operators
The operators UNION (the result is all the rows and values in one or both tables ) , INTERSECT (all the rows and value common to both tables) , and EXCEPT (result is all the row values in the first table but not the second. SELECT sku, sku description, department FROM Catalogu_SKI_2017 UNION (Or IINTERSECT OR EXCEPT OR UNION ALL) SELECT SELECT sku, sku description, department FROM Catalogu_SKI_2018
Domain integrity
The requirement that all the values in a column are of the same kind
Composite determinant
a determinant of a functional dependency that consists of more than one attribute. (StudentNumber, ClassNumber) -> (Grade)
SQL is not
a full-featured programming language (C, C#) SQL is a data sublanguage for creating and processing database data and meta data.
applications
are the computer programs users work with
Database management systems
creates processes and administers databases
Characteristics of databases
data stored in tables w/ rows and columns. May have multiple tables where each table stores data about a different thing. Each row in a table stores data about an occurrence or instance of the thing of interest. A database stores data and relationships amoung the data.
Types of constrains
domain intergrity, entity integrity, referential integrity.
Database management systems provides
efficient, reliable, convenient, and safe multi-user storage of and access to massive amounts of persistent data. Massive, persistent, safe, multouse, convenient, efficient, reliable.
Entities vs relationships
entities: students, faculty, course, offerings, enrollment. Relationshihps faculty teach offerings, students enroll in offerings, offerings made in the course.
Unique Data Values
implies that this column is NOT NULL, and does not allow a NULL value in any row
Key
is a combination of one or more columns that is used to identify particular rows in a relation.
Comppsite key
is a key that consist of two or more columns.
Candidate Key
is a key that determines all of the other columns in a relation.
Microsoft acess
is a low end product intended for indvidual users and small work groups.
A database
is a self describing collection of integrated tables. The tablees are called intefrated because they store data about the relationships between rows of data.
Referential integrity constraint
is a statemet that limits the values of the foreign key to those already exisiting as primary key values in the corresponding relation
Surrogate Keys
is an artificial column added to a relation to serving as a primary key? DBMS supplied. Short numeric that never changes is an ideal primary key. Has artificial values that are meaningless to users. Normally hidden in forms and reports. ex : w/o key Rental Property
Determinat
is uniqure in a relation if and only if it determines every other column in the relation. You cannot dind the determinants of all functional dependencies by looking for uniqure value in one column. Data set limitations must be logically determinant.
information
knowledge derived from data
Domain
means a group of data that meets a specific type of definitions FirstName could have a domain such as albert, ashley, or david
A function dependency
occurs when the value of one set of attributes determines the value of a second set of attributes. The may depend on equations however they are not equations. So Cookie Cost = NumberOfBoxes X $5 NumberOfBoxes -> cookie cost.
Relation DBMS products
store data about entities in relations, which are a special type of table. A relations is a two dimensional table that has the follow characterstics
Normaliation
the process of converting complex data structures into simple, stable data structures. The goal is to have the same data free of update anomalies. Two fundamental to the normalization process: Functional dependence and keys.
Entity integrity constratint
the requirement in order to function primary keys must have unique data values inserted into every row of the table.
in a relation defined by codd
the rows of a relation must be unique but there is no requirement for a designated primary key.
SQL
the structured query language is an internationally recognized standard database language that is used by all commercial DBMS
Tables are not relations if
they have multiple entries per cell. Required row order.
Varchar(L)
variable length character strongs
SQL underscore ( _ ) wild character
which represents a single unspecified character in a specific position in the character string used in combo w/: SQL like operator and SQL NOT LIKE
SQL percent sign (%) wildcard character
which represents any sequence of contiguous unspecified characters (including spaces) in a specific position in the character string. EX: SELECT * FROM SKU_DATA WHERE SKU LIKE '%2__';