DSO 428 EXAM 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Views Benefits

Benefits include: Reduce complexity Provide a level of security Provide a mechanism to customize the appearance of the database. Present a consistent, unchanging picture of the structure of the database, even with underlying database changes --> think Amazon

DBMS functions

Data definition language (DDL) - Permits specification of data types, structures and any data constraints. - Store specifications in the database. --> specification of how the data is stored (data types, structures, constraints, relationships, security)makes up the system catalog Data manipulation language (DML) - General enquiry facility (query language) of the data --> query language for data. Retrieves/manipulates data. Controlled access to database may include A security system An integrity system A concurrency control system A recovery control system A user-accessible catalog

Required Data for specific attribute

VARCHAR(10) NOT NULL

SIMPLE JOIN

does not include unmatched rows SELECT c.clientNo, c.fName, c.lName, v.propertyNo, v.comment FROM Client c JOIN Viewing v ON c.clientNo = v.clientNo ;

Self Targeting (recursive): Foreign Key:

relationship within a table set of attributes within a relation that matches the candidate key of some (possibly same) relation Super_ssn is a foreign key that corresponds to Essn

All column names in SELECT list must appear in GROUP BY clause, unless name is used only in an aggregate function.

the reverse is not always true

ALTER TABLE

§ Add a new column to a table § Drop a column from a table § Add a new table constraint § Drop a table constraint § Set a default for a column § Drop a default for a column

nested subqueries

• SQL statements can have a "SELECT statement" embedded within them . • A subselect (inner SELECT) can be used in WHERE and HAVING clauses of an outer SELECT, where it is called a subquery or nested query.

CREATE DOMAIN

•Create Domain (a set of legal value for the attribute) (some attributes may share the same domain) CREATE DOMAIN AddressType as NCHAR(50) Address1 AddressType Address2 AddressType -CREATE DOMAIN ... CHECK CREATE DOMAIN DomainName [AS] dataType [DEFAULT defaultOption] [CHECK (searchCondition)] CREATE DOMAIN State AS CHAR(2) DEFAULT 'CA' CHECK (VALUE IN ('CA', 'WA'.....))

drop table restrict

•With RESTRICT, if any other objects depend for their existence on continued existence of this table, SQL does not allow request. •With CASCADE, drops all dependent objects recursively. •DROP TABLE TableName [RESTRICT | CASCADE]

reversed words

•are a fixed part of SQL and must be spelt exactly as required and cannot be split across lines.

literals

•constants used in SQL statements. - non numeric enclosed in quotes - numeric not enclosed in quotes

Character Types

For birthday you can just use date as data type Redemption time → use timestamp LastName → char (character) Any of those are fine under character Address → character (contains both number and the character) UID → integer Revenue → decimalVarchar usually works for all occasions we would see in this course. Nchar is used more when we need to put in non-traditional characters like international language.

General Constraints Example

Additional rules specified by users or database administrators that define or constrain some aspect of the enterprise. (e.g. staff in a branch<20) Can you insert B0009 (new branchNo) into Staff table? → NO - Violates referential integrity b/c not in home table - If you add B0009 into Staff table and not the Branch home table you will Null value in your Staff table (b/c not existing in home branch table) - Would have to insert B0009 into Branch home table before inserting into Staff table - However, you CAN insert new staff into Staff table that does have a null branchNo (staff isn't assigned to branch yet but will be assigned to one of the branchNo already existing in the Branch table)

insert

INSERT INTO *TableName* VALUES( ' ', ' ');

Referential Integrity

If foreign key exists in a relation, either foreign key must 1. Match a candidate key value of some tuple in its home relation, OR... - i.e. can't enter a foreign key that doesn't exist in primary key tuple of home table 2. Foreign key value must be wholly null - i.e. can enter null value into foreign key if doesn't exist yet in primary key of home table or unsure of value

SQL DML Query

(will return results in a new table. Query will not change the tables in DB: e.g PropertyForRent) Syntax (including all elements): Only SELECT and FROM are mandatory. Order of the clauses cannot be changed.

ON DELETE CASCADE (Fire)

) - When a tuple p in a table (with the PK) is deleted then all tuples f in other tables (with the FKs) that refer to p are also deleted.

group by

** IF SELECT LIST has 3 items and one is an aggregate function - the 2 items that are not aggregate functions legit have to be in the group by clause Aggregate Function will return only one row of results (statistics at most aggregate level - e.g. how many staff in total) • Often we need intermediate rollup for subtotal of each group (how many staff in each branch, total salary paid by each branch, etc.) Use GROUP BY clause to create groups, - e.g., can use multiple column to create group at granular level SELECT and GROUP BY closely integrated: - Each item in SELECT list must be single-valued per group (group-level consistency), and - SELECT clause may only contain: • Column names - must be the ones from GROUP BY clause • Aggregate functions - any attribute an aggregate function is applicable • Constants • Expression involving combinations of the above.

ON DELETE NO ACTION (default) -- (Cannot delete)

- Prevents a tuple p in Table (with the PK) from being deleted when a tuple f in some other table (with the FK) points to tuple p.

For each foreign key in each relation, identify

- The primary key / parent relation that is being referenced; - The referential constraint; and - The associated business rule

IMPORTANT NOTE

- The tuple f (with the FK) can be deleted at any time without checking referential integrity of tuple p (with the PK)!

Practical Analysis - 2 Step Check

1. Can't insert b/c violates entity integrity (primary key can't take null value) 2. Can't insert b/c CR74 already exists in table

What is a Database Management System? (DBMS)

A software system that enables users to define, create, and maintain the database and which provides controlled access to this database

Alter Table

Add a column to Client table. How about Drop? ALTER TABLE table_name ADD column_name datatype; Remove constraint from PropertyForRent that staff are not allowed to handle >100 properties. How about Add?

Perform simple SQL Queries Use aggregate functions (COUNT,SUM,MAX,MIN,AVG) How many properties cost more than £350 per month to rent?

Aggregate functions can be used only in SELECT list and in HAVING clause. If SELECT list includes an aggregate function and there is no GROUP BY clause, SELECT list cannot reference a column out with an aggregate function. SELECT staffNo, SUM(salary) FROM Staff; --> System will be Confused - Shrink the table or Not? Aggregate Function will return only one row of results (statistics at most aggregate level - e.g. how many staff in total) • Often we need intermediate rollup for subtotal of each group (how many staff in each branch, total salary paid by each branch, etc.) You have to be consistent in telling DBMS what you want across columns You're trying to tell DBMS here that you want staffNo and summed salary of all staff DBMS is confused if you want to collapse the table and return one row or do you want separate rows

Objectives of 3 Level Architecture

All users should be able to access the same data (Conceptual level) A user's view is immune to changes made in other user's views. Users should not need to know physical database storage details. * The above objectives are covered by next two objectives DBA should be able to change database storage structures (internal schema) without affecting conceptual schema. - Or ... conceptual schema should be immune to changes to physical schema (Physical Data Independence) DBA should be able to change conceptual schema without affecting all of the users and all their external schema. - Or ... external schema should be immune to changes to the conceptual schema.

Foreign Key

An attribute or set of attributes within one relation Matches the candidate key of some (possible same) relation I.e. foreign key references primary key - Relationship between focal table and another table UID student table (focal) and Course Registration table (contains UID, course id, semester, etc.) - These two UIDs are connected - UID in Course Registration table is FOREGIN KEY - UID in Student Table is PRIMARY KEY - Connect two tables by FK / PK! Construct relationships between tables through FK / PK!

Super Key How to ID a Super Key?

Attribute or set of attributes (column) in a relation Uniquely identifies a tuple (row) within the relation --> Uniquely identify ANY student in the table or ANY row in the table How to ID a Super Key? 1. Data Dictionary / System Catalog 2. Common Sense / Semantic Meaning Example: Is UID a superkey to uniquely identify any student in our table? Yes → everyone is assigned a unique one (different for everyone) Is SSN a superkey to uniquely identify any student in our table? NO! Not good to use for privacy purposes Some people might not have it (only U.S. citizens) Is UID + SSN a superkey? Yes because UID is already a superkey But don't need SSN Is Name + GPA + Degree + Grad_Yr a superkey? No → there could still be duplicates Not a guarantee

DreamHome Rental Database Schema (exercise)

CREATE TABLE Viewing( clientNo NCHAR(4), → 4 you find from looking at table propertyNo NCHAR(10), → 10 was previously defined in other PropertyForRent table viewDate NCHAR(15), comment NCHAR(40), → longer characters because commentary PRIMARY KEY (clientNo, propertyNo, viewDate), FOREIGN KEY (clientNo) REFERENCES Client(clientNo), FOREIN KEY (propertyNo) REFERENCES PropertyForRent(propertyNo) );

Three step procedure: to identify Super/Candidate/Primary Key/Alternative Key

Can you use the set of attributes to uniquely identify row in table? UID + FB_ID → yes Can you further reduce that superkey but still uniquely identify the row in table? Yes → can remove FB_ID and becomes Super Key If no...becomes Candidate Key Super Key --> Candidate Key (condensed form) --> Primary Key vs. Alternate Key UID + FB_ID → is a Super Key but not a Candidate Key - Once you condense it to UID it will be a Candidate Key!

Referential actions on DELETE and UPDATE:

Cascade Set null Set default No action - the default option (if no explicit statement) FOREIGN KEY (staffNo) REFERENCES Staff ON DELETE SET NULL FOREIGN KEY (staffNo) REFERENCES Owner ON UPDATE CASCADE

Alter Table #2

Change Staff table by removing default value for position column and setting default for sex column to female ('F').

File-based approach (2) Daily Deal Redemption - File Based Approach

Collection of application programs. Each program defines and manages its own data Imagine: you bought a restaurant deal from Groupon.com and go to redeem the voucher using Groupon mobile app: my voucher Under File based approach: (each party has program that defines/manage its own data) Customer: used or not? (forget to mark?) Merchant: put voucher # into local PC for a match Groupon: Can do nothing about it after selling voucher & never know when & which voucher is used & customer feedback Local arrangement Consumers don't have a way to check whether they have used a specific voucher or not Groupon can't send a survey b/c they don't know when you go to the restaurant or consume/use the voucher you purchased through them Don't' know the consumption pattern of users → Groupon doesn't know how many vouchers customers use or repeat vouchers No one is happy : ( --- Inconvenient / bad user experience No one gains any insights (Groupon lose track of valuable data

Perform simple SQL Queries Perform set operations (UNION, INTERSECT, EXCEPT)

Combine results of two or more queries into a single result table - SIMPLE Union of two tables, A and B, is table containing all rows in either A or B or both. Intersection is table containing all rows common to both A and B. Difference is table containing all rows in A but not in B. --> Order matters in this case! Two tables must be union compatible, means: 1. Two tables must contain same number of columns 2. Corresponding columns have the same data type & length

Disadvantages of DBMS - Challenges

Complexity Size Cost of DBMSs Additional hardware costs Cost of deployment- training, hiring specialist, ... Performance - general vs. specific ... - General - DMBS is more of a general solution - Specific - specific industry or company you might want a more customized solution Greater impact of a failure - centralization of resources

Conceptual Modeling

Conceptual schema → result of using relational data model - The core of a database system supporting all user views - Should be complete and accurate representation of an organization's data requirements. Conceptual modeling is process of developing a model of information use that is independent of implementation details (e.g. targeted DBMS). (our focus: E-R mode) Result is a conceptual data model**

Advantages of Database approach (7)

Control of data redundancy → not just multiple copies made over and over Data consistency More information from same amount of data (combination) Sharing of data Improved data integrity Improved security Enforcement of standards Economy of scale

database delete

DELETE FROM 'TableName' WHERE searchCondition

Drop Table

DROP TABLE PropertyForRent; •Removes named table and all rows within it.

Roles of People in the Database Environment

Data Administrator (DA) - manages/installs/configures DBMS/software application - Logical/conceptual database design. Database Administrator (DBA) - physical realization of the database - responsible for performance, integrity, security.more technical Database Designers Logical - conceptual schema to describe what data is stored (arrange data into logical structure) Physical - how the data is stored (once you have a conceptual schema) (specify configurations/parameters on storage media.) Application Developers - create/develop applications that use DBMS.. End-Users Naïve (e.g. credit card user) - ppl who are not aware that they are interacting with a database Sophisticated (e.g. business analyst/data scientist) - data scientists, analysts.

Database Languages #1

Data Definition Language (DDL) Allows the DBA or users to describe and name - Entities & attributes - Relationships required for the application. - Integrity and security constraints. (we'll come back to this in lecture 3) Results of compilation of DDL statements: - System catalog (data dictionary) that integrates metadata. --> PDF or Google Doc with information --> Tells you what are the tables, columns, data type for each column in system catalog!

Database Languages #2

Data Manipulation Language (DML) → how do you use this query language (a language for the specification of procedures for the retrieval (and sometimes also modification) of information from a database) to perform analysis? - Provides basic data manipulation operations on data held in the database Procedural DML (Network and Hierarchical DML): - Allows the user to tell the system exactly how to manipulate (query) data. Non-Procedural DML (SQL):** - Allows the user to state what data is needed rather than how it is to be retrieved.

Database approach Daily Deal Redemption - Database Approach

Data is independent of application programs (Data is a separate entity and it doesn't matter what programs you use) Imagine: you bought a restaurant deal from Groupon and go to redeem the voucher using groupon mobile app: my voucher Under database approach: all data (about voucher, customer, merchant) is managed on groupon centralized Databases in a logically related manner & all parties has shared access to data Customer: used or not? (Groupon will tell/Let merchants try!) Merchant: scan or put voucher # into "merchant center" (DB app) → customer scans voucher in store with merchant and merchant documents data Groupon: 1) Take # from app, check DB & Feedback merchant whether the voucher has been redeemed 2) Change status of voucher in customer account & send an email survey 3) Tell merchant no. of vouchers not redeemed & how much $ made & Survey results 4) Groupon accumulates valuable data on redemption, which can be combined with other data for analysis by data users in different departments: redemption pattern(BusinessUnit), social shopping(DS),campaign value (merchant). Opportunity for group data or group cell! Everyone is better off!!

Two ways of organizing data

Database approach File-based approach How to organize data (e.g. record and update data) for a new short-rental company? → each Sales, HR, and Marketing teams are collecting and managing data (different types)

Different Data Models

Discipline to define how many tables are in data models Intentional that BranchNo is in both tables

purpose of views

Enhanced security - Provides powerful and flexible security mechanism by hiding parts of database from certain users. • Customized data access - Permits users to access data in a customized way, so that same data can be seen by different users in different ways, at same time. • Simplified operations - Can simplify complex operations on base relations.

ISO standard supports definition of foreign keys with FOREIGN KEY clause in CREATE and ALTER TABLE:

FOREIGN KEY (branchNo) REFERENCES Branch ie.: - FOREIGN KEY (staffNo) REFERENCES Staff ON DELETE SET NULL - FOREIGN KEY (ownerNo) REFERENCES Owner ON UPDATE CASCADE

Data Model --> How to Organize Data!

Integrated collection of concepts for describing data, relationships between data, and constraints on the data in an organization. Purpose: to represent data in an understandable way. Determines in which manner data is stored, organized & manipulated Data Model comprises: A structural part (Relations/Relationship) → tables A manipulative part (allowed SQL operations) → how SQL can be used to analyze or define data Possibly a set of integrity rules (Entity / Referential Integrity) → what are the rules that regulates the data tables → we are mostly using the Relational Model!

Integrity Constraints

Integrity Constraints → when we update one table how does that impact other tables / update the other datatables? Null: Represents value for an attribute that is currently unknown or not applicable for tuple (e.g. survey → some people don't have answers to some questions and skip through them Deals with incomplete or exceptional data (unknown) Represent the absence of a value and is not the same as zero or spaces (zero or spaces are values) --> SSN won't work anymore as the primary key if null values are entered!

Examples of Database Applications: You are interacting with databases all the time!

Make Purchases at a local store using your credit card - You are storing these purchases in a database Make Transfer in Online Banking Play with any of the app on the mobile Complete Course Registration Redeem Daily Deal Vouchers at Restaurants Track Blood Donation Records

How to Create Table *** most important card *****

Name the Table (CREATE TABLE Branch ) Specify Attribute Name/ Domain ( branchno Nchar(10) ) Impose Entity Integrity - PK & Not null option 2 - if you don't write any statement for ON DELETE the system by default will assume on delete no action example: CREATE TABLE Branch ( branchno VARCHAR(10), street VARCHAR(20), city VARCHAR(20), postcode VARCHAR(20), PRIMARY KEY (branchno) );

Database Approach

Objective: Define data independent of application programs. Have control on manipulation and access to the data that is independent to all applications and software. Result: Database + Database Management System (DBMS)

Objectives of SQL

Objectives of SQL

Views over Shared Data

One disadvantage of a shared database - COMPLEX and DIFFICULT to understand or to use. --> It contains all data across multiple departments or business functions in a large organization. A view mechanism provides users with only the data they want or need to use. (Amazon Cart/My Course) - A view is essentially some subset of the database. - Allows each user to have his/her own view of the database - Support multiple applications for multiple user bases --> As an individual customer you don't need to know all the other consumers' data shows user his own subset of data. allows multiple applications for multiple users

Perform simple SQL Queries Add filter conditions using WHERE WHERE List staffno, full name, positions & salary for all staff with a salary greater than 20,000 and less than 30,000 BETWEEN...AND

Other option: SELECT staffNo, fName, lName, position, salary FROM Staff WHERE salary >= 20000 AND salary <= 30000; Can also use NOT BETWEEN, between, and, or

Perform simple SQL Queries Add filter conditions using WHERE WHERE IN

Other option: SELECT staffNo, fName, lName, position FROM Staff WHERE position = 'Manager' OR position = 'Supervisor'; SELECT staffNo, fName, lName, position FROM Staff WHERE position IN ('Manager', 'Supervisor'); Can also use NOT IN IN is more efficient when set contains many values.

Define primary keys with PRIMARY KEY clause:

PRIMARY KEY (staffNo) PRIMARY KEY (clientNo, propertyNo) -- composite pk

Primary Key Alternate Key

Primary Key (PK) The chosen / selected candidate key (UID) to identify tuples uniquely within a relation - Primary key can be one column or could be a combination of multiple columns Alternate Keys Candidate keys that are not selected to be primary key (SSN)

Primary/Candidate Key vs. Foreign Key

Primary/Candidate Key - Super key + Cannot be further reduced - We use PK to uniquely identify any record within a relation Foreign Key - An attribute or a set of attributes that matches PK/CK in some (possibly same) relation. - We use FK to get information from some (other) relation FK-PK is the only right way to join information from another table (not through any other set of attributes) We can find FK in a database by 1) first identifying PK within each table; 2) search FK in other relations FK can also match PK of the same relation (don't forget to search FK within the same relation)

Referential Integrity in Delete / Update action What would happen to staff in B003 if the branch is closed ? (delete a row in Branch)

Referential Action → Property to protect foreign key from changes in primary key in home relation Says how the foreign key is affected when you change B003 in home/upstream table If we were to remove B003 in the Branch table, how would we handle SG37, SG14, or SG5 in the Staff table? Referential Entity (action) will define what will happen to BranchNo in Staff table when you delete B0003 in home table Will say → "On Delete No Action" (default choice), "On Delete Set Null" or "On Delete Cascade" Referential Actions is about How FK is affected, when there is change in PK - go to see rules

Relations: Terminology Relation Attribute Tuple Degree Cardinality Domain What is the degree and cardinality of this table?

Relation = Table with columns and rows Attribute = named Column of a relation Tuple = row of a relation Degree = Number of Attributes or Columns in a relation Cardinality = number of records, rows, tuples in a relation Domain = set of allowable values for one or more attributes - Each column is defined on a domain - Set of all possible values for an attribute (column) - If domain name is Branch Number...the character should only have a size of four (B005) - This Branch Number must call into this character for the domain Degree of top table: 4 (4 columns) Cardinality: 5 (top table) → don't count header row

System catalog

Repository of information (metadata) describing the data in the database. (covered in definition of database in ch.1) Typically stores: - Names of authorized users. (User) - Names of data items in the database. (Entity & Attributes) - Constraints on each data item. (Integrity Constraints) - Data items accessible by a user and the type of access (Access) Used by modules such as Authorization Control and Integrity Checker..

Perform Simple SQL Queries

Retrieve data from tables using SELECT...FROM Can you create an address book for property with propertyno & address? Can you list all properties in London? Can you list all houses with four or more rooms? What is the average rent within postcode NW2? How many properties are managed by branch003? Select - specifies which columns are to appear in output (columns in new table - vertical) From - specify tables to be used (raw material) Where - filters rows (rows in new table - horizontal) Group by - forms group of rows with same column value Having - filters groups subject to some condition Order by - specifies the order of the output

Practice with Pattern Matching List clinetNo and full name for those clients whose first name starts with 'J' AND last name starts with 'K'

SELECT clientNo, fname, lname FROM Client WHERE fname LIKE 'J%' AND lname LIKE 'K%'

database update

UPDATE TableName SET salary = salary*1.05 WHERE position = 'Manager'

Perform simple SQL Queries Add filter conditions using WHERE WHERE Pattern Matchig If you're unsure if Glasgow is upper or lower case... 63 Well St, Glasgow G42 Forget one's last name but remember it starts with Y and contains 4 characters

WHERE lower(address) LIKE '%glasgow'; lower(attribute) → makes it all lowercase so you can search if unsure Can we identify the above record using WHERE address like '%Glasgow' → NO! No % sign at the end enforces that the address must end on Glasgow SELECT lname FROM STUDENT WHERE lname LIKE 'Y___'

Perform simple SQL Queries Add filter conditions using WHERE WHERE Pattern Matching --> LIKE Find all owners with the string 'Glasgow' in their address.

We don't know where the string Glasgow is in their address (is it a street or city?) % at the beginning is an indicator for a string starting with a 0 (or other number) % at the end is an indicator for a string ending in anything

What would happen to staff in B003 if the branch is renamed? Update B003 to B001 ** review ** confused *8 deck 2 slide 32

What will happen to all the records in Staff table with B0003? 1. You would specify the referential integrity CASCADE (update branch number for all staff - don't fire because branch still exists its just renamed) 2. SET NULL (set all staff to null with B003 until they're reassigned) 3. Cannot update branchno e.g. bcard (NO ACTION)

• ON DELETE SET NULL (Allow uncertainty)

When a tuple p in a table (with the PK) is deleted then all tuples f in other tables (with the FKs) that refer to p are set to null.

3 Level Architecture

1. External Level - user view (Amazon or HR) - Multiple independent users or applications - Users' view of the database - Focus on each user or application Don't need to understand entire view of all the data and tables Only need to understand a small subset of data! 2. Conceptual Level - community view (the "what"?) - Community view of the database (Shared) - Describes what data is stored in database and relationships among the data - Focus on the organization Conceptual Schema - internal conceptual map of all tables At the conceptual level the organization needs to know how many tables, what are these tables, and what columns of data are in each table Once you have a conceptual map you can code this into your physical layer!! Can code your conceptual schema in a different way depending on what type of physical infrastructure you use Physical infrastructure could be: Oracle, AWS After you code this creates the internal schema Describes what data is stored and relationships among data 3. Internal Level (the "how"?) - Physical representation of the database on the computer - Describes how the data is stored in the database - Focus on the DBMS - coding structure of database. Internal schema describes how the data is stored - How the data is actually implemented

Properties of Relations

1. Relation name is distinct from all other relation names in relational schema. - If you have multiple tables, there shouldn't be two tables with the same name - Can't talk to the database and call the right table if there are duplicate names 2. Each cell of relation contains exactly 1 atomic (single) value. 3. Each attribute has a distinct name. 4. Values of an attribute are all from the same domain. 5. Each tuple is distinct; no duplicate tuples - Impose these constraints to make sure we can retrieve the info in the future (no duplicate records to call on) - If a certain use calls one of two rows there's no way they can differentiate between the two rows (same content) - Even if you're entering names and two people have the same name → there will be different IDs attached to these rows 6. Order of attributes has no significance. 7. Order of tuples has no significance, theoretically

Relational data model consists of three parts

1. Relations - there are two tables 2. Relationship between two tables (presented using structure called "primary key" / "foreign key" paid) 3. Integrity Constraints (rules to govern dynamic updating process in database) BranchNo is the index column This index column is copied to the second table (BranchNo appears twice) BranchNo is called primary key in first table BranchNo is called foreign key in second table Can use BranchNo to combine two tables and access/refer to data across tables This is done through primary key / foreign key connection

Practical Analysis Part 2

1. We CAN insert b/c foreign key (branchNo) can take null value in home table AND because SG82 is not a repeat in home staffNo table 2. Can't insert b/c B001 doesn't exist in primary key home table --> violates referential integrity 3. We can't insert because entity integrity (can't insert null into primary key) 4. We can't insert because not unique

Examples for Properties of Relations

1. You cannot insert this record because it's an exact duplicate 2. You can insert this record because branchNo primary key is not a duplicate! 3. You can't insert because it's an exact duplicate 4. Atomic (not a single cell value) - Can't insert record with a cell that contains more than one value (contains both Manager and Accountant) - How can you do this? → create a new table! --> List everything you already have in two new rows (same info but two positions) --> You could also create two columns (position 1, position 2 **could have a lot of empty cells for those who don't have 2 positions)

Exercise: FK and PK Answer

First search Primary Key and then search for matching Foreign Key!! Employee: Super_ssn can't be unique b/c one supervisor could manage multiple employees --> one supervisor could have multiple employees (many to one, super_ssn is not unique in employee table b/c could reference many employees) Department: Mgr_ ssn is a foreign key but it's a subset of employee SSN (still satisfies definition of foreign key) --> one manager number could mg multiple departments - Foreign key: one attribute or a set of attributes that match the value of a primary key in one table - Manager ssn may take 10/100 values in employee ssn but it still matches so it's still a foreign key - Manager ssn foreign key that reference back to employee table Dept _ location: Dlocation can't alone work because the same building might host multiple departments! Dnumber alone won't work because one number could reference six different locations No foreign key for dept_location (no couple match of composite primary keys) Works_On: Could be multiple project numbers coordinating to one employee's SSN - An employee works on project → so you need a composite primary key (two primary keys) Dependent: Dependent_name alone is not enough - Some dependent for different employee SSN can have the same name

Functions of DBMS

Functions of DBMS (just pay attention to 3) → DBMS is management tool of database Data Storage, Retrieval, and Update - Allow external user to talk to database User-Accessible Catalog - Catalog is data dictionary Concurrency Control Services - In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome. Transaction Support Recovery Services Authorization Services Support for Data Communication Integrity Services Services to Promote Data Independence Utility Services

HAVING

HAVING clause is designed for use with GROUP BY to restrict groups that appear in final result table Similar to WHERE, but WHERE filters individual rows whereas HAVING filters groups. Column names in HAVING clause must also appear in the GROUP BY list or be contained within an aggregate function.

Components of DBMS Environment

Hardware Can range from a PC to a network of computers. Software DBMS operating system network software (if necessary) application programs. Data Used by the organization (data used in the system) Data description is called the schema. Procedures Instructions and rules that should be applied to the design and use of the database and DBMS. People Data administrator Database administrator Database designers (logical, physical) Application developers End users

Perform simple SQL Queries Use aggregate functions (COUNT,SUM,MAX,MIN,AVG)

ISO standard defines five aggregate functions: COUNT returns number of values in specified column. SUM returns sum of values in specified column. AVG returns average of values in specified column. MIN returns smallest value in specified column. MAX returns largest value in specified column. --> COUNT, MIN, and MAX apply to numeric and non-numeric fields --> SUM and AVG may be used on numeric fields only.

Entity Integrity

In a base relation, no attribute of a primary key can be null (don't forget another constraint) - Primary key can uniquely identify a tuple - You can enter null into GPA and Name tuples (because these aren't primary keys) but not SSN tuple

Perform simple SQL Queries Add filter conditions using WHERE Pattern Matching

SQL has two special pattern matching symbols: %: sequence of zero or more characters _: any single character LIKE '%Glasgow%' means a sequence of characters of any length containing 'Glasgow'. VERY USEFUL and PRACTICAL. E.g. forget one's last name but remember it starts with Y LIKE '?

Limitations of file-based approach (5)

Separation & isolation of data, no communication, inconsistency. - Each program maintains its own data set. - Users of one program may be unaware of potentially useful data held by other programs. (no communication, inconsistency, hard to analyze - e.g. show houses that have lower rent than client's max) Duplication of data - wasted space, different formats. - Same data is held by different programs. Data dependence - file structure changes mean program change (will mess it up if you switch programs from Oracle to AWS with file based) - File structure is defined in the program code. File structure changes » Program changes (merchant case) - Data-application(or program) independence in DB approach (benefit of DB approach) - Separate data applications that may not be the same or compatible with one another

Perform simple SQL Queries Add filter conditions using WHERE WHERE Filters rows (rows in new table -- horizontal)

Several approaches to filter rows Comparison Range Set Membership Pattern Match NULL Can be combined with "AND" or "OR"

What is a Database?

Shared collection of Logically related data A description of this data Designed to meet information needs of an organization System catalog (metadata) provides data description Achieve program-data independence --> Metadata → data of the original data (a set of data that describes and gives information about other data) Data modeling approach (how to logically relate data?) Identify logically related data which comprises entities, attributes, and relationships between entities Entity-Relationship model (Conceptual) Relational model (Logical)

Candidate Key

Super key (uniqueness) - A refinement of superkey - Should not contain any redundant attribute - Can't reduce it any more (simplest form of superkey) No proper subset is a super key for the relation (irreducibility) - Cannot use a proper subset to uniquely identify a record in the table May have more than one attribute - composite key - Composite Key = more than one primary key A relation may have multiple candidate keys Example: Is UID + Degree a candidate key? No → not concise enough yet (you can use the UID alone and make that the candidate key) If you cannot further reduce it then it's a candidate key In this case we can still remove the degree part and use UID alone If a superkey has only one attribute, is it automatically a candidate key? YES A super key can have multiple candidate keys UID and SSN can both be candidate keys but which one is better? → UID From a math perspective both columns can satisfy the property But from a privacy perspective all of us want UID

Physical Data Independence

Why do we need the three levels? - Too many complexities in this database approach - Want each type of external user to be separate - Each layer does not need to worry about specific details in the next layer (specific users don't worry about the data structure) - A data scientist only cares about the conceptual level Should not need to know changes in the lower levels - Any change in the lower level shouldn't affect the upper levels either! Why do we want the three levels to be independent? Should not need to know changes in the lower levels Physical data independence - Refers to immunity of conceptual schema to changes in the internal schema - Internal schema changes --> e.g. using different file organizations, storage structures/devices --> should not require change to conceptual - Change lower level (internal or physical schema) without affecting the upper level (conceptual schema) - Supported by the CONCEPTUAL/INTERNAL mapping Physical data dependence is achieved by conceptual / internal mapping Logical data independence - Refers to immunity of external schemas to changes in conceptual schema - Conceptual schema changes --> e.g. addition/removal of entities --> should not require changes to external schema or rewrites of application programs - Change the lower level (conceptual schema) without affecting the upper level (external schema) - Supported by the EXTERNAL/CONCEPTUAL mapping Logical data independence is achieved by external / conceptual mapping

What if you want to delete records in downstream table instead? (delete SG14 and remove B0003 from Staff table)

Will that affect the B0003 in the upstream table? (Branch table) No! Removal of record in downstream table won't affect upstream table Removal of branchNo in upstream table will have a much larger effect on the downstream table! → you must set BranchNo records in Staff downstream table to null or delete them before you can B0003 in upstream Start building upstream table before downstream table so that you can embed the primary key in the upstream table as a foreign key in the downstream table

Perform simple SQL Queries Add filter conditions using WHERE WHERE NULL Search Condition List details of all viewings on property PG4 where a comment has not been supplied

also try: SELECT clientNo, viewDate FROM Viewing WHERE propertyNo = 'PG4' AND comment = ' ';

user defined words

are made up by user and represent names of various database objects such as relations, columns, views

Count(*) How many unique properties have been viewed?

count(*) counts all rows of a table, regardless of whether NULLs or duplicate values occur SELECT COUNT(Salary) will miss null salary values SELECT COUNT(*) will capture everything! SELECT COUNT(StaffNo)will still capture everything because StaffNo as primary key CAN'T have any null values Can use DISTINCT before column name to eliminate duplicates DISTINCT has no effect with MIN/MAX but may have with SUM/AVG/COUNT How many unique properties have been viewed? SELECT COUNT(Distinct propertyNo) FROM Viewing

outer join

does include unmatched rows SELECT c.clientNo, v.clientNo, c.fName, c.lName, v.propertyNo, v.comment FROM Client c RIGHT JOIN Viewing v ON c.clientNo = v.clientNo ;

Answer

• [R1] When an employee is fired, then all information about the dependents of that employee are deleted from the database. • [R2] When an employee is fired, then all information about the employee working on a project can be deleted from the database. • [R3] When a project is deleted, then all information about employees working on that project can be deleted. Caution: This does not mean that we delete employee information altogether! • [R4] Employee information cannot be removed from the database if the employee is the manager of a department. In this case, the department needs to be assigned a new manager before the employee is deleted. • [R5] An employee who is supervising another employee should not be deleted. • [R6] A department cannot be deleted if there are current projects in the department. • [R7] When a department is deleted all employees in that department are deleted. Ask yourself: What is Changed? - PK; What is affected? --FK


Ensembles d'études connexes

Principles of Management - Chapter 14

View Set

brain teasers, BrainTeasers, Brain Teasers, Brain Teasers - Interview Questions

View Set

Chapter 12: Part 121, Subpart T - Flight Operations

View Set

American Imperialism and Spanish-American War

View Set

Combo with "Philosophy in context" and 27 others

View Set

Nursing Management: Patients With Renal Disorders

View Set

pharm week 5 textbook NCLEX questions

View Set