DSCI 351 midterm 1

¡Supera tus tareas y exámenes ahora con Quizwiz!

ER vs relational model

both used to model data ER has multiple concepts (entities, relationships, is-a, etc.), well-suited for capturing application requirements but not well suited for computer implementation relational model: single concept is relation, world is represented with a collection of tables, well-suited for efficient manipulations on computers

declaring multi attribute keys

can also be another element in the list of elements of a create table statement form is essential if the key consists of more than one attribute

Multirelation queories

can be addressed by listing all relations in the FROM clause, with <relation>.<attribute> syntax

char and varchar in mysql

char(n): - right-pad with spaces to store exactly n characters - trailing spaces are removed when retrieved varchar(n) - does not pad with spaces - store length as prefix (one byte if length < 255, otherwise 2)

E/R approach

create one relation for each subclass, with only the key attribute(s) and attributes attached to that entity set

use nulls

create one relation; entities have null in attributes that don't belong to them

defining a database schema

database schema comprises declarations for the relations of the database many other kinds of elements may also appear in database schema, including views, indices and triggers

schemas

describes structure of data Schema of a relation: relation name plus attribute names schema of a database: a set of relational schemas

explicit tuple-variables

distinguish copies by following the relation name by the name of a tuple-variable in the FROM clause

object-oriented approach

each entity belongs to exactly one class; create a relation for each possible subtree including the root, with all its attributes

attributes

each has an atomic domain: string, integer, reals etc.

keys

every entity set needs at least one, can be more than one key per set

referential integrity constraint

exactly one value exists in a given role attribute has a non-null single value explicitly requires a reference to exist

controlling duplicate elimination

force result to be a set by SELECT DISTINCT: may distinct multiple attributes Force the result to be a bag (i.e., don't eliminate duplicates) by ALL, as in ...UNION ALL

introduced by exists or not exists

form boolean expression exists (subquery) evaluates to true if subquery has at least one result not exists (subquery) evaluates to true if subquery has no results

dates and times

form of date: yyyy-mm-dd datetime: yyyy-mm-dd hh:mm:ss

relational algebra

formalism for creating new relations from existing ones using relational operators

importance of contraints

give semantics to data references to entities enable efficient storage and lookup

having clauses

having <condition> may follow a group by clause if this is the case, condition applies to each group, and groups not satisfying the condition are eliminated

benefits of sql

high-level language, programmer able to avoid specifying a lot of data manipulation queries are optimized quite well, making it efficient

mathematical relationship

if A, B are sets, then a relationship R is a subset of A x B

restrictions on SELECT lists with aggregation

if any aggregation is used, then each element of the select list must either be aggregated or an attribute on the group by list

insertion

insert single tuple: INSERT INTO <relation> VALUES (<list of values>)

subquery in where clause

introduced by = or !=: - x = subquery - x can be attribute or tuple of attributes - needs to return exactly one result introduced by in or not in: - x in (subquery), may return multiple results

common constraints

keys, single-value constraints, referential integrity constraints, domain constraints, general constraints

specifying attributes in INSERT

may add to relation name a list of attributes reasons to do so: - forget standard order of attributes for relation - don't have values for all attributes and want the system to fill in missing components with null or default value

adding attributes

may change a relation schema by adding a new attribute (column) by: ALTER TABLE <name> ADD <attribute declaration>;

requirements on Having conditions

may refer to any relation or tuple-variable in the from clause can refer to attributes of those relations if attribute is either a grouping attribute or aggregated

Null values

missing value or inapplicable when value is compared with null, truth value = UNKNOWN

ER relationships

modeled as mathematical set binary and multiway relationships converting a multiway one into many binary ones constraints on the degree/multiplicity of the relationship (many-one, one-one, many-many)

operational semantics

one tuple-variable for each relation in FROM clause if tuple-variables are pointing to tuples that satisfy the WHERE clause, send tuples to SELECT clause

order by

orders by desc or ascending, can set limit for number of returned values offset x - offset of first down to be returned (starts from 0)

entities

real-world objects distinguishable from other objects, described using a set of attributes

grouping

relation that results from select-from-where is group according to values of all those attributes, and any aggregation is applied only within each group

two mathematical definitions of relations

relations as subset of cartesian product (tuple = element of string x int x string x string, relation = subset of string x int x string x string), order is important relation as a set of functions: fix set of attributes (A tuple = function t: A-> attribute domains, relation = a set of tuples/functions), order is not important

deleting attributes

remove an attribute from a relation schema by: ALTER TABLE <name> DROP <attribute>;

right outer join

retain dangling tuples from right relation

schema and instances as programming

schema = type/class instance = value/instance database schema = stable over long periods of time database instance = changes constantly as data is changed

Instances

schema instance = data relational schema = instance = relation with a collection of tuples Database schema = instance = n relations, of types R1, R2, ..., Rn

Subquery in FROM Clause

select * from (select * from <relation> tupe variable needed to name the relation generated by the subquery

Bag Semantics

selection: preserve number of occurrences projection: preserve number of occurrences (no duplicate elimination) cartesian product, join: no duplicate elimination

declaring a relation

simplest form: CREATE TABLE <name> (<list of elements>); remove a relation from database with DROP TABLE <name>;

subclasses in ER diagrams

special cases, fewer entities, more properties

formal semantics

start with product of all relations in the from clause apply selection condition from the where clause project onto list of attributes and expressions in the SELECT clause

difference (except)

subtract number of occurrences {a,b,b,b,c,c} - {b,c,c,c,d} = {a,b,b}

inserting many tuples

use INSERT INTO <relation> (<subquery>);

SQL joins

use bag semantics can be used in place of relations in a FROM clause

weak entity sets

when some or all of their key attributes come from other entity sets

intersection

{a,b,b,b,c,c} (upside down U) {b,b,c,c,c,c,d} = {b,b,c,c} minimum of the two numbers of occurrences

SQL patterns

<attribute> LIKE/NOT LIKE <pattern>, used with "%<string>%'

comparison operators

=, !=, <, >, <=, >=, <>

entity set

A collection of like entities.

full outer join

A join in which all rows from both tables will be included regardless of whether they match rows from the other table

left outer join

A join that includes all of the rows from the first table in the query and only those records from the second table that match the join field in the first table

delete all tuples

DELETE FROM <relation> no where clause needed

deletion

DELETE FROM <relation> WHERE <condition>

eliminate duplicates in an aggregation

DISTINCT inside aggregation causes duplicates to be eliminated before the aggregation

Domains in a relational model

Each attribute has a type, must be atomic

other declarations for attributes

NOT NULL means that the value for this attribute may never be null DEFAULT <value> says that if there is no specific value known for this attribute's component in some tuple, use the stated <value>

nulls ignored in aggregation

NULL never contributes to a sum, average, or count of a specific column (e.g., count(price)), and can never be the minimum or maximum of a column (unless the column has only null values) if there are no non-Null values in a column, then the result of aggregation is NULL

declaring single-attribute keys

Place PRIMARY KEY or UNIQUE after the type in the declaration of the attribute

theta join

R JOIN S ON <condition> is a theta-join, using <condition> for selection gives us all (n, a, p, d, b) quadruples such that drinker n (same as d) lives at address a, has phone p, and frequents bar b.

aggregations

SUM, AVG, MIN, MAX, COUNT can be applied to a column in a SELECT statement to produce that aggregation on the column

primary key vs unique

There can be only one PRIMARY KEY for a relation, but several UNIQUE attributes No attribute of a PRIMARY KEY can ever be NULL in any tuple. But attributes declared UNIQUE may have NULL's, and there may be several tuples with NULL.

Three-valued logic

True = 1; false = 0; unknown = .5 AND = min, OR = max, NOT(x) = 1-x if x = Null then 4*(3-x)/7 is still NULL

update

UPDATE <relation> SET <list of attribute assignments> WHERE <condition on tuples>;

natural outer join

add natural before left/right

Union (all)

add number of occurrences {a,b,c,c} U {a,a,b,b,c} = {a,a,a,b,b,b,c,c,c}

constraint

an assertion about the data in the database that must be true at all times part of database schema very important to design to ensure integrity

declaring keys

an attribute or list of attributes may be declared with PRIMARY KEY or UNIQUE These each say the attribute(s) so declared functionally determine all the attributes of the relation schema

single value constraint

an entity has at most one value for a given attribute or relationship a many-one relationship also implies a single value constraint

additional MySQL operators

between x and y: [x,y] A in (x,y,z): A = x or A = y or A = z


Conjuntos de estudio relacionados

ATI physical assessment and Nursing process (Test 2)

View Set

INFO 360 Exam 1 Review - Ch. 1,2,4,5

View Set

Electromagnetic Spectrum and Speed of Light

View Set

International Business Law Chapter 5, Chapter 9 International Environment of Business Management, chapter 6 international business environment, BLAW 497 Exam 2 MC

View Set