INFO 330
Fan traps
Where a model represents a relationship between entity types, but pathway between certain entity occurrences is ambiguous.
Chasm Trap
Where a model suggests the existence of a relationship between entity types, but the pathway does not exist between certain entity occurrences.
Aggregates SQL
With the exception of the COUNT aggregate function, all other aggregate functions ignore NULL values. Aggregate functions are frequently used with the GROUP BY clause of the SELECT statement.
Schema
a concept or framework that organizes and interprets information system view
Native Drivers
a library provided by the database manufacturer to link it into the application
A functional dependency (A->B) occurs when the attribute A uniquely determines B
a means of expressing that the value of one particular attribute is associated with, or determines, a specific single value of another attribute. B can be linked to multiple A's but each unique A is ALWAYS linked to a certain B value
DBMS (Database Management System)
a product used for the storage and organization of data that typically has defined formats and structures Provides facility to define the database via a Data Definition Language --DDL Provides facility to store, retrieve, & manipulate data through Data Manipulation Language - SQL (Structured Query Language) is the standard - Note SQL & MySQL are 2 different things
subschema (user view)
a subset of the schema; the way the user defines the data and the data relationships
Application
UI Business Logic
Candidate Key
Uniquely identifies Irreducibility - can't make it any smaller & still work
Self Joins
Used to link a table to itself Requires the use of table aliases Requires the use of a column qualifier
CREATE TABLE
valid SQL command for constructing a table
cascade
when rows deleted in Master table, deletes rows in child tables. Rows updated in Master update foreign keys in child tables
LDAP Continued
•Standards based (RFC 4510) •TCP based•Built in replication •Access control •Tree of directory entries •Object Oriented •Entries are a set of attributes •Extendable schema •Multi-value attributes•Unique identifier for each entry called a distinguished name (DN) Lecture 1 slide 33
entity integrity rule
A rule that states that no primary key attribute (or component of a primary key attribute) may be null.
safe expressions
A safe expression yields a finite number of tuples as its result. Otherwise, it is called unsafe.
Database Definition
A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization Databases are "Self Describing"
Ternary Relationship
A simultaneous relationship among the instances of three entity types.
Top-Down Design
A systematic approach in which a problem is broken into a series of high-level tasks.
Relational Calculus
- lets users describe what they want, rather than how to compute it (non-operational, declarative)
GUI
A Graphical User Interface, currently the dominant method for designing human-computer interaction.
ORDER BY
A SQL clause that is useful for ordering the output of a SELECT query (for example, in ascending or descending order).
DROP TABLE
A SQL command used to delete database objects such as tables, views, indexes, and users.
OLAP (online analytical processing)
A category of software tools that provide analysis of data stored in a database and is often used in data mining
having
A clause applied to the output of a GROUP BY operation to restrict selected rows. But what if we want to return results based upon a GROUP BY? Enter the HAVING clause.
LDAP (Lightweight Directory Access Protocol)
A communications protocol that defines how a client can access information, perform operations, and share directory data on a server.• Most popular way of accessing directories. Microsoft's Active Directory is an example .•Similar to SQL, in it defines a way to query data, but it also defines the protocol.
database server
A computer in a client/server environment that is responsible for running a DBMS to process SQL statements and perform database management tasks Business Logic Data Management Lecture 3
transitive dependency
A condition in which an attribute is dependent on another attribute that is not part of the primary key.
full functional dependence
A condition in which an attribute is functionally dependent on a composite key but not on any subset of the key.
Bottom-up Design
A design philosophy that begins by identifying individual design components and then aggregates them into larger units. In database design, the process begins by defining attributes and then groups them into entities. Compare to top-down design.
Domain
A domain is the set of allowable values for one or more attributes. the set of allowable values for one or more attributes. Domains may be distinct for each attribute. It allows the user to define in a central place the meaning of values that attributes can hold. Specifies the constraints placed upon an attribute or attributes.
Primary Key
A field (or group of fields) that uniquely identifies a given entity in a table The candidate key that you choose to use -Surrogate Key
ER Diagram
A graphical representation of an entity-relationship model. Multiple standards: Chen notation, Crow's foot, UML
Cross Join
A join in which each row from one table is combined with each row from another table.
outer join
A join in which rows that do not have matching values in common columns are nevertheless included in the result table. Outer joins are used to return all rows from one table regardless of a match in the other table. If no match is found in the other table, a NULL is returned. Three types: LEFT, RIGHT, FULL
XML
A language used to store files as simple text along with information on how the application software would interpret that text XML tries to solve a few problems with traditional file stores such as -Data dependence-Incompatibility •But XML still has problems -Duplication of data -Exchangeability comes at a price in terms of performance and disk space
identifier
A logical construct so we can reference an object
JSON (JavaScript Object Notation)
A markup language (such as html) used for transmitting documents. Contains little metadata and is preferred for transmitting volumes of data between servers and browsers. While the notation is the format of JavaScript objects, JSON documents can be processed by any language.
Serializability
A property in which the selected order of transaction operations creates the same final database state that would have been produced if the transactions had been executed in a serial fashion. Serializability is the classical concurrency scheme. It ensures that a schedule for executing concurrent transactions is equivalent to one that executes the transactions serially in some order. It assumes that all accesses to the database are done using read and write operations.
Scalar Subquery
A query that returns exactly one value: a single row, with a single column. A scalar subquery expression is a subquery that returns exactly one column value from one row. The value of the scalar subquery expression is the value of the select list item of the subquery. can be multiple rows
directories
Also referred to as folders; they are used for organizing and storing computer files. •A hierarchical database consisting of entities. •Focus high performance reading and searching •Designed for high availability •Entities have updatable schemas.
CRUD
An acronym used in database development that stands for the Create, Read, Update, and Delete database operations. How we manipulate the data. Called the Data Manipulation Language (DML).
Foreign Key
An attribute (or set) in a table that matches the candidate key (primary key of another table) of another table. (Can be the same table).
Superkey
Any attribute set that uniquely identifies
CODASYL DBMS
Conference on Data Systems Languages, worked to define COBOL and a data model First standard database definition published in 1971 Navigation database procedural querying
DBMS Advantages
Control (though not elimination) of redundancy Consistency - reduce points of update/modification, caused by 'same' data existing in multiple places Data Sharing Integrity - use of constraints Security - users, roles, privileges Standardization forced Concurrency Backup & Recovery
Vertical Mapping
Create a table for each subclass; reference back to "base" table via a foreign key lecture 11 slide 19
ODBC (Open Database Connectivity)
Interacts with a database via a database driver Requires an ODBC driver for the DBMS
Uniqueness
It exists, and it is this unique item
existence
It exists, but does not tell us anything else. Conceptual Construct
OLTP (online transaction processing)
Database design used for browsing and manipulating business transaction activity
Restrict
Don't allow actions if it will violate a referential constraint
grouping SQL
GROUP BY clause is added to the end of the SELECT statement to group results after aggregate functions
Referential Integrity
If a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or the foreign key value must be wholly null If a foreign key exists in a relation, it must 'point' to something -- ie there must be a candidate value in the home relation
Set Null
If update occurs, set child foreign keys to null
EXISTS
In SQL, a comparison operator that checks whether a subquery returns any rows. The EXISTS operator is used to test for the existence of any record in a subquery. The EXISTS operator returns true if the subquery returns one or more records.
tuple
In the relational model, a table row.
Encapsulation
Keeping details (like data and procedures) together in one part of a program so that programmers working on other parts of the program don't need to know about them.
Conceptual Design
Look at the high level Picture. Not thinking about DBMS yet -Identify entities -Identify Relationships -Identify Attributes -Determine Candidate, primary, alternative keys -Check for redundancy -Validate
Data Modeling
Minimal design usually better Think about the data first, application second
Inner Join
Most common type of join; includes rows in the query only when the joined field matches records in both tables. There are actually two ways to express an Inner. An inner join is done by taking the Cartesian product of the two tables, then only returning the rows that match the conditional.
relational database
Normalized collection of distinctly named relations.
Cardinality
Number of tuples in a relation.
OOP
Object Oriented Programming
ORM
Object Relational Mapper Converting data between incompatible system types You set up a basic form (like a blue print) that you would only change the value that you need to change, and it will simplify the process of adding and adjusting data
IMS (Information Management System)
Possibly the oldest DBMS still in use. Hierarchical database. Built for the Apollo space program in the 1960s. Still widely used in 2014 by Fortune 500 companies with over 50B daily transactions integrates all of an organization's systems and processes in to one complete framework, enabling an organization to work as a single unit with unified objectives. Organizations often focus on management systems individually, often in silos and sometimes even in conflict.
Data Abstraction
Representing or storing information with methods that separate layers of concerns so that the programmer can work with information while ignoring lower-level details about how the information is represented. We can separate the data from the application. Similar to object-orientated classes. Change the internals as long as the external behavior doesn't change Lecture 1 slide 24
N-Tier Systems
Seeing the trend towards a larger number of tiers. •Service Oriented Architectures
Layering
Separating components of a design individually or into groups in order to make specific adjustments
Representing Strings
Strings are wrapped in single quotes example: WHERE lastName='Smith'
ALTER TABLE
The SQL command used to make changes to table structure. When the command is followed by a keyword (ADD or MODIFY), it adds a column or changes column characteristics.
Logical Data Independence
The capacity to change the conceptual schema without having to change the external schemas and their associated application programs. Textbook pg 39: Logical data independence refers to the immunity of the external schemas to changes in the conceptual schema. The ability to change the logical (conceptual) schema without changing the External schema (User View) is called logical data independence. For example, the addition or removal of new entities, attributes, or relationships to the conceptual schema sternal schemas or having to rewrite existing application programs.
Physical Data Independence
The capacity to change the internal schema without having to change the conceptual schema. Textbook pg 39: Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema. The ability to change the physical schema without changing the logical schema is called physical data independence. For example, a change to the internal schema, such as using different file organization or storage structures, storage devices, or indexing strategy, should be possible without having to change the conceptual or external schemas.
Cartesian Product
The combination of all rows in the first table and all rows in the second table a set of pairs (x, y) of elements from two sets X and Y.
Caching
The local storage of frequently needed files that would otherwise be obtained from an external source.
Degree
The number of attributes in a relation.
File Systems
The overall structure of an operating system, in which files are named, organized, and stored. FAT and NTFS are types of file systems. Most common approach to storing data is in discrete files Limitations of Files: Separation and isolation of data •Duplication of data •Data Dependence •Incompatible file formats •Concurrency issues •Fixed queries / proliferation of apps •Two main problems emerge:-Data definition is embedded in application-No control over access and data manipulation. Lecture 1
normalization
The process of applying rules to a database design to ensure that information is divided into the appropriate tables.
Entity Integrity
The property of a relational table that guarantees each entity has a unique value in a primary key and that the key has no null values.
Three Tier Systems
There is a trend towards even further abstraction from the database by putting in middle-tier. Web is a prime example: UI -> Web Server -> Database Server
T-SQL
Transact-SQL- is a sophisticated query language with additional features beyond what is defined in the ANSI SQL. specific to microsoft sql server
attributes
columns
index
consist of one column, multiple columns, and come in different varieties (index, unique, etc) Just like a book index
Horizontal Mapping
create a table for each subclass, but each table contains all attributes
Filter mapping
create one table, add a type filed, and add all fields; if the field isn't applicable for the type, make it NULL
Requirements Analysis
developers identify the particular features and functions of the new system Understand the needs, meet with potential stakeholders, understand diff views of the problem
ANSI-SPARC Three-Level Architecture
external level conceptual level internal level page 87 of textbook
referential integrity rule
foreign keys which link rows in one table to rows in another table must have values that correspond to the value of a primary key in another table
set default
if update occurs, set child foreign keys to default
INSERT (SQL Command)
inserts new data into a database
Business logic
part of the program that encodes the real-world business rules that determine how data can be created, stored, and changed.
partition sql
partitioning is a way to divide a large table into smaller, more manageable parts without having to create separate tables for each part. Data in a partitioned table is physically stored in groups of rows called partitions and each partition can be accessed and maintained separately
Client/server systems
the server holds the data and runs the program, the client computer interfaces with the user
pi
projection
Lazy read SQL
references to the rows rather than the values in the rows
integrity constraints
rules that help ensure the quality of information
sigma
select
relational
tables
