Database Concepts
entity
Something of importance to a user that needs to be represented in a database. In the entity-relationship model, entities are restricted to things that can be represented by a single table
LEFT JOIN
The LEFT JOIN keyword returns all rows from the left table (table1), with the matching rows in the right table (table2). The result is NULL in the right side when there is no match.
RIGHT JOIN
The RIGHT JOIN keyword returns all records from the right table (table2), and the matched records from the left table (table1). The result is NULL from the left side, when there is no match.
data integrity
The result of implementing domain integrity, entity integrity, and referential integrity in a database.
transitive dependency
There is no non-key attribute determined by another non-key attribute. In a relation having at least three attributes, such as R (A, B, C), the situation in which A determines B and B determines C, but B does not determine A.
SQL UPDATE
UPDATE EMPLOYEE SET Department = 'Finance', OfficePhone = '360-285-8420' WHERE EmployeeNumber = 9;
SQL code to find any values that would violate referential integrity?
Update, inserting, and deleting anomalies. Null values, not setting a foreign key
Why would you not want to represent multivalued columns as separate columns?
You wouldn't want to make a separate column for an instance of an undefined amount of objects such as number of children
Primary Key
a candidate key selected to be the key of a relation
Partial Dependency
nonprime attribute is functionally dependent on part of a candidate key. (A nonprime attribute is an attribute that's not part of any candidate key.)
Attribute
(1) A value that represents a characteristic of an entity. (2) A column of a relation
Characteristic of relations
(1) For a relation to be considered well-formed, every determinant must be a candidate key. (2) Any relation that is not well-formed should be broken into two or more relations that ate well formed.
Data Model
(1) a model of the users' data requirements usually expressed in terms of the entity-relationship model. It is sometimes called a users' data model. (2) A language for describing the structure and processing of a database.
why do you want data for a read-only database to be one large table instead of a database in 3NF?
A 3NF database won't be needed if the data will be stored as a copy of previous data that won't be updated or kept to date
Multivalued Dependency
A condition in a relation with three or more attributes in which independent attributes appear to have relationships they do not have.
Concurrency
A condition in which two or more transactions are processed against the database at the same time. In a single CPU system, the changes are interleaved; in a multi-CPU system, the transactions may be processed simultaneously, and the changes on the database server are interleaved.
Relational Database
A database that consists of relations. In practice, relational database contain relations with duplicate rows. Most DBMS products include a feature that removes duplicate rows when necessary and appropriate. Such removal is not done as a matter of course because it can be time-consuming and expensive.
Composite Key
A key of a relation that consists of two or more columns.
Foreign Key
A key that uniquely identifies a row in another table or the same table
Structured Query Language (SQL)
A language for defining the structure and processing of a relational database. It can be used as a stand-alone query language, or it can be embedded in application programs. SQL was developed by IBM and is accepted as a national standard by the American National Standards Institute.
Database Administrator (DBA)
A person or group responsible for establishing policies and procedures to control and protect a database. They work within guidelines set by data administration to control the database structure, manage data changes, and maintain DBMS programs
Cascade Delete
A property of a relationship that indicates that when a parent row is deleted, related child rows should be deleted as well.
Cascade Update
A referential integrity action specifying that when a parent row is updated, the foreign keys of related child rows should be updated as well.
Second Normal Form (2NF)
A relation in first normal form in which all non-key attributes are fully dependent on all the keys.
Third Normal Form (3NF)
A relation in second normal form that has no transitive dependencies.
INNER JOIN, ON
A relational algebra operation on two relations, A and B, that produces a third relation, C. A row of A is concatenated with a row of B to form a new row in C if the rows in A and B meet restrictions concerning their values.
Functional Dependency
A relationship between attributes in which one attribute or group of attributes determines the value of another. The expressions X --> Y, "X determines Y", and "Y is functionally dependent on X" mean that given a value of X, we can determine the value of Y.
Referential Integrity Constraint
A relationship constraint on foreign key values. A referential integrity constraint specifies that the values of a foreign key must be a subset of the values of the primary key to which it refers.
Database Management System (DBMS)
A set of programs used to define, administer, and process a database and its applications.
Data Warehouse
A store of enterprise data that is designed to facilitate management decision making. A data warehouse includes not only data but also metadata, tools, procedures, training, personal information, and other resources that make access to the data easier and more relevant to decision makers.
Relation
A two-dimensional array that contains single-value entries and no duplicate rows. The meaning of the columns is the same in every row. The order of the rows and columns is immaterial.
Surrogate Key
A unique, system-supplied identifier used as the primary key of a relation. The values of a surrogate key have no meaning to the users and usually, are hidden on forms and reports.
Advantages and disadvantages of normalization?
Advantages of normalization: Updates run quickly due to no data being duplicated in multiple locations. Inserts run quickly since there is only a single insertion point for a piece of data and no duplication is required. Tables are typically smaller that the tables found in non-normalized databases. This usually allows the tables to fit into the buffer, thus offering faster performance. Data integrity and consistency is an absolute must if the database must be ACID compliant. A normalized database helps immensely with such an undertaking. Disadvantages of normalization: Since data is not duplicated, table joins are required. This makes queries more complicated, and thus read times are slower. Since joins are required, indexing does not work as efficiently. Again, this makes read times slower because the joins don't typically work well with indexing.
null value
An attribute value that has never been supplied. Such values are ambiguous and can mean the value is unknown, the value is not appropriate, or the value is known to be blank.
Update Anomaly
An update anomaly is a data inconsistency that results from data redundancy and a partial update
First Normal Form (1NF)
Any table that fits the definition of a relation.
Functions of a DBMS
Create databases Create tables Create supporting structures Read database data Modify database data (insert, update, and delete) Maintain database structures Enforce rules Control concurrency Provide security Perform backup and recovery
SQL DELETE
DELETE FROM Project WHERE Department = 'Sales and Marketing';
Metadata
Data concerning the structure of data in a database stored in the data dictionary. Metadata are used to describe tables, columns, constraints, indexes, and so forth.
what are read-only databases?
Database where its contents cannot be altered
Purpose of Normalization
Eliminate anomalies and organize data
How do you turn a 3NF table into one large table?
Form all data correctly then just create a new table.
Edgar Frank Codd
Former IBM Employee who created relational databases in 1970 made the process of Normaization
SQL INSERT
INSERT INTO PROJECT (Department, ProjectName, MaxHours, StartDate) VALUES ('Finance', '2018 Q4 Prorfolio Analysis', 140.00, '05-OCT-18');
Insertion Anomaly
In a relation, the condition that exists when, to add a complete row to a table, one must add facts about two or more logically different themes.
Deletion Anomaly
In a relation, the situation in which the removal of one row of a table deletes facts about two or more themes.
Purposes of indexes
Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records
How could you locate inconsistent data in a database?
Multiple copies of same data and changes are made only to those that are only single file and not somewhere else
Redundancy
the appearance of the same data factor in more than one field or table of data