Database Design Concepts
Full Functional Dependence
ALL of the determinant attributes are necessary to determine the dependent attributes. The removal of one of the determinants causes the dependency to no longer exist. (STU_NUM → STU_GPA)
First Normal Form (1NF)
Attributes have no more than one value for each single instance of that entity, and identify a PK (singular or composite)
Three common types of entity relationship diagrams
Chen, Crow's Foot, UML
Partial Functional Dependence
SOME determinant attributes are redundant and non necessary to determine the dependent attributes (STU_NUM, STU_LNAME) → STU_GPA
Field
a character or group of characters that has a specific meaning
Attribute
a characteristic of an entity
File
a collection of related records
Database vs database system vs database management system
a database system is an organization of components that defines and regulates the collection, storage, management, use of data (a concept), while a DBMS is a software that acts as an intermediary between the user and the database
Record
a logically connected set of one or more fields that describes a person, place, or thing (e.g. a customer record)
Candidate key
a super key without unnecessary attributes
Data anomalies (3 kinds)
abnormalities that develop when all changes in redundant data are not made correctly. 1) update/modification anomalies, 2) insertion anomalies, 3) deletion anomalies
Data dependence
access to a file is dependent on data storage characteristics. Even changes in the characteristics of data, such as changing a field from integer to decimal, require changes in all the programs that access the file. Because all data access programs are subject to change when any of the file's data storage characteristics change (that is, changing the data type), the file system is said to exhibit data dependence. Conversely, data independence exists when it is possible to make changes in the data storage characteristics without affecting the application program's ability to access the data.
Structural dependence
access to a file is dependent on its own structure. For example, adding a customer date-of-birth field to the CUSTOMER, Given this change, none of the previous programs will work with the new CUSTOMER file structure. Therefore, all of the file system programs must be modified to conform to the new file structure. In short, because the file system application programs are affected by change in the file structure, they exhibit structural dependence. Conversely, structural independence exists when it is possible to make changes in the file structure without affecting the application program's ability to access the data.
Foreign key
an attribute whose values match primary key values in the related table
UNF (unnormalized form)
an entity with attributes that contain multiple values for a single instance of that entity
Super key
any key that uniquely identifies each row
Subtype discriminator
attribute included in supertype entity to structure specialization hierarchy (determines to which subtype each supertype ocurrance is related). For disjoint/nonoverlapping subtypes, only one discriminator is required (eg PERSON_TYPE). For overlapping subtypes, several binary attributes are required (eg PERSON_IS_DIRECTOR, PERSON_IS_ACTOR)
Relationship
bi-directional association among entities (one to many, one to one, many to many)
Business rules
brief, precise, unambiguous descriptions of policies, procedures, or principles within a specific organization
Primary key
chosen key among all candidate keys (cannot contain nulls!)
Composite key
composed of more than one attribute
Entity integrity
condition in which each row (entity instance) in the table has its own unique identity (PK). 1) all values of PK must be unique, 2) no PK field can be null
Referential integrity
condition in which every reference to an entity instance by another entity instance is valid. 1) every FK entry must be a valid PK value of the related table, 2) OR FK is null
Dependency diagram
depicts all dependencies found within a given table structure (arrows above the attributes indicate fully functional dependencies, arrows below the attributes indicate partial and transitive dependencies)
Connectivity
describes the relationship classification (1:1, 1:M, M:N). Determined by the maximum cardinality (ie the right-hand numbers in UML)
Participation
determined by the minimum cardinality. It informs whether an entity occurrence is required or not in a relationship. (mandatory - 1..*, optional - 0..*)
Data inconsistency
different and conflicting versions of same data occur at different places
Disjoint/nonoverlapping subtypes
each entity instance of the supertype can appear in only one of the subtypes
Overlapping subtype
each entity instance of the supertype may appear in more than one of the subtypes
Second Normal Form (2NF)
eliminates partial dependencies (may include transitive dependencies)
Third Normal Form (3NF)
eliminates transitive dependencies
Strong (identifying) relationship
exists if PK of related entity (child) contains a PK component of parent entity. At least one part of the child's PK is inherited from one parent entity
Entity supertype
generic entity type related to one or more entity subtypes (contains attributes common to all subtypes
Entity relationship modeling
graphical model of the structure of the database
Physical design
hardware dependent
Associative entity
has a composite primary key constituted by the primary keys of tables to be linked
Conceptual design
independent of software and hardware, identification of entities and business rules
Entity subtype
inherits all attributes and relationships from supertype. Contains unique attributes specific only to this subtype.
Systems development life cycle
iterative pattern that traces the history (creation, evolution, and replacement) of information systems
Secondary key
key used strictly for data retrieval purposes (eg STU_LNAME + STU_FNAME)
Cardinality
minimum and maximum number of entity instances associated with one occurrence of related entity (established by business rules!), uses (min...max) notation in UML
Entity
object in the real world (person, place, thing about which data are collected and stored
Normalization
process of evaluating and correcting table structures to minimize data redundancies
Data
raw facts
Constraint
restriction placed on the data
Data redundancy
same data stored unnecessarily in different places (i.e. islands of information). Data stored in different, independent locations is unlikely to be updated consistently and coherently.
Composite vs single attributes
single attribute cannot be divided (eg age), composite attribute can be subdivided (eg address). Composite attributes should always be split into single attributes when possible
Logical design
software dependent but hardware independent; map conceptual design to specific data model (ie tables, attributes, relationships, and constraints)
Multi-value attributes
something that can have many different values (eg EMP_SKILL). Creates a many to many relationship which needs to be resolved by 1) creating several new attributes (eg CAR_COLOUR_INSIDE, CAR_COLOUR_OUTSIDE), or 2) creating a new entity composed of original multi-valued attributes' components
Weak entity
the child entity, existence is dependent on the existence of the parent entity
Information
the result of processing raw data to reveal meaning
Determination
the state in which knowing the value of one attribute in a table makes it possible to determine the value of another attribute in the same table (Determinant → Dependent)
Relational algebra
use of relational algebra operators on existing relations produces new relations (data manipulation)
Keys
used by RDMS to establish relationships among tables and ensure data integrity
Derived/calculated attributes
value may be calculated from other attributes (not physically stored in database, calculated on demand)
Systems development life cycle vs database life cycle
SDLC traces history (life cycle) of an application within an information system, DBLC describes history of database within the information system
Weak (non-identifying) relationship
Exists if PK of related entity (child) does not contain any PK component from parent entity. Participating entities have their own independent primary keys that are not shared.
Transactional entity
Has its own independent primary key. The primary keys of tables to be linked are foreign keys in this entity.
Transitive Dependency
Indicated by a functional dependency in which the determinant is a non-prime attribute (not PK or candidate key) and dependent is also a non-prime attribute.