Database Systems
File/Table Operations
1- created 2- populate -key in/import 3- updated (records)
Relation
A.k.a Table; is a matrix composed of intersecting rows and columns.
Natural Key
AKA natural identifier A real-world, generally accepted identifier used to distinguish/uniquely identify real-world objects. It is familiar to end users and forms part of their day-to-day business vocabulary. Most of these make acceptable PK identifiers. Example: SSN.
Logical Operators
AND/OR/NOT Used in conditional expressions
Database communication interfaces
Accept end-user requests via multiple, different network environments.
Manual File Systems
Accomplished through a system of file folders and filing cabinets
hardware
CPU, storage
Computerized File Systems
Data processing (DP) specialist: Created a computer based system that would track data and produce required reports
DEFAULT
Defines a default value for a column (when no value is given)
DELETE
Deletes one or more rows from a table
A simple attribute can be broken down into smaller pieces.
FALSE
What commands modify a SQL database
INSERT DELETE UPDATE
data integrity
In a relation database, a condition in which the data in the database complies with all entity and referential integrity constraints
Unstructured data
It exists in their original state
Secondary key
One field or a combination of fields for which more than one record may have the same combination of values. Also called a nonunique key
Schema
The conceptual organization of the entire database as viewed by the DBA.
Types of Data Anomaly
Update Anomalies, Insertion Anomalies, Deletion Anomalies
Simple retrieval query in SQL
can consist of up to four clauses. Mandatory clauses are SELECT and FROM. Must be in the following order: SELECT <attribute list> FROM <table list> {WHERE <condition>} {ORDER BY <attribute list>};
Constraint name
must be unique.
Index
used to logically access rows in a table quickly
data inconsistency
資料不一致
Steps for the normalization procedure:
1. Eliminate the repeating groups (present the data in a tabular format where each cell has a single value and there are no repeating groups. To eliminate repeating groups you eliminate the nulls). 2. Identify the Primary Key (make sure it is a PK that will uniquely identify ANY attribute value) 3. Identify all dependencies (know what is related and what determines what)
The Heirarchical Model
1960; represented by upside-down tree. Only 1:M relationships between records (segments).
Advantages of the DBMS
Better data integration and less data inconsistency, Increased end user productivity, Improved: Data sharing, Data security, Data access, Decision making
CREATE VIEW
Creates a dynamic subset of rows and columns from one or more tables
CREATE INDEX
Creates an index for a table
Atomic Data
Data stored at their lowest level of granularity
structured data
Data that has been formatted to facilitate storage, use, and information generation
Optional Participation
One entity occurrence does not REQUIRE a corresponding entity occurrence in a particular relationship. Example: "COURSE generates CLASS" --> some courses do not generate a class. You put a small circle (O) on the side of the optional entity.
Ch. 1 - Data
Recorded facts and numbers
NOT
Requires the first query to be true but not the second.
HAVING
Restricts the selection of grouped rows based on a condition
WHERE
Restricts the selection of rows based on a conditional expression
SUM
Returns the sum of all values for a given column
Granularity
The level of detail represented by the values stored in a table's row.
Indexed file organization
The storage of records either sequentially or nonsequentially with an index that allows software to locate individual recores
Sequential file organization
The storage of records in a file in sequence according to a primary key value
T/F? Relationships between entities always operate in both directions.
True Example: a CUSTOMER may generate many INVOICEs. & each INVOICE is generated by one CUSTOMER. (1:M relationship)
T/F? Normalization is usually used in conjunction with entity relationship modeling.
True. A DB designer will make a data model using Crow's Foot notation ERDs and then after the initial design is complete, the designer can use normalization to analyze the relationships among the attributes within each entity and determine if the structure can be improved through normalization.
Ch. 1 - Users
Use forms to read, enter, and query data, and they produce reports to convey information
Byte
Whole numbers from 0 to 255
XML database
XML資料庫
An entity that associates the instances of one or more entity types and contains attributes specific to the relationships is called a(n):
associative entity
default value
assumed value if no explicit value
A property or characteristic of an entity type that is of interest to the organization is called a(n):
attribute
composite attribute
attribute composed of several more basic/ atomic attributes (ex: Name = FirstName, MiddleInitial, Surname)
Composite Attribute
attribute that can be subdivided to yield additional attributes
atomic attribute
attribute which contains a single value of the appropriate type (relational model should have only atomic attributes) (ex: "address" of type varchar(100) should only have 1 address)
multivalued attribute
attribute which has lower and upper bounds on the number of values for an individual entry (ex: phone number = set # of digits) (relational model should not store multivalued attributes)
Relationship Strength
based on how the PK of a related entity is defined.
knowledge
body of information and facts about a specific subject
Data Definition Language
commands that create, modify, and remove database objects such as tables, indexes, and users
recent trends
data driven websites distributed databases object-oriented DB data warehouses data marts big data
Data dictionary management
defines data elements and their relationships
relational schema, R
definition of a table in the database; can be denoted by listing table name & attributes R(A1, A2, ... , An) where R = table name and each An = an attribute
The number of entity types that participate in a relationship is called the:
degree
data anomaly
develops when not all of the required changes in the redundant data are made successfully
restrict deletes
do not allow deletion of a record in the parent table if related rows exist in the child table
Attributes have a ___
domain
Strong or Regular Entity
exhibits existence independence
data redundancy
exists when the same data are stored unnecessarily at different places
Velocity
refers to real time analytics; analyze quickly
field
smallest unit of application data recognized by system software
structural independence
結構獨立性
online transaction processing (OLTP)
線上交易處理(OLTP)
data
資料
Dependency Diagram
A diagram that depicts all dependencies found within a given table structure. Its use make it less likely that you will overlook an important dependency.
Boolean Algebra
A field in math dedicated to the use of logical operators. Know how to write sql statements with and/or/not
Index
A table or other data structure used to determine in a file the location of records that satisfy some condition
IS NULL
Checks whether an attribute value is null
Relationship Degree
The number of entities or participants associated with a relationship.
structural independence
exists when you can change the file structure without affecting the application's ability to access the data
Cardinality
expresses the minimum and maximum number of entities associated with one occurrence of related entity
A student can attend five classes, each with a different professor. Each professor has 30 students. The relationship of students to professors is a ________ relationship.
many-to-many
structural dependence
結構依賴性
Structured Query Language (SQL)
結構化查詢語言(SQL)
structured data
結構化資料
NoSQL
New generation of databases that address specific challenges of the Big Data era. Not based on relational model, distributed DB architectures, high fault tolerance, very large amounts of sparse data, performance over transaction consistency.
Number
Numeric data used in mathematical calculations
DROP VIEW
Permanently deletes a view
DROP INDEX
Permanently deletes an index
Data Redundancy Implications
Poor data security, Data inconsistency, Increased likelihood of data entry errors when complex entries are made in different files, Data anomaly
Data redundancy implications
Poor data security, data inconsistency, and increased likelihood of data-entry errors when complex entries are made in different files.
Semistructured data
Processed to some extent
Denormalization
Produces a lower normal form so 3NF will be converted to 2NF. BUT: The price you pay for increased performance through denormalization is greater data redundancy.
Data quality
Promoting accuracy, validity, and timeliness of data
Data quality
Promoting accuracy, validity, and timeliness of data.
LOOK AT PAGE 249 FOR DATA MODELING CHECKLIST
READ IT FROM TEXTBOOK PDF
End User Data
Raw facts of interest to end user
data
Raw facts, or facts that have not yet been processed to reveal their meaning to the end user
Validation Rules
Reduce the possibility of error when data is entered. The rule restricts what can be entered.
Big Data
Refers to a movement to find new & better ways to manage large amounts of web-generated data and derive business insight from it. Seeks high performance, scalability, and reasonable cost.
Logical Design
Refers to the task of creating a conceptual data model that could be implemented in any DBMS.
Which of the following is NOT a good characteristic of a data name?
Relates to a technical characteristic of the system
Ch. 1 - What is the most commonly used type of database?
Relational database
ROLLBACK
Restores data to their original values
AVG
Returns the average of all values for a given column.
Desktop database
Runs on PC.
Order of precedence:
SELECT SUM/AVG/DISTINCT/COUNT FROM WHERE (AND/OR/NOT/MAX/MIN) LIKE GROUP BY HAVING ORDER BY
Ch. 1 - SQL
SQL stands for Structured Query Language and is internationally recognized and understood by all commercial DBMS products
multiset
SQL tables cannot really be a set of tuples because there can't be identical members--instead it is a multiset wherein the select stmt can be used to choose a distinct option.
SELECT
Selects attributes from rows in one or more tables or views
A business rule is a statement that defines or constrains some aspect of the business.
TRUE
A multivalued attribute may take on more than one value for a particular entity instance
TRUE
A time stamp is a time value that is associated with a data value.
TRUE
An attribute whose values can be calculated from related attribute values is called a derived attribute.
TRUE
An entity is a person, place, object, event, or concept in the user environment about which the organization wishes to maintain data.
TRUE
An entity type name should always be a singular noun.
TRUE
In an E-R diagram, an associative entity is represented by a rounded rectangle.
TRUE
One of the roles of a database analyst is to identify and understand rules that govern data.
TRUE
While business rules are not redundant, a business rule can refer to another business rule.
TRUE
First normal form (1NF)
Table format, no repeating groups, and PK identified
Terms used by SQL for relational model terms
Table--relation Row--tuple Column--attribute
Ch. 1 - Integrated Tables
Tables that store both data and the relationships among the data
Validation Text
Text is displayed in a message box then the user is required to enter again.
Conversion to 2NF occurs only when ____
The 1NF has a composite PK.
Mapping (SELECT-FROM-WHERE block)
Simple SQL query formed by three clauses SELECT <attribute list> FROM <table list> WHERE <condition>
Data warehouse
Stores data in a format optimized for decision support
Data warehouse
Stores data in a format optimized for decision support.
Current generation DBMS software
Stores data structures, relationships between structures, and access paths, Defines, stores, and manages all access paths and components
Data dictionary
Stores definitions of the data elements and their relationships
Data dictionary
Stores definitions of the data elements and their relationships.
Analytical database
Stores historical data and business metrics used exclusively for tactical or strategic decision making, data warehouse
Data Mart
Subsets of data warehouses; used to analyze a business unit
Overlapping Subtypes
Subtypes that contain nonunique subsets of the supertype entity set; each entity instance of the supertype may appear in more than one subtype. Example: a person may be an employee, a student, or both. STUDENT AND EMPLOYEE OVERLAP.
Multiuser database
Supports multiple users at the same time, Workgroup databases, Enterprise database
1NF
-all key attributes are defined -there are no repeating groups in the table. Each row/column intersection contains one and only one value, not a set of values. -all attributes are dependent on the primary key.
performance tuning
Activities that make a database perform more efficiently in terms of storage and access speed
Special Operators
BETWEEN, IS NULL, LIKE, IN, EXISTS, DISTINCT
Database Environments
Conceptual, Logical, and physical
Record Operations
Create, Read, Update, Delete
CREATE TABLE AS
Creates a new table based on a query in the user's db schema
Data dictionary management
Data dictionary
Distributed database
Data is distributed across different sites
Centralized database
Data is located at a single site
Cloud Computing Data Architect
Design and implement the infrastructure for next generation cloud database systems, internet technologies, cloud storage technologies, data security, performance tuning, large databases, etc.
Database Architect
Design and implementation of database environments, DBMS fundamentals, data modeling, SQL, hardware knowledge, etc.
Database Designer
Design and maintain databases, system design, database design, SQL
UNIQUE
Ensures that a column will not have duplicate values
Ch.1 - What is the most popular data modeling technique?
Entity-Relationship (ER) Data Modeling
Referential Integrity
Foreign key value would refer to a primary key value of another table, or it can be null.
Wizard
Form & Report
GUI
Graphical user interface
Each vendor can supply many parts to any number of warehouses, but need not supply any parts.
In the figure shown below, which of the following business rules would apply?
A person can marry at most one person.
In the figure shown below, which of the following is true?
Structured data
It results from formatting, Structure is applied based on type of processing to be performed
DISTINCT
Limits values to unique values
Extensions
Optional modules in SQL standards help specify databases.
ORDER BY
Orders the selected rows based on one or more attributes
Data storage management
Performance tuning
Semi structured data
Processed to some extent.
Desktop database
Runs on PC
Decimal
Stores numbers from -1038 - 1 through to 1038 - 1 (.mdb)
Integer
Stores numbers from -32,768 to 32,767 (no fractions)
SQL
Structured Query Language
SQL
Structured Query Language--shows how (in what order) to execute queries.
A business rule should be internally consistent.
TRUE
information
The result of processing raw data to reveal its meaning
multi-dimensional
created with input from relative data bases
Entity
exists in itself
Data in columns
have same format
user requests info--> DBMS searches database--> Database sends info back to user
interaction btwn user, Database software and database
Composite or Compound Key
made up of more than one field
1:1
rare
Cardinality
the numeric relationship with the RD
metadata
元資料
record
記錄
Relationships depicted with specialization hierarchies are described in terms of ______
"is a" relationships. Example: a pilot IS AN employee. a mechanic IS AN employee. an accountant IS AN employee.
Third normal form (3NF)
2NF and no transitive dependencies
ad hoc query
A "spur-of-the-moment" question
File
A collection of related records
Tuple
A row in a relation.
Database communication interfaces
Accept end user requests via multiple, different network environments
Default value
Automatically place a value in a field when a new record is created.
Object oriented database
Contain complex data and complex data relationships
T/F? The highest level of normalization is always the most desirable.
False; the higher the normal form, the more relational join operations needed to produce a specified output. And more resources are needed by the DB system to respond to end user queries. So sometimes you need to denormalize.
Design
Table & Query
Functional dependence
The attribute B is fully functionally dependent on the attribute A if each value of A determines one and only one value of B. Example: PROJ_NUM --> PROJ_NAME proj num is the determinant attribute and the proj name is the dependent attribute. Review definition from previous exam.
Null
Where the entry is left empty
Date Function
a procedure that returns that current system date
self join
join table to self w/ left join or inner join employee/supervisor example
data
raw facts
information
result of processing raw data to reveal its meaning
Domain
set of possible values for a give attribute
distributed database
分佈式資料庫
analytical database
分析資料庫
Types of Databases by location
-Centralized (support data located at a single site) -Distributed (supports data distributed across several sites, difficult to administer, can lessen network delays)
Types of Databases by Size
-Single-user (supports only one user at a time) -Workgroup (multi-user that supports small group of users) -Enterprise (multi-user database that supports an entire organization)
Database Views
-User (Business Oriented) -Logical/Conceptual (Programmers, Designers, DBA) -Physical (Computer & the DBMS)
Problems with conventional file systems
-each application program uses its own files -each file is owned by the department -managing them is time-consuming -not easily shared -data redundancy
Database system environment is composed of 5 components
-hardware -software -people -procedures -data
Types of Databases by use
-transaction (production) supports a company's day-to-day operations -data warehouse (for data mining) stores data used to generate information
1 to many
1 area to many readers one side (primary key) can have unique index but many side can't (duplicates okay or no)
Flat File
1 table Records Fields
Fields
1- Name 2- Length (#of bytes) 3- Type of data
A table is in 3NF when:
1. It is in 2NF 2. It contains no transitive dependencies
Disadvantages of stored derived attributes
1. Requires constant maintenance to make sure derived value is current
Second normal form (2NF)
1NF and no partial dependencies
Norms
1NF, 2 NF, 3NF
If the 1NF has a single-attribute PK, then the table is automatically in _____.
2NF
How to write not equal to
<> or !=
Comparison Operators
=, <, >, <=, >=, <> Used in conditional expressions
Object/Relational Database Management System (O/R DBMS)
A DBMS based on the ERDM.
data dictionary
A DBMS component that stores metadata
Relational Database Management System (RDBMS)
A DBMS which hides complexity from users, who see the database as a collection of tables in which data is stored.
Problem domain
A clearly defined area within the real-world environment, with well defined scope and boundaries, that will be systematically addressed.
Pointer
A field of data indicating a target address that can be used to locate a related field or record of data
Record
A logically connected set of one or more fields that describes a person, place, or thing
Normalization
A process for evaluating and correcting table structures to minimize data redundancies, thereby reducing the likelihood of data anomalies. Works through a series of stages called normal forms.
data management
A process that focuses on data collection, storage, and retrieval
Inheritance
A property that enables an entity subtype to inherit the attributes and relationships of the supertype. All entity subtypes inherit their PK attribute from their supertype.
query
A question or task asked by an end user of a database in the form of SQL code
Data model
A relatively simple representation, usually graphical, of more complex real-world structures.
Entity Instance or Occurrence
A row in the relational table in the ER model.
Single-valued attribute
An attribute that can only have a single value. Example: a person can only have one social security number. Example: a manufactured part can only have one serial number. This is not necessarily a simple attribute!! Example: a serial number is single-valued but it is a composite attribute bc it can be subdivided into the region it was produced, plant within the region, blah blah blah. like how SSN numbers mean certain things.
Derived Attribute
An attribute whose value is calculated (derived) from other attributes. Does not need to be physically stored within the database; but can be derived instead by using an algorithm. Sometimes called computed attributes. Example: Age can be derived by using the difference between the current date and the person's DOB. Example: Total cost of an order can be derived by multiplying the quantity ordered by the unit price.
Existence-independent
An entity that can exist apart from all of its related entities. AKA Strong Entity or Regular Entity. Example: It is possible for a PART of a produce to exist independently from a VENDOR in the relationship "PART is supplied by VENDOR"
Join index
An index on columns from two or more tables that come from the same domain of values
Determinant
Any attribute whose value determines other values within a row.
Multivalued Attributes
Attributes that can have many values. Example: a person may have several college degrees. Or a car's color can be divided into many colors for the roof, body, and trim. So CAR_VIN would be the PK but CAR_COLOR would be a multivalued attribute of the CAR entity.
DBMS
Collection of programs that creates and manages the database and controls access
Database management system (DBMS)
Collection of programs, Manages the database structure, Controls access to data stored in the database
Discipline specific databases
Contains data focused on specific subject areas
Discipline-specific database
Contains data focused on specific subject areas.
software
DBMS
database software
DBMS// creates database add, change, delete data
Ch. 1 - Metadata
Data about data
Metadata
Data about data, which the end user data are integrated and managed, Describe data characteristics and relationships
Metadata
Data about data, which the end-user data are integrated and managed.
metadata
Data about data; that is, data about data characteristics and relationships
Data dependence
Data access changes when data storage characteristics change
Sorting
Data is arranged in some kind of order
Data anomaly
Develops when not all of the required changes in the redundant data are made successfully
Data anomaly
Develops when not all of the required changes in the redundant data are made successfully.
American National Standards Institute (ANSI)
Devised a framework for data modeling based on 3 primary levels of data abstraction: external, conceptual, and internal.
Required option
Dictates whether a null value is allowed
Data inconsistency
Different versions of the same data appear in different places
DBMS benefits
Eliminates most of the file system's problems. Stores data structures, relationships, and access paths. Defines all components.
Online analytical processing (OLAP)
Enable retrieving, processing, and modeling data from the data warehouse
Backup and recovery management
Enables recovery of the database after a failure
Online analytical processing OLAP
Enables retrieving, processing, and modeling data from the data warehouse.
Data Definition Language (DDL)
Enables the DBA to define the schema components.
Entity Integrity
Every table must have its own primary key and that each has to be unique and not null
Lack of Design and Data Modeling Skills
Evident despite the availability of multiple personal productivity tools being available, vital in the data design process, decreases communication between the designer, user, and the developer
Many-to-Many (M:N or *..*)
Ex. An employee may learn many job skills, and each skill can be learned by many employees.
T/F? You should implement M:N relationships and multivalued attributes in the RDBMS.
FALSE. Each column and row intersection represent a single data value. So if multivalued attributes exist, the designer has to decide on splitting the multivalued attribute into several new attributes (one for each component of the multivalued attribute) OR create a new entity composed of the multivalued attribute's components.
Single
Floating point decimal numbers from -3.4 x 1038 to 3.4 x 1038 with 6 digits of precision.
Database Design
Focuses on the design of the database structure that will be used to store and manage end user data, Well designed database, Poorly designed database causes difficult to trace errors
Data Management
Generation, storage, and retrieval of data
Data management
Generation, storage, and retrieval of data.
Replication ID
Globally unique identifies (GUID)
Entity-Relational Model (ERM)
Graphically depicts entities and relationships in a database structure. Normally represented in an entity-relationship diagram (ERD).
Database Security Officer
Implement security policies for data administration, DBMS fundamentals, database administration, SQL, data security technologies, etc.
FALSE
In the figure below, one might want to create a single-attribute surrogate identifier to substitute for the composite identifier.
Each employee can supervise one to many employees
In the following diagram, which answer is true?
SQL environment
Installation of an SQL compliant RDBMS on a computer system
Long Integer
Integer values from - 2147483648 to 2147483648
Role of the DBMS
Intermediary between the user and the database, Enables data to be shared, Presents the end user with an integrated view of the data, Receives and translates application requests into operations required to fulfill the requests, Hides database's internal complexity from the application programs and users
GO OVER PROJECT 1 AND ASSIGNMENT 3 ON HOW TO WRITE SQL STATEMENTS. PRACTICE A LITTLE WITH SSMS!
Know some data types and stuff that you did for project 1. Look at page 271 in textbook pdf and read the bottom for the differences in data types.
MAKE SURE TO REVIEW INSERT, UPDATE, DELETE syntax
LOOK OVER HIS PPT SLIDES. PRACTICE WRITING THESE STATEMENTS.
Query language
Lets the user specify what must be done without specifying how.
Data integrity management
Minimizes redundancy and maximizes consistency
ALTER TABLE
Modifies a table's definition (adds, modifies, or deletes attributes or constraints)
UPDATE
Modifies an attribute's values in one or more table's rows
DBMS traits
Multiuser access control, backup and recovery management, and integrity management.
Data transformation and presentation
Transforms entered data to conform to required data structures
Unstructured data
Type of database that exists in its original state.
Ch. 1 - What are the components of a database system?
Users, database applications, dbms, database
CHECK
Validates data in an attribute
3 V's of Big Data
Volume, Variety, Velocity
social media
Web and mobile technologies that enable "anywhere, anytime, always on" human interactions
Logical independence
When a change in the internal model does not affect the conceptual model.
Object-Oriented Data Model (OODM)
Where data and relationships are contained in a single structure known as an object. Is the basis for the object-oriented database management system. Said to be a semantic (meaningful) data model.
GROUP BY clause
Within the SELECT statement; it is used to group rows into smaller collections. Aggregate functions then summarize the data within each smaller collection. Generally used when you have attribute columns combined with aggregate functions in the SELECT statement.
Data warehouse
a collection of databases
data quality
a comprehensive approach to promoting the accuracy, validity, and timeliness of the data
Many-to-Many Relationship
a row in table A can have many matches in table B and vice versa
data warehouse
a specialized database that stores data in a format optimized for decision support; contains historical data obtained from the operational databases as well as data from external sources
AutoNumber
a unique sequential number (incremented by 1) or a random number assigned to a new record in a table
performance tuning
activities that make a database perform more efficiently in terms of storage and access speed
instance
actual content of database at some point in time
database processing approach
all data integrated, no duplicates definition of data separate from app entire org. owns database
range control
allowable value limitations (constraints or validation rules)
domain constraints
allowable values for an attribute
Module
an object in Relational Database Management Systems that you type code in, specifically Visual Basic
A ________ specifies the number of instances of one entity that can be associated with each instance of another entity.
cardinality constraint
Classification
categorizing; looks for patterns in shopping behavior to target marketing efforts
Attributes
characteristics (fields) of entities
attribute
column
join condition
combines two tuples EX. WHERE Dname = 'Research' AND Dnumber = Dno;
metadata
data about data, through which the end-user data are integrated and managed
Data modeling may be the most important part of the systems development process because:
data characteristics are important in the design of programs and other systems components.
DML
data manipulation language
online analytical processing OLAP
decision support system (DSS) tools that use multidimensional data analysis techniques; creates an advanced data analysis environment that supports decision making, business modeling, and operations research. [Title and acronym.]
object database model
define own data types
Data admin
determines and monitors the employees' access to the DBMS
data management
discipline that focuses on the proper generation, storage, and retrieval of data
Optional Attribute
does not require a value, may be left empty
file-based systems approach
each program defines and manages own data info in separate files data could duplicate solution: writing codes so that 1 student can only access her info ex. NDSU no database= pay tuition THEN register
Customers, cars, and parts are examples of
entities
Associations
establishes the relationship between two entities or events
Total completeness
every supertype occurrence must be a member of at least one subtype.
Weak Entity
existence dependent has a primary key that is partially or wholly derived from parent entity in the relationship
data inconsistency
exists when different versions of the same data appear in different places
database abstraction
hiding of details of data storage that are not needed by most users; aim to separate user's views of database from how it is physically represented
data
integrated/shared: diff. people access same
ad hoc query
is a spur-of-the-moment question
Practical significance of data dependence
is difference between logical and physical format
outer join
join 2+ query tables for 1 result includes non matching rows all records from one table and only records that have matching fields in other
schema
logical structure of database
production database
main database designed to keep track of the day-to-day operations of a company; also known as transactional databases
NOT
negates the result of a conditional expression DUH... gives you the opposite
entity integrity constraints (primary key constraints)
no duplicates or null values allowed for primary keys; set automatically when primary key chosen
Data Integrity
overall completeness, accuracy and consistency of data, meaning data is intact and unchanged
Schema
planned design structure of a database
Composite Identifier
primary key composed of more than one attribute
Foreign Key
primary key of one table appearing in another table
Weak Relationship
primary key of related entity does not contain a primary key component of the parent entity
Normalization
process of organizing a database to reduce redundancy & improve data integrity
normalization
process that generally results in a well-structured relation
Data integrity
process to ensure data is not corrupted or lost during data entry or extraction and transformation
Query language
provides instructions and procedures to retrieve data from database
Domain
range of values for a field
end user data
raw facts of interest to the end user
Clickstream
records where a user clicks on a web page
Normalization
reduces data complexity and eliminates data redundancy
tuple
row
qualify
same name can be used to two or more attributes as long as the attributes are in different relations--so it is qualified with the relation name to prevent confusion EMPLOYEE.Name
record
set of fields with all info each has same fields in same sequence ex. person's info
database
shared, integrated computer structure that stores a collection of end-user data and metadata; a collection of self-describing data; resembles a very well-organized electronic filing cabinet in which powerful software (the dbms) helps manage the cabinet's contents
Data cleansing
should be performed after a data audit on tables that are merged or imported
referential integrity constraints
specified between two relations and require foreign key; ensures that the database doesn't contain any unmatched foreign keys; default constraints usually chosen but can be changed
selection condition
specified by the WHERE clause and is the Boolean condition that must be true for any retrieved tuple. Causes for the selection of a particular tuple of interest.
Primary key
specifies one or more attributes that make up the primary key of a relation. If only one attribute te clause can follow the attribute directly--EX: Dnumber INT PRIMARY KEY;
Completeness Constraint
specifies whether each entity supertype occurrence must also be a member of at least one subtype. Can be partial or total.
disadvantages of DBMS
strict schema & multiple tables/ relations complexity size cost of DBMS, hardware & conversion higher impact of failure
The common types of entities are:
strong entities, weak entities, and associative entities(all of the above)
multiuser database
supports multiple users at the same time
online transaction processing database
supports online transaction processing for day-to-day operations of an organization
single user database
supports only one user at a time
A fact is an association between two or more:
terms
A simultaneous relationship among the instances of three entity types is called a(n) ________ relationship.
ternary
Short Text
text or combinations of text and numbers that have a max size of 255 characters
data types
text, numeric, currency, date...
Surrogate
the DBMS chooses the PK
determinants
the attributes on the left side of the arrow when converting to 2nf
projection attributes
the attributes whose values are to be retrieved.
One-to-Many Relationship
the most common type of relationship. A row from table A can have many matches in table B, but the row in table B can only have one match in table A
1NF to 2NF means to remove _____
the partial dependencies
One-to-One Relationship
the row in table A can only have one match in table B and vice versa
Domain
the set of possible values for a given attribute. Example: GPA has a domain (0,4) because the lowest possible GPA is 0 and the highest is 4. Example: Gender only has 2 possibilities (M or F)
Hierarchical database
tree-like structure; parent child relationships
In an E-R diagram, there are ________ business rule(s) for every relationship.
two
A relationship between the instances of a single entity type is called a(n) ________ relationship
unary
Three types of relationship degrees
unary (one entity) binary (two entities) ternary (three entities)
data dictionary
used by the dbms to look up the required data component structures and relationship
Data Control
used to control access to data stored in a database (Authorization)
OLAP
used to generate reports that have info or business intelligence
Foreign key
used to uniquely relate a record to another table where that field is the primary key
Disadvantages of not storing derived attributes
uses CPU processing cycles increases data access time adds coding complexity to queries
An entity type whose existence depends on another entity type is called a ________ entity.
weak
social media
web and mobile technologies that enable "anywhere, anytime, always on" human interaction
enterprise database
when the database is used by the entire organization and supports many users (more than 50, usually hundreds) across many departments
file access
where physical records are accessed
file organization
where records are physically located on a disk
A good data definition will describe all of the characteristics of a data object EXCEPT:
who can delete the data
reject
won't update due to the RESTRICT operation.
social media
社交媒體
T/F? Entity and object are interchangeable and represent real-world objects.
True
Structured data
Type of database resulting from formatting. Based on type of processing to be performed.
Data Redundancy
Unnecessarily storing same data at different places, Islands of information
Captions
Used to abbreviate field names the purpose of using captions for Field names in the Data Dictionary
Ch. 1 - Primary Key
Used to create the relationships between the tables
Data Manipulation Language (DML)
Used to define the environment in which the data can be managed and is used to work with data in the database.
Connectivity
Used to describe the relationship classification. Example: 1:M
Input Mask
Used to input data if the values to be entered in a field have exactly the same format. E.g the beginning of a phone number or the end of an email (GMail)
ORDER BY
Useful when the listing order is important to you. You put ORDER BY columnlist ASC/DESC
Physical independence
When a change in the physical model does not affect the internal model.
Can attributes share a domain?
YES. Example: a student address and professor address have the same domain of all possible addresses.
Yes/No
Yes and No values and fields that contain only one of two values (Yes/No, True/False, On/Off, 0/1, etc.)
How to put many to many relationships in tables?
You make 3 tables and the third one is a junction table that has the PKs of the other two ENTITY tables. Then you link that PK of ENTITY tables to the junction table.
Function
a Visual Basic procedure that performs a task or calculation and returns a result
candidate key
a collection of attributes that satisfies the requirements for a primary key
database management system
a collection of programs that manages the database structure and controls access to the data stored in the database; intermediary between the user and the database
Form
a database GUI
operational database
a database designed primarily to support a company's day-to-day operations; also known as a transactional database or production database
Table
a database object that is organized by rows and columns
XML database
a database system that stores and manages semistructured XML data
discipline specific database
a database that contains data focused on specific subject areas
single user database
a database that supports only one user at a time
Candidate Key
a field that could have served as key field but isn't
Attachment
a file that can be attached per record with a size limit of 2 GB
referential integrity
a foreign key value (in the relation on the many side) MUST match a primary key value in the relation on the one side or be null
third normal form
a form where no transitive dependencies exist
transitive dependency
a functional dependency between two or more non key attributes
Record
a group of fields
physical record
a group of fields stored in adjacent memory locations and retrieved together as a unit
non loss decomposition
a join operation should restore the original information
Common field when doing joins is
a key field
Extensible Markup Language
a metalanguage used to represent and manipulate data elements. Permits the manipulation of a document's data elements.
workgroup database
a multiuser database supporting a relatively small number of users (less than 50) or a department within an organization
file
a named collection of related records
query language
a nonprocedural language that is used by a DBMS to manipulate its data
second normal form
a normal form where there are no partial dependencies
Structured Query Language
a powerful and flexible relational database language composed of commands that enable users to create database and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information
Composite Identifier
a primary key composed of more than one attribute.
composite key
a primary key that consists of more than one attribute
IIF Function
a procedure that returns a specified value based on a true or false result. Abbreviated for "Immediate If"
Format Function
a procedure that returns a string containing an expression formatted according to instructions contained in a format expression
Time Function
a procedure that returns that current system time
DateDiff Function
a procedure that returns the number of time intervals between two specified dates
select-project-join
a query that only involves selection and join conditions plus projection attributes. EX: SELECT Pnumber, Dnum, Lname, Address, Bdate FROM PROJECT, DEPARTMENT, EMPLOYEE WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Plocation = 'Stafford';
first normal form
a relation is in this form if all multivalued attributes have been removed
Defining Relationships
a relationship works by matching data in key columns to other columns in a different table, usually with the same name. In most cases, the relationship matches the primary key from one table with a foreign key in the other table.
desktop database
a single-user database that runs on a personal computer
query
a specific request issued to the DBMS for data manipulation; simply put it is a question
Lookup Wizard
a wizard that helps you define either a simple or complex lookup field
null value control
allowing or prohibiting empty fields
pattern matching
allows for comparison conditions based on only parts of a character string.
Data mining
also called analytics; analyzes data in databases to find new patterns or trends
referential triggered action
alternative action that can be attached to a foreign key if update operation is illegal.
field
an alphabetic or numeric character or group of characters that defines a characteristic of a person, place, or thing
Alias
an alternate name given to a column or table in any SQL statement. *know how to write an alias*
primary key
an attribute (or combo of attributes) that uniquely identifies each row in a relation
foreign keys
an attribute (or set of attributes) within one table that matches or "links to" the candidate key of some other table allow data in tables to be linked and cross referenced by matching the same data values in both tables matching must take place to primary or candidate keys
Composite Attribute
an attribute that can be further subdivided to yield additional attributes. Example: Address can be subdivided into street, city, state, zip code. DON'T CONFUSE WITH COMPOSITE KEY. It's wise to change these into a series of simple attributes.
A mutually exclusive relationship is one in which:
an entity instance can participate in only one of several alternative relationships.
The ERM refers to a table row as ____
an entity instance or entity occurrence At the ER modeling level, an entity refers to the entity set; not the single entity instance
Calculated
an expression that uses data from one or more fields. You can designate different result data types from the expression
Report
an object in a database designed for formatting, calculating, printing, and summarizing data
OLE Objects
an object that can store pictures, audio, video, or other BLOBs (Binary Language Objects)
Query
an object that performs create, read, update, and delete operations on the database
database system
an organization of components that defines and regulates the collection, storage, management, and use of data in a database environment
query result set
answer to a query sent back by the DBMS to the application
users
application programmers, end users
An entity type name should be all of the following EXCEPT
as short as possible
Relationships
associations between entities that always operate in both directions
derived attribute
attribute whose value can be determined from another attribute (ex: can derive age from birthdate)
Derived Attribute
attribute whose value is calculated from other attributes (derived using an algorithm)
A person's name, birthday, and social security number are all examples of
attributes
Multivalued Attribute
attributes that have many values
Single-valued Attribute
attributes that have only a single value
candidate keys
can be multiple; all the candidates for the primary key (though only one primary key is chosen) ensures that each tuple in a relation is unique can be used to identify a tuple in the relation
CHECK clause
can restrict attribute or domain values. Example: department numbers are restricted between 1 and 20
M:N or M:M
can't be implemented in relational model can be changed into two 1:M relationships
Data independence
changes in the data storage characteristics without affecting the application programs
Cast (coerce)
changing string comparisons into equivalent strings--like TIME to TIMESTAMP.
DBMS (Database Management System)
collection of programs that facilitates the process of defining, constructing, and manipulating databases for various applications
file
collection of related records ex. includes group of people
relational data model
collection of relations (tables), each containing tuples (rows) and attributes (columns)
Database
collection of structured information that is designed to manage large amounts of information
data hardware software users
components of database system
An attribute that can be broken down into smaller parts is called a(n) ________ attribute
composite
An attribute that uniquely identifies an entity and consists of a composite attribute is called a(n):
composite identifier
data quality
comprehensive approach to promoting the accuracy, validity, and timeliness of the data
Data governance
concerned with the mgmt of data, including quality through effective utilization, availability, and protection
general purpose database
contain a wide variety of data used in multiple disciplines
discipline specific database
contain data focused on specific subject areas
Data warehouse
contains current and historical data from variety of sources; enables analytics of operational data
Database Admin
creation and management of entire database mgmt system
DDL
data definition language; standardized language to define the schema of a database; tasks include creating, modifying & removing database objects
unstructured data
data that exist in their original, raw state; that is, in the format in which they were collected
semistructured data
data that have already been processed to some extent
domain constraint
data types and formats must match those specified; set automatically when data type chosen
Data Manipulation Language
database language used for selecting, inserting, updating, and deleting data in a database
cloud database
database that is created and maintained using cloud data services
distributed database
database that supports data distributed across several different sites
centralized database
database that supports data located at a single site
DEFAULT<value>
defined as the value if no value is given. If not included the default becomes NULL as long as the attribute does not have a NOT NULL constraint.
A business rule:
defines or constrains some aspect of the business. asserts business structure. controls or influences the behavior of the business. (all of the above)
what is the DBMS capable of?
defining database (DDL) manipulate database (SQL) control redundancy restrict unauthorized access enforce integrity constraints provide multiple user interfaces/ veiws provide concurrent access provide mechanism for recovery provide backup allows representation of complex relationships between data
relational database schema, S
definition of a set of relations that are to be stored in the database S = {R1, R2, .... , Rn} where each Rn is a table
null
denotes absence of a value; is a member of every domain
An attribute that can be calculated from related attribute values is called a(n) ________ attribute.
derived
The total quiz points for a student for an entire semester is a(n) ________ attribute
derived
Repeating Group
derives its name from the fact that a group of multiple entries of the same type can exist for any single key attribute occurrence. Example: each time a new record is entered for a project, the number of entries in the group grows by one. This is a repeated group! A relational table must not contain any of these.
business intelligence
describes a comprehensive approach to capture and process business data with the purpose of generating information to support business decision making
Connectivity
describes the relationship classification
Optional Participation
does not require corresponding entity occurrence in a particular relationship
If speed is an issue
don't use indexed sequential
what are some potential problems with the file system approach?
duplicated data in different files and locations difficulty in accessing data programmer effort for any new tasks data isolation- due to multiple files and formats integrity problems- part of code update inconsistencies- failures may leave data in an inconsistent state concurrent access- may not be supported
file system approach
each user stores the information they require to do their job in their own files on their own machines/ servers & writes own programs to access data; files are separate and not cross-indexed; each program "owns" and accesses its own files
Participants
entities that participate in a relationship
Existence Independence
entity exists apart from all its related entities
Existence Dependence
entity exists in the database only when it is associated with another related entity occurrence
The logical representation of an organization's data is called a(n):
entity-relationship model.
foreign key
field in table that matches primary key of different table
Hierarchy of Data
field, record, file, data types
Columns
fields/attributes
analytical database
focuses primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making
data processing specialist
hired to create computer-based system.
use join
how to query multiple tables
A(n) ________ is the relationship between a weak entity type and its owner.
identifying relationship
cascade deletion
if a record in the parent table is deleted automatically delete related rows in the child table
Normalization helps ____
improve the existing data structure and create an appropriate database design. Whether designing a new database structure or modifying an existing one, the normalization process is the same.
Business intelligence
includes data mining, predictive analytics, and OLAP
Relationship Degree
indicates the number of entities or participants associated with relationship
inner join
join 2+ query tables for 1 result this way would not include employee's without phone number records selected from 2 tables only when records have same value in common that links them
Long Text
lengthy text or combinations of text and numbers that have a max size of 63,999 characters
Business policies and rules govern all of the following EXCEPT
managing employees
A relationship where the minimum and maximum cardinality are both one is a(n) ________ relationship.
mandatory one
many to many
many products to many consumers
structural dependence
means that access to a file is dependent on its structure
well-structured relations
minimizes redundancy (it does not eliminate it entirely). Also allows for insertions, deletions, and modifications without compromising the integrity of the data
relational database model
most common stores data in tables with rows and columns each row primary key and each column unique name
Required Attribute
must have a value, cannot be left empty
relation
named, two dimensional table of data, consisting of a set of named columns (attributes) and an arbitrary number of unnamed rows
1:M
normal
Partial completeness
not every supertype occurrence is a member of a subtype; some supertype occurences may not be members of any subtype.
file processing
often use when writing a program and open in text file to work with data
primary key
one attribute per relation where 1. there can only be one such primary key per table 2. the primary key can never contain the null value 3. all values entered for the primary key must be unique (in writing we underline primary key)
Identifier
one or more attributes that uniquely identify each entity instance
tuple
one record or data set in a row
1 to 1
one student to one ID both primary and foreign keys have unique index (Yes no duplicates)
Example of Composite Entity
prescription entity to connect doctor, patient, and drug entities
entity integrity
primary key components cannot be null
Rows
records
database design
refers to the activities that focus on the design of the database structure that will be used to store and manage end-user data
rename
rename relation attributes with the query in SQL by giving them aliases.
An attribute of an entity that must have a value for each entity instance is a(n):
required attribute
An attribute that must have a value for every entity (or relationship) instance is a(n):
required attribute
Disadvantages of of storing derived attributes
requires constant maintenance to ensure derived value is current
Mandatory Participation
requires corresponding entity occurrence in a particular relationship
Advantages of storing derived attributes
saves CPU processing cycles saves data access time data value readily available can be used to keep track of historical data
Advantages of not storing derived attributes
saves space computation always yields current value
Database
shared, integrated computer structure that stores: -End user data -Metadata (data about data) -Triggers (event that causes procedure to occur) -Procedures (sorting) -Indexes
database approach
single repository of data (which may be distributed) is maintained that is defined once and then accessed by various users via a DBMS
flat file database
single table database
field
smallest meaningful data unit group of 1+ characters with specific meaning ex. last name in table
Data storage management
stores data and related data entry forms, report definitions
relation
table
data integrity
the condition in which all of the data in the database are consistent with the real-world events and conditions
online transaction processing
the systems that support a company's day-to-day operations. Databases that support this are known as OLTP databases, transactional databases, or operational databases
2NF to 3NF means to remove ____
the transitive dependencies
physical data format
the way a computer "sees" (stores) data
logical data format
the way a person views data
A value that indicates the date or time of a data value is called a(n):
time stamp.
Macro
tool that allows automation of tasks and add functionality to your forms, reports, and controls
denormalization
transforming normalized relations into non normalized physical records specifications
primary key
uniquely identify each record in table (user ID)
structured data
unstructured data that have been formatted to facilitate storage, use, and information.
Composite Entity
used to represent a many to many relationship between two or more entities is in a one to many relationship with the parent entities composed of the primary key attributes of each parent entity may contain additional attributes that play no role in connective process
foreign key
used to represent relationships between two relations
null valued attributes
values of some attribute within a particular tuple may not apply to a particular tuple - null is used
nullify deletes
when a record is deleted in the parent table, find the related records in the child table, and set the foreign key in each of the related records to null
Recursive Query
when a table must be joined to itself.
data dependence
when change to any of the file's data storage characteristics change all data access programs
islands of information
when the structure supports scattering the data into several locations, each location is an island; often causes data to be in several different versions because the locations aren't updated consistently
referential integrity
when two tables share relationship based on data stored in tables, relationship must remain consistent any foreign key must agree with primary key referenced
data independence
when you can change the data storage characteristics without affecting the program's ability to access the data
Right Outer Join
yields all rows in Agent including those that don't have matching values in the Customer table
Left Outer Join
yields all rows in customer table, including those that don't have a matching value in agent
transactional database
事務資料庫
enterprise database
企業資料庫
semistructured data
半結構化資料
ad hoc query
即席查詢
Extensible Markup Language (XML)
可擴展標記語言(XML)
business intelligence
商業智能
single-user database
單用戶資料庫
multiuser database NoSQL
多用戶資料庫NoSQL
discipline-specific database
學科特定資料庫
workgroup database
工作組資料庫
performance tuning
性能調校
operational database
操作型資料庫
file
文件
query
查詢
query result set
查詢結果集
query language
查詢語言
desktop database
桌面資料庫
field
欄位
physical data format
物理資料格式
production database
生產資料庫
knowledge
知識
online analytical processing (OLAP)
線上分析處理(OLAP)
data dependence
資料依賴
data warehouse
資料倉儲
data redundancy
資料冗餘
data dictionary
資料字典
database
資料庫
database management system (DBMS)
資料庫管理系統(DBMS)
database system
資料庫系統
database design
資料庫設計
data independence
資料獨立性
data anomaly
資料異常
data integrity
資料的完整性
data management
資料管理
data processing (DP) specialist
資料處理(DP)專家
data quality
資料質量
information
資訊
islands of information
資訊島
general-purpose database
通用資料庫
logical data format
邏輯資料格式
centralized database
集中式資料庫
cloud database
雲資料庫
unstructured data
非結構化資料
grade/degree
# of attributes (constant)
cardinality
# of tuples (variable)
File Types re:data
-master -transaction -reference/look-up -temporary -print -archive
Keys
-one or more fields -record must have unique value
Kinds
-relational (data) -object-oriented -no-SQL -Data warehouses
INSERT Command
1) can be used to add a single tuple to a relation 2) allows user to specify explicit attribute names that correspond to the values provided in INSERT command. 3) INSERT INTO EMPLOYEE (Fname, Lname, Dno, Ssn) VALUES ('Richard,' "Marini,' '4,' '6543234455')
Attributes of SQL schema
1) has schema name 2) an authorization identifier--to ID owner of that schema 3) descriptors--for each element in schema 4) elements--like tables, constraints, views, domains, and other constructs
List of data types in SQL
1) numeric--like into 2) character-string-can be fixed length like CHAR(n) or varying length CHAR VARYING(n) 3) bit-string--can be fixed or varying too. BIT(n) or BIT VARYING (n) 4) Boolean--values of True and False 5) DATE--has ten positions n the form YYYY-MM-DD 6) Timestamp--includes DATE and TIME fields 7) interval--related to DATE, TIME, and TIMESTAMP in the data type. It is a relative value that can be used to increment or decrement.
DELETE Command
1)removes tuples in a relation 2) includes a WHERE clause similar to that in the SQL query to select tuples to be deleted 3) tuples are only deleted in one table at a time 4) Can propagate to tuples in other relations via a referential triggered action. 5) DELETE FROM EMPLOYEE WHERE Lname = 'Brown';
UPDATE Command
1)used to modify attribute values of one or more selected tuples 2)WHERE clause selects tuples to be modified. 3)Here the primary key value may propagate to foreign key values via referential triggered acton is specified. 4) UPDATE PROJECT SET Plocation = 'Bellarie,' Dnum=5 WHERE Pnumber = 10;
Two categories of SQL functions:
1. Data definition language (DDL)- includes commands to create db objects like tables, indexes, and views. 2. Data manipulation language (DML)- includes commands to insert, update, delete, and retrieve data within the db tables.
Weak Entity
1. Entity is existence-dependent; it cannot exist without the entity with which it has a relationship. 2. Entity has a PK that is partially or totally derived from the parent entity in the relationship.
What are the first three stages of Normalization?
1. First normal form (1NF) 2. Second normal form (2NF) 3. Third normal form (3NF) 2NF is better than 1NF, and 3NF is better than 2NF. 3NF is as high as you need to go for normalization but there is 4NF.
1NF to 2NF steps:
1. Make new tables to eliminate partial dependencies (for each component of the PK that acts as a determinant in a partial dependency, create a new table with a copy of that component as the PK. Keep the determinants in the original table too bc they will be FKs for the relationship to relate the new tables to the original table). 2. Reassign corresponding dependent attributes
2NF to 3NF:
1. Make new tables to eliminate transitive dependencies (for every transitive dependency write a copy of its determinant as a PK for a new table. If you have 3 different transitive dependencies, you will have 3 different determinants. Keep the determinant in the original table to serve as FK) 2. Reassign corresponding dependent attributes READ PG 230 of TEXTBOOK PDF
Advantages of stored derived attributes
1. Saves CPU processing cycles 2. Saves data access time 3. Data value is readily available 4. Can be used to keep track of historical data
Advantages of not stored derived attributes
1. Saves storage space 2. Computation always yields current value
2 criteria to help determine when to use subtypes and supertypes:
1. There must be different, identifiable kinds or types of the entity in the user's environment 2. The different kinds or types of instances should each have one more attributes that are unique to that kind or type of instance.
Disadvantages of not stored derived attributes
1. Uses CPU processing cycles 2. Increases data access time 3. Adds coding complexity to queries
A specialization hierarchy provides the means to:
1. support attribute inheritance 2. define a special supertype attribute (subtype discriminator) 3. define disjoin/overlapping constraints and complete/partial constraints.
The Network Model
1969; like Heirarchical but records can have M:N relationships.
The Relational Model
1970, June; E.F. Codd of IBM publishes "A Relational Model of Data for Large Shared Databanks" in which he presents the model, based on the mathematical concept of a relation.
Entity Cluster
A "virtual" entity type used to represent multiple entities and relationships in the ERD. Formed by combining multiple interrelated entities into a single, abstract entity object. It's not actually an entity in the final ERD. Just a temporary entity used to represent multiple enttiies/relationships with the purpose of simplifying the ERD and enhancing its readability.
Surrogate Key
A PK created by the DB designer to simplify the identification of entity instances. Has no meaning in the user's environment- it exists only to distinguish one entity from another (like PKs). It has no intrinsic meaning, values for it can be generated by the DBMS to ensure that unique values are always provided. Example: auto-incrementing ID numbers
Ch. 1 - What is a data model?
A blueprint that's used as a design aid on the way to a database design
Business Rule
A brief, precise, and unambiguous description of a policy, procedure, or principle within an organization. Used to define entities, relationships, attributes, and constraints.
On what premises are business rules based?
A business rules approach is based on the following: 1. Because business rules are an expression of business policy, they are a core concept in an enterprise. 2. Natural language for end-users and a data model for developers can be used to state business rules.
Field
A character or group of characters that have a specific meaning, used to define and store data
Attribute
A characteristic of an entity.
File
A collection of related records. For example, may contain data about students at a university.
Ch. 1 - Database
A collection of related tables and other structures
data quality
A comprehensive approach to ensuring the accuracy, validity, and timeliness of data
data independence
A condition in which data access is unaffected by changes in the physical data storage characteristics
data inconsistency
A condition in which different versions of the same data yield different results
Extent
A contiguous section of disk storage space
data anomaly
A data abnormality in which inconsistent changes have been made to a database
structural dependence
A data characteristic in which a change in the database schema affects data access, thus requiring changes in all access programs
structural independence
A data characteristic in which changes in the database schema do not affect data access
data dependence
A data condition in which data representation and manipulation are dependent upon the physical data storage characteristics
operational database
A database designed primarily to support a company's day-to-day operations
analytical database
A database focused primarily on storing historical data and business metrics used tor tactical or strategic decision making
Ch. 1 - DBMS
A database management system - program used to create, process, and administer the database
XML database
A database system that manages semistructured XML data
general purpose database
A database that contains a wide variety of data used in multiple disciplines
discipline specific database
A database that contains data focused on specific subject areas
cloud database
A database that is created and maintained using cloud services, such as Microsoft Azure or Amazon AWS
multiuser database
A database that supports multiple concurrent users
single user database
A database that supports only one user at a time
Data type
A detailed coding scheme recognized by system software, such as DBMS, for representing organizational data
Secondary Key
A field that contains useful items of data often used in searches. Not always unique.
Single Key
A field where where each item of data is unique. Care must be taken when choosing a single key as some fields (family names) are not always unique.
Hash index table
A file organization that uses hashing to map a key into a location in an index, where there is a pointer to the actual data record matching the hash key
Entity Supertype
A generic entity type that is related to one or more entity subtypes. Contains common characteristics/attributes for all of its subtypes. Example: EMPLOYEE (it's a super type that has common characteristics for its entities but it has subtypes for specific kinds of employees that branch out)
Field
A group of characters, alphabetic or numeric, that has a specific meaning. Defines and stores data.
Hyperlink
A link to another location or file, typically activated by clicking on a highlighted word or image on the screen
Schema
A logical group of database objects (like tables and indexes) that are related to each other. Usually belongs to a single user or application. Useful in that they group tables by owner (or function) and enforce a first level of security by allowing each user to see only the tables that belong to that user.
centralized database
A logically related database that is stored in two or more physically independent sites
extensible markup language
A metalanguage used to represent and manipulate data elements
Ch. 2 - Null Value
A missing data value
workgroup database
A multiuser database that usually supports fewer than 50 users or is used for a specific department in an organization
SQL catalog
A named collection of schemas in a named SQL environment
Tablespace
A named logical storage unit in which data from one or more database tables, views, or other database objects may be stored
Physical file
A named portion of secondary memory such as a hard disk allocated for the purpose of storing physical records
NoSQL
A new generation of database management systems that is not based on the traditional relational database model
query language
A nonprocedural language that is used by a DBMS to manipulate its data
Entity
A person, place, thing, or event about which data will be collected and stored.
structured query language
A powerful and flexible relational database language composed of commands that enable users to create database and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information
Relational Diagram
A representation of the relational database's entities, attributes within entities, and relationships between entities.
Constraint
A restriction placed on the data. Ex. A student's GPA must be between 0.00 and 4.00. A class may only have one teacher.
Hashing algorithm
A routine that converts a primary key value into a relative record number or relative file address
Ch. 1 - Database Application
A set of one ore more computer programs that serves as an intermediary between the user and the DBMS. Can read and modify SQL statements. Can present data to users in the form of forms & reports.
business intelligence
A set of tools and processes used to capture, collect, integrate, store, and analyze data to support business decision making
online analytical processing
A set of tools that provide advanced data analysis for retrieving, processing, and modeling data from the data warehouse
database
A shared, integrated computer structure that houses a collection of related data
What is the difference between a simple attribute and a composite attribute?
A simple attribute cannot be broken down into smaller components whereas a composite attribute can be. An example of a simple attribute is last name. An example of a composite attribute is mailing_address, which would have street, city, state and zip code as components.
Primary Key
A single key that must have a value.
desktop database
A single-user database that runs on a personal computer
data warehouse
A specialized database that stores historical and aggregated data in a format optimized for decision support
External Schema
A specific representation of an external view.
Internal schema
A specific representation of an internal model.
Hashed file organization
A storage system in which the address for each record is determined using a hashing algorithm
Data Dictionary
A table providing a comprehensive description of each field in the database. This commonly includes: field name, data type, data format, field size, description and example.
File organization
A technique for physically arranging the records of a file on secondary storage devices
Raw facts
A telephone number, a birth date, a customer name, and a year to date (YTD) sales value
Weak Relationship
AKA Non-identifying relationship Exists if the primary key of the related entity does not contain a primary key component of the parent entity.
Extended Entity Relationship Model (EERM)
AKA enhanced entity relationship model. The result of adding more semantic constructs to the original ER model. Uses a EERD (EER Diagram). Has entity supertypes, subtypes, and clustering.
Strong Relationship
AKA identifying relationship. Exists when the primary key of the related entity contains a PK component of the parent entity.
tuple variable
AKA iterator--loops over each individual tuple in EMPLOYEE and evaluates it based on the WHERE clause--if it satisfies the conditions, it is selected.
Disjoint Subtypes
AKA nonoverlapping subtypes; Subtypes that contain a unique subset of the supertype entity set.; each entity instance of the supertype can appear in only one of the subtypes. Example: Employee (supertype) who is a pilot (subtype) can appear only in the PILOT subtype not in any other subtypes. Indicated by the letter d.
Structural dependence
Access to a file is dependent on its own structure, All file system programs are modified to conform to a new file structure
Structural dependence
Access to a file is dependent on its own structure. All file system programs are modified to conform to a new file structure.
Manual file systems
Accomplished through a system of file folders and filing.
Extended Relational Data Model (ERDM)
Adds many OO model features to the relational database structure.
Which of the following conditions should exist if an associative entity is to be created?
All the relationships for the participating entities are many-to-many. The new associative entity has independent meaning. The new associative entity participates in independent relationships. (all of the above)
Character
Alphabetic or numeric
Relationship definition
An association between entities
Atomic Attribute
An attribute that cannot be further subdivided. It displays ATOMICITY.
Simple Attribute
An attribute that cannot be subdivided. Example: age, sex, marital status.
Optional Attribute
An attribute that does not require a value; it can be left empty. Example: Not all students have a middle name or phone number (so these would not be required).
Required Attribute
An attribute that must have a value; it cannot be left empty. In the Crow's Foot Model, these are in bold to indicate that there must be data entry. Example: students are assumed to all have a last and first name (so these would be required).
Existence-dependent
An entity that can exist in the database only when it is associated with another related entity occurrence. It has a mandatory FK attribute that cannot be null. Example: PARENT has Child (so the child entity is existence-dependent on the PARENT entity bc it is impossible for a child to exist without the PARENT.
HAVING clause
An extension of the GROUP BY feature; it operates like WHERE in the SELECT statement. BUT: WHERE applies to columns and expressions for individual rows. HAVING is applied to the output of a GROUP BY operation.
database system
An organization of components that defines and relates the collection, storage, management, and use of data in a database environment
What is a derived attribute, and how is it different from a stored attribute?
Answer: A derived attribute is an attribute whose value can be calculated from other related attributes. A derived attribute is not stored in the physical table which is eventually created from the ERD. A stored attribute, as its name implies, is stored as a column in the physical table.
How is a strong entity different from a weak entity?
Answer: A strong entity type exists independently of any other entities. A weak entity type depends on another (strong) entity type. When an instance of the strong entity type no longer exists, any weak entity instances which depend upon the strong entity cease to exist.
What is an associative entity? What four conditions should exist in order to convert a relationship to an associative entity?
Answer: An associative entity is an entity type that associates the instances of one or more entity types and contains attributes that are peculiar to the relationship between those entity instances. Often, a many-to-many relationship is converted to an associative entity. The following four conditions should exist in order to do this: 1. All the relationships for the participating entities types are many relationships. 2. The resulting associative entity has independent meaning. 3. The associative entity has one or more attributes other than the identifier. 4. The associative entity participates in one or more relationships independent of the entities in the associative relationship.
What is the difference between an entity type and an entity instance?
Answer: An entity type is a collection of entities that share common properties. An entity instance is a single occurrence of an entity type. So, for example, STUDENT is an entity type and John Smith is an entity instance.
What are the three different degrees of relationship?
Answer: The thee possible degrees are: Unary (an instance of one entity is related to an instance of the same entity type), Binary (an entity instance of one type is related to an entity instance of another type) and Ternary (instance of three different types participate in a relationship).
Base tables AKA
Base relations--are actually created and stored in the DBMS
Unified Modeling Language (UML)
Based on OO concepts, is a set of diagrams and symbols used to graphically model a system.
Key-Value model
Based on a structure composed of two elements: a key and a value in which every key has a corresponding value or set of values.
Iterative Process
Based on repetition of processes and procedures. This is when you make an ER diagram. Have to identify business rules, main entities and relationships, develop an initial ERD, identify attributes and PKs, and then revise/review the ERD.
Generalization
Bottom-up process or identifying a higher-level, more generic entity supertype from lower-level subtypes. Based on grouping the common characteristics and relationships of the subtypes. Example: might identify piano, violin and guitar. You could identify a "string instrument" entity supertype as a common characteristic for the subtypes.
Ch. 2 - BI Systems
Business intelligence systems are information systems used to support management decisions by producing information for assessment, analysis, planning, and control.
Aggregate Functions
COUNT, MIN, MAX, SUM, AVG Used with SELECT to return mathematical summaries on columns
ORDER BY
Can organize Query results using this clause. Ex. ORDER BY D.Dname, E.Lname, E.Fname;
Business intelligence
Captures and processes business data to generate information that support decision making
Business intelligence
Captures and processes business data to generate information that supports decision making.
Attribute definition
Characteristics of entities.
EXISTS
Checks whether a subquery returns any rows
BETWEEN
Checks whether an attribute value is within a range
LIKE
Checks whether an attribute value matches a given string pattern
IN
Checks whether an attribute value matches any value within a value list
Which of the following criteria should be considered when selecting an identifier?
Choose an identifier that is stable. Choose an identifier that will not be null. Choose an identifier that doesn't have large composite attributes. (All of the above)
Database management system DBMS
Collection of programs; manages the database structure; controls access to data stored in the database.
Ch. 1 - What is it called when more than one column in a table is combined to form a primary key?
Composite key
Computerized file systems
Computer-based system that track data and produce required reports.
Entity Subtype
Contain their own unique characteristics but are directly related to a supertype and inherit its attributes too. Example: a PILOT is a subtype of EMPLOYEE. also MECHANIC and ACCOUNTANT (these have unique attributes). CLERK is not acceptable bc it only satisfies that it is an identifiable kind of employee but none of the attributes are unique to just clerks.
General purpose databases
Contains a wide variety of data used in multiple disciplines
General-purpose database
Contains a wide variety of data used in multiple disciplines.
Ch. 1 - Row
Contains data about a particular instance
Database Developer
Create and maintain database based applications, programing, database fundamentals, SQL
Cloud database
Created and maintained using cloud data services that provide defined performance measures for the database
Cloud database
Created and maintained using cloud data services that provide defined performance measures for the database.
CREATE SCHEMA AUTHORIZATION
Creates a db schema
CREATE TABLE
Creates a new table in the user's db schema
Currency
Currency values and numeric data used in mathematical calculations
Data dependence
Data access changes when data storage characteristics change. Significant for difference between logical and physical format.
Distributed database
Data is distributed across different sites.
Centralized database
Data is located at a single site.
What are some of the guidelines for good data names of objects in general?
Data names always should: 1. Relate to the business not technical characteristics. Student would be a good name but not filest023. 2. Be meaningful so that the name tells what the object is about 3. Be unique 4. Be readable 5. Be composed of words taken from an approved list 6. Be repeatable 7. Follow a standard syntax
Data independence
Data storage characteristics are changed without affecting the program's ability to access the data.
Data independence
Data storage characteristics is changed without affecting the program's ability to access the data
unstructured data
Data that exists in its original, raw state; that is, in the format in which it was collected
semistructured data
Data that has already been processed to some extent
Ch. 2 - Data Warehouse
Database systems that have data, programs, and personnel that specialize in the preparation of data for BI processing.
Ch. 2 - Operational Database
Database that stores the company's current day-to-day transaction data
Date/Time
Date and Time values for the years 100 through 9999
Structured Query Language (SQL)
De facto query language and data access standard supported by the majority of DBMS vendors
Structured query language
De facto query language and data access standard supported by the majority of DBMS vendors.
FOREIGN KEY
Defines a FK for a table
PRIMARY KEY
Defines a PK for a table
DROP TABLE command does what?
Deletes a table from the database
Relationship
Describes an association among entities. Data models use 3 types: one to many, one to one, and many to many.
Operational database
Designed to support a company's day to day operations
Operational database
Designed to support a company's day-to-day operations.
Database Analyst
Develop Databases for decision support reporting, SQL, query optimization, data warehouses
Data inconsistency
Different versions of the same data appear in different places.
Web mining
Discovers patterns in a less structured DBMS
Vertical partitioning
Distribution of the columns of a logical relation into several separate physical tables
Horizontal partitioning
Distribution of the rows of a logical relation into several separate tables
The ERM forms the basis of an ____, which represents _____.
ERD; the conceptual database as viewed by the end user.
Security management
Enforces user security and data privacy
Performance tuning
Ensures efficient performance of the database in terms of storage and access speed
Performance tuning
Ensures efficient performance of the database in terms of storage and access speed.
NOT NULL
Ensures that a column will not have null values
Database main components
Entities, attributes, and relationships
ERD
Entity Relationship Database; describes the data and map processes to data requirements
Specialization Hierarchy
Entity supertypes and subtypes are organized into this; it depicts the arrangement of higher-level entity supertypes (parent entities) and lower-level entity subtypes (child entities). A subtype can only exist with the context of a supertype and every subtype can only have one supertype to which it is directly related. But it can have many levels of supertype/subtype relationships so a supertype can have many subtypes.
One-to-One (1:1 or 1..1)
Ex. Each store is managed by one employee, and each manager employee manages one store.
One-to-Many (1:M or 1..*)
Ex. Painter paints many paintings, but each painting is made by one painter.
Ch. 2 - ETL System
Exact, Transform, and Load System - cleans and prepares data for BI processing, then the data is stored in the data warehouse DBMS.
aliases or tuple variables
Example FROM Employee AS E, Employee AS S WHERE E.Super_ssn=S.ssn;
Unary Relationship
Exists when an association is maintained with a single entity. AKA a recursive relationship because the entity has a relationship with itself. Example: EMPLOYEE manages EMPLOYEE (like if it is a manager then it is technically an employee that manages employees). (1:1 relationship with itself)
data redundancy
Exists when the same data is stored unnecessarily at different places
Transitive Dependency
Exists when there are functional dependencies such that X-->Y, Y-->Z, and X is the PK. So the dependency X -->Z is a transitive dependency bc X determines the value of Z via Y. *more difficult to identify among a set of data. AKA the signaling dependency.
Partial Dependency
Exists when there is a functional dependence in which the determinant is only part of the PK. Example: (A,B) is the PK, then the functional dependence B --> C is a partial dependency because only part of the PK (B) is needed to determine the value of C. *these are straightforward and easy to identify
Ternary Relationship
Exists when three entities are associated. (higher degrees exist like four-degree relationship, but they are rare). Example: DOCTOR, PATIENT, AND DRUG and the verb for the relationship is prescribes. DOCTOR writes many PRESCRIPTIONs PATIENT receives many PRESCRIPTIONs DRUG appears in many PRESCRIPTIONs
Binary Relationship
Exists when two entities are associated in a relationship. The most common type of relationship. Higher-order relationships, like ternary, are decomposed into equivalent binary relationships whenever possible to simplify the conceptual design. Example: PROFESSOR teaches one or more CLASSes.q
Cardinality
Expresses the minimum and maximum number of entity occurrences associated with one occurrence of the related entity. Indicated by putting numbers by the entities like (x,y). X is the minimum number of associated entities, and y is max. Example: A PROFESSOR teaches a CLASS. Cardinality: PROFESSOR(1,1) and CLASS (1,4) This means that each professor teaches up to four classes, so the PK occurs at least once and no more than 4 times as FKs in the CLASS table.
A business rule is a statement of how a policy is enforced or conducted.
FALSE
A cardinality constraint tells what kinds of properties are associated with an entity.
FALSE
A relationship instance is an association between entity instances where each relationship instance includes exactly one entity from each participating entity type.
FALSE
A ternary relationship is equivalent to three binary relationships.
FALSE
An entity type on which a strong entity is dependent is called a covariant entity.
FALSE
An example of a term would be the following sentence: "A student registers for a course."
FALSE
Business rules are formulated from a collection of business ramblings.
FALSE
Data names do not have to be unique.
FALSE
In an E-R diagram, strong entities are represented by double-walled rectangles.
FALSE
It is not permissible to associate attributes with relationships.
FALSE
Most systems developers believe that data modeling is the least important part of the systems development process.
FALSE
Some examples of attributes are: eye_color, weight, student_id, student.
FALSE
The degree of a relationship is the number of attributes that are associated with it
FALSE
The intent of a business rule is to break down business structure
FALSE
The maximum criminality of a relationship is the maximum number of instances of entity B that may be associated with each instance of entity A.
FALSE
The name used for an entity type should never be the same in other E-R diagrams on which the entity appears.
FALSE
The purpose of data modeling is to document business rules about processes.
FALSE
The relationship among the instances of three entity types is called a unary relationship.
FALSE
When systems are automatically generated and maintained, quality is diminished.
FALSE
Well designed databases:
Facilitate data management, generate accurate and valuable information.
Well designed database
Facilitates data management, Generates accurate and valuable information
Key
Fields that are used to sort and retrieve information. It holds a data item that is unique for each record. The key is used so that not all the data is read.
Structural independence
File structure is changed without affecting the application's ability to access the data
Structural independence
File structure is changed without affecting the application's ability to access the data.
Double
Floating point decimal numbers from -1.797 x 10308 to 1.797 x 10308 with 10 digits of precision.
A department can have more than one employee.
For the relationship represented in the figure below, which of the following is true?
GROUP BY
Groups the selected rows based on one or more attributes
Database Consultant
Help companies leverage database technologies to improve business processes and achieve specific goals, database fundamentals, data modeling, database design, SQL, DBMS, hardware vendor specific technologies, etc.
Core
Implemented by RDMS vendors that comply with SQL--must have specifications in SQL
Sparse data
In NoSQL DBs, the case in which the number of attributes is very large but the number of actual data instances is low.
Eventual consistency
In NoSQL, means updates to the DB will propogate through the system and evebtually all data copies will be consistent.
Ternary
In the following diagram, what type of relationship is depicted?
It depicts a unary relationship. It depicts a many-to-many relationship. There is an associative entity. (All of the above)
In the following diagram, which is true?
Each patient has one or more patient histories. Each patient history belongs to one and only one patient. (both A and C)
In the following diagram, which of the answers below is true?
islands of information
In the old file system environment, pools of independent often duplicated, and in consisted data created and managed by different departments
File System Redux: Modern End User Productivity Tools
Includes spreadsheet programs such as Microsoft Excel
Which of the following is NOT a characteristic of a good business rule?
Inconsistent
Disadvantages of Database Systems
Increased costs, Management complexity, Maintaining currency, Vendor dependence, Frequent upgrade/replacement cycles
Advantages of DBMS
Increased end-user productivity; data sharing, security, access, and decision making.
Default join
Inner joint
INSERT
Inserts row(s) into a table
Role of the DBMS
Intermediary between user and database; enables data sharing; present end-user integrated view of data; receives and translates application requests into operations; hides integral complexity from programs and users.
What does * mean?
It means "all the attributes"
Boolean
It only requires one byte of data. This is often used for fields that requires answers such as Yes/No - Y or N, True/False - T or F, Gender - M or F.
Entity naming conventions
It's a noun; usually written in all CAPS.
Problems with File System Data Processing
Lengthy development times, Difficulty of getting quick answers, Complex system administration, Lack of security and limited data sharing, Extensive programming
Query language
Lets the user specify what must be done without having to specify how
Record
Logically connected set of one or more fields that describe a person, place, or thing. E.g. Customer name, address, phone number, date of birth.
Database Systems
Logically related data stored in a single logical data repository, DBMS eliminates most of file system's problems, Current generation DBMS software
Physical model
Lowest level of abstraction. Describes the way data are saved on storage media.
Composite Key
Made by joining two or more fields together. Used when no item in any field can be guaranteed to be unique.
Database Administrator
Manage and maintain DBMS and databases, database fundamentals, SQL, vendor courses
Typical Data Migration Order
Manual -> Conventional Files/Spreadsheets -> Database
Evolution of File System Data Processing
Manual File Systems -> Computerized File Systems -> File System Redux: Modern End User Productivity Tools
Internal model
Maps the conceptual model to the DBMS. Its what the DBMS "sees" and in relational DB is composed of SQL commands.
Virtual relations
May or may not correspond to actual file--made through "create view" statement
Software Independence
Means that the model in use does not depend on the DBMS software used to implement the model.
Hardware Independence
Means that the model in use is not dependent on the hardware used to implement the model.
Optional Entity
Minimum cardinality is 0.
Raw Data
Not yet been processed to reveal the meaning
Raw data
Not yet been processed to reveal the meaning.
Mandatory Participation
One entity occurrence requires a corresponding entity occurrence in a particular relationship. If no optionality symbol is depicted with the entity in crow's foot notation, then the entity is assumed to exist in a mandatory relationship with the related entity.
Ch. 2 - OLTP
Online transaction processing system is used to record all sales transactions of the company.
Which of the following is an entity type on which a strong entity depends?
Owner Member Attribute D) None of the above is the answer
DROP TABLE
Permanently deletes a table (and its data)
COMMIT
Permanently saves data changes
Logically related data stored in a single logical data repository
Physically distributed among multiple storage facilities
Strong Relationship
Primary key of related entity contains a primary key component of the parent entity
Information
Produced by processing data, Reveals the meaning of data, Enables knowledge creation, Should be accurate, relevant, and timely to enable good decision making
Information
Produced by processing data. Reveals meaning of data; should be accurate, relevant, and timely.
Ch. 1 - What does a foreign key do?
Provides the link between two tables, creating a relationship
Ch. 2 - What is Microsoft Access's GUI style called?
QBE - query by example
Database access languages and application programming interfaces
Query language, Structured query language (SQL)
Data
Raw Facts, Have little meaning unless its been organized in some logical manner, Building block of information
End-user data
Raw facts of interest to end user.
Data
Raw facts, such as a telephone number, birth date, customer name, and year-to-date sales value. Little meaning unless organized in some logical manner.
Normalizing the table structure will do what?
Reduce the data redundancies. If repeating groups exist, they must be eliminated by making sure that each row defines a single entity. Dependencies must be identified to diagnose the normal form.
Time-variant data
Refer to data whose values change over time and for which you must keep a history of the data changes. For example: GPA or bank account balance change over time. Whereas, DOB or SSN are not time variant.
Relational Operator
Refers to the characters or symbols indicating the relationship between two expressions. They are used for simple queries.
Database model
Refers to the implementation of a data model in a specific database system.
Relational Database Management System
Relational Database Management System is a program that enables users to create and manage data
Wildcard Operators
Represent one or more unknown characters. The question mark (?) substitutes for one character and the asterisk (*) substitutes for a number of charcters.
Conceptual Model
Represents a global view of an entire DB by an entire organization. A.k.a conceptual schema. Most widely used model is ER with ERD graphics.
Extensible Markup Language (XML)
Represents data elements in textual format
Extensible markup language XML
Represents data elements in textual format.
AND
Requires both the first and second query to be true.
OR
Requires either the first or second query to be true.
MAX
Returns the maximum attribute value found in a given column
MIN
Returns the minimum attribute value found in a given column
COUNT
Returns the number of rows with non-null values for a given column
Rules of precedence
Rules that establish the order in which computations are completed. PEMDAS
Retrieve the b-date and address of employee(s) whose name is John B. Smith
SELECT Bdate, Address FROM EMPLOYEE WHERE Fname='John' AND Minit='B' AND Lname='Smith';
SELECT Statement
SQL's one basic stmt for retrieving info from DB.
Islands of information
Scattered data locations, Increases the probability of having different versions of the same data
Islands of information
Scattered data locations. Increases the probability of having different versions of the same data.
Database
Shared, integrated computer structure that stores a collection of: End user data and metadata
Multiuser access control
Sophisticated algorithms ensure that multiple users can access the database concurrently without compromising its integrity
Ch. 1 - Column
Stores a characteristic common to all rows
Analytical database
Stores historical data and business metrics used exclusively for tactical or strategic decision making.
Workgroup databases
Supports a small number of users or a specific department
Workgroup database
Supports a small number of users or a specific department.
Enterprise database
Supports many users across many departments
Enterprise database
Supports many users across many departments.
Multi-user database
Supports multiple users at the same time.
Single user database
Supports one user at a time, Desktop Database
Single-user database
Supports one user at a time.
A fact is an association between two or more terms.
TRUE
A good data definition is always accompanied by diagrams, such as the entity-relationship diagram.
TRUE
A single occurrence of an entity is called an entity instance.
TRUE
Data modeling is about documenting rules and policies of an organization that govern data.
TRUE
Data names should always relate to business characteristics.
TRUE
Data, rather than processes, are the most complex aspects of many modern information systems
TRUE
Enforcement of business rules can be automated through the use of software tools that can interpret the rules and enforce them.
TRUE
It is desirable that no two attributes across all entity types have the same name.
TRUE
One reason to use an associative entity is if the associative entity has one or more attributes in addition to the identifier.
TRUE
Participation in a relationship may be optional or mandatory.
TRUE
Relationships represent action being taken using a verb phrase.
TRUE
The E-R model is used to construct a conceptual model.
TRUE
The relationship between a weak entity type and its owner is an identifying relationship.
TRUE
The relationship between the instances of two entity types is called a binary relationship.
TRUE
When choosing an identifier, choose one that will not change its value often.
TRUE
T/F? The relationship classification is difficult to establish if you only know one side of the relationship.
TRUE.
Identifiers
The ERM uses these; they are one or more attributes that uniquely identify each entity instance. Can be the PK of a table. These are UNDERLINED in the ERD.
a composite attribute
The address attribute is
Subtype Discriminator
The attribute in the supertype entity that determines to which subtype the supertype occurrence is related. Example: EMPLOYEE_TYPE so if the EMPLOYEE_TYPE has a value of "P" then the supertype is related to the PILOT subtype.
knowledge
The body of information and facts about a specific subject
query result set
The collection of data rows returned by a query
database management system
The collection of programs that manages the database structure and controls access to the data stored in the database
Ch. 1 - What is database design?
The creation of the proper structure of database tables, the proper relationships between tables, appropriate data constraints, and other structural components of the database.
Participants
The entities that participate in a relationship. And each relationship is identified by a name that describes the relationship. The relationship is an active or passive verb for example: a student TAKES a class, a professor TEACHES a class. etc.
enterprise database
The overall company data representation, which provides support for present and expected future needs
data processing specialist
The person responsible for developing and managing a computerized file processing system
Subschema
The portion of the database "seen" by the application programs that actually produce the desired info from the data within the database.
Searching
The process of retrieving data from the database by using something relevant to the data such as a string of characters or a word
Denormalization
The process of transforming normalized relations into non-normalized physical record specifications
database design
The process that yields the description of the database structure and determines the database components. The second phase of the Database Life Cycle
multivalued attribute
The skill attribute is
Field
The smallest unit of application data recognized by system software
physical data format
The way a computer "sees"(stores) data
logical data format
The way a person views data within the context of a problem domain
derived
The years employed attribute is
Relational Schema
This is a shorthand notation for the table structure; Key attributes are underlined in the ERD.
Query language (QL)
This is a specialised language designed to allow users to access information from the database. It is the most complex method because the user must learn the language but it provides the most power and flexibility. Different DBMSs support different query languages. SQL (Structured Query Language) is a standard query language but there are different versions of it in use. Storing & retrieving.
Menu
This is often the easiest way to pose a query but is the least flexible. The DBMS presents the user with a list of options from to choose.
Query by example (QBE)
This requires the user to enter the criteria against a field. For example, if you were looking for people who lived in Eastwood, you would type 'Eastwood' in the 'Suburb' field and leave the remaining fields blank. The DBMS would then search the database and select all records that have Eastwood in the 'Suburb' field.
What is the function of the PK?
To guarantee entity integrity, not "describe" the entity.
Role of applications programmers
To translate company policies and procedures from a variety of sources into appropriate interfaces, reports, and query screens.
Specialization
Top-down process of identifying lower-level, more specific entity subtypes from a higher-level entity supertype. Based on grouping the unique characteristics and relationships of the subtypes. Example: you identify many entity subtypes from the original employee supertype.