Database Management
Definition of data mining
Business intelligence systems that use sophisticated statistical and mathematical techniques to perform what-if analyses, to make predictions, and to facilitate decisions. Contrast with reporting systems.
Other names for table, column, row
Relation, attribute, tuple File, field, record
Definition of database:
collection of data stored in a standardized format designed to be shared by multiple users.
Database Life Cycle
Stages o Requirements Collection & Analysis o Conceptual Modeling o Logical Modeling o Physical Modeling o Application Design
Entities have "independent" meanings and attributes are meaningful only within the context of an entity
--
Basic function of (database) Application programs.
-Create and process forms - Process queries - Create and process reports .Execute application logic -Control the aplication itself
Functions of a DBMS
-Create database - Create tables -Create supporting structures -Modify data -Read database data -Maintain database structures -Enforce rules -Control concurrency -Perform backup and recovery
The purpose of a database is to...?
...help people track things of interest to them.
Data is stored in...?
...tables, which have rows and columns like a spreadsheet. A database may have multiple tables, where each table stores data about a different thing. Each row in a table stores data about an occurrence or instance of the thing of interest. A database stores data and represents relationships.
What does a dashed line between entities mean?
A relationship drawn with a dashed line is used between strong entities
What relationships can have relationship attributes?
A M:N relationship can have relationship attributes. A 1:N cannot.
A data model is an abstraction of a complex real-world environment.
A blueprint which is generalized and non-DBMS-specific. Easier to change model than database.
Descriptive/Assesment techniques
A business intelligence system that processes data by filtering, sorting, and making simple calculations. OLAP is a type of reporting system. Contrast with data mining systems.
What is a candidate key?
A candidate key is a column, or set of columns, in a table that can uniquely identify any database record without referring to any other data
Comcurrency
A condition in which two or more transactions are processed against the database at the same time. In a single CPU system, the changes are interleaved; in a multi-CPU system, the transactions may be processed simultaneously, and the changes on the database server are interleaved.
What is a database?
A database is a set of data that has a regular structure and is organized in such a way that a computer can easily retrieve the desired results
A database is a self-describing collection of integrated tables.
A database is called self-describing because it stores a description of itself. The self-describing data is called metadata, which is data about data.
What is a surrogate key?
A primary key field that is automatically assigned by the computer. An example is the StudentNumber field in GSU's database.
Foreign key=
A primary key in one table that is used as an identifier in another table.
Nonidentifying relationship
A relationship drawn with a dashed line (refer to Figure 5-7) is used between strong entities and is called a nonidentifying relationship because there are no ID-dependent entities in the relationship
Single-valued, multi-valued identifier
A single identifier is to prefer since it is more likely to be unique
Definition of relation
A two-dimensional array containing single-value entries and no duplicate rows. Values for a given entity are shown in rows; values of attributes of that entity are shown in columns. The meaning of the columns is the same in every row. The order of the rows and columns is immaterial.
What is a primary key?
A unique identifier field within a table.
Double lined rectangle
A weak entity
Partial key
A weak entity's key is called a partial key (dotted underline)
Relationship between tables and relations?
All relations are tables, but not all tables are relations
What is amn ID-Dependent Entity?
An ID-dependent entity is an entity (child) whose identifier includes the identifier of another entity (parent). The ID-dependent entity is a logical extension or subunit of the parent: BUILDING : APARTMENT (The minimum cardinality from the ID-dependent entity to the parent is always one.)
What is an attribute?
Attributes describe an entity's characteristics.
Why not write n:n or m:m instead of n:m?
Because even though there is a many-to-many relationship, the relationship might not be identical both ways
Database design occurs in the __ of the SDLC
Component design
Predictive techniques
Data mining
The three steps of the database overview?
Database foundations --> database queries --> database design
Referential integrity constraints
For example, in the Student-Class-Grade database tables in Figure 1-5, what would happen if a user mistakenly entered a value of 9 for StudentNumber in the GRADE table? No such student exists, so such a value would cause numerous errors. To prevent this situation, it is possible to tell the DBMS that any value of StudentNumber in the GRADE table must already be a value of StudentNumber in the STUDENT table. If no such value exists, the insert or update request should be disallowed. The DBMS then enforces these rules, which are called referential integrity constraints.
What are Identifiers?
Identifiers are attributes that name, or identify, entity instances.
What are Composite identifiers ?
Identifiers that consist of two or more attributes.
What are the three steps of an ER-model?
Identify entities, identify relationships with mappings,add attributes
What are Parent and Child Entities?
In a one-to-many relationship: The entity on the one side of the relationship is called the parent entity or just the parent. The entity on the many side of the relationship is called the child entity or just the child.
What is a strong and a weak entity?
In an entity-relationship model, any entity whose existence in the database does not depend on the existence of any other entity is a strong entity. A weak entity is dependent on the existence of another entity.
What is an alternate key?
In entity-relationship models, a synonym for candidate key
Supertype
In generalization hierarchies, an entity or object that logically contains subtypes. For example, EMPLOYEE is a supertype of ENGINEER, ACCOUNTANT, and MANAGER.
Subset
In generalization hierarchies, an entity or object that is a subspecies or subcategory of a higher-level type, called a supertype. For example, ENGINEER is a subtype of EMPLOYEE.
Where does data modeling occur in SDLC?
In the requirements analysis step.
Two disadvantages with a surrogate key?
Independently, the surrogate key doesn't mean anything. Also, it is specified for one table and thus becomes complicated when the entity appears in multiple tables since surrogate key is only unique in one table.
What is "IE" short for?
Information Engineering (Crow's foot model)
Definition of BI
Information systems that assist managers and other professionals in the analysis of current and past activities and in the prediction of future events. Two major categories of BI systems are reporting systems and data mining systems.
What are the two types of data?
Internal (private data) generated from the organization itsel. External (public data) is generated from an outside source, such as a public database.
Clustering
Intra-cluster distances are minimized. Inter-cluster distances are maximized
Arrow used to represent relationship indicates what?
It is used in inheritance
What is mandatory and optional participation in relation to minimum cardinality?
Minimum cardinality of zero [0] indicating optional participation is indicated by placing an oval next to the optional entity. Minimum cardinality of one [1] indicating mandatory (required) participation is indicated by placing a vertical hash mark next to the required entity.
What does NULL and NOT NULL mean in regard to NULL status?
Null allows NULL, while NOT NULL does not
What is underlined in an entity-relationship diagram?
Primary key attribute(s)
DBMS
Software package that facilitates the creation, organization, and management of databases. Often includes tools for querying, security, handling multiple users.
Inheritance
Subtype entity inherits all attributes of its supertype entity Class inheritance in Java Key of student = key of person Key of employee = key of person
Why are tables integrated?
Tables are called integrated because they store data about the relationships between the rows of data.
What is the E_R model?
The entitiy-relationship model?
Definition big data
The established term for the enormous datasets created by Web applications, such as search tools (e.g., Google and Bing), and by Web 2.0 social networks, such as Facebook, LinkedIn, and Twitter
The degree of the relationship is the number of entity classes in the relationship
Two entities have a binary relationship of degree two. Three entities have a ternary relationship of degree three.
Triggers
Used for accuracy and consistency
Indexes
Used to speed up process of querying
Client-server architecture
User - Computer (mobile device) - internet - webserver (app server) - database
What are the components of a database system?
User --> Database Application --> (SQL) ---> DBMS --> Database
Identifying relationship
We use a solid line to represent the relationship between the ID-dependent entity and its parent. This type of a relationship is called an identifying relationship.
An entity with rounded corners is?
We use an entity with rounded corners to represent the ID-dependent entity
A database design is...?
a set of database specifications that can actually be implemented as a database in a DBMS
Relationship classes=
associations among entity classes
Relationship instances:
associations among entity instances
Problem domain
clearly defined area within the real-world environment, with well-defined scope and boundaries
Database design =
how the database structure will be used to store and manage end-user data
Domain constraints
limit column values to a particular set of values. For example, EMPLOYEE.EmpCode could be limited to 'New Hire', 'Hourly', 'Salary', or 'Part Time'.
Range constraints
limit values to a particular interval of values. EMPLOYEE.HireDate, for example, could be limited to dates between January 1, 1990, and December 31, 2025.
intrarelation constraint
limits a column's values in comparison with other columns in the same table.
An interrelation constraint
limits a column's values in comparison with other columns in other tables.
Minimum cardinality
minimum number of entity instances that must participate in a relationship.
Data =
recorded facts and figures. INformation is derived from data.
What are the steps of SDLC?
requirements analysis, design, development, testing, maintenance
A default value is...?
s value supplied by the DBMS when a new row is created.
Maximum cardinality =
the maximum number of entity instances that can participate in a relationship.