Business Database Systems: CIS 355
NoSQL
"non SQL" or "non relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases
Relation
(sometimes called a table is a matrix composed of intersecting rows and columns where each row in a relation is called a tuple and each column in the table represents an attribute
ACID
Atomic, Consistent, Isolated, Durable. Atomic & Consistent: transactions combining multiple changes are handled as "atoms"--meaning they are indivisible units, so they either succeed or fail as a whole (all or nothing). Isolated: multiple queries hitting the database at the same time are separated; won't interfere with each other. Durable: the data is persistent and will retain its state despite errors or crashes
Business Intellegence
BI, is an umbrella term that refers to a variety of software applications used to analyze an organization's raw data. BI as a discipline is made up of several related activities, including data mining, online analytical processing, querying and reporting.
Location is used to determine types of databases:
Centralized database: a database that supports data located at a single site Distributed Database: a database that supports data distributed across several different sites
Bill Inmon
Champion of the Corporate Information Factory architecture (CIF)
Referential Integrity
Each foreign key value must match a primary key value in another table (or be null if allowed).
Entity instance (occurrence)
Each row in the table in the ER model
Normilization
First, it prevents many types of anomalies that come from adding, removing, and updating data. Second, it allows very flexible queries, beginning from any starting point, without having to scan the entire database.
Relational Model Pro's
Flexible Queries: stars with an entity; combine as many tables as needed. Data integrity: 1) Eliminating redundancy prevents anomalies and inconsistency. 2) Data types and foreign keys are defined up front and enforced (metadata). 3)ACID compliance and transaction.
"The Key"
For each relation: Every non key attribute Depends on the key The whole key, And nothing but the key, So help me Codd.
OLAP Database
Online Analytical Processing. OLAP performs multidimensional analysis of business data and provides the capability for complex calculations, trend analysis, and sophisticated data modeling.
OLTP Database
Online transaction processing, or OLTP, is a class of information systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing.
Number of Users Determines the type of Database: list types of DB's
Single User Database - only supports 1 user at a time (if run on a PC called a desktop database Multi-User Database supports multiple people at a time (usually fewer than 50) Work Group Database When a multi user database supports a specific department within an organization Enterprise Database When the Database is sued by the entire organization and supports many users, usually hundreds, across many departments As you move up in DB size you also move up in: Cost & Value Complexity Security Requirements Reliability Requirements Scalability Requirements Maintainability Requirements Number of Concurrent Users
SQL
Structured Query Language- the de facto query language and data access standard supported by the majority of DBMS vendors
Attribute
a characteristic of an entity
DBMS
a collection of programs that manages the database and controls access to the data stored in the database
Foreign Key (FK)
a field (or collection of fields) in one table that uniquely identifies a row of another table. In simpler words, the foreign key is defined in a second table, but it refers to the primary key in the first table
Natural Key (identifier)
a key that is formed of attributes that already exist in the real world. For example, a US citizen's social security number could be used as a natural key
Secondary Key
a key that is used strictly for data retrieval purposes
Candidate Key
a minimal superkey; a superkey without any unnecessary attributes
Composite Kay
a multi-attribute key; used when more than a single attribute define functional dependence
Normalization
a process for evaluating and correcting table structures to minimize data redundancy thereby reducing the probability of data anomalies
Required vs Optional Attribute
a required attribute is an attribute that must have a value and cannot be left empty; an optional attribute it an attribute that does not require a value and can be left empty
Database
a shared, integrated computer structure that stores a collection of end user data (raw facts that are of interest to the end user), and meta data (data about data through which the end-user data are integrated and managed) "a collection of self-describing data"
Query
a specific request issued to the DBMS for data manipulation i.e. "read or update the data"
Ad-hoc Query
a spur-of-the-moment question when end users want answers to quick questions i.e. "What was the dollar amount of product sales in the last 6 mos.?"
Conceptual Modeling
aims at identifying concepts and relationships
Surrogate Key
an artificial primary key introduced by the designer with the purpose of simplifying the assignment of primary keys to tables, surrogate keys are usually numeric and they often have values auto-generated by the DBMS, they have no special meaning and are hidden from end users
Primary Key (PK)
an attribute or combination of attributes that uniquely identifies any given row
Derived Attribute
an attribute whose value is determined via calculation (derived) from manipulations of other attributes ie value in TOTAL is derived from the sum of SUBTOTAL and TAXES
Existence Dependant
an entity is said to existence-dependent if it can exist in the database only when it is associated with another related occurrence ie DEPENDENT in the EMPLOYEE entity
Determinant
any attribute whose vale determines other values within a row; for ever transitive dependency write its determinant as a PK for a new table
Superkey
any key that uniquely identifies each row, a super key determines all of a rows attributes; could me multi attributed
Entitiy
anything (noun) about which there are data to be collected and stored
Structured Data
are the result of taking unstructured data and formatting (structuring) such data to facilitate storage, use, generalization of information
Simple vs Composite Attribute
composite attribute- is an attribute that can be further "decomposed" to yield additional attributes ie ADDRESS may have STREET ADDRESS, HOUSE NUMBER, STATE simple- attribute is an attribute that is fully decomposed and cannot be further subdivided
Key
consists of one or more attributes that determine other attributes
key
consists of one or more attributes that determine other attributes
Associative entity
created to avoid problems associated with M:N relationships the associative entity structure includes as foreign keys
Metadata
data about that data; provides a description of the data characteristics and the set or relationships that link that data found within the database
Unstructured Data
data that exist in their original (raw) state in the format in which they were collected. this format does not lend itself to processing that yields information
Repeating Group
derives its name from the fact that a group of multiple entries of the same type can exist for any single key attribute occurrence
Realtionship
describes an association among entities
Data Anomaly
develops when all of the required changes in the redundant data are not made successfully; can include deletions, update and insertion anomalies
Data Redundancy
exists when the same data are stored unnecessarily at different places
Big Data
extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.
Physical database design
includes details of data types, indexes, and other implementation decisions that matter to DBAs.
Entity-Relationship model
is a data model for describing the data or information aspects of a business domain or its process requirements, in an abstract way that lends itself to ultimately being implemented in a database such as a relational database
Null
no data entry at all, not a permitted value for a primary key
Physical model (schema)
operates at the lowest level of abstraction describing the way data are saved on storage media such as disks or drives. the physical model requires the definition of both the physical storage devices and the (physical) access methods required to reach the data withint hose storage devices both hardware and software dependant
Denormalization
produces a lower normal form; that is a 3NF will be converted to a 2NF through denationalization; the price you pay for increased performance through denationalization is greater data redundancy
Data
raw facts that have not yet been processed to reveal their meaning
RDBMS
relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as invented by E. F. Codd, of IBM
Internal model (schema)
representation of the database as "seen" by the DBMS. the internal model requires the designer to match the conceptual model's characteristics and constraints to those of the selected implementation model An internal schema depicts a specific representation of an internal model using the database constructs supported by the chose database
Conceptual model (schema)
represents a global view of the entire database as viewed by the entire organization; that is, the conceptual model integrates all external views (entities, relationships, constraints, and processes) into a single global view of the entire data in the enterprise known as conceptual schema
Single- Valued vs Multi Valued Attribute
single- an attribute that can have only a single value ie a person can only have one SSN or DOB multi valued- attributes that can have many values ie a person may have many college they have attended, multiple phone numbers....
Flag
special codes to indicate the absence of some value
Logical Modeling
specifies integrity constraints, keys, and facilitates normalization
Functional Dependency (FD)
the attribute B is functionally dependent on the attribute A if each column A determines one and only one value in column B
Knowlege
the body of information and facts about a specific subject
External model (schema)
the end user's view of the data environment, a specific representation of an external view is known as external schema
Relational Model
the focus of this CIS 355 class, introduces in 1970 by E.F. Codd of IBM; represented a mpjr breakthrough for both users and designers and it;s conceptual simplicity set the stage for a database revolution
Information
the result of processing raw data to reveal it's meaning
Attribute Domain
the set of values allowed in an attribute ie rooms in a hotel: 1-300 age: 1-99 married: yes/no
Data Dictionary
the storage of all definitions of the data elements and their relationships by the DBMS
Entity Integrity
within a table each primary key value must be unique to ensure that each row is uniquely identified by the primary key. In that case, the table is said to exhibit entity integrity
Relational Operator: SELECT/RESTRICT
yields values for all rows found in a table that satisfy a given condition, SELECT can be used to list all of the row values or yield only those values that match a specified criterion, in other words SELECT yields a horizontal subset of a table