MIS 385 Final
Associative entities
(also called gerunds) are formed from many-to-many relationships between other entity types. These entities are represented by a rectangle with rounded corners.
Dynamic view
A "virtual table" created dynamically upon request by a user, no data actually stored; instead data from base table made available to user, based on SQL SELECT statement on base tables or other views
Service-Oriented Architecture (SOA)
A collection of services that communicate with each other, usually by passing data or coordinating a business activity
data administration
A high-level function that is responsible for the overall management of data resources in an organization, including maintaining corporate-wide definitions and standards
Extensible Stylesheet Language Transformation (XSLT)
A language used to transform complex XML documents and also used to create HTML pages from XML documents
Relation
A named two-dimensional table of data.
Data steward
A person responsible for ensuring that organizational applications properly support the organization's data quality goals
Data Warehouse
A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes
database administration
A technical function that is responsible for physical database design and for dealing with technical issues such as security enforcement, database performance, and backup and recovery
XQuery
An XML transformation language that allows applications to query both relational databases and XML data
Open source DBMS
An alternative to proprietary packages such as Oracle, Microsoft SQL Server, or Microsoft Access
Thin Client
An application where the client (PC) accessing the application primarily provides the user interfaces and some application processing, usually with no or limited local data storage.
Second Normal Form
Any partial functional dependencies have been removed (i.e., nonkey attributes are identified by the whole primary key). A relation in first normal form in which every nonkey attribute is fully functionally dependent on the primary key.
Third Normal Form
Any transitive dependencies have been removed (i.e., nonkey attributes are identified by only the primary key). A relation that is in second normal form that has no transitive dependencies.
Journalizing facility
Audit trail of transactions and database updates
Materialized view
Copy or replication of data, data actually stored, must be refreshed periodically to match corresponding base tables
Backup facility
DBMS copy utility that produces backup copy of the entire database or subset
Data Structure
Data are organized in the form of tables, with rows and columns.
Master Data Management (MDM)
Disciplines, technologies, and methods to ensure the currency, meaning, and quality of reference data within and across various subject areas
Deletion anomaly
Example: data for employee number 140 are deleted from the table. This will result in losing the information that this employee completed a course on a specific date. It results in losing the information that this course had an offering that completed on that date.
Modification anomaly
Example: suppose that employee number 100 gets a salary increase. We must record the increase in each of the rows for that employee; otherwise, the data will be inconsistent.
Insertion anomaly
Example: when a user is able to enter employee data without supplying course data.
Data governance
High-level organizational groups and processes overseeing data stewardship across the organization
Correlated Subquery
In SQL, a subquery in which processing the inner query depends on data from the outer query
Data mining
Knowledge discovery, using a blend of statistical, Artificial Intelligence, and computer graphics techniques
XML Schema Definition (XSD)
Language used for defining XML databases that has been recommended by the W3C
Database recovery
Mechanism for restoring a database quickly and accurately after loss or damage
Multidimensional OLAP (MOLAP)
OLAP tools that load data into an intermediate structure, usually a three- or higher-dimensional array
Relational OLAP (ROLAP)
OLAP tools that view the database as a traditional relational database in either a star schema or other normalized or denormalized set of tables
XPath
One of a set of XML technologies supporting XQuery development, locating data in XML documents
Conformed dimension
One or more dimension tables associated with two or more fact tables for which the dimension tables have the same business meaning and primary key with each fact table
Application Partitioning
Placing portions of the application code in different locations (client vs. server) after it is written with advantages such as; improved performance, improved interoperability, balanced workloads
Data Manipulation
Powerful operations are used to manipulate data stored in the relations.
Database security
Protection of the data against accidental or intentional loss, destruction, or misuse
Changed data capture (CDC)
Technique that indicates which data have changed since the last data integration activity
Data Integrity
The model includes mechanisms to specify business rules that maintain the integrity of data when they are manipulated.
Concurrency control
The process of managing simultaneous operations against a database so that data integrity is maintained and the operations do not interfere with each other in a multi-user environment
Online analytical processing (OLAP)
The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques
Simple Object Access Protocol (SOAP)
XML-based communication protocol for sending messages between applications over the Internet
Web Service Description Language (WSDL)
XML-based grammar for describing a Web Service and specifying its public interface
Java servlet
a Java program that is stored on the server and contains the business and database logic for a Java-based application
Enterprise data warehouse (EDW)
a centralized, integrated data warehouse that is the control point and single source of all data made available to end users for decision support applications
Fat client
a client PC that is responsible for processing presentation logic, extensive application and business rules logic, and many DBMS functions
Three-tier architecture
a client/server configuration that includes three layers: a client layer and two server layers. Although the nature of the server layers differs, a common configuration contains an application server and a database server
Procedure
a collection of procedural and SQL statements that are assigned a unique name within the schema and stored in the database
Information repository
a component that stores metadata that describes an organization's data and data processing resources, manages the total information processing environment, and combines information about an organization's business information and its application portfolio
Information Resource Dictionary System (IRDS)
a computer software tool that is used to manage and control access to the information repository
Database Server
a computer that is responsible for database storage, access, and processing in a client/server environment. Some people also use this term to describe a two-tier client/server application
Functional Dependency
a constraint between two attributes in which the value of on attribute is determined by the value of another attribute
Extent
a contiguous section of disk storage space
After image
a copy of a record (or page of memory) after it has been modified
Before image
a copy of a record (or page of memory) before it has been modified
Smart card
a credit card-sized plastic card with an embedded microprocessor chip that can store, process, and output electronic data in a secure manner
Logical data mart
a data mart created by a relational view of a data warehouse
Dependent data mart
a data mart filled exclusively from an enterprise data warehouse and its reconciled data
Independent data mart
a data mart filled with data extracted from the operational environment, without the benefit of a data warehouse
User-defined data type (UDT)
a data type that a user can define by making it a subclass of a standard type or creating a type that behaves as an object. UDT's may also have defined functions and methods
Data mart
a data warehouse that is limited in scope
Relational DBMS (RDBMS)
a database management system that manages data as a collection of tables in which all data relationships are represented by common values in related tables
failure transparency
a design goal for a distributed database, which guarantees that either all the actions of each transaction are committed or else none of them is committed
local autonomy
a design goal for a distributed database, which says that a site can independently administer and operate its database when connections to other nodes have failed
location transparency
a design goal for a distributed database, which says that a user (or user program) using data need not know the location of the data
replication transparency
a design goal for a distributed database, which says that although a given data item may be replicated at several nodes in a network, a programmer or user may treat the data item as if it were a single item in a single node
Data Type
a detailed coding scheme recognized by system software, such as a DBMS, for representing organizational data
Transaction
a discrete unit of work that must be completely processed or not processed at all within a computer system. Entering a customer order is an example.
Checkpoint facility
a facility by which a DBMS periodically refuses to accept any new transactions. The system is in a quiet state, and the database and transaction logs are synchronized.
Pointer
a field of data indicating a target address that can be used to locate a related field or record of data.
Hashed index table
a file organization that uses hashing to map a key into a location in an index, where there is a pointer to the actual data record matching the hash key.
Recursive Foreign Key
a foreign key in a relation that references the primary key values of the same relation.
Transitive dependency
a functional dependency between the primary key and one or more nonkey attributes that are dependent on the primary key via another nonkey attribute.
Partial Functional Dependency
a functional dependency in which one or more nonkey attributes are functionally dependent on part (but not all) of the primary key.
Outer join
a join in which rows that do not have matching values in common columns are nevertheless included in the result table
Equi-join
a join in which the joining condition is based on the equality between values in the common columns. Common columns appear in the result table
Natural join
a join that is the same as an equi-join except that one of the duplicate columns is eliminated in the result table
Semijoin
a joining operation used with distributed databases in which only the joining attribute from one site is transmitted to the other site
Database change log
a log that contains before and after images of records that have been modified by transactions
Deadlock prevention
a method for resolving deadlocks in which user programs must lock all records they require at the beginning of a transaction
Recovery manager
a module of a DBMS that restores the database to a correct condition when a failure occurs and then resumes processing user questions
Tablespace
a named logical storage unit in which data from one or more database tables, views, or other database objects may be stored
Physical File
a named portion of secondary memory (such as hard disk) allocated for the purpose of storing physical records
Trigger
a named set of SQL statements that are considered (Triggered) when a data modification (ie. INSERT, UPDATE, DELETE) occurs or if certain data definitions are encountered. If a condition stated within a trigger is met, then a prescribed action is taken
Client/Server System
a networked computing model that distributes processes between clients and servers, which supply the requested services. In a database system, the database generally resides on a server that processes the DBMS. The clients may process the application systems or request services from another server that holds the application programs
Composite key
a primary key that consists of more than one attribute.
Enterprise key
a primary key whose value is unique across all relations
Two-phase locking protocol
a procedure for acquiring the necessary locks for a transaction in which all necessary locks are acquired before any locks are released, resulting in a growing phase when locks are acquired and a shrinking phase when they are released.
Locking
a process in which any data that are retrieved by a user for updating must be locked, or denied to other users, until the update is completed or aborted
Transaction log
a record of the essential data for each transaction that is processes against the database
Well-structured relation
a relation that contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies.
Join
a relational operation that causes two tables with a common domain to be combined into a single table or view
Data dictionary
a repository of information about a database that documents data elements of a database
Hashing algorithm
a routine that converts a primary key value into a relative record number or relative file address.
Referential integrity constraint
a rule that states that either each foreign key value must match a primary key value in another relation or the foreign key value must be null.
Entity integrity rule
a rule that states that no primary key attribute (or component of a primary key attribute) may be null.
Surrogate Primary Key
a serial number or other system-assigned primary key for a relation
Web services
a set of emerging XML-based standards that define protocols for automatic communication between software programs over the Web
Catalog
a set of schemas that, when put together, constitute a description of a database
Star schema
a simple database design in which dimensional data are separated from fact or event data. A dimensional model is another name for star schema
Distributed database
a single logical database that is spread physically across computers in multiple locations that are connected by a data communication link
Scalar aggregate
a single value returned from an SQL query that includes an aggregate function
Normal form
a state of a relation that requires that certain rules regarding relationships between attributed (or functional dependencies) are satisfied.
Hashed File Organization
a storage system in which the address for each record is determined using a hashing algorithm
Function
a stored subroutine that returns one value and has only input parameters
Schema
a structure that contains descriptions of objects created by a user, such as base tables, views, and constraints, as part of a database
Informational System
a system designed to support decision making based on historical point-in-time and prediction data for complex queries or data-mining applications
Operational System
a system that is used to run a business in real time, based on current data. Also called a system of record
System Catalog
a system-created database that describes all database objects, including data dictionary information, and also includes user access information
Virtual Table
a table constructed automatically as needed by a DBMS.
Index
a table or other data structure used to determine in a file the location of records that satisfy some condition
data federation
a technique for data integration that provides a virtual view of integrated data without actually creating one centralized database
File organization
a technique for physically arranging the records of a file on secondary storage devices
Shared lock (S lock or read lock)
a technique that allows other transactions to read but not update a record or another resource
Restore/rerun
a technique that involves reprocessing the day's transactions (up to the point of failure) against the backup copy of the database
Exclusive lock (X lock or write lock)
a technique that prevents another transaction from reading and therefore updating a record until it is unlocked
Forward recovery (rollforward)
a technique that starts with an earlier copy of a database. After images (the results of good transactions) are applied to the database, and the database is quickly moved forward to a later state.
Extensible Markup Language (XML)
a text-based scripting language used to describe data structures hierarchically, using HTML-like tags
Aborted transaction
a transaction in progress that terminates abnormally
Null
a value that may be assigned to an attribute when no other value applies or when the applicable value is unknown.
two-phase commit
an algorithm for coordinating updates in a distributed database
commit protocol
an algorithm to ensure that a transaction is either successfully completed or aborted
Alias
an alternative name used for an attribute
Open database connectivity (ODBC)
an application programming interface that provides a common language for application programs to access and process SQL databases independent of the particular DBMS that is accessed
Versioning
an approach to concurrency control in which each transaction is restricted to a view of the database as of the time that transaction started, and when a transaction modifies a record, the DBMS creates a new record version instead of overwriting the old record. Hence, no form of locking required
Deadlock resolution
an approach to dealing with deadlocks that allows deadlocks to occur but builds mechanisms into the DBMS for detecting and breaking the deadlocks
Foreign key
an attribute in a relation that serves as the primary key of another relation in the same database.
Primary Key
an attribute or a combination of attributes that uniquely identifies each row in a relation.
Homonym
an attribute that may have more than one meaning
Candidate key
an attribute, or combination of attributes, that uniquely identifies a row in a relation
Real-time data warehousing
an enterprise data warehouse that accepts near-real-time feeds of transactional data from the systems of record, analyzes warehouse data, and in near-real-time relays business rules to the data warehouse and systems of record so that immediate action can be taken in response to business events.
Anomaly
an error or inconsistency that may result when a user attempts to update a table that contains redundant data. The three types of anomalies are insertion, deletion, and modification anomalies.
Snowflake schema
an expanded version of a star schema in which dimension tables are normalized into several related tables
Big data
an ill-defined term applied to databases whose sixe strains the ability of commonly used relational DBMSs to capture, manage, and process the data within a tolerable elapsed time
Deadlock
an impasse that results when two or more transactions have locked a common resource, and each waits for the other to unlock that resource
Join Index
an index on columns from two or more tables that come from the same domain of values
Operational data store (ODS)
an integrated, subject-oriented, continuously updateable current-valued (with recent history), enterprise-wide, detailed database designed to serve operational users as they do decision support processing
Inconsistent read problem
an unrepeatable read, one that occurs when one user reads data that have been partially updated by another user
First Normal Form
any multivalued attributes (also called repeating groups) have been removed, so there is a single value (possibly null) at the intersection of each row and column of the table. A relation tat has a primary key and in which there are no repeating groups.
Fourth Normal Form
any multivalued dependencies have been removed
Fifth Normal Form
any remaining anomalies have been removed
Boyce-Codd Normal Form
any remaining anomalies that result from functional dependencies have bee removed (because there was more than one possible primary key for the same nonkeys).
refresh mode
bulk rewriting of target data at periodic intervals
Static extract
capturing a snapshot of the source data at a point in time
Incremental extract
capturing changes that have occurred since the last static extract
Data control language (DCL)
commands used to control a database, including those for administering privileges and committing (saving) data
Data Definition Language (DDL)
commands used to define a database, including those for creating, altering, and dropping tables and establishing constraints
Data manipulation language (DML)
commands used to maintain and query a database, including those for updating, inserting, modifying, and querying data
Authorization rules
controls incorporated in a data management system that restricts access to data and also restrict the actions that people may take when they access data
data transformation
convert data from format of operational system to format of data warehouse
Transient data
data in which changes to existing records are written over previous records, thus destroying the previous data content
Periodic data
data that are never physically altered or deleted once they have been added to the store
Derived data
data that have been selected, formatted, and aggregated for end-user decision support applications
Reconciled data
detailed, current data intended to be the single, authoritative source for all decision support applications
Vertical Partitioning
distribution of the columns of a logical relation into several separate physical tables
Horizontal Partitioning
distribution of the rows of a logical relation into several separate tables
Weak entities
entities that cannot exist except with an identifying relationship with an owner (regular) entity type. These entities are represented by a rectangle with a double line.
Regular entities
entities that have an independent existence and generally represent real-world objects, such as persons and products. These entities are represented by rectangles with a single line.
Persistent Stored Modules (SQL/PSM)
extensions defined in SQL:1999 that include the capability to create and drop modules of code stored in the database schema across user sessions
Embedded SQL
hard coded SQL statements included in a program written in another language, such as C or Java
Range Control
limits the set of permissible values a field may assume
Vector aggregate
multiple values returned from an SQL query that includes an aggregate function
Secondary Key
one field or a combination of fields for which more than one record may have the same combination of values. Also called a nonunique key
update mode
only changes in source data are written to data warehouse
Application program interface (API)
routines that an application uses to direct the performance of procedures by the computer's operating system
NoSQL
short for "not only SQL". It is a class of database technology used to store and access textual and other unstructured data, using more flexible structures than the rows and columns format of relational databases.
Middleware
software that allows an application to interoperate with other software without requiring user to understand and code low-level operations
Dynamic SQL
specific SQL code generated on the fly while an application is processing
Universal Description, Discovery, and Integration (UDDI)
standard for creating and distributing Web services
Base table
table containing the raw data
Determinant
the attribute on the left side of the arrow in a functional dependency
Backward recovery (rollback)
the backout, or undo, of unwanted changes to a database. Before the images of the records that have been changed are applied to the database, and the database is returned to an earlier state. Rollback is used to reverse the changes made by transactions that have been aborted, or terminated abnormally.
Encryption
the coding or scrambling of data so that humans cannot read them
Database destruction
the database itself is lost, destroyed, or cannot be read.
Locking level (lock granularity)
the extent of a database resource that is included with each lock
Grain
the level of detail in a fact table, determined by the intersection of all the components of the primary key, including all foreign keys and any other primary key elements
Transaction boundaries
the logical beginning and end of a transaction
joining
the process of combining data from various sources into a single table or view
Normalization
the process of decomposing relations with anomalies to produce smaller, well-structured relations.
Data archiving
the process of moving inactive data to another storage location where it can be accessed when needed
selection
the process of partitioning data according to predefined criteria
aggregation
the process of transforming data from detailed to summary level
Denormalization
the process of transforming normalized relations into non-normalized physical record specifications
Data visualization
the representation of data in graphical and multimedia formats for human analysis
Field
the smallest unit of application data recognized by system software
Indexed file organization
the storage of records either sequentially or nonsequentially with an index that allows software to locate individual records.
Sequential file organization
the storage of records in a file in sequence according to a primary key value
Default Value
the value a field will assume unless a user enters an explicit value for an instance of that field.
Synonyms
two (or more) attributes that have different names but the same meaning
User-defined procedures
user exits (or interfaces) that allow system designers to define their own security procedures in addition to the authorization rules
Data scrubbing
uses pattern recognition and Artificial Intelligence techniques to upgrade data quality