MIS Chp. 6
Data Cleansing
also known as data scrubbing; consists of activities for detecting & correcting data in a database that are incorrect, incomplete, improperly formatted or redundant
Key Field
A field in a record that uniquely identifies instances of that record so that it can be retrieved, updated or sorted
Byte
A group of bits, representing a single character, which can be a letter, a number, or another symbol
File
A group of records of the same type
Database
A group of related files
Field
A grouping of characters into a word, a group of words, or a complete number, such as a person's name or age.
Entity-Relationship Diagram
A methodology for documenting databases illustrating the relationship between various entities in the database
Data Quality Audit
A structured survey of the accuracy & level of completeness of the data in an information system; can be performed by surveying entire data files, surveying samples from data files, or surveying end users for their perceptions of data quality
Data Dictionary
An automated or manual tool for storing & organizing information about the data maintained in a database
Data Administration
Responsible for the specific policies and procedures through which data can be managed as an organizational resource
Structured Query Language
SQL- the standard data manipulation language for relational database management systems
Web Mining
discovery and analysis of useful patterns and information from the World Wide Web.
Text Mining
discovery of patterns and relationships from large sets of unstructured data
Information Policy
formal rules governing the maintenance, distribution & use of information in an organization
Record
groups of related fields
In-memory Computing
technology for very rapid analysis & processing of large quantities of data storing the data in the computer's main memory rather than in secondary storage
Normalization
the process of creating small, table data structures from complex groups of data when designing a relational database
Primary Key
unique identifier for all the information in any row of a database table
Data Manipulation Language
used to add, change, delete and retrieve data in a database
Report Generator
used to create and display data in a structured/ polished format
Nonrelational Database Management Systems
Database management system for working with large quantities of structured or unstructured data that would be difficult to analyze with a relational model
Data Warehouse
Database that stores current & historical data of potential interest to decision makers throughout the company
Online Analytical Processing
OLAP, capability for manipulating & analyzing large volumes of data from multiple perspectives
Data Mart
Subset of a data warehouse, in which a summarized or highly focused portion of the organization's data is placed in a separate database for a specific population of users
Big Data
data sets with volumes so huge that they are beyond the ability of typical relational DBMS to capture, store, and analyze. The data is often unstructured or semi-structured
Database Server
A computer in a client/server environment that is responsible for running a DBMS to process SQL statements and perform database management tasks.
Query
A _____________ is used to extract data from the database in a readable format according to the user's request.
Database Management System
DBMS
Data Definition
DBMS has a __________ ____________ capability to specify the structure of the content of the database
Attributes
Each entity has specific characteristics
Foreign Key
Field in a database table that enables users to find related information in another database table
Database Administration
Refers to the more technical & operational aspects of managing data, including physical database design & maintenance
Entity
a person, place, or thing or event about which information must be kept
Relational Database
a type of logical database model that treats data as if they were stored in two-dimensional tables. It can relate data stored in one table to data in another as long as the two tables share a common data element
Sentiment Analysis
mining text comments in an email message, blog or other social media
Data Mining
more discovery-driven; ____________ _____________ provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns & relationships in large databases & inferring rules from them to predict future behavior
Hadoop
open-source software framework that enables distributed parallel processing of huge amounts of data across many inexpensive computers
Analytic Platform
pre-configured hardware-software system that is specifically designed for high speed analysis of large data sets
Bit
represents the smallest unit of data a computer can handle
Tuples
rows/ records in a relational database
Referential Integrity
rules to ensure that relationships between coupled database tables remain consistent