ISMN 3830
metadata
"data about data"
data processing specialist
...
field
A character or group of characters (alphabetic or numeric) that has a specific meaning. It is used to define and store data. examples: a person's Social Security number, address, phone number, and bank balance
data quality
A comprehensive approach to ensuring the accuracy, validity, and timeliness of data.
data redundancy
A condition that exists when a data environment contains redundant (unnecessarily duplicated) data.
structural dependence
A data characteristic that exists when a change in the database schema affects data access, thus requiring changes in all access programs.
end-user data (raw facts), metadata
A database contains two types of data: ___ and ___.
transactional database
A database designed to keep track of the day-to-day transactions of an organization.
xml database
A database system that stores and manages semistructured XML data.
operational database
A database that is designed primarily to support a company's day-to-day operations. Also known as a online transactional processing (OLTP) transactional database or production database.
multiuser database
A database that supports multiple concurrent users.
single-user database
A database that supports only one user at a time.
data management
A discipline that focuses on data collection, storage, and retrieval. Common ___ functions include addition, deletion, modification, and listing.
workgroup database
A multiuser database that supports a relatively small number of users (usually fewer than 50) or that is used for a specific department in an organization.
file
A named collection of related records. example: a __ might contain data about the students currently enrolled at Auburn
query language
A nonprocedural language that is used by a DBMS to manipulate its data. An example of __ is SQL.
Structured Query Language (SQL)
A powerful and flexible relational database language composed of commands that enable users to create database and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information.
database
A shared, integrated computer structure that houses a collection of related data
desktop database
A single-user database that runs on a personal computer.
island of information
A term used in the old-style file system environment to refer to independent, often duplicated, and inconsistent data pools created and managed by different organizational departments.
database system
An organization of components that defines and regulates the collection, storage, management, and use of data in a database environment.
data warehouse
Bill Inmon, the acknowledged "father of the data warehouse," defines the term as "an integrated, subject-oriented, time-variant, nonvolatile collection of data that provides support for decision making."
metadata
Data about data, that is, data concerning data characteristics and relationships.
unstructured data
Data that exist in their original (raw) state; that is in the format in which they were collected.
semistructured data
Data that have already been processed to some extent.
data integrity
In a relational database, refers to a condition in which the data in the database is in compliance with all entity and referential integrity constraints.
database management system
Refers to the collection of programs that manages the database structure and controls access to the data stored in the database.
production database
The main database designed to keep track of the day-to-day operations of a company.
information
The result of processing raw data to reveal its meaning. Information consists of transformed data and facilitates decision making.
physical data format
The way in which a computer "sees" (stores) data; how the computer must work with the data.
logical data format
The way in which a human being views data.
XML
Unlike other markup languages, ___ permits the manipulation of a document's data elements; ___ is designed to facilitate the exchange of structured documents such as orders and invoices over the Internet.
structured data
Unstructured data that have been formatted to facilitate storage, use, and the generation of information.
business intelligence
__ describes a comprehensive approach to capture and process business data with the purpose of generating info to support business decision making
data anomily
__ develops when not all of the required changes in the redundant data are made successfully example: an employee moves, but the address change is corrected in only one file and not across all files in the database
attribute
__ is a characteristic of an entity
problem domain
__ is a clearly defined area within the real-world environment, with well-defined scope and boundaries, that will be systematically addressed
data model
__ is a relatively simple representation, usually graphical, of more complex real-world data structures
online analytical processing (OLAP)
__ is a set of tools that work together to provide an advanced data analysis environment for retrieving, processing, and modeling data from the data warehouse
NoSQL (not only SQL)
__ is generally used to describe a new generation of database management systems that is not based on the traditional relational database model
data
__ is one of an org's most valuable assets
database systems. file system
__ provides advantages over ____ management approach: -eliminates inconsistency, data anomalies, data dependency, and structural dependency problems -stores data structures, relationships, and access paths
performance tuning
__ relates to the activities that make a database perform more efficiently in terms of storage and access speed.
query
__: a specific request for data manipulation issued to the DBMS for data manipulation (ex. to read or update the data)
data, information
___ (raw facts) VS. ___ (the result of processing raw data to reveal its meaning)
general purpose databases
___ contain a wide variety of data used in multiple disciplines -ex. a census database that contains general demographic data, the ProQuest databases that contain newspaper, magazine, and journal articles for a variety of topics
discipline specific database
___ contain data focused on specific subject areas. the data in this type of database are used mainly for academic or research purposes within a small set of disciplines -examples: fiancial data stored in databases such as CompuStat, geographic information system databases that store geospatial and other related data, medical databases that store confidential medical history data
unstructured
___ data exist in a format that does not lend itself to the processing that yields info
analytical
___ databases allow the end user to perform advanced data analysis of business data using sophisticated tools
database system
___ disadvantages: -increased costs -management complexity -maintaining currency -vendor dependence -frequent upgrade/replacement cycles
data independence
___ exists when you can change the data storage characteristics without affecting the program's ability to access the data
structural independence
___ exists when you can change the file structure without affecting the application's ability to access the structure
data inconsistency
___ exists with different versions of the same data appear in different places
analytical database
___ focuses primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making
data
___ have little meaning unless they have been organized in some logical manner
extensible markup language (XML)
___ is a metalanguage used to represent and manipulate data elements in a textual format.
query language
___ is a nonprocedural language that lets the user specify what must be done without having to specify how
database
___ is shared, integrated computer structure housing end user data and metadata
structural dependence
___ means that access to a file is dependent on its structure
database design
___ refers to the activities that focus on the design of the database structure that will be used to store and manage end-user data
data modeling
___ refers to the process of creating a specific data model for a determined problem domain -first step in designing a database
data dependence
___: -change in file's data characteristics requires modification of data access programs -must tell program what to do and how -makes file systems cumbersome from programming and data management views
Database Management System (DBMS)
___: -manages database structure -controls access to data -contains query language
structural depenence
___: change in file structure requires modification of related programs
database design
___: the process that yields the description of the database structure. -process determines the database components.
query, ad hoc query
a __ is a question, and a __ is a spur-of-the-moment question
data processing (DP) specialist
a __ is hired to create a computer based system that would track data and produce required reports
data dependence
a data condition in which the data representation and manipulation are dependent on the physical data storage characteristics
distributed database
a database that supports data distributed across several different sites
centralized database
a database that supports data located at a single site
record
a logically connected set of one or more fields that describes a person, place, or thing example: the fields that constitute a customer __ might consist of the name, address, phone number, date of birth, credit limit, and unpaid balance
data warehouse
a specialized database that stores data in a format optimized for decision support
ad hoc query
a spur of the moment question
query result set
after a query is issued to the DBMS, the DBMS sends back an answer called the ___
data dictionary
any changes made in a database structure are automatically recorded in the ___, thereby freeing you from having to modify all of the programs that access the changed structure
database model
collection of logical constructs used to represent data structure and relationships within the database
data dictionary
component that stores metadata—data about data. Thus, the data dictionary contains the data definition as well as its characteristics and relationships. A data dictionary may also include data that are external to the DBMS.
inconsistency
data ___ = lack of data integrity
update, insert, delete
data anomalies are commonly defined as the following 3 things
logically, repository
database consists of ___ related data stored in a single ___
DBMS
importance of __: -provides better access to more and better managed data -promotes integrated view of org's operations -reduces the probability of inconsistent data
DBMS
importance of ___: -makes data mngt more efficient and effective -query language allows quick answers to ad hoc queries
data redundancy, decisions
importance of good design: -poor design results in unwanted ___ -poor design generates errors leading to bad ___
database model
in the book, ___ is often used to refer to the implementation of a data model in a specific database system
hardware, software, people, procedures, data
list the 5 majors part that compose the database system
semistructured
most data you encounter are best classified as ___
end-user data
raw facts of interests to the user
data
raw facts, that is, facts that have not yet been processed to reveal their meaning to the end user
unstructured, structured
some data might not be ready (__) for some types of processing, but they might be ready (__) for other types of processing
abstraction, dependence
the DBMS provides data ___, and it removes structural and data ___ from the system
data dictionary
the DBMS uses the __ to look up the required data component structures and relationships, this relieving you from having to code such complex relationships into each program
data warehouse
the __ contains historical data obtained from the operational databases as well as data from other external sources
knowledge
the body of information and facts about a specific subject
query result set
the collection of data rows that are returned by a query.
poor data security, data inconsistency, data anomalies
uncontrolled data redundancy sets the stage for what 3 things?
XML databases
unstructured and semistructured data storage and management needs are being addressed through a new generation of databases known as ___
lengthy development times, difficulty of getting quick answers (ad hoc queries are impossible), complex system admin, lack of security and limited data sharing, extensive programming
what are the 5 problems with file system data processing? -problems associated with file systems that severely challenge the types of info that can be created from the data as well as the accuracy of the info
enterprise database
when the database is used by the entire organization and supports many users (more than 50) across many departments, the database is known as a ___
environment, data model
within the database ___, a ___ represents data structures and their characteristics, relations, constraints, transformations, and other constructs with the purpose of supporting a specific problem domain
structure
you apply ___ based on the type of processing that you intend to perform on the data