C175 Chapter 1
Common problems do a collection of spreadsheets created by end users share with the typical file system?
1. End users create their own, private, copies of the data, which creates issues of data ownership 2. Creates islands of information where changes to one set of data are not reflected in all of the copes of the data 3. Lack of data consistency. Since the data in various spreadsheets may be intended to represent a view of the business environment, a lack of consistency in the data may lead to faulty decision making
Physical data format
The way a computer "sees" (stores) data
Data anomaly
when not all of the required changes in the redundant data are made successfully
DBMS make data management more efficient and effective by:
1. Improved data sharing 2. Improved data security 3. Better data integration 4. Minimized data inconsistency 5. Improved data access 6. Improved decision making 7. Increased end-user productivity
Cost of implementing a database system
1. Increased acquisition and operating costs 2. Management complexity 3. Maintaining currency 4. Vendor dependence
Summary of information
1. Information is produced by processing data 2. Information is used to reveal the meaning of data 3. Good, relevant, and timely information is the key to good decision making 4. Good decision making is the key to organizational survival in a global environment
Uncontrolled data redundancy causes:
1. Poor data security 2. Data inconsistency 3. Data-entry errors 4. Data integrity problems
Functions of a DBMS
1. The DBMS stores the definitions of data and their relationships (metadata) in a data dictionary; any changes made are automatically recorded in the data dictionary. 2. The DBMS creates the complex structures required for data storage. 3. The DBMS transforms entered data to conform to the data structures in item 2. 4. The DBMS creates a security system and enforces security within that system. 5. The DBMS creates complex structures that allow multiple user access to the data. 6. The DBMS performs backup and data recovery procedures to ensure data safety. 7. The DBMS promotes and enforces integrity rules to eliminate data integrity problems. 8. The DBMS provides access to the data via utility programs and from programming languages interfaces. 9. The DBMS provides end-user access to data within a computer network environment.
Data dictionary
A DBMS component that stores metadata—data about data. The data dictionary contains data definitions as well as data characteristics and relationships. May also include data that is external to the DBMS
Field
A character or a group of characters (numeric or alphanumeric) that describes a specific characteristic. A field may define a telephone number, a date, or other specific characteristics that the end user wants to keep track of
Data dependence
A data condition in which data representation and manipulation are dependent on the physical data storage characteristics
Record
A logically connected set of one or more fields that describes a person, place, event, or thing
Why the cost of ownership may be lower with a cloud database versus a company database?
Cloud databases reside on the Internet instead of within the organization's own network infrastructure. This can reduce costs because the organization is not required to purchase and maintain the hardware and software necessary to house the database and support the necessary levels of system performance
Summary of data
Data constitute the building blocks of information
Data transformation and presentation
Exists to transform any data entered into required data structures. By using this function, the DBMS can determine the difference between logical and physical data formats
Hashed files
Files are encrypted using hash functions that convert data consisting of various formats into numeric values
Heap files
Files containing an unsorted set of records that are uniquely identified by a record id which allows them to be inserted or deleted using that id
Flat files
Files having no internal hierarchy
Index files
Files that store a list of lookup field values from a data file
File
Historically, a collection of file folders, properly tagged and kept in a filing cabinet. Although such manual files still exist, we more commonly think of a (computer) file as a collection of related records that contain information of interest to the end user
Islands of information
In the old file system environment, pools of independent, often duplicated, and inconsistent data created and managed by different departments
Data
Raw facts from which the required information is derived. Data have little meaning unless they are grouped in a logical manner
Data dictionary management
Removes structural and data dependency and provides the user with data abstraction. The DBMS uses this function to look up the required data component structures and relationships
Security management
Sets rules that determine specific users that are allowed to access the database. This function also sets restraints on what specific data any user can see or manage
Basic database functions that a spreadsheet cannot perform
Spreadsheets do not support self-documentation through metadata, enforcement of data types or domains to ensure consistency of data within a column, defined relationships among tables, or constraints to ensure consistency of data across related tables
Logical data format
The way a person views data within the context of a problem domain
Business intelligence
a comprehensive approach to capture and process business data with the purpose of generating information to support business decision making
Structural dependence
a file is dependent on its structure
Desktop database
a single-user database that runs on a personal computer
Data warehouse
a specialized database that stores data in a format optimized for decision support
Data integrity
as the condition in which all of the data in the database is consistent with the real-world events and conditions
Knowledge
body of information and facts about a specific subject
Data independence
can change the data storage characteristics without affecting the program's ability to access the data
Structural independence
can change the file structure without affecting the application's ability to access the data
General-purpose database
contain a wide variety of data used in multiple disciplines
Discipline-specific databases
contain data focused on specific subject areas
Cloud database
created and maintained using cloud data services, such as Microsoft Azure of Amazon AWS
Metadata
data about data, through which the end-user data is integrated and managed
Operational database, also known as an online transaction processing (OLTP) database, transactional database, or production database
designed primarily to support a company's day-to-day operations
Data inconsistency
exists when different versions of the same data appears in different places
Analytical database
focuses primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making
NoSQL (Not only SQL)
generally used to describe a new generation of database management systems that is not based on the traditional relational database model
Semistructured data
has already been processed to some extent
Database management system (DBMS)
is a collection of programs that manages the database structure and controls access to the data stored in the database
Data quality
is a comprehensive approach to promoting the accuracy, validity, and timeliness of the data
Summary of a database
is a computer structure for storing data in a shared, integrated fashion so that the data can be transformed into information as needed
Data management
is a discipline that focuses on the proper generation, storage, and retrieval of data
query language
is a nonprocedural language—one that lets the user specify what must be done without having to specify how
Online analytical processing (OLAP)
is a set of tools that work together to provide an advanced data analysis environment for retrieving, processing, and modeling data from the data warehouse
Database
is a shared, integrated computer structure
Extensible Markup Language (XML)
is a special language used to represent and manipulate data elements in a textual format
Query
is a specific request issued to the DBMS for data manipulation
Ad hoc query
is a spur-of-the-moment question
Unstructured data
is data that exists in its original (raw) state—that is, in the format in which it was collected
Structured Query Language (SQL)
is the de facto query language and data access standard supported by the majority of DBMS vendors
Structured data
is the result of formatting unstructured data to facilitate storage, use, and the generation of information
Information
is the result of processing raw data to reveal its meaning (produced by processing data)
Structural independence is important because
it substantially decreases programming effort and program maintenance costs
Data processing (DP) specialist
person responsible for developing and managing a computerized file processing system
Database system
refers to an organization of components that define and regulate the collection, storage, management, and use of data within a database environment
Database design
refers to the activities that focus on the design of the database structure that will be used to store and manage end-user data
Social media
refers to web and mobile technologies that enable "anywhere, anytime, always on" human interactions
Data redundancy
same data is stored unnecessarily at different places
Workgroup database
supports a relatively small number of users (usually fewer than 50) or a specific department within an organization
Distributed database
supports data distributed across several different sites
Centralized database
supports data located at a single site
Multiuser database
supports multiple users at the same time
Single-user database
supports only one user at a time
XML database
supports the storage and management of semistructured XML data
Performance tuning
the activities that make the database perform more efficiently in terms of storage and access speed
Query result set
the answer the DBMS sends back to the application
Enterprise database
used by the entire organization and supports many users (more than 50, usually hundreds) across many departments