MIT Chapter 7
data management (tools)
- simple data management software packages - easy to use and update such as personal information manages (PIMs)
database models
-hierarchical, network, relational, object-oriented, object-relational
centralized database approach to data mangement
-multiple application programs share a pool of related data - stored in a central location (data center), such as grocery store chain - more efficient and less prone to errors -requires a DBMS -reduces Data Redundancy - when data is copied, stored, and used from different locations - difficult to manage due to constant changes
good data
-nonredundant, flexible, simple and adaptable to a number of different applications
relational model
-to describe data using a standard tabular format -all data elements are placed in two-dementional tables called Relations, that are to logical equivalent files -tables in relational models organize data in rows and columns, simplifying data access and modification
one-to-many relationship
1 and infinity symbols on the relationship lines -1 - one record for each table entry -infinity - each of the 1 can have a number of the next table -one to one or many to many relationships
key
a field in a record that is used to identify the record (EmployeeNumber)
relationship
ability to connect data in different tables through common field
database types - multiuser - database development platform
allow multiple employees to access and edit simultaneously - businesses - allow you to create a database and DBMS for accessing and manipulating data in database
business intelligence
business use of data mining to increase efficiency, reduce costs, increase profits
field types - computed field
calculated field - calculated from other fields instead of being entered into the database
field types - alphanumeric
character data includes characters or numbers that cannot be manipulated or used in calculations
attribute
characteristic of an entity (employee number, name, hire date) -selected to capture relevant characteristics of entities
data center
climate-controlled building or set of buildings that house the servers that store and deliver mission-critical information and services (corporate databases and DBMS)
database
collection of data (integrated and related files or tables) organized to meet users' needs
data definition language (DDL)
collection of instructions and commands used to define and describe data and data relationships in a specific database (schemas entered into a DBMS using DDL)
record
collection of related field that describe some object or activity - can create a more complete description (name, time, artist, price, release date, track #)
file (table)
collection of related records (itunes music library)
unstructured databases
contains data that is difficult of impossible to place in a traditional database system - notes, drawings, fingerprints, medical abstracts, sound recordings
field types - logical field
contains items such as yes or no
field types - numeric field
contains numbers that can be used in making calculations
database backup
copy of all or part of database on second storage device
databases can be..
critical or corresponding responses across diverse functional areas of an organization -controversial - privacy and even civil rights are being violate? - databases of people's information
data integrity
data stored in database is accurate and up to date
characteristics of databases - amount
database size - depends on number of records or files - determines overall storage resuirement
replicated database
database that holds duplicate set of frequently used data - send at beginning and end of the dat
database system or database environment
database, DBMS, and application programs that utilize the data
lights out environment
databases being automated and can run/manage themselves while being monitored remotely
database types - single user
databases for personal computers - only one person can use at a time -microsoft outlook, quicken - store and manipulate personal data
back-end application
dbms that interacts with other programs and databases; indirectly interacts with users
front-end application
dbms that is one that directly interacts with users
schema
description of outline the logical and physical structure of data and relationships among the data in a database
database types - special-purpose database
designed for one purpose or limited number of applications
data dictionary
detailed description of all data used in the database - standards definitions of terms and data elemnts that can be referenced by programmers, data administrators, and users to maintain data integrity
end-user computing
development and use of application programs and computer systems by users who are not computer system professionals
database design - input and output interface design
effective interfaces - convenient and powerful database design feature
garbage in, garbage out (GIGO)
entering inaccurate data into a database that results in inaccuracies; too old and no longer valid
field name
every field has one and can have either fixed or variable length - no spaces in database syntax (EmployeeNumber)
comma-separated values (CSV)
flat file format - uses line breaks and commas to interpret data as a table -simple, standardized and understood by many different databases
entity
generalized class of people, places, or things (objects) for which data is collected, stored, and maintained (CDs, employees, inventory) -data about entities are stored and represented as records in a database
entity relationship diagram
illustration of relationships between tables
data manipulation: selecting
involves choosing data based on certain criteria
data manipulation: joining
involves combining two or more tables -key to flexibility and power of relational databases
continuity planning
involves disaster recovery - important to government agencies that maintain information on which national security depends
data warehouse
large database that holds important information form a variety of sources
characteristics of databases - volatility
measure of changes - additions, deletions, modifications - typically require in a given time period
characteristics of databases - immediacy
measure of how rapidly changes must be made to data
database design - record and table design
must identify exact fields that are contained in each record and types of records that may be included in each table
master files
permanent files that are updated over time
anomalies
problems and irregularities in data such as two s. thomas without a unique identification code -- add a primary key
normalization
process of correcting data problems or anamolies - to ensure the database contains "good data"
data mining
process of extracting information from data warehouse or data mart
database recovery
process of returning database to original, correct condition if database has crashed or been corrupted
data analysis
process that involves evaluating data to identify problems with content of database
disaster recovery
providing a plan for hwo to bring systems back online after an emergency
characters
represented by bytes (basic building block of information). can be upper/lowercase, numbers, special symbols
database systems in organizations
routine processing, information and decision support, data warehouse and data marts and data mining
query by example (QBE)
siual appraoch to making queries or getting ansers to wuestions by entering names, values, and other items into a window -easier and faster than learning formal DMLs and SQL
database administrators (DBAs)
skilled and trained computer professionals who direct all activities related to an organization's database, including providing security from intruders
data mart
small data warehouse, often developed for specific person or purpose - generated from data warehouse using DBMS
field
smallest practical unit in most databases -typically a name, numer, or combination of characters that in some way describes an aspect of an object or activity.
Database Management System (DBMS)
software that accesses databases - group of programs that manipulate the database and provide an interface between the database and the user/application programs -store important data for various people groups -perform routine tasks -providing information to help make better decisions -ensure data is protected and safe from attackers
data item
specific attribute - can be found in fields of the record describing the entity (DNA test results)
data manipulation language (DML)
specific language provided with DBMS that allows people and other databse users to access, modify, and make queries about data contained in databse and to generate reports
database design - field design
specify type, size, format and other aspects of each field
standardized query language (SQL)
standardized query manipulation language for relational databases
database types - flat file
stores databse records in plain text file -stores each record as a line of text that is separated by commas etc
transaction files
temporary files that contain data representing transactions or actions that must be taken (number of hours employee worked last week, sales order from yesterday) -cause changes to master files
data hierarchy
the manner in which data in a database is organized into sequential levels of detail -characters are combined to make fields -fields are combined to make a record -records are combined to make a file -files are combined to make a database
storage area network (SAN)
to connect multiple storage devices on high-speed network to make recovering from failure faster and more efficient
redundant array of independent disks (RAID)
to store data on multiple disks
primary key
uniquely identifies a record - no other record can have the same primary key -essential for DBMS functioning -unique id number for multiple fields as a primary key
database types - open-source database
used by travel agencies, manufacturing companies, etc -strong community support, customizable, free
database types - general-purpose database
used for a large number of applications -oracle, sybase, IBM
distributed databases
virtualized database - actual data may be spread across several databases at different locations - appear to be single, unified database, but connections data at different locations via telecommunications -controlling accesses and changes data sometimes difficult