CIT 365 Database Management - Test 1 Chapters 1-3
- number of users - database location - expected type and extent of use
Databases can be classified according to these three things...
- not based on the relational model - support distributed database architectures - provide high scalability, high availability, and fault tolerance - support large amounts of sparse data - geared toward performance rather than transaction consistency - store data in key-value stores
NoSQL Databases
- Hadoop _Hadoop Distributed File System (HDFS) - MapReduce - NoSQL
What are some Big Data technologies?
- name changes - misspellings - deletion of linked records or only partial deletions
What are some types of data anomaly?
- high scalability, availability, and fault tolerance are provided - uses low-cost commodity hardware - supports big data - key-value model improves storage efficiency
What are the advantages of NoSQL?
- visual modeling yields conceptual simplicity - visual representation makes it effective communication tool - is integrated with the dominant relational model
What are the advantages of entity relationship model?
- better data intergration and less data inconsistency - increased end0user productivity - improved data sharing, security, access, and decision making - data quality
What are the advantages of the DBMS?
- structural independence is promoted using independent tables - tabular view improves conceptual simplicity - ad hoc query capability is based on SQL - isolates the end user from physical-level details - improves implementation and management simplicity
What are the advantages of the relational model?
- volume does not allow the usage of conventional structures - expensive - OLAP tools proved inconsistent dealing with unstructured data
What are the challenges of Big Data?
- complex programming is required - there is no relationship support - there is no transaction integrity support - in terms of data consistency, it provides an eventually consistent model
What are the disadvantages of NoSQL?
- increased costs - management complexity - maintaining currency - vendor dependence - frequent upgrade / replacement cycles
What are the disadvantages of database systems?
- limited constraint representation - limited relationship representation - no data manipulation language - loss of information content occurs when attributes are removed from entities to avoid crowded displays
What are the disadvantages of the entity relationship model?
- requires substantial hardware and system software overhead - conceptual simplicity gives untrained people tools to use a good system poorly - may promote information problems
What are the disadvantages of the relational model?
- lengthy development times - difficulty of getting quick answers - complex system admin - lack of security -extensive programming
What are the problems with file system data processing?
- help standardize company's view of data - communications tool between users and designers - allow designers to understand the nature, role, scope of data, and business processes; develop appropriate relationship participation rules and constraints; create an accurate data model
What are the reasons for identifying and documenting business rules?
- volume - velocity - variety
What are the three characteristics of big data?
produced an automatic transmission database that replaced standard transmission databases
What did the relational model developed by Codd in 1970 do?
describe main and distinguishing characteristics of the data
What do business rules do?
difficult-to-trace errors
What does a poorly designed database cause?
- facilitates data management - generates accurate and valuable information
What does a well designed database do?
database structure focuses on the design of the database structure that will be used to store and manage end-user data
What does database design focus on?
represents a global view of the entire database by the entire organization
What does the conceptual model do?
- performs basic functions provided by the hierarchical and network DBMS systems - makes the relational data model easier to understand and implement - hides the complexities of the relational model from the user
What does the relational database management system do?
- graphical representation of entities and their relationships in a database structure
What is the entity relationship model?
- communication tool - give an overall view of the database - organize data for various users - are an abstraction for the creation of good database
What is the importance of Data Models?
-intermediary between the user and the database - enables data to be shared - presents the end-user with an integrated view of the data - receives and translates application requests into operations required to fulfill the requests - hides database's internal complexity from the application programs and users
What is the role of the DBMS?
DBMS
____ eliminates most of file system's problems'
field
a character or group of characters (alphabetic or numeric) that has no specific meaning. used to define and store data
file
a collection of related records
record
a logically connected set of one or more fields that describes a person, place, or thing.
model
abstraction of real-world object or event
database communication interfaces
accept end-user requests via multiple, different network environments
structural dependence
access to a file is dependent on its own structure
conceptual schema
basis for the identification and high-level description of the main data objects
- find new and better ways to manage large amounts of web and sensor-generated data - provide high performance and scalability at a reasonable cost
big data aims to ...
business rule
brief, precise, and unambiguous descriotion of a policy, procedure, or principle
business intelligence
captures and processes business data to generate information that support decision making
attribute
characteristic of an entity
attribute
column
schema
conceptual organization of the entire database as viewed by the database administrator
discipline-specific databases
contain data focused on specific subject areas
general-purpose databases
contains a wide variety of data used in multiple discplines
cloud database
created and maintained using cloud data services that provide defined performance measures for the database
- stores data structures, relationships between structures, and access paths - defines, stores, and manages all access paths and components
current generation DBMS software:
metadata
data about data
data dependence
data access changes when data storage characteristics change
unstructured data
data exists in their original state
distributed database
data is distributed across different sites
centralized database
data is located at the single site
semistructured data
data processed to some extent
- poor data security - data inconsistency - increased likelihood of data-entry errors when complex entries are made in different files
data redundancy implications
data independence
data storage characteristics is changed without affecting the program's ability to access the data
structured data
data that results from formatting
structured query language (SQL)
de facto query language and data access standard supported by the majority of DBMS vendors
relationship
describes an association among entities
operational database
designed to support a company's dat-to-day operations
data anomaly
develops when not all of the required changes in the redundant data are made successfully
backup and recovery management
enables recovery of the database after a failure
schema data definition language (DDL)
enables the database administrator to define the schema components
security management
enforces user security and data privacy
performance tuning
ensures efficient performance of the database in terms of storage and access speed
- be descriptive of the objects in the business environment - use terminology that is familiar to the users
entity names are required to:
data manipulation language (DML)
environment in which data can be managed and is used to work with the data in the database
structural independence
file structure is changed without affecting the application's ability to access the data
data modeling
good ______ ____________ facilitates communication between the designer, user, and the developer
physical data format
how computer must work with data
logical data format
how humans view the data
data modeling
iterative and progressive process of creating a specific data model for a determined problem domain
query language
lets the user specify what must be done without having to specify how
relation or table
matrix composed of intersecting tuple and attribute
data integrity management
minimizes redundancy and maximizes consistency
subschema
portion of the database seen by the application programs that produce the desired information from the data within the database
- facilitates communication between parties - promotes self-documentation
proper naming:
data
raw facts
end-user data
raw facts or interests to end-user
extensible markup language (XML)
represents data elements in textual format
attribute name
required to be descriptive of the data represented by the attribute
tuple
rows
entity instance or entity occurrence
rows in the relational table
desktop database
runs on PC
islands of information
scattered data locations
constraint
set of rules to ensure data integrity
database
shared, integrated computer structure that stores a collection of end-user data and metadata.
data models
simple representations of complex real-world data structures
multiuser access control
sophisticated algorithms ensure that multiple users can access the database concurrently without compromising its integrity
external schema
specific representation of an external view
analytical database
store historical data and business metrics used exclusively for tactical or strategic decision making
data warehouse
stores data in a format optimized for decision support
data dictionary
stores definitions of the data elements and their relationships
workgroup database
supports a small number of users or a specific department
enterprise database
supports many users across many departments
multiuser database
supports multiple users at the same time
single-user database
supports one user at a time
logical design
task of creating a conceptual data model
connectivity
term used to label the relationship types
online analytical processing (OLAP)
tools for retrieving, processing, and modeling data from the data warehouse
data transformation and presentation
transforms entered data to conform to required data structures
entity
unique and distinct object used to collect and store data
data redundancy
unnesessarily storing the same data at different places
entity relationship diagram (ERD)
uses graphic representations to model database components