ch.3 Database Systems, Data Warehouses, and Data Marts

¡Supera tus tareas y exámenes ahora con Quizwiz!

Validation rule

A rule determining whether a value is valid; for example, a student's age can not be a negative number.

Subject oriented

Focused on a specific area, such as the home-improvement business or a university, whereas data in a database is transaction/function oriented

Purpose

Used for analytical purposes, whereas data in a database is used for capturing and managing transactions

fragmentation

approach to a distributed DBMS addresses how tables are divided among multiple locations. There are three variations: horizontal, vertical, and mixed.

allocation

approach to a distributed DBMS combines fragmentation and replication, with each site storing the data it uses most often.

database

A database is a collection of related data that is stored in a central location or in multiple locations.

data hierarchy

A data hierarchy is the structure and organization of data, which involves fields, records, and files.

database management system (DBMS)

A database management system (DBMS) is software for creating, storing, maintaining, and accessing database files. A DBMS makes using databases more efficient.

Type of data

Captures aggregated data, whereas data in a database captures raw transaction data

Time variant

Categorized based on time, such as historical information, whereas data in a database only keeps recent activity in memory

Field data type

Character (text), date, and number

Integrated

Comes from a variety of sources, whereas data in a database usually does not

Integrity rules

Defines the boundaries of a database, such as maximum and minimum values allowed for a field, constraints (limits on what type of data can be stored in a field), and access methods

Data structure

Describes how data is organized and the relationship among records

Operations

Describes methods, calculations, and so forth that can be performed on data, such as updating and querying data

Field name

Student name, admission date, age, and major

data dictionary

The data dictionarystores definitions, such as data types for fields, default values, and validation rules for data in each field.

physical view

The physical viewinvolves how data is stored on and retrieved from storage media, such as hard disks, magnetic tapes, or CDs.

replication

The replication approach to a distributed DBMS has each site store a copy of the data in the organization's database.

Default value

The value entered if none is available; for example, if no major is declared, the value is "undecided."

Variety

This refers to the combination of structured data (e.g., customers' product ratings between 1 and 5) and unstructured data (e.g., call center conversations or customer complaints about a service or product).

Volume

This refers to the sheer quantity of transactions, measured in petabytes (1,024 terabytes) or exabytes (1,024 petabytes).

Velocity

This refers to the speed with which the data has to be gathered and processed.

data-driven Web site

acts as an interface to a database, retrieving data for users and allowing users to enter data in the database.

data administration component

also used by IT professionals and database administrators, is used for tasks such as backup and recovery, security, and change management.

Data-mining

analysis is used to discover patterns and relationships.

object-oriented databases

both data and their relationships are contained in a single object. An object consists of attributes and methods that can be performed on the object's data.

data model

determines how data is created, represented, organized, and maintained. It usually contains data structure, operations, and integrity rules.

database administrator (DBA)

found in large organizations, design and set up databases, establish security measures, develop recovery procedures, evaluate database performance, and add and fine-tune database functions.

Prescriptive analytics

goes beyond descriptive and predictive analytics by recommending a course of action that a decision maker should follow and showing the likely outcome of each decision.

normalization

improves database efficiency by eliminating redundant data and ensuring that only related data is stored in a table.

logical view

involves how information appears to users and how it can be organized and retrieved.

data warehouse

is a collection of data from a variety of sources used to support decision-making applications and generate business intelligence.

foreign key

is a field in a relational table that matches the primary key column of another table. It can be used to cross-reference tables.

Structured Query Language (SQL)

is a standard fourth-generation query language used by many DBMS packages, such as Oracle 12c and Microsoft SQL Server. SQL consists of several keywords specifying actions to take.

big data

is data so voluminous that conventional computing methods are not able to efficiently process and manage it.

network model

is similar to the hierarchical model, but records are organized differently. Unlike the hierarchical model, each record in the network model can have multiple parent and child records.

data manipulation component

is used to add, delete, modify, and retrieve records from a database.

data definition component

is used to create and maintain the data dictionary and define the structure of files in a database.

application generation component

is used to design elements of an application using a database, such as data entry screens, interactive menus, and interfaces with other programming languages.

online transaction processing (OLTP)

is used to facilitate and manage transaction-oriented applications, such as point-of-sale, data entry, and retrieval transaction processing. It generally uses internal data and responds in real time.

random access file structure

records can be accessed in any order, regardless of their physical locations in storage media. This method of access is fast and very effective when a small number of records need to be processed daily or weekly.

indexed sequential access method (ISAM)

records can be accessed sequentially or randomly, depending on the number being accessed. For a small number, random access is used, and for a large number, sequential access is used.

sequential access file structure

records in files are organized and processed in numerical or sequential order, typically the order in which they were entered.

inheritance

refers to new objects being created faster and more easily by entering new data in attributes.

encapsulation

refers to the grouping into a class of various objects along with their attributes and methods—meaning, grouping related items into a single unit. This helps handle more complex types of data, such as images and graphs.

Extraction, transformation, and loading (ETL)

refers to the processes used in a data warehouse. It includes extracting data from outside sources, transforming it to fit operational needs, and loading it into the end target (database or data warehouse).

create, read, update, and delete (CRUD)

refers to the range of functions that data administrators determine who has permission to perform certain functions.

Descriptive analytics

reviews past events, analyzes the data, and provides a report indicating what happened in a given period and how to prepare for the future Predictive analytics, as the name indicates, is a proactive strategy; it prepares a decision maker for future events.

distributed database management system (DDBMS)

stores data on multiple servers throughout an organization.

A database engine

the heart of DBMS software, is responsible for data storage, manipulation, and retrieval.

hierarchical model

the relationships between records form a treelike structure (hierarchy). Records are called nodes, and relationships between records are called branches. The node at the top is called the root, and every other node (called a child) has a parent. Nodes with the same parents are called twins or siblings.

primary key

uniquely identifies every record in a relational database. Examples include student ID numbers, account numbers, Social Security numbers, and invoice numbers.

relational model

uses a two-dimensional table of rows and columns of data. Rows are records (also called tuples), and columns are fields (also referred to as attributes).

database marketing

uses an organization's database of customers and potential customers to promote products or services.

Business analytics (BA)

uses data and statistical methods to gain insight into the data and provide decision makers with information they can act on.

data mart

usually a smaller version of a data warehouse, used by a single department or function.

query by example (QBE) With query by example (QBE)

you request data from a database by constructing a statement made up of query forms. With current graphical databases, you simply click to select query forms instead of having to remember keywords, as you do with SQL. You can add AND, OR, and NOT operators to the QBE form to fine-tune the query.

In summary, a database has the following advantages over a flat file system:

• More information can be generated from the same data. • Complex requests can be handled more easily. • Data redundancy is eliminated or minimized. • Programs and data are independent, so more than one program can use the same data. • Data management is improved. • A variety of relationships among data can be easily maintained. • More sophisticated security measures can be used. • Storage space is reduced.


Conjuntos de estudio relacionados

Culture Anthropology 220 quiz 2 chapter 5

View Set

Econ 2301 Chapter 10 Learning Objectives

View Set

Chapter 9, 10, 11: Quiz 3 Review

View Set

Proton, Neutron, and Electrons Study Guide

View Set

Life Insurance, Guaranteed Exam Wrong Questions Part 2

View Set