Exam 2 combined set Study Guide(Chapters 4,5,6,7,9)

Ace your homework & exams now with Quizwiz!

Union Join

Includes all columns from each table in the join, and an instance for each row of each table

Tree search

Indexed File Organization uses ____

data warehouse

Informational systems are also known as the____

Decisions

Inputs in the Physical Design process lead to ____

Entity integrity

No primary key attribute may be null. All primary key fields MUST have data.

Inputs

Normalized relations, volume estimates, data security/backup needs, attribute definitions/physical specs, response time expectations, technology used, and integrity expectations are all _____ in the Physical Design Process?

common columns

The ______ in joined tables are usually the primary key of the dominant table and the foreign key of the dependent table in 1:M relationships

Time-variant

The ability to use a data warehouse to study trends and changes over time is ____

Data Mining

The goal of ___ is to explain observed events or conditions, confirm hypothesis, and exploring data for new or unexpected relationships.

irrelevant

The order of columns and rows in a relation must be ___

Schema`

The organization of data to create a blueprint of how a database will be constructed (divided into tables)

transitive dependencies

functional dependecies on non-primary-key attributes

Drill-down

going from summary to more detailed views in OLAP operation is ___

Range Control

limits allowable values for field (e.g., drivers license)

Data Type

limits type of data and length

data integrity

maintaining and assuring the accuracy and consistency of data

Multivalued Attributes

non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part).

Integer

positive/negative whole number up to 38 digits long

Subject oriented

products,customers, patients, students, etc are all part of ____ in data warehousing

Referential Integrity

range control for foreign-key to primary-key match-ups

Procedures

routines that do not return values and can take input or output parameters

Triggers

routines that execute in response to a database event

Base Tables

tables containing the raw data

ORDER BY

the ___ statement Sorts the result according to specified criteria

WHERE

the ___ statement will indicate the conditions under which a row will be included in the result

GROUP BY

the ____ statement will Indicate categorization of results

HAVING

the ____ statement will Indicate the conditions under which a category (group) will be included

FROM

the ____ statement will Indicate the table(s) or view(s) from which data will be obtained

Data Dictionary

the schema is stored in the ____

COUNT

use the ___ function to find totals

Default value

value-assumed value if no explicit value entered

VARCHAR@

variable-length character, must enter max length, only consumes space needed

transitive dependency

when a non-key attribute determines another non-key attribute.

Partial Dependency

when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key.

Identify columns that are unique(candidate keys). Select a Primary key

3rd and 4th steps of Table Creation

Identify primary key-foreign key mates. Determine default values

5th and 6th steps of Table Creation

Dynamic View

A "virtual table" created upon request by user. No data is stored; data from base tables is available to user. Based on SQL SELECT statement on base tables or other views

Data Mart

A 'mini' data warehouse that is limited in scope, i.e., a slice of the data warehouse.

Relational Database Management System(RDBMS)

A DBMS that manages data as a collection of tables in which all relationships are represented by common values in related tables

Data Mart; Data Warehouse

A ___ would have few internal and external sources while ___ has many

Data mart; Warehouse

A ____ is highly denormalized while a ___ is lightly denormalized

Hash Algorithm

A routine that converts primary key to record address. Locates records based on algorithm.

Natural Join

An equi-join in which one of the duplicate columns is eliminated in the result table

Join

An operation that causes two or more tables with a common field to be combined into a single table or view

Denormalization

Can improve performance (speed) by reducing number of table lookups (i.e. reduce number of necessary join queries)

Operational

Clerks, salespersons, and administrators are all primary users of the ____ System

Data Control Language(DCL)

Commands that control a database, including administering privileges and committing data

Data Definition Language(DDL)

Commands that define a database, including creating, altering, and dropping tables and establishing constraints

Data Manipulation Language(DML)

Commands that maintain and query a database

referential integrity

Constraint that states that foreign key values of a table must match primary key values of a related table

Materialized View

Copy of data that is actually stored on disk. Must be refreshed periodically to maintain sync to the base tables.

Goal of Physical Database Design

Create a design for storing data that will provide adequate performance and insure database integrity, security, and recoverability

Informational System

The primary purpose of ___ is to support managerial decision making?

Operational System

The primary purpose of ____ is to run the business on a current basis?

Indexed File Organization

The storage of records either sequentially or nonsequentially with an index that allows software to locate individual records

Schema

The structure that contains descriptions of objects created by a user (base tables, views, constraints)

Online Analytical Processing(OLAP)

The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple techniques

Functional Dependency

The value of one attribute determines the value of another attribute

Operational

The volume of ___ System includes many constant updates and queries on one or a few table rows

File Organization

Fast data retrieval, efficient storage space utilization, protection from failure/ data loss, low need for reorganization, growth accomodation, are all factors for selecting _____.

logical views of the data warehouse.

In logical Data Mart and Real Time Warehouse Architecture data marts are NOT seperate databases but _______. making it easier to create new data marts

Informational

The volume of ____System includes periodic batch updates and queries requiring many or all rows

unnormalized

There are multivalued attributes or repeating groups is ____

a seperate relation with a foreign key taken from the original entity

When transforming entities to relations, a multivalued attribute becomes___

their simple, component attributes

When transforming entities to relations, if you have composite attributes only use ___

Referential Integrity

You cannot have an order for a nonexistent customer. You cannot delete a customer that has an order. These are examples of keeping___

Vertical Partioning

You would use ____ if different users need access to different rows

Vertical Partioning

You would use _____ if different users need access to different columns

Operational

___ Systems usage is narrow, planned, and have simple updates and queries

Relation

___ is a named, two demensional table of data

Data Mart

___ is specific DSS application and is decentralized by user area. Also, is organic and possibly not planned

Normalization

___ is the process of decomposing relations with anomalies to produce smaller, well-structured relations

Procedures

____ are called explicitly

Trigger

____ are event-driven.

Company-Wide View

____ can lead to inconsistent key structures/ data values ,synonyms, missing data, and uses free-form instead of structured fields

Fact Tables

____ contain factual or quantitative data. provides statistics for sales broken down by product, period and store dimensions

Operational/Informational

____ design goal is performance(throughput, availability) while ____ design goal is ease of flexible access and use

Data Warehouse

____ is application independent, centralized(possibly enterprise wide) and planned

DELETE

____ statement allows you to remove rows from a table

DROP TABLE

____ statement allows you to remove tables

SELECT

____ statement is used to List the columns (and expressions) that should be returned from the query

Fractals

____ technique in Data mining is used to compress large databases without losing information

Case-based reasoning

____ technique in Data mining is used to derive rules from real-world case examples

Neural nets

____ technique in Data mining is used to develop predictive models based on principles modele after the human brain

Sequence association

____ technique in Data mining is used to discover cycles of events and behaviors

Affinity

____ technique in Data mining is used to discover strong mutual relationships

Clustering and signal processing

____ technique in Data mining is used to discover subgroups or segments

Rule Discovery

____ technique in Data mining is used to search for patterns and correlations in large data sets

Decision Tree Induction

____ technique in Data mining is used to test or discover if ... then rules for decision propensity

Regression

____ technique in Data mining is used to test or discover relationships from historical data

Candidate Keys

____ uniquely identify each entity instance (row) and one will become the primary key

Informational

_____ Systems usage is broad, ad hoc, and have complex queries/ analysis

Embedded SQL

_____ includes hard coded SQL statements in a program written in another languance such as Jave and creates a more flexible, accessible interface for the user; sometimes creates performance improvement; always provides security improvements

Dynamic SQL

______ has the ability for an application program to generate SQL code on the fly, as the application is running, is more flexible for ad hoc(random queries)

Insertion Anomaly

____adding new rows forces user to create duplicate data

Deletion Anomaly

____deleting rows may cause a loss of data that would be needed for other future rows

ALIAS

____is an alternative column or table name. It is often used for brevity with long table names.

Modification Anomaly

___changing data in a row forces changes to other rows because of duplication

simple

a key with a single field or attribute are called____

composite

a key with more than one field is called ___

Informational System

a system designed to support decision making based on a historical point-in-time and prediction data for complex queries or data mining applications

Operational System

a system that is used to run a business in real time, based on current data

DATE

actual date

1st Normal form

all relations are in ____

Null value control

allowing or prohibiting empty fields

BLOB

binary large object capable of storing 4 gigabytes of binary data, good for graphics, sound clips, etc

Cube slicing

coming up with 2-D view of data is ___ in OLAP operation

Dimension Tables

contain descriptions about the objects (e.g., products and stores) of the business

CREATE VIEW

defines a logical table from one or more tables or views

CREATE TABLE

defines a new table and its columns

CREATE SCHEMA

defines a portion of the database owned by a particular user

Physical Record

A group of fields stored in adjacent memory locations and retrieved together as a unit

Equi-join

A join in which the joining condition is based on matching the values of the common columns.. Common columns appear redundantly in the result table

Physical File

A named portion of memory allocated for the purpose of storing physical records

Structured Query Langauge(SQL)

A special-purpose programming language designed for managing data RDBMS.

Data Warehouse

A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes

Index

A table or other data structure used to determine in a file the location of records that satisfy some condition

extract, transform, and load(ETL)

All data warehouse architectures involve some form of ____

Informational

An ____ System provides data that shows a historical point in time snapshot and predictions

Operational

An _____ System provides data that represents the current state of the business

repeating groups

Another name for multivalued attributes is ____

Decisions

Attribute data types, Physical record descriptions, file organizations, indexes/database architectures, and query optimization are all types of ____ in the Physical Design process?

depth of the tree

Average time to find desired record=

Integrated

Being ___ includes consistent naming conventions, formats, and encoding structures; from multiple data sources

Multidimensional OLAP (MOLAP)

Cube structure of online analytical processing

Maintenance&Implementation

DCL is part of the ___&____ database development processes

Physical Design&Maintenance

DDL is part of the ___&____ database development processes

implementation

DML is part of the ___ database development process

Project; Data

Data Marts are ___ oriented while Data warehouses are ___ oriented

Horizontal Partitioning

Distributing the rows of a table into several seperate files.

Non-updatable

Data warehouses are read only, and are periodically refreshed.This makes them ___

EDW

Dependent Data marts are loaded from ____

denormalized

Dimension tables are ___ to maximize performance

Vertical Partioning

Distributing the columns of a table into several seperate relations

functionally dependent

Each non-key attribute (column, field) is ____ on every primary and candidate key

Second Normal Form(2NF)

Every non-key attribute must be defined by the entire primary key (not just part of the key) is in ____

unique

Every row in a Relation must be ___

Identify data types for attributes. Identify columns that can/cannot be null

First two steps of Table Creation

CHAR

Fixed-length character, max 2000 characters, default length is 1 character

atomic(not multivalued, not composite)

For a table to qualify as a Relation it must have a unique name and every attribute value must be ____

primary or candidate key

For referential integrity to hold, any field in a table that is declared a foreign key can contain only values from a parent table's ______

Partitioning

Form of denormaliztion that creates more tables. Split tables in multiple sub tables. Often corresponds with user schemas or views.

foreign to primary key references

How are referential integrity constraints implemented in a DBMS?

Triggers

INSERT,UPDATE, or DELETE are all examples of ____

Sequential File Storage

If using ____ you must start at the beginning and scan until you find the desired record(not used as much because inflexible)

Logical Data Mart and Real Time Warehouse Architecture

In ___ ODS and data warehouse are one and the same

Dependent Data Mart with Operational Data Store

In ___ architecture ODS provides option for obtaining current data.

Independent Data Mart

In ____ architecture there are Seperate ETL for each data mart. It also creates data access complexity due to multiple data marts

Enterprise data warehouse(EDW)

In a Dependent Data Mart with Operational Data Store a single ETL is for ____

Data Warehouse

In a Logical Data Mart and Real Time Warehouse Architecture Near real-time ETL is for ___

attributes or field(StudentName, StudentID)

In a relation table columns are ___

entity instances (records)

In a relation table rows are ___

Data Mining

Knowledge discovery using a blend of statistical, AI, and computer graphics techniques

Identify constraints on columns(domain specifications). Create the table(and indexes if appropriate)

Last two steps of Table Creation

Informational

Managers, business analysts, customers are all the primary users of the ____ System

Tablespace

Named set of disk storage elements in which physical files for database tables can be stored

2NF

No partial functional dependencies Would be _____ form

Database

Operational System is also known as the ___

secondary keys

Other fields or combinations of fields that can also be indexed are called____

Indexed

Primary keys are automatically ____

Routines

Program modules that execute on demand

entity types

Relations are ____(Student)

Functions

Routines that return values and take input parameters

Inner Join

Rows must have matching values in order to appear in the result

Outer Join

Rows that do not have matching values in common columns are still included in the result

Field

Smallest unit of data in a database

Index

Speed up random/sequential access to base table data. Similar to the back of a textbook

online transaction processing

Star Schema is bad for _____

ad-hoc

Star Schema is excellent for ____ queries.

TIMESTAMPTIMESTAMP WITH LOCAL TIME ZONE

Stores a moment an event occurs, using a definable fraction-of-a-second precision. Value adjusted to the users session time zone

BINARY LARGE OBJECT(BLOB)

Stores binary string values in hexadecimal format. defined to be a variable length.

NUMERIC

Stores exact numbers with a defined precision and scale.

Integer(INT)

Stores exact numbers with a predefined precision and scale of zero.

CHARACTER VARYING(VARCHAR or VARCHAR2)

Stores string values containing any characters in a character set but of definable variable length.

Character(CHAR)

Stores string values containing any characters in a character set. Is defined to be a fixed length

BOOLEAN

Stores truth values:TRUE, FALSE, or UNKNOWN

ALTER TABLE

The _____ statement allows you to change column specifications

INSERT

To add one or more rows to a table you would use the ____ statement

UPDATE

To modify data in an existing row/rows you would uses the ___ Statement

Denormalization

Transforming normalized relations into non-normalized physical record specifications

Purpose of Physical Database Design

Translate the logical design/description of data into the technical specifications for storing and retrieving data

wildcard*

Use the _____ after SELECT for everything in the table

AND, OR and NOT

Use the _______ operators for customizing conditions in the WHERE clause

1:N

What type of relationships between dimension tables and fact tables?

data integrity

When choosing data types you want to improve ____?

storage space

When choosing data types you want to minimize ____?

all possible values of the attribute

When choosing data types you want to represent ___?

data manipulations

When choosing data types you want to support ____?

Hashed

Which type of file organization is best for adding new records?

Hashed

Which type of file organization is best for updating records?

Sequential

Which type of file organization is the best for not wasting storage space?Worst?

Hashed

Which type of file organization would be best for Deleting records?

Indexed

Which type of file organization would be best for Multiple-key retrieval?

Sequential

Which type of file organization would not be used for random retrieval on primary key?

Hashed; impractical unless using a hash index

Which type of file organization would you not want to use for Sequential retrieval on primary key?

Data warehouse

Which would have a longer life a data warehouse or a Data Mart?

Warehouse; Mart has one central subject of concern to users

Which would have more subjects a Data warehouse or Mart?

extra work

With hashed file organization, when adding new records multiple keys with the same address would require?

Data warehouse

Would a Data mart or Data Warehouse be more flexible?


Related study sets

FSL421 certains aspects de grammaire

View Set

Cardiovascular Alterations CC Questions

View Set

QuickBooks ProAdvisor Certification Questions

View Set