245 Review

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Integrated data warehouse

Meaning that hit is a centralized, consolidated database that integrates data derived from the entire organization and from multiple sources with diverse formats. Data integration implies that all data is described in the same way throughout the organization

MOLAP

Multidimensional online analytical processing: This extends OLAP functionality to multidimensional database management systems . A MDBMS uses techniques to store data in a matrix like structure, because they believe that multidimensional databases are best suited to manage, store, and analyze multidimensional data. MDBMS visualize the stored data as a data cube. They are much quicker than using ROLAP. Used for very large databases and are faster for larger data sets with predefined dimensions.

BI data

Non normalized, historic data. Generally stored in a few tables and represent transaction summaries. BI data give tactical and strategic business meaning to the operational data. Cover a long period of time, more general, and include many data dimensions. Periodic updates are required.

Outer join

Not only returns the rows matching the join condition, but also the rows with unmatched values. When you have a left outer join, the unmatched values appear on the left and vice versa.

Facts (star schema)

Numeric measurements that represent a specific business aspect or activity. Facts are stored in the center of the star schema and are linked through their dimensions. EX: sales figures

Subject oriented data warehouse

Organized and summarized by topic, such as sales, marketing, or finance. For each topic, the warehouse contains specific subjects of interest

UNION query

Perfect tool for generating a combined listing of customers, one that excludes duplicate records. This combines rows from two or more queries without including duplicate rows (combines the output of two SELECT queries)

Governance

Business intelligence is a method for controlling and monitoring business health for consistent decision making

create an index in SQL

CREATE INDEX <name> ON <attribute>

JOIN USING vs. JOIN ON

Can be used interchangeably most of the time. JOIN USING can only be used when the two tables have the same attribute name. JOING <table> USING ( <fieldname>); JOIN ON can be used when the two tables do not have the same field name but they are the same thing. JOIN <table> ON <table1.fieldname=table2.fieldname>

Updatable view

Can be used to update attributes in any base table used in the view. However, not all views are updatable. One easy way to determine whether a view can be used to update a base table is to examine the view's output. If the primary key columns of the base table you want to update still have unique values in the view, the base table is updatable.

PL/SQL

Procedural language sequel: this is a language that makes it possible to use and store procedural code and SQL statements within the database and to merge SQL with transitional programming constructs, such as variables, loops, and error trapping. Most commonly used with triggers and stored procedures PL/SQL functions are mainly invoked within PL/SQL programs such as triggers and stored procedures.

Dimensions (star schema)

Qualifying characteristics that provide additional perspectives to a given fact. Dimensions are an area of interest because decision support data are almost always viewed in relation to other data. Dimensions are the magnifying glass through which you study the facts.

Key performance indicators

Quantifiable numeric or scaled based measurements that assess the company's effectiveness and success in reaching its strategic and operational goals.

BI architecture

Ranges from highly integrated single-vendor systems to loosely integrated, multivendor environments. Includes ETL tools, data storage, query and reporting, data visualization, data altering and monitoring, and data analytics

Integrity of security

Concerned with keeping data consistent and free of errors. ex: shredding or using encryption to copy data

Availability of security

Refers to the accessibility of data whenever required by authorized users and for authorized purposes.

Concurrency Control

Coordinating the simultaneous execution of transactions in a multi-user database. The objective is to ensure the serializability of transactions in a multiuser database environment. Using a database between multiple users and environments can create data integrity problems

Very large databases

DBMS must be required to support advanced storage technologies and support multi-processor technologies. Drive the need for data warehouses

Delete a value in SQL

DELETE FROM <table> WHERE <condition>

To drop a table you have already completed

DROP TABLE <table> CASCADE CONSTRAINT ;

DA

Data administrator: reports directly to top management and is given a higher degree of responsibility and authority than the DBA. The DA is responsible for controlling the overall corporate data resources He sets long term goals and policies. The job is more broad in scope, and focuses on the long term goals.

Data as an asset

Data can translate into information, which if used accurately and timely, it can enhance a companies competitive advantage

DDL

Data definition language: used to create tables, indexes, commands

Table space/file group (architecture)

Data files are grouped in table space or a file group. This is a logical grouping of several data files that store data with similar characteristics. EX: specific table space where data dictionary information is

Star schema

Data modeling technique used to map multidimensional decision support data into a relational database. Creates the equivalent of a multidimensional database schema from the existing relational database.

DBA

Database administrator: The focal point for data and user interaction. the DBA defines and enforces the procedures and standards to be used by programmers and end users during their work with the DBMS. Responsible for controlling, monitoring, and allocating resources to people and data. Gathers user requirements, building user end confidence, resolves conflicts, finds solutions to information needs, ensuring quality of data

DML

Date manipulation langage: used to insert values, update, and delete values

ROLAP

Relational online analytical processing: PRovides OLAP functionality by using relational databases to analyze multidimensional data. Very easy to use and manipulate This approach builds on existing relational technology and represent a natural extension to companies that already use relational database management systems within their organizations ROLAP uses to star schema to enable relational database technology to support multidimensional data representations.

Time variant data warehouse

Represent the flow of data through time. Data is periodically updated

Atomicity

Requires that all operations of a transaction be completed and if not then the transaction is aborted. If a transaction has four database requests, all four requests must successfully be completed otherwise it is aborted.

Data mart

Small, single subject data warehouse subset that provides decision support to a small group of people. Can be created from the data extracted for a larger data warehouse. Powerful computers can provide a customized decision support system to small groups in ways that might not be possible with a centralized system

Operational data

Stored in a relational database in which the structures are highly normalized. Optimized to store transactions that represent daily operations. Cover a short period of time, atomic detailed data, focuses on representing individual transactions rather than the effects of transactions over time.

Master data management

Technology is not enough in BI. Master data management is a collection of concepts, techniques, and processes for the proper identification, definition, and management of data elements within an organization. The main goal is to provide a comprehensive and consistent definition of all data within an organization. (uniform and consistent data)

Business Intelligence

Describes a comprehensive, cohesive, integrated set of tools and processes used to capture, collect, integrate, store, and analyze data with the purpose of generating and presenting information to support business decision making. Allows a business to transform data into information, then into knowledge, then into wisdom. Helps identify opportunities to create a competitive advantage Has the potential to affect a company's culture by creating continuous business performance improvement through active decision support at all levels in an organization.

Attributes (star schema)

Each dimension table contains attributes. Attributes are often used to search, filter, and classify facts. Dimensions provide descriptive characteristics about the facts through their attributes.

Data analytics

Encompasses a wide range of mathematical, statistical, and modeling techniques with the purpose of extracting knowledge from data. Represent what business managers really want from BI- the ability to extract actionable business insight from current events and foresee future problems.

What type of integrity is enforced when a primary key is declared

Entity integrity is the condition in which each row (entity instance) in a table has its own unique identity. To ensure entity integrity, the primary key has two requirements: all the values in PK must be unique and cannot contain a null

NOT used in SQL

Finds the rows that do not match a criteria WHERE NOT( attribute=);

Database request

The equivalent of a single SQL statement in an application program or transaction. Ex: three SQL statements is equivalent to three database requests.

IN used in SQL

The in operator uses a value list. All of the values must be of the same data type WHERE <attribute> IN (1, 2, 3);

Fact table

The middle of the star schema and contains facts that are linked through their dimensions. Each fact table is designed to answer specific decision support questions.

OLAP

A BI style whose systems share three main characteristics: multidimensional data analysis, advanced database support, and easy to use interfaces Designed to use both operational and BI data. The objective is to increase the speed of data access and data visualization

Lock

A lock guarantees exclusive use of a data item to a current transaction. In other words, transaction 2 does not have access to a data item that is currently being used in transaction

Data cache/ buffer cache (architecture)

A shared, reserved memory area that stores the most recently accessed data blocks in RAM. This also caches system catalogue data and contents of the indexes

SQL/ Procedure Cache (architecture)

A shared, reserved memory area that stores the most recently executed SQL statements, including triggers and functions.

Consistent database state

A successful database transaction changes the database from one consistent state to another. A consistent database state is one in which all data integrity constraints are satisfied.

Data files (architecture )

All data in a database are stored in data files. A typical enterprise database is normally composed of several data files. A data file can contain rows from a single table or it can contain rows from many different tables.

Transaction

Any action that reads from or writes to a database. A logical unit of work that must be entirely completed or entirely aborted; no intermediate states are acceptable meaning that no transaction can be partially completed COMMIT is used to end a successful transaction

ACID TEST

Atomicity, consistency, isolation, durability. Each transaction must display these

Explanatory analytics

Focuses on discovering and explaining data characteristics and relationships based on existing data. This uses statistical tools to formulate hypotheses and answer questions about relationships

Predictive analytics

Focuses on predicting future data outcomes with a high degree of accuracy. Uses statistical tools to answer questions about the future.

Data cube

How end users visualize stored data. Data cubes are created by extracting data form the operational databases or from the data warehouse. You cannot change a data cube

Insert values into a table

INSERT INTO <table> VALUES (); - Character and date values must be entered between apostrophes, number values do not need to

Pessimistic locking

The use of locks based on the assumption that conflict between transactions

Slice and dice

In multidimensional terms, the ability to focus on slices of the cube to perform a more detailed analysis. To slice and dice, it must be possible to identify each slice of the cube. To do so, you use the values of each attribute in a given dimension. For example, to use the location dimension, you might need to define a STORE ID to a particular store.

Indexes

These are crucial in speeding up data access because they facilitate searching, sorting, and using aggregate functions An index is an ordered set of values that contains the index key and pointers. An index scan is more efficient than a full table scan because the index data are preordered and the amount of data is usually much smaller. Index when the field appears in a where or having clause, order or group by, high data sparsity, high index selectivity.

Durability

This ensure that once transaction changes are done and committed, they cannot be undone or lost even in the event of a system failure.

Serializability

This ensures that the schedule for the concurrent execution of the transactions yields consistent results. This is important in multiuser and distributed databases in which multiple transactions are likely to be executed at the same time.

Lock gradularity

This indicates the level of lock use. Database lock, table level locking, page level lock, row level low, or field level lock

Consistency

This indicates the permanence of the database's consistent state. A transaction takes the database from one consistent state to another

Inner query/ subquery

This is how you add rows to a table using another table as the source of the data. A subquery/inner query is a query that is embedded inside another query. The inner query is executed first

Isolation

This means that the data used during the execution of a transaction cannot be used by a second transaction until the first one is completed. This is used in multiuser environments because several users access the database at the same time.

Database security officer

This persons job is to ensure database security and integrity. Planning, organizing, and testing of database contingency plans and recovery procedures

Batch update routine (updatable view)

This pools multiple transactions into a single batch to update a master table field in a single operation. For example, used to update a product's quantity on hand based on summary sales transactions.

Data mining

This refers to analyzing massive amounts of data to uncover hidden trends, patterns, and relationships. To use computer models, explain the findings, and then make business decisions. Data warehouses are a good source of data mining operations because the data has already been cleaned.

Periodicity

Time is the most common dimension used in business analysis. Therefore it is very common to have one fact table for each level of aggregation defined within the time dimension.

I/O requests (architecture)

To work with data, the DBMS must retrieve the data from permanent storage and place them in RAM or the data cache. To move data from permanent storage, the DBMS issues I/O requests and waits for replies. Working with data in the cache is faster because you do not have to wait for retrieval.

Extraction, transformation, and loading (ETL)

Tools to collect, filter, integrate, and aggregate internal and external data to be saved into a data store optimized for decision support.

Snowflake schema

Type of star schema in which the dimension tables can have their own dimension tables. The snowflake schema is usually the result of normalizing dimension tables

Union compatible

UNION only works properly if relations are union compatible, which meant the number of attributes must be the same and their corresponding data types must be alike. EX: Char and Varchar are similar, as are number and small integer

Update a value in SQL

UPDATE <table> SET <attribute> WHERE <condition>

Update/delete cascade

Update: this ensures that if you make changes to the table you referenced, the change is updated throughout the entire system Delete: if you delete a value in the table you referenced the change is updated throughout the entire system

Transaction log

Used by a DBMS to keep track of all transactions that update the database. The DBMS uses this information for a recovery requirement triggered by a ROLLBACK statement or to recover the database forward to a consistent state. A transaction log is costly, but the ability to restore a corrupted database is worth it.

GROUP BY in SQL

Used when you have attribute columns combined with aggregate functions in the SELECT statement

Between used in SQL

WHERE <attribute> BETWEEN <> AND <> - can also be used with OR in the where line

Dashboards

Web based technologies to present key business performance indicators or information in a single integrated view, generally using graphics that are clear concise, and easy to understand

Systems administrator

When a business has several incompatible DBMS to support different operations, the company might have one DBA assigned to each DBMS. The coordinator of all of the DBAs is called the system administrator

Inconsistent retrievals

When a transaction accesses data before and after one or more other transactions finish working with the data. The transaction might read some data before they are changed, thus yielding inconsistent results

Lost update problem

When two concurrent transactions are updating the same data element and one of the updates is lost or overwritten by the other transaction

Uncommitted data

When two transactions are executed concurrently and the first transaction is rolled back after the second transaction has already accessed the uncommitted data

Recursive join

When you join a table to itself

IN subqueries

When you want to compare a single attribute to a list of values, you use the IN operator in the sub query.

Multidimensional data analysis

Where data is processed and viewed as part of a multidimensional structure. This type of data is important to businesses because they tend to view business data as being related to other business data.

Data warehouse

a repository that contains data in formats that facilitate data extraction, data analysis, and decision making. An integrated, subject oriented, time variant, consolidated database that integrates data derived from the entire organization and from multiple sources with diverse formats.

Security

activities that ensure confidentiality, integrity, and availability of an information system and its main asset. You cannot secure data if you do not secure all the processes and systems around it.

Dirty Data

data that suffers from inaccuracies and inconsistencies and is a huge threat to organizations. Some dirty data can come form the movement of data across systems, as in the creation of a data warehouse

Confidentiality of security

ensuring that data are protected against unauthorized access, and if the data are accessed by an authorized user, that the data is used for an authorized purpose

ROLLBACK transaction

if you lose power or do not finish a transaction, the database is in an inconsistent state and it is not usable for transactions. assuming the DBMS supports transaction management, the DBMS will roll back the database to the previous consistent state

Stored procedure

named collection of PL/SQL statements. They can be used to encapsulate and represent business transactions. For example, you can create a stored procedure to represent a product sale and that stored procedure can be executed in a single transaction.

Nonvolatile data warehouse

once entered in a data warehouse, they can never be removed

View

this is a virtual table based on a SELECT query. The query can contain columns, computed columns, and aggregate functions. The tables on which the view is based are called base tables. DDL language. Views are dynamically updated meaning that it is re-created on demand each time it is invoked. Views create a level of security because they can restrict users to only seeing certain columns and rows in a table.


Kaugnay na mga set ng pag-aaral

ART-031-S002: Introduction to Art

View Set

Chapter 7 Introduction to Databases

View Set

Human Motivation Final Cummulative

View Set