IS 300: 5.4 Data Warehouses and Data Marts

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Unfortunately, many source systems that have been in use for years contain "______ _______"—for example, missing or incorrect data—and they are poorly documented. As a result, __________ should be used at the beginning of a warehousing project to better understand the data.

1.) "Bad data" 2.) data-profiling software

What are some other data transformations that might take place?

1.) Format Changes to the Data 2.) Aggregations performed (i.e., on sales figures) 3.) Data-cleansing software may be used to clean up the data (i.e., eliminating duplicate records for the same customer).

Typically, organizational databases are oriented toward ______. That is, databases use _______, where business transactions are processed online as soon as they occur. What are the objectives? Data warehouses and data marts, which are designed to support decision makers but not _______, use _______, which involves the analysis of _____ by end users.

1.) Handling transactions; Online Transaction Processing (OLTP) 2.) The objectives are SPEED and EFFICIENCY 3.) OLTP; online analytical processing (OLAP); accumulated data

What's the trend when it comes to SOURCE DATA? What do these source systems often use?

1.) Include more types of data (i.e., sensing data from RFID tags). 2.) Different software packages (i.e., IBM, Oracle) 3.) They store data in different formats (relational, hierarchical)

What does it mean when we say that data warehouses and data marts are nonvolatile? Therefore, what does the warehouse or mart reflect? What is this critical for?

1.) It means that users cannot change or update the data. 2.) It reflects history 3.) This is critical for identifying + analyzing trends

What are some source systems can modern organizations select from?

1.) Operational/transactional systems 2.) Enterprise resource planning (ERP) systems 3.) Website data 4.) Third-party data (i.e., customer demographic data)

What are the 6 basic characteristics of data warehouses and data marts?

1.) Organized by business dimension or subject 2.) Use online analytical processing 3.) Data is collected from multiple systems and are then integrated around subjects. 4.) Time variant--maintain historical data 5.) Nonvolatile 6.) They Use a Multidimensional Structure

Data warehouses and data marts are ______. What's eliminated?

1.) Read only 2.) Extra processing is eliminated b/c data already contained in the data warehouse are not updated

What does the environment for data warehouses and marts include?

1.) Source systems that provide data to the warehouse or mart 2.) Data-integration tech + processes that prepare the data for use 3.) Different architectures for storing data in an organization's data warehouse or data marts 4.) Different tools + applications for the variety of users. 5.) Metadata, data quality, and governance processes that ensure that the warehouse or mart meets its purposes.

Once the data are loaded in a data mart or warehouse, they can be _____. What does the organization begin to obtain?

1.) accessed 2.) Business value from BI

When it comes to DATA EXTRACTION, most companies employ ______. This software makes it relatively easy to ______, ______, ______, and _____

1.) commercial software 2.) Specify the tables and attributes in the source systems that are to be used; (2) map and schedule the movement of the data to the target, such as a data mart or warehouse; (3) make the required transformations; and, ultimately, (4) load the data.

The _____ of the data in the warehouse must meet users' needs. If it does not, then....

1.) data quality 2.) then users will not trust the data and ultimately will not use it.

Most organizations find that the quality of the data in source systems is poor and must be improved before the data can be used in the _________. Some of the data can be improved with _______. The better, long-term solution, however, is to _______. This approach requires the_____ of the data to assume responsibility for making any necessary changes to implement this solution.

1.) data warehouse 2.) Data-cleansing software 3.) improve the quality at the source system level 4.) business owners

In addition to storing data in their source systems, organizations need to ______ the data, ______ them, and then ________. What is this process often called?

1.) extract; transform; load them into a data mart or warehouse 2.) ETL (AKA data integration)

Data extraction can be performed either by _____ such as ______ or by _______.

1.) handwritten code; SQL queries 2.) commercial data-integration software

When it comes to TIME VARIANT, data warehouses and data marts maintain ______ data; that is, data that include ______ as a variable. Unlike transactional systems, which maintain only ____ data (such as for the last day, week, or month), a warehouse or mart may store _____. Organizations use historical data to _____, _____, and ______

1.) historical; time; recent; years of data 2.) detect deviations, trends, and long-term relationships.

Finally, after data is transformed, data are ________. This window is becoming ______ as companies seek to store ever-fresher data in their warehouses. What have companies moved to as a result?

1.) loaded into the warehouse or mart during a specified period known as the "load window." 2.) smaller 3.) They have moved to REAL-TIME DATA WAREHOUSING where data are moved using data-integration processes from source systems to the data warehouse or mart almost instantly.

There is typically some "__________ _______ _______"—that is, a _______—that motivates a firm to develop its BI capabilities. What does this pain lead to? What do these data requirements range from?

1.) organizational pain point; business need 2.) Information requirements, BI applications, and requirements for source system data 3.) Single source system (as in case of data mart, to hundreds of source systems, as in the case of an enterprise-wide data warehouse).

The most successful companies are those that can ______ and _______. What key to this response? What's the challenge?

1.) respond quickly ;flexibly to market changes and opportunities 2.) The effective + efficient use of data and information by analysts and managers. 3.) The challenge is to provide users with access to corporate data so that they can analyze the data to make better decisions.

Organizations can choose from a variety of architectures to _______. What's the most common architecture? Why?

1.) store decision-support data 2.) One central enterprise data warehouse, w/o data marts; b/c data stored in the warehouse are accessed by all users + represent single version of truth.

Definition: DATA MART

A low-cost, scaled-down version of a data warehouse that is designed for the end-user needs in a strategic business unit (SBU) or an individual department.

What do companies also establish when it comes to middle management?

A middle management-level committee that oversees the various projects in the BI portfolio to ensure that these projects are being completed in accordance with the company's objectives.

What do companies that are effective in BI governance often create?

A senior-level committee composed of vice presidents and directors who (1) ensure that the business strategies and BI strategies are in alignment (2) prioritize projects (3) allocate resources.

While data is organized by subject (i.e., by customer, vendor, product, price level, and region) in a data mart or data warehouse, how is data organized in TRANSACTIONAL SYSTEMS?

By business process such as order entry, inventory control, and accounts receivable.

What do all of the prior stages constitute?

Creating BI infrastructure.

Definition: METADATA

Data about data in a repository

To ensure that BI is meeting their needs, organizations must implement _______ to plan and control their BI activities.

Governance

This data warehouse architecture contains a central data warehouse that stores the data plus multiple dependent data marts that source their data from the central repository. Because the marts obtain their data from the central repository, the data in these marts still comprise the single version of the truth for decision-support purposes.

Hub and spoke

Warehouses and data marts are updated, but through _______ rather than by ______.

IT-controlled load processes; users

Another architecture is _______. These marts store data for a single application or a few applications, such as marketing and finance. Organizations that employ this architecture give only limited thought to _______ for other applications or by other functional areas in the organization. Clearly this is a very _____-centric approach to storing data.

Independent data marts; how the data might be used; application

There are many potential BI users, including IT developers; frontline workers; analysts; information workers; managers and executives; and suppliers, customers, and regulators. Some of these users are _______, whose primary role is to create information for other users. IT developers and analysts typically fall into this category. Other users—including managers and executives—are _________, because they use information created by others.

Information Producers; information consumers

Definition: QUERY-BY-EXAMPLE (QBE)

Method of creating database queries that allows the user to search for documents based on an example in the form of a selected string of text or in the form of a document name or a list of documents.

(Multidimensional data structure) Typically, the data warehouse or mart uses a ________ data structure.

Multidimensional data

Definition: DATA WAREHOUSE

Repository of historical data that are organized by subject to support decision makers within the organization.

Definition: MULTIDIMENSIONAL STRUCTURE

Storage of data in more than two dimensions; a common representation is the data cube.

What do lower-level operational committees perform?

Tasks such as creating data definitions and identifying and solving data problems.

What does GOVERNANCE require?

That people, committees, and processes be in place.

What do DATA MARTS support?

They support local rather than central control by CONFERRING POWER on the user group.

Typically, groups that need _______ require only a data mart rather than a data warehouse.

a single or a few business analytics applications

data in data warehouses and marts are organized by _______, which are _____ such as product, geographic area, and time period that represent the _____ of the data cube.

business dimensions; subjects; edges

larger companies have increasingly moved to ______ ______.

data warehouses

Data marts can be implemented more quickly than ______, often in less than ___ days.

data warehouses; 90

In general, _____ and _____ support business analytics applications.

data warehouses; data marts

The independent data mart architecture is not particularly ______. Although it may meet a specific organizational need, it does not reflect an _______. Instead, the various organizational units create independent ______. Not only are these marts expensive to build and maintain, but they often contain ________ data.

effective; enterprise-wide approach to data management; data marts; inconsistent

Because data warehouses are so _____, they are used primarily by ______.

expensive; large companies

Business analytics encompasses a broad category of applications, technologies, and processes for _____, ____, ____, and _____ data to help business users make better decisions.

gathering, storing, accessing, and analyzing data

It is important to maintain data about the data, known as ______, in the data warehouse.

metadata

When it comes to integration, data are collected from _______ and are then INTEGRATED around ________.

multiple systems; subjects

A common source for the data in data warehouses is the company's ______, which can be ______.

operational databases; relational databases

If the manager of a local bookstore wanted to know the profit margin on used books at her store, then she could obtain that information from her database using SQL or _______.

query-by-example (QBE)

Transactional databases are designed to access a ______ at a time. In contrast, data warehouses are designed to access_____.

single record at a time; large groups of related records

After the data are extracted, they are _______ to make them more useful.

transformed Example: data from different systems may be integrated around a common key, such as a customer identification number. Organizations adopt this approach to create a 360-degree view of all of their interactions with their customers. As an example of this process, consider a bank. Customers can engage in a variety of interactions: visiting a branch, banking online, using an ATM, obtaining a car loan, and more. The systems for these touch points—defined as the numerous ways that organizations interact with customers, such as e-mail, the Web, direct contact, and the telephone—are typically independent of one another. To obtain a holistic picture of how customers are using the bank, the bank must integrate the data from the various source systems into a data mart or warehouse.

relational databases store data in _____-dimensional tables. In contrast, data warehouses and marts store data in ______.

two; more than 2 dimensions


Ensembles d'études connexes

intro to humanities final ch 10-15

View Set

Pediatric Diagnosis (Final Material)

View Set

quizlet abeka history of the world quiz 30

View Set

Chapter 9 - Managing Linux Processes

View Set

Lecture 2- material culture- archaeological anthropology

View Set

Math: Multiplication, Division, Fractions, PEMDS,

View Set

Chemistry Final Exam Study Problems

View Set