ISDS 705 ch 3

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Terabytes (Tb) Gigabytes (Gb).

1 Petabyte (Pb) = 1,000 __ = 1,000,000 __

Variety Volume Veracity

4 V's of data analytics: ___, ___, Velocity, and ___.

decision model

A __ quantifies the relationship between variables, which reduces uncertainty.

client/server architecture clients data centers

A distributed database system allows apps on computers and mobiles to access data from both local and remote databases. Distributed databases use ___ to process information requests. Computers and mobile devices accessing the servers are called __. The databases are stored on servers that reside in the company's __, a private cloud or a public cloud.

software as a service (SaaS) slower response times, security risks

A popular option offered by a growing number of BI tool vendors is ___. This cloud approach appeals to small and midsize companies that have limited IT staff and want to carefully control costs. The potential downsides include ___, ___, and backup risks.

big data analytics operating margin

According to the McKinsey Global Institute (MGI), ___ have helped companies outperform their competitors. MGI estimates that retailers using this tactic increase their __ by more than 60 percent.

business-driven development approach

After completing these activities on pg 100, BI analysts can identify the data to use in BI. This is a ___ that starts with a business strategy and work backward to identify data sources and the data that need to be acquired and analyzed.

business records

All organizations create and retain ___- A record is documentation of a business event, action, decision, or transaction.

DBMS Data integrity and maintenance Data synchronization

An accurate and consistent view of data throughout the enterprise is needed so one can make informed, actionable decisions that support the business strategy. Functions performed by a __ to help create such a view are: Data filtering and profiling (process and store data efficiently), ___, ___, Data security, and Data access.

BI Information overload online

An unending challenge is how to determine which data to use for __ from what seems like unlimited options. ___ is a major problem for executives and for employees. Other challenges is data quality, particularly with regard to __ information bc of lack of accuracy.

master data management (MDM) complete (unified)

As data become more complex and their volumes explode, database performance degrades. One solution is the use of master data and ___. MDM processes integrate data from various sources or enterprise applications to create a more ___ view of a customer, product, or other entity.

real time or near real time

BI provides data at the moment of value to a decision maker—enabling it to extract crucial facts from enterprise data in ___.

Hadoop Map stage

Big data volumes exceed the processing capacity of conventional database infrastructures. A widely used processing platform is Apache __. It places no conditions on the structure of the data it can process, distributes computing problems across a number of servers, and implements MapReduce in two stages: 1. __ 2. Reduce stage.

electronic records management (ERM) system.

Business documents such as spreadsheets, e-mail messages, and word-processing documents are a type of record. Most records are kept in electronic format and maintained throughout their life cycle—from creation to final archiving or destruction by an ___.

keep informed in real time identify opportunities and risks

Business intelligence (BI) are tools and techniques process data and do statistical analysis for insight and discovery—that is, to discover meaningful relationships in the data, __, detect trends, and ___.

"garbage in, gospel out."

Data collection is a highly complex process that can create problems concerning the quality of the data being collected. Classic expressions that sum up the situation are "garbage in, garbage out" (GIGO) and the potentially riskier ___ -poor-quality data are trusted and used as the basis for planning.

Data warehouse data marts

Data management technologies that keep users informed and support the various business demands are the following: Databases, __, Business intelligence (BI) , and ___.

small-scale data warehouses

Data marts are __ that support a single function or one department. Enterprises that cannot afford to invest in data warehousing may start with one of these.

data mining, BI, and decision support.

Data warehouses and data marts are optimized for OLAP, __, __, and __.

databases and data silos nonvolatile

Data warehouses integrate data from multiple __, and organize them. For example, data are extracted from a database, processed to standardize their format, and then loaded into data warehouses at specific times, such as weekly. As such, data in data warehouses are __ and ready for analysis.

analytical queries primary source queries

Data warehouses store data from various source systems and databases across an enterprise in order to run __ against huge datasets collected over long time periods. Warehouses are the __ of cleansed data for analysis, reporting, and BI. Often the data are summarized in ways that enable quick responses to __.

enterprise data warehouses (EDW).

Data warehouses that integrate data from databases across an entire enterprise are called ___.

enterprise data warehouses (EDW)

Data warehouses that pull together data from disparate sources and databases across an entire enterprise are called ___.

analysis and quick response to queries OLAP (online analytics-processing systems) Subject-oriented

Data warehouses: • Designed and optimized for___ and __. • Nonvolatile. This stability is important to being able to analyze the data and make comparisons. Data stored may never be changed/deleted in order to do trend analysis or make comparisons with newer data. • ___ systems. • ___, which means that the data captured are organized to have similar data linked together.

transactions occur OLTP systems

Database management systems (DBMSs) are software used to manage the additions, updates, and deletions of data as __, and to support data queries and reporting. They are __.

data collection systems relational database

Database management systems (DBMSs) integrate with __ such as TPS and business applications; store the data in an organized way; and provide facilities for accessing and managing that data. Over the past 25 years, the __ has been the standard database model adopted by most enter-prises. These databases store data in tables consisting of columns and rows, similar to the format of a spreadsheet.

centralized database

Databases are either centralized or distributed. A __ stores all related files in a central location.

data mining data warehouse technology

Databases cannot be optimized for ___, complex online analytics-processing (OLAP) systems, and decision support. These limitations led to the introduction of __.

volatile OLTP (online transaction-processing system)

Databases: •Designed and optimized to ensure that every transaction gets recorded and stored immediately. •__ because data are constantly being updated, added, or edited. • ___ systems.

active data warehouse (ADW)

Demand for information to support real time customer interaction and operations leads to real time data warehousing and analytics known as an __. This is not designed to support executive strategic decision making, but rather to support operations.

electronic documents and image paper documents query and search

ERM systems consist of hardware and software that manage and archive __ and __; then index and store them according to company policy. ERM systems have __ capabilities so documents can be identified and accessed like data in a database.

Oracle Teradata

Enterprises need powerful DBMSs and data warehousing solutions, analytics, and reporting. The four vendors that dominate this market: __, IBM, Microsoft, and ___.

data latency Query response time Query predicability

Factors That Determine the Performance of a DBMS: __, Ability to handle the volatility of the data, ___, ___, and data consistency.

better IT security transmission delay

For decades the main database platform consisted of centralized database. Benefits of centralized database configurations include: Better control of data quality and __. A major disadvantage of centralized databases, like all centralized systems, is __when users are geodispersed.

master data management (MDM) customers, sales, and transactions,

For huge corporations like Coca-cola, all data are standardized through a series of __ processes. An enterprise data warehouse (EDW) generates a single view of all multichannel retail data. The EDW creates a trusted view of __, __, __ enabling Coca-Cola to respond quickly and accurately to changes in market conditions

centralized database data modeling, and social media

For most global companies, data is managed in a __. Data warehousing, big data, analytics, __, and __ are used to respond to competitors' activity, market changes, and consumer preferences.

online transaction-processing (OLTP) systems

Given their functions, DBMSs are referred to as __. OLTP is a database design that breaks down complex information into simpler data tables to strike a balance between transaction-processing efficiency and query efficiency. OLTP databases process millions of transactions per second.

forecasting sales financial

Many organizations built data warehouses because they were frustrated with inconsistent data that could not support decisions or actions. Viewed from this perspec-tive, data warehouses are infrastructure investments that companies make to support ongoing and future operations, such as: marketing and sales, pricing and contracts, __, __, and __.

total sales

Market share is the percent-age of __ in a market captured by a brand, product, or company.

Higher performance Simpler administration

NoSQL systems are such a heterogeneous group of database systems that attempts to classify them are not very helpful. However, their general advantages are these: __, Easy distribution of data, Greater flexibility, and __.

variable costs dollar of sales

Operating margin is a measure of the percent of a company's revenue left over after paying for its __. An increasing margin means the company is earning more per ___. The higher the operating margin, the better.

Association for Information and Image Management ARMA International

Organizations such as the ___, National Archives and Records Administration (NARA), and ___have created and published industry standards for document and records management.

master reference file Master data entities

Participants in the health-care supply chain essentially developed a ___ of its key data entities. A data entity is anything real or abstract about which a company wants to collect and store data. ___are the main entities of a company, such as customers, products, suppliers, employees, and assets.

NoSQL (short for "not only SQL") Scalability

RDBMSs are still the dominant database engines, but the trend toward __ systems is clear. This is due to their __- the system can increase in size to handle data growth or the load of an increasing number of concurrent users.

structured query language (SQL)

Relational management systems (RDBMSs) provide access to data using a declarative language ___.

Extracted Transformed Loaded read only

Select data are moved from databases to a warehouse. Specifically, data are: 1. __ from designated databases. 2. __ by standardizing formats, cleaning the data, integrating them. 3. __ into a data warehouse. In a warehouse, data are __ ; that is, they do not change until the next ETL.

improve business performance driving revenue growth

Senior managers who want to find and eliminate defects that harm performance are tasked with trying to ___ and retain customers. Key performance indicators (KPIs) such as improving profitability, ___, and improving the quality of customer service are monitored closely.

data life cycle internal or external database used in business apps Supply chain management (SCM)

The __ is a model that illustrates the way data travel through an organization. It begins with storage in an ___, next it is loaded into a data warehouse for analysis, then reported to knowledge workers or __. ___, customer relationship management (CRM), and e-commerce are enterprise applications that require up-to-date, readily accessible data to function properly.

cost to prevent errors cost to correct errors

The cost of poor-quality data may be expressed as a formula: Cost of poor-quality data = lost business + ___ + ___

Data marts

The high cost of data warehouses can make them too expensive for a company to implement. __ are lower-cost, scaled-down versions that can be implemented in a much shorter time (ex. less than 90 days) because they store a smaller amount of data. Serve a specific department or function, such as finance, marketing, or operations.

data ownership

The source of the problem is __—that is, who owns or is responsible for the data. These problems exist when there are no policies defining responsibility and accountability for managing data.

principle of 90/90 data use principle of data in context

Three general data principles relate to the data life cycle perspective and help to guide IT investment decisions: Principle of diminishing data value,___, and ___.

change data capture (CDC) and data deduplication ("deduping the data").

Three technologies involved in preparing raw data for analytics include ETL, __, and __.

forecasting planning

Throughout Coca-Cola's divisions and departments, huge volumes of data are analyzed to make more and better time-sensitive, critical decisions about products, shopper marketing, the supply chain, and production. Point of sale data (POS) are analyzed to support collaborative __, __, and replenishment processes within its supply chain.

DBMSs

When most business transactions occur, __ record and process transactions in the database, and support queries and reporting.

public cloud private clouds

With a ___, a service provider hosts the data and/or software that are accessed via an Internet connection. For ___, the company hosts its own data and software, but uses cloud-based technologies.

Exploration, Preprocessing

With text analytics, info is extracted from large quantities of various types of textual information. Text analytics can help identify the ratio of positive/negative posts and can show a need for other customer research. The basic steps involved in text analytics include: ___, ___, and Categorizing and Modeling.

SQL

_ is a standardized query language for accessing databases.

Queries

__ are ad hoc (unplanned) user requests for specific data.

databases Data warehouses

__ are optimized for extremely fast transaction processing and query processing. __ are optimized for analysis.

Fault tolerance

__ means that no single failure results in any loss of service.

Deduping processes

__ remove duplicates and standardize data formats, which helps to minimize storage and data synch.

Declarative languages

__ simplify data access by requiring that users only specify what data they want to access without defining how access will be achieved.

Databases

__ store data generated by business apps, sensors, operations, and transaction-processing systems (TPS). This data is extremely volatile (changes frequently) . Medium and large enterprises typically have many various types.

Centralized database

__ stores data at a single location that is accessible from anywhere. Searches can be fast because the search engine does not need to check multiple dis-tributed locations to find responsive data.

Data and text mining

___ are used to discover knowledge that you did not know existed in the databases.

CDC processes

___ capture the changes made at data sources and then apply those changes throughout enterprise data stores to keep data synchronized. Minimizes the resources required for ETL processes by only dealing with data changes.

Data mining software

___ enables users to analyze data from various dimensions or angles, categorize them, and find correlations or patterns among fields in the data warehouse.

Text mining sentiment analysis

___ is a broad category that involves interpreting words and concepts in context.It helps companies tap into the explosion of customer opinions expressed online. Social commentary and social media are being mined for ___ or to understand consumer intent.

Business analytics

__describes the entire function of applying technologies, algorithms, human expertise, and judgment. Data and text mining are specific analytic techniques.

BI architecture big data

__is undergoing technological advances in response to big data and the performance demands of end-users. One technology advance that can help handle __ is BI hosted on a public or private cloud.

Dirty data

__—that is, poor-quality data—lack integrity and cannot be trusted.

Online analytics-processing systems (OLAP)

is a term used to describe the analysis of complex data from the data warehouse.


Kaugnay na mga set ng pag-aaral

Med surg 2 Quiz 2 practice questions

View Set

Human Physiology Chapter 1: The Study of Body Function

View Set

ACCOUNT 250, Exam 4 (Final Study Guide), (Also study exam study guides 1 -3)

View Set

Abnormal Psych-Personality Disorders Quiz 9

View Set

Wellness Coaching COPY, Chapter 9: Design Thinking COPY, Client Assessment: Chapter 8 COPY, Coaching Review L12- L15 COPY

View Set

MRI SEQUENCE PARAMETERS AND OPTIONS REVIEW QUESTIONS

View Set