BCIS 461 Data Analytics Final

Ace your homework & exams now with Quizwiz!

Which data mining process/methodology is thought to be the most comprehensive, according to kdnuggets.com rankings?

CRISP-DM

Define what data is and explain the different characteristics of it

Data is collection of facts usually gathered as a result of transactions, experiments, observations, or experiences. Typically consists of numbers, words, and/or images. It is considered the lowest level of abstraction from which information and knowledge is derived.

Different types of players are identified and described in the analytics _________?

Ecosystem

_________ ____________ ______________ is a mechanism that integrates application functionality and shares functionality (rather than data) across systems, thereby enabling flexibility and reuse.

Enterprise application integration

BI represents a bold new paradigm in which the company's business strategy must be aligned to its business intelligence analysis initiatives.

False

Computerized support is only used for organizational decisions that are responses to external pressures, not for taking advantage of opportunities.

False

Data mining requires specialized data analysts to ask ad hoc questions and obtain answers quickly from the system.

False

Data warehouse administrators do not need strong business insight since they only handle the technical aspect of the infrastructure.

False

Data warehouses are subsets of data marts.

False

Due to industry consolidation, the analytics ecosystem consists of only a handful of players across several functional areas.

False

Information systems that support such transactions as ATM withdrawals, bank deposits, and cash register scans at the grocery store represent transaction processing, a critical branch of BI.

False

Lagging indicators measure activities that have a significant impact on outcome KPIs

False

Moving the data into a data warehouse is usually the easiest part of its creation.

False

Organizations seldom devote a lot of effort to creating metadata because it is not important for the effective use of data warehouses.

False

Properly integrating data from various databases and other disparate sources is a trivial process.

False

Subject oriented databases for data warehousing are organized by detailed subjects such as disk drives, computers, and networks.

False

Successful BI is a tool for the information systems department, but is not exposed to the larger organization.

False

The BPM development cycle is essentially a one-shot process where the requirement is to get it right the first time.

False

The data storage component of a business reporting system builds the various reports and hosts them for, or disseminates them to users. It also provides notification, annotation, collaboration, and other services.

False

Two-tier data warehouse/BI infrastructures offer organizations more flexibility but cost more than three-tier ones.

False

User-initiated navigation of data through disaggregation is referred to as "drill-up."

False

With the balanced scorecard approach, the entire focus is on measuring and managing specific financial goals based on the organizations strategy.

False

____________ _____________ (also called in-database analytics) refers to the integration of the algorithmic extent of data analytics into data warehouse.

In-database processing

_________ describe the structure and meaning of the data, contributing to their effective use.

Metadata

SEMMA

Sample, Explore, Modify, Model, and Assess

Computer applications have moved from transaction processing and monitoring activities to problem analysis and solution applications.

True

Converting continuous valued numerical variables to ranges and categories is referred to as discretization.

True

Data accessibility means that data are easily and readily obtainable.

True

During the early days of analytics, data was often obtained from the domain experts using manual processes to build the mathematical or knowledge-based models.

True

In the 2000s, the DW-driven DSSs began to be called BI systems.

True

Interval data are variables that can be measured on interval scales.

True

Learning is one of the BSC Performance Management System organizational view.

True

Managing data warehouses requires special methods, including parallel computing and/or Hadoop/Spark.

True

Predicative algorithms generally require a flat file with a target variable, so making data analytics ready for prediction means that data sets must be transformed into a flat-file format and made ready for ingestion into those predictive algorithms.

True

The "islands of data" problem in the 1980s describes the phenomenon of unconnected data being stored in numerous locations within an organization.

True

The cost of data storage has plummeted recently, making data mining feasible for more firms.

True

With KPIs, driver KPIs have a significant effect on outcome KPIs but the reverse is not necessarily true.

True

What is six sigma?

a methodology aimed at reducing the number of defects in a business process

__________ is an evolving tool space that promises real-time integration from a variety of sources such as relational databases, Web services, and multidimensional databases. a. Enterprise information integration (EII) b. none of these c. Enterprise application integration (EAI) d. Extraction, transformation, and load (ETL)

a. Enterprise application integration

OLTP systems handle a company's routine ongoing business. In contrast, a data warehouse is typically a. a distinct system that provides storage for data that will be made use of in analysis b. a repository of actionable intelligence obtained from a data mart c. an integral subsystem of an online analytical processing system (OLAP) d. the end result of BI processes and operations

a. a distinct system that provides storage for data that will be made use of in analysis

In which stage of ETL are anomalies detected and corrected? a. cleanse b. transformation c. load d. extraction

a. cleanse

A large storage location that can hold vast quantities of data (mostly unstructured) in its native/raw format for future/potential analytics is referred to as a. data lake b. relational database c. extended ASP d. data cloud

a. data lake

All of the following are true about in-database processing technology EXCEPT a. it is the same as in-memory storage technology b. it is often used for apps like credit card fraud detection and investment risk management c. it pushes the algorithms to where the data is d. it makes the response to queries much faster than conventional databases

a. it is the same as in-memory storage technology

Oper marts are created when operational data needs to be analyzed a. multidimensionally b. unidimensionally c. lineraly d. in a dashboard

a. multidimensionally

What is the management feature of a dashboard? a. operational data that identified what actions to take to resolve a problem b. summarized dimensional data to analyze the root cause of problems c. summarized dimensional data to monitor key performance metrics d. graphical abstracted data to monitor key performance metrics

a. operational data that identify what actions to take to resolve a problem

Which of the following BEST enables a data warehouse to handle complex queries and scale up to handle many more requests? a. parallel processing b. Microsoft Windows c. use of the web users as a front end d. a larger IT staff

a. parallel processing

This measure of dispersion is calculated by simply taking the square root of the variations. a. standard deviation b. range c. variance d. arithmetic mean

a. standard deviation

Fundamental reasons for investing in BI must be _____ with the company's business strategy.

aligned

Today, many vendors offer diversified tools, some of which are completely preprogrammed (called shells). How are these shells utilized? a. They are used for the customization of BI solutions. b. All a user needs to do is insert the numbers. c. The shell provides a secure environment for the organizations BI data. d. They host an enterprise data warehouse that can assist in decision making.

b. All a user needs to do is insert the numbers.

Which of the following statements about Big Data are true? a. Data chunks are stored in different locations on one computer. b. Hadoop is a type of processor used to process Big data applications. c. MapReduce is a storage filing system. d. Pure Big Data systems do not involve fault tolerance.

b. Hadoop is a type of processor used to process Big Data applications.

The DMAIC performance improvement model all of the following steps EXCEPT a. Define b. Monitor c. Control d. Improve

b. Monitor

How are enterprise resource planning (ERP) systems related to supply chain management (SCM) systems? a. different terms for the same system b. complementary systems c. mutually exclusive systems d. None of these; these systems never interface

b. complementary systems

This technique makes a no priori assumption of whether one variable is dependent on the others and is not concerned with the relationship between variables; instead it gives an estimate on the degree of association between variables a. means test b. correlation c. multiple regression' d. regressiojn

b. correlation

Which approach to data warehouse integration focuses more on sharing process functionality than data across systems? a. enterprise information integration b. enterprise application integration c. enterprise function integration d. ETL

b. enterprise application integration

Which type of visualization tool can be very helpful when a data set contains location data? a. bar chart b. geographic map c. highlight table d. tree map

b. geographic map

All of the following are benefits of hosted data warehouse except a. smaller upfront investment b. greater control of data c. better quality hardware d. frees up in-house systems

b. greater control of data

What kinda of data warehouse is created separately from the enterprise data warehouse by a department not reliant on it for updates? a. volatile data mart b. independent data mart c. sectional data mart d. public data mart

b. independent data mart

Which of the following is an umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies? a. MIS b. DSS c. ERP d. BI

d. BI

KPIs have all of the following distinguishing criteria EXCEPT a. Targets b. Ranges c. Encodings d. Limits e. Benchmarks

d. Limits

This measure of central tendency is the sum of all the values/observations divided by the number of observations in the data set. a. dispersion b. mode c. median d. arithmetic mean

d. arithmetic mean

What is the main reason parallel processing is sometimes used for data mining? a. because any strategic application requires parallel processing b. because most of the algorithms used for data require it c. because the hardware exists in most organizations and it is available to use d. because of the massive data amounts and search efforts involved

d. because of the massive data amounts and search efforts involved

Which characteristic of data requires that the variables and data values be defined at the lowest (or as low as required) level of detail for the intended use of the data? a. data source reliability b. data accessibility c. data richness d. data granularity

d. data granularity

Six sigma has all of the following, EXCEPT a. emphasizes learning and innovation b. includes all business processes c. identifies measure that drive performance d. emphasizes targets for each measurement

d. emphasizes targets for each measurement

The need for more versatile reporting than what was available in the 1980s era ERP systems led to the development of what type of system? a. relational databases b. management information systems c. data warehouses d. executive information systems

d. executive information systems

Which of the following developments is NOT contributing to facilitating the growth of decision support and analytics? a. collaboration technologies b. Big Data c. knowledge management systems d. locally concentrated workforces

d. locally concentrated workforces

Operational or transaction databases are product oriented, handling transactions that update the database. In contrast, data warehouses are a. product oriented and volatile b. product oriented and nonvolatile c. subject oriented and volatile d. subject oriented and nonvolatile

d. subject oriented and nonvolatile

In the opening Vignette on Sports Analytics, what was adjusted to drive one-time ticket sales? a. player selections b. stadium location c. fan tweets d. ticket prices

d. ticket prices

Data preparation, the third step in the CRISP-DM data mining process, is more commonly known as_______?

data preprocessing

In the terrorist funding case study, an observed price _______ may be related to income tax avoidance/evasion, money laundering, or terrorist financing.

deviation

There has been an increase in data mining to deal with global competition and customers more sophisticated ______ and wants.

needs

What are the four processes that define a closed-loop BPM cycle? List and explain each of the processes.

1. Strategize: This is the process of identifying and stating the organization's mission, vision, and objectives, and developing plans (at different levels of granularity—strategic, tactical and operational) to achieve these objectives.2. Plan: When operational managers know and understand the what (i.e., the organizational objectives and goals), they will be able to come up with the how (i.e., detailed operational and financial plans). Operational and financial plans answer two questions: What tactics and initiatives will be pursued to meet the performance targets established by the strategic plan? What are the expected financial results of executing the tactics?3. Monitor/Analyze: When the operational and financial plans are underway, it is imperative that the performance of the organization be monitored. A comprehensive framework for monitoring performance should address two key issues: what to monitor and how to monitor.4. Act and Adjust: What do we need to do differently? Whether a company is interested in growing its business or simply improving its operations, virtually all strategies depend on new projects—creating new products, entering new markets, acquiring new customers or businesses, or streamlining some processes. The final part of this loop is taking action and adjusting current actions based on analysis of problems and opportunities.

Cross Industry Standard Process for Data Mining (CRISP-DM)

Business Understanding Data Understanding Data Preparation Model Building Testing & Evaluation Deployment

Explain the ETL process. Make sure to discuss its purpose, its importance within Business Intelligence, and the 4 (four) stages of this process.

The ETL process is used to change data (structured or unstructured) into organized information that can be sorted and analyzed to improve the decision making process. Extraction is when the data is sorted and organized. This could be mean that it is coming from a data lake or an old data management system. After the data is extracted, it is transformed into usable date this is the stage where irrelevant data and anomalies are dismissed, and data is now more organized. It is next cleansed, and after irrelevant data is dismissed, other anomalies are also corrected. The final stage is the loading, and this is where the data is now in a data warehouse that is easy to analyze and dissect. It is important to BI because it helps BWA see what information they have and determine what is happening (descriptive analytics). The ETL process turns large amounts of data in a system into usable information.

All of the following statements about data mining are true EXCEPT a. the potentially useful aspect means that results should lead to some business benefit b. the process aspect means that data mining should be a one-step process to results c. the novel aspect means that previously unknown patterns are discovered d. the valid aspect means that the discovered patterns should hold true to new data

b. the process aspect means that data mining should be a one-step process to results

Contextual metadata for a dashboard includes all the following EXCEPT a. whether any high-value transactions that would skew the overall trends were rejected as part of the loading process b. which operating system is running the software server c. whether the dashboard is presenting "fresh" or "stale" information d. when the data warehouse was last refreshed

b. which operating system is running the dashboard server software

Relational databases began to be used in the a. 1960s b. 1970s c. 1980s d. 1990s

c. 1980s

Understanding customers better has helped Amazon and others become more successful. The understanding comes primarily from a. developing a philosophy that is data analytics-centric b. asking customers what they want c. analyzing the vast data amounts routinely collected d. collecting data about customers and transactions

c. analyzing the vast data amounts routinely collected

This plot is a graphical illustration of several descriptive statistics about a given data set. a. pie chart b. bar graph c. box-and-whiskers plot d. kurtosis

c. box-and-whiskers plot

Business applications have moved from transaction processing and monitoring to other activities. Which of the following is NOT one of those activities? a. solution applications b. mobile access c. data monitoring d. problem analysis

c. data monitoring

Which characteristic of data means that all the required data elements are included in the data set? a. data source reliability b. data accessibility c. data richness d. data granularity

c. data richness

When querying a dimensional database, a user went from summarized data to its underlying details. The function that served this purpose is a. dice b. slice c. drill down d. roll-up

c. drill down

What are caused the growth of the demand for instant, on-demand access to dispersed information? a. the increasing divide between users who focus on the strategic level and those who are more oriented to the tactical level b. the need to create a database infrastructure that is always online and contains all the information from the OLTP systems c. the more pressing need to close the gap between operational data and strategic objectives d. the fact that BI cannot simply be a technical exercise for the information systems department

c. the more pressing need to close the gap between the operational data and strategic objectives

When representing data in a data warehouse, using several dimension tables that are each connected only to a fact table means that you are using which warehouse structure?

star schema

A web client that connects to a web server, which is in turn connected to a BI application server, is reflective of a

three tier architecture

A _________________ is a major component of a BI system that is often browser based and often represents a portal or dashboard.

user interface

3 levels or categories that are most often viewed as sequential and independent, but also occasionally seen as overlapping

• Descriptive or reporting analytics refers to knowing what is happening in the organization and understanding some underlying trends and causes of such occurrences.• Predictive analytics aims to determine what is likely to happen in the future. This analysis is based on statistical techniques as well as other more recently developed techniques that fall under the general category of data mining.• Prescriptive analytics recognizes what is going on as well as the likely forecast and makes decisions to achieve the best performance possible.


Related study sets

Advantages and Disadvantages of Sole Proprietorships

View Set

Chapter 46: Hematologic or Neoplastic Disorder NCLEX style

View Set

Natuur: thema 5 -> 2. Hoe komen onze erfelijke eigenschappen tot stand?

View Set

Chapter 14: Assessing Skin, Hair, and Nails Health Assessment

View Set

Ch. 27 Fluid Electrolyte and Acid-Base Balance

View Set

Ch. 17: The Uterus and Vagina (Content)

View Set