IDC Exam 1
Given the following range of numbers, what is the value of the 1st Quartile? 10 11 11 13 14 15 15 16 17 19 20 22 a) 11 b) 12 c)13 d)15
11
Given the following range of numbers, what is the value of the 1st Quartile? 10 11 11 13 14 15 15 16 17 19 20 22 a)11 b)12 c)13 d)15
12
Given the following range of numbers, what is the value of the 2nd Quartile? 10 11 11 13 14 15 15 16 17 19 20 22 a)12 b)13 c)14 d)15
15
What is the Mode of the following numbers: 1, 2, 2, 3, 4, 5, 5, 5 a)2 b)3 c)4 d)5
5
You have a survey question that asks: "What do you think the likelihood is that the FSU football team will win the ACC championship?" If you have survey results from 100 people and the average response is 40% with a standard deviation of 5. Which of the following can you approximate from the results? 95% of the respondents think that there is a 30% - 50% chance that the FSU football team will win the ACC championship. 70% of the respondents think that there is a 30% - 50% chance that the FSU football team will win the ACC championship. 100% of the respondents think that there is a 30% - 50% chance that the FSU football team will win the ACC championship. 0% of the respondents think that there is a 30% - 50% chance that the FSU football team will win the ACC championship
95% of the respondents think that there is a 30% - 50% chance that the FSU football team will win the ACC championship.
25) Business intelligence (BI) can be characterized as a transformation of A) data to information to decisions to actions. B) Big Data to data to information to decisions. C) actions to decisions to feedback to information. D) data to processing to information to actions.
A) data to information to decisions to actions.
32) The very design that makes an OLTP system efficient for transaction processing makes it inefficient for what? A) end-user ad hoc reports, queries, and analysis B) transaction processing systems that constantly update operational databases C) the collection of reputable sources of intelligence D) transactions such as ATM withdrawals, where we need to reduce a bank balance accordingly
A) end-user ad hoc reports, queries, and analysis
When representing data in a data warehouse, using several dimension tables that are each connected only to a fact table means you are using which warehouse structure? A) star schema B) snowflake schema C) relational schema D) dimensional schema
A) star schema
22) Operational or transaction databases are product oriented, handling transactions that update the database. In contrast, data warehouses are A) subject-oriented and nonvolatile. B) product-oriented and nonvolatile. C) product-oriented and volatile. D) subject-oriented and volatile.
A) subject-oriented and nonvolatile.
30) In which stage of extraction, transformation, and load (ETL) into a data warehouse are data aggregated? A) transformation B) extraction C) load D) cleanse
A) transformation
36) Today, many vendors offer diversified tools, some of which are completely preprogrammed (called shells). How are these shells utilized? A) They are used for customization of BI solutions. B) All a user needs to do is insert the numbers. C) The shell provides a secure environment for the organization's BI data. D) They host an enterprise data warehouse that can assist in decision making.
B) All a user needs to do is insert the numbers.
How does the use of cloud computing affect the scalability of a data warehouse? A) Cloud computing vendors bring as much hardware as needed to users' offices. B) Hardware resources are dynamically allocated as use increases. C) Cloud vendors are mostly based overseas where the cost of labor is low. D) Cloud computing has little effect on a data warehouse's scalability.
B) Hardware resources are dynamically allocated as use increases.
36) Which of the following online analytical processing (OLAP) technologies does NOT require the precomputation and storage of information? A) MOLAP B) ROLAP C) HOLAP D) SQL
B) ROLAP
37) How are descriptive analytics methods different from the other two types? A) They answer "what-if?" queries, not "how many?" queries. B) They answer "what-is?" queries, not "what will be?" queries. C) They answer "what to do?" queries, not "what-if?" queries. D) They answer "what will be?" queries, not "what to do?" queries.
B) They answer "what-is?" queries, not "what will be?" queries.
Which approach to data warehouse integration focuses more on sharing process functionality than data across systems? A) extraction, transformation, and load B) enterprise application integration C) enterprise information integration D) enterprise function integration
B) enterprise application integration
33) What can the BI users in an organization help guide and direct? A) how to implement and deploy a BI initiative that can be lengthy, expensive, and failure prone B) how the DW is structured and the types of BI tools and other supporting software that are needed C) how to decompose the planning and execution into business, organization, functionality, and infrastructure components D) how the DW is structured and the costs and the appreciation for different classes of potential users
B) how the DW is structured and the types of BI tools and other supporting software that are needed
29) Once a data warehouse is in place, the general process of intelligence creation begins with A) end-user examinations of decision-making impacts. B) identifying and prioritizing specific BI projects. C) estimating the cost-benefit ratio of the ROI. D) establishing the critical partnerships required for BI governance.
B) identifying and prioritizing specific BI projects.
34) If a company's strategy is properly aligned with DW and BI initiatives, and if the company's IS organization can be made capable of playing its role in such a project, and if the requisite user community is in place and has the proper motivation, then A) it is no longer necessary to start BI within the company. B) it is wise to start BI and establish a BI Competency Center (BICC) within the company. C) the organization is ready for the introduction of new data-generating technologies, such as radio-frequency identification (RFID). D) business leaders are required to document their business processes and to sign off on the legitimacy of the information they rely on.
B) it is wise to start BI and establish a BI Competency Center (BICC) within the company.
Which of the following BEST enables a data warehouse to handle complex queries and scale up to handle many more requests? A) use of the web by users as a front-end B) parallel processing C) Microsoft Windows D) a larger IT staff
B) parallel processing
26) In answering the question "Which customers are most likely to click on my online ads and purchase my goods?" you are most likely to use which of the following analytic applications? A) customer profitability B) propensity to buy C) customer attrition D) channel optimization
B) propensity to buy
In the Magpie Sensing case study, the automated collection of temperature and humidity data on shipped goods helped with various types of analytics. Which of the following is an example of predictive analytics? A) real time reports of the shipment's temperature B) warning of an open shipment seal C) location of the shipment D) optimal temperature setting
B) warning of an open shipment seal
Statistical Data Variable Type: A variable that contains the values of either Yes or No would best be categorized as which of the following variable types? Nominal Binary Discrete Ratio
Binary
31) Online transaction processing (OLTP) systems handle a company's routine ongoing business. In contrast, a data warehouse is typically A) the end result of BI processes and operations. B) a repository of actionable intelligence obtained from a data mart. C) a distinct system that provides storage for data that will be made use of in analysis. D) an integral subsystem of an online analytical processing (OLAP) system.
C) a distinct system that provides storage for data that will be made use of in analysis.
Which of the following is NOT an example that falls within the four major categories of business environment factors for today's organizations? A) globalization B) increased pool of customers C) fewer government regulations D) increased competition
C) fewer government regulations
27) In answering the question "Which customers are likely to be using fake credit cards?" you are most likely to use which of the following analytic applications? A) channel optimization B) customer segmentation C) fraud detection D) customer profitability
C) fraud detection
All of the following are benefits of hosted data warehouses EXCEPT A) smaller upfront investment. B) better quality hardware. C) greater control of data. D) frees up in-house systems.
C) greater control of data.
Which data warehouse architecture uses a normalized relational warehouse that feeds multiple data marts? A) independent data marts architecture B) centralized data warehouse architecture C) hub-and-spoke data warehouse architecture D) federated architecture
C) hub-and-spoke data warehouse architecture
Which kind of data warehouse is created separately from the enterprise data warehouse by a department and not reliant on it for updates? A) sectional data mart B) public data mart C) independent data mart D) volatile data mart
C) independent data mart
38) Which of the following statements is more descriptive of active data warehouses in contrast with traditional data warehouses? A) strategic decisions whose impacts are hard to measure B) detailed data available for strategic use only C) large numbers of users, including operational staffs D) restrictive reporting with daily and weekly data currency
C) large numbers of users, including operational staffs
Prescriptive BI capabilities are viewed as more powerful than predictive ones for all the following reasons EXCEPT A) prescriptive BI gives actual guidance as to actions. B) understanding the likelihood of certain events often leaves unclear remedies. C) only prescriptive BI capabilities have monetary value to top-level managers. D) prescriptive models generally build on (with some overlap) predictive ones.
C) only prescriptive BI capabilities have monetary value to top-level managers.
37) Active data warehousing can be used to support the highest level of decision making sophistication and power. The major feature that enables this in relation to handling the data is A) country of (data) origin. B) nature of the data. C) speed of data transfer. D) source of the data.
C) speed of data transfer.
30) When middles look across an organization to ensure that project priorities reflect the needs of the entire business, what is their main concern? A) that their proprietary BI methods are protected from industrial espionage B) that additional information available through an enterprise data warehouse should assist in decision making C) that a project does not just serve to sub-optimize one area over others D) that return on investment (ROI) and total cost of ownership justify the cost—benefit ratio
C) that a project does not just serve to sub-optimize one area over others
35) What has caused the growth of the demand for instant, on-demand access to dispersed information? A) the increasing divide between users who focus on the strategic level and those who are more oriented to the tactical level B) the need to create a database infrastructure that is always online and contains all the information from the OLTP systems C) the more pressing need to close the gap between the operational data and strategic objectives D) the fact that BI cannot simply be a technical exercise for the information systems department
C) the more pressing need to close the gap between the operational data and strategic objectives
Big Data often involves a form of distributed storage and processing using Hadoop and MapReduce. One reason for this is A) centralized storage creates too many vulnerabilities. B) the "Big" in Big Data necessitates over 10,000 processing nodes. C) the processing power needed for the centralized model would overload a single computer. D) Big Data systems have to match the geographical spread of social media.
C) the processing power needed for the centralized model would overload a single computer.
25) A Web client that connects to a Web server, which is in turn connected to a BI application server, is reflective of a A) one tier architecture. B) two tier architecture. C) three tier architecture. D) four tier architecture.
C) three tier architecture.
39) Which of the following statements about Big Data is true? A) Data chunks are stored in different locations on one computer. B) Hadoop is a type of processor used to process Big Data applications. C) MapReduce is a storage filing system. D) Pure Big Data systems do not involve fault tolerance.
D) Pure Big Data systems do not involve fault tolerance.
31) In which stage of extraction, transformation, and load (ETL) into a data warehouse are anomalies detected and corrected? A) transformation B) extraction C) load D) cleanse
D) cleanse
21) The "single version of the truth" embodied in a data warehouse such as Capri Casinos' means all of the following EXCEPT A) decision makers get to see the same results to queries. B) decision makers have the same data available to support their decisions. C) decision makers get to use more dependable data for their decisions. D) decision makers have unfettered access to all data in the warehouse.
D) decision makers have unfettered access to all data in the warehouse.
35) When querying a dimensional database, a user went from summarized data to its underlying details. The function that served this purpose is A) dice. B) slice. C) roll-up. D) drill down.
D) drill down.
Which data warehouse architecture uses metadata from existing data warehouses to create a hybrid logical data warehouse comprised of data from the other warehouses? A) independent data marts architecture B) centralized data warehouse architecture C) hub-and-spoke data warehouse architecture D) federated architecture
D) federated architecture
All of the following statements about metadata are true EXCEPT A) metadata gives context to reported data. B) there may be ethical issues involved in the creation of metadata. C) metadata helps to describe the meaning and structure of data. D) for most organizations, data warehouse metadata are an unnecessary expense.
D) for most organizations, data warehouse metadata are an unnecessary expense.
32) Data warehouses provide direct and indirect benefits to using organizations. Which of the following is an indirect benefit of data warehouses? A) better and more timely information B) extensive new analyses performed by users C) simplified access to data D) improved customer service
D) improved customer service
40) All of the following are true about in-database processing technology EXCEPT A) it pushes the algorithms to where the data is. B) it makes the response to queries much faster than conventional databases. C) it is often used for apps like credit card fraud detection and investment risk management. D) it is the same as in-memory storage technology.
D) it is the same as in-memory storage technology.
In the Magpie Sensing case study, the automated collection of temperature and humidity data on shipped goods helped with various types of analytics. Which of the following is an example of prescriptive analytics? A) real time reports of the shipment's temperature B) warning of an open shipment seal C) location of the shipment D) optimal temperature setting
D) optimal temperature setting
24) Organizations counter the pressures they experience in their business environments in multiple ways. Which of the following is NOT an effective way to counter these pressures? A) reactive actions B) anticipative actions C) adaptive actions D) retroactive actions
D) retroactive actions
28) When Sabre developed their Enterprise Data Warehouse, they chose to use near-real time updating of their database. The main reason they did so was A) to provide a 360 degree view of the organization. B) to aggregate performance metrics in an understandable way. C) to be able to assess internal operations. D) to provide up-to-date executive insights.
D) to provide up-to-date executive insights.
Statistical Data Variable Type: A variable that contains a countable number of distinct values would best be categorized as which of the following variable types? Ordinal Discrete Interval Ratio
Discrete
(T/F) A well-designed data warehouse means that user requirements do not have to change as business needs change
False
(T/F) Almost all BI applications are constructed with shells provided by an outsourcing provider who may themselves create a custom solution for a vendor or work with another client.
False
(T/F) BI represents a bold new paradigm in which the company's business strategy must be aligned to its business intelligence analysis initiatives.
False
(T/F) Because the recession has raised interest in low-cost open source software, it is now set to replace traditional enterprise software.
False
(T/F) Bill Inmon advocates the data mart bus architecture whereas Ralph Kimball promotes the hub-and-spoke architecture, a data mart bus architecture with conformed dimensions.
False
(T/F) Computerized support is only used for organizational decisions that are responses to external pressures, not for taking advantage of opportunities.
False
(T/F) Data warehouse administrators (DWAs) do not need strong business insight since they only handle the technical aspect of the infrastructure.
False
(T/F) In the Isle of Capri case, the only capability added by the new software was increased processing speed of processing reports.
False
(T/F) Information systems that support such transactions as ATM withdrawals, bank deposits, and cash register scans at the grocery store represent transaction processing, a critical branch of BI.
False
(T/F) Large companies, especially those with revenue upwards of $500 million consistently reap substantial cost savings through the use of hosted data warehouses.
False
(T/F) Moving the data into a data warehouse is usually the easiest part of its creation.
False
(T/F) OLTP systems are designed to handle ad hoc analysis and complex queries that deal with many data items.
False
(T/F) One of the four components of BI systems, business performance management, is a collection of source data in the data warehouse.
False
(T/F) Organizations seldom devote a lot of effort to creating metadata because it is not important for the effective use of data warehouses.
False
(T/F) Pushing programming out to distributed data is achieved solely by using the Hadoop Distributed File System or HDFS.
False
(T/F) The ETL process in data warehousing usually takes up a small portion of the time in a data-centric project.
False
(T/F) The success of BI is assured not because of which personnel would be the most likely to use it, but as a result of pervasive adoption across the organization.
False
(T/F) The term intelligence in a BI context is used to describe clandestine operations dedicated to stealing corporate secrets, in the manner of the government's CIA and other covert agencies.
False
(T/F) The two critical partnerships required for BI governance are (a) a partnership between functional area users and/or product/service area employees, and (b) a partnership between representatives of the marketing and vendor sides.
False
(T/F) The use of dashboards and data visualizations is seldom effective in finding efficiencies in organizations, as demonstrated by the Seattle Children's Hospital Case Study.
False
(T/F) Two-tier data warehouse/BI infrastructures offer organizations more flexibility but cost more than three-tier ones.
False
(T/F)Subject oriented databases for data warehousing are organized by detailed subjects such as disk drives, computers, and networks.
False
(T/F)The complexity of today's business environment creates many new challenges for organizations, such as global competition, but creates few new opportunities in return.
False
Which of the following is a common way of visualizing the frequency distribution of data points over a range of possible values? Quantitative Easing Chart Histogram Skewness chart Scatterplot
Histogram
Statistical Data Variable Type: There is a web-based survey that asks you, "On a rating of 1(hated it) to 5(loved it), how much did you like the movie." This value is stored in your database and you need to categorize the statistical variable type. Which of the following variable types would be best? Ordinal Discrete Interval Ratio
Interval
Statistical Data Variable Type: There is a web-based survey that asks you, "On a rating of 1(hated it) to 5(loved it), how much did you like the movie." This value is stored in your database and you need to categorize the statistical variable type. Which of the following variable types would be best? Ordinal Discrete Interval Ratio
Interval
Which of the following measures of central location would be best to use when the Skewness is approximately zero? Mean Median Mode 2nd quartile
Mean
Which of the following measures of central location would be best to use when the Skewness is highly positive or negative? Mean Median Mode 4th Quartile
Median
Statistical Data Variable Type: A variable that contains the values of United States Zip Codes (aka Postal Codes) would best be categorized as which of the following variable types? Binary Nominal Ordinal Discrete
Nominal
Statistical Data Variable Type: A variable that contains the one of the three values: "1st Place""2nd Place" or "3rd Place" would best be categorized as which of the following variable types? Binary Nominal Discrete Ordinal
Ordinal
Which of the following Visualization charts would you use to plot the relationship between TWO variables? Bubble chart Histogram Rubber Chicken chart Scatterplot
Scatterplot
Which of the following is NOT a qualitative data type? Conversations Surveys with numerical answers Magazine articles Media broadcasts
Surveys with numerical answers
You have a survey question that asks: "What is your likelihood that you will watch the next asteroid shower?" If you have survey results from 100 people (sample A) and the average response is 20% with a standard deviation of 5. You ask another 100 people (sample B) the same question and they have the same average of 20% but their standard deviation is 10. What can you say about the two different survey results? Sample B respondents are much more likely than Sample A respondents to watch the next meteor shower. Sample A respondents are much more likely than Sample B respondents to watch the next meteor shower. It will probably be raining that day so who cares There was a wider range of responses in Sample B than in Sample A but the overall likelihood was the same.
There was a wider range of responses in Sample B than in Sample A but the overall likelihood was the same.
You have a survey question that asks: "What is your likelihood that you will watch the next asteroid shower?" If you have survey results from 100 people (sample A) and the average response is 20% with a standard deviation of 5. You ask another 100 people (sample B) the same question and they have the same average of 20% but their standard deviation is 10. What can you say about the two different survey results? a)Sample B respondents are much more likely than Sample A respondents to watch the next meteor shower. b)Sample A respondents are much more likely than Sample B respondents to watch the next meteor shower. c)It will probably be raining that day so who cares d)There was a wider range of responses in Sample B than in Sample A but the overall likelihood was the same.
There was a wider range of responses in Sample B than in Sample A but the overall likelihood was the same.
(T/F) ) One way an operational data store differs from a data warehouse is the recency of their data.
True
(T/F) ) The overwhelming majority of competitive actions taken by businesses today feature computerized information system support.
True
(T/F) Actionable intelligence is the primary goal of modern-day Business Intelligence (BI) systems vs. historical reporting that characterized Management Information Systems (MIS).
True
(T/F) Because of performance and data quality issues, most experts agree that the federated architecture should supplement data warehouses, not replace them.
True
(T/F) Data warehouse and BI initiatives typically follow a process similar to that used in military intelligence initiatives.
True
(T/F) In addition to deploying business intelligence (BI) systems, companies may also perform other actions to counter business pressures, such as improving customer service and entering business alliances.
True
(T/F) In the Starwood Hotels case, up-to-date data and faster reporting helped hotel managers better manage their occupancy rates.
True
(T/F) Many business users in the 1980s referred to their mainframes as "the black hole," because all the information went into it, but little ever came back and ad hoc real-time querying was virtually impossible.
True
(T/F) The "islands of data" problem in the 1980s describes the phenomenon of unconnected data being stored in numerous locations within an organization.
True
(T/F) The access to data and ability to manipulate data (frequently including real-time data) are key elements of business intelligence (BI) systems.
True
(T/F) The data warehousing maturity model consists of six stages: prenatal, infant, child, teenager, adult, and sage.
True
(T/F) The hub-and-spoke data warehouse model uses a centralized warehouse feeding dependent data marts.
True
(T/F) The use of statistics in baseball by the Oakland Athletics, as described in the Moneyball case study, is an example of the effectiveness of prescriptive analytics.
True
(T/F) Traditional BI systems use a large volume of static data that has been extracted, cleansed, and loaded into a data warehouse to produce reports and analyses.
True
(T/F) Volume, velocity, and variety of data characterize the Big Data paradigm
True
(T/F) Without middleware, different BI programs cannot easily connect to the data warehouse.
True
(T/F) Data warehouses are subsets of data marts
false