MISY 5360 Final Exam Review
The BI objects used to extract and stage data from the source systems are called:
Data sources
Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources.
GRID COMPUTING
________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database.
In-database analytics
A directory that stores all related InfoObjects within the same business context is called:
Info Object Catalog
The physical representation of the Star Schema is called:
Info cube
_____________ represents the structure that allows data to be stored in a Business Intelligence system.
Info cube
Raw data is loaded into the Persistent Staging Area by using a(n) _________.
Info package
General layers in a Data Warehouse Architecture include all the following except _______?
InfoObjects
SAP BW provides open interfaces to connect with 3rd party DataSources through the use of_________.
Staging BAPs
Which of the following is not true with regards to an InfoObject?
Starts with - Info Objects
Which of the following is not relevant to Characteristics?
Starts with- is contained in fact table
With regard to Persistent Staging Areas, which of the following does not apply?
Starts with- summeries
For low latency, interactive reports, a data warehouse is preferable to Hadoop.
TRUE
Hadoop was designed to handle petabytes and extabytes of data distributed over multiple nodes in parallel.
TRUE
If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse.
TRUE
It is important for Big Data and self-service business intelligence go hand in hand to get maximum value from analytics.
TRUE
Many analytics tools are too complex for the average user, and this is one justification for Big Data.
TRUE
MapReduce can be easily understood by skilled programmers due to its procedural nature.
TRUE
The term "Big Data" is relative as it depends on the size of the using organization.
TRUE
Possible components of master data are:
Text, attributes, hierarchies, all of above
Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources?
grid computing
Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near real time with highly accurate insights. What is this process called?
in-memory analytics
Characteristics are one type of:
info object
In a Hadoop "stack," what node periodically replicates and stores data from the Name Node should it fail?
secondary node
Companies with the largest revenues from Big Data tend to be
the largest computer and IT services firms.
Traditional data warehouses have not been able to keep up with
the variety and complexity of data.
A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data.
tracker
Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse?
unrestricted, ungoverned sandbox explorations
What is the Hadoop Distributed File System (HDFS) designed to handle?
unstructured and semistructured non-relational data
The ________ of Big Data is its potential to contain more useful patterns and interesting anomalies than "small" data.
value proposition
Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called?
variability
The master data:
Represents dimensional data
________ refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data.
Veracity
Hadoop is primarily a(n) ________ file system and lacks capabilities we'd associate with a DBMS, such as indexing, random access to data, and support for SQL.
distributed
Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near—real time with highly accurate insights. What is this process called?
in-memory analytics
Data ________ or pulling of data from multiple subject areas and numerous applications into one repository is the raison d'être for data warehouses.
integration
Most Big Data is generated automatically by ________.
machines
Relevant information regarding InfoProviders includes all the following except ________.
may be used to (wrong), multi providers(wrong), info provides with(wrong)
Which of the following sources is likely to produce Big Data the fastest?
RFID tags
A newly popular unit of data in the Big Data era is the petabyte (PB), which is
10^15 bytes
A newly popular unit of data in the Big Data era is the petabyte (PB), which is
10^15 bytes.
The data types associated with Key Figures are:
ALL OF THE ABOVE
Which of the following is a type of InfoProvider (Physical or Virtual)
ALL of the above
most important InfoProviders
All of the above
InfoCubes use the Star Database Schema; which fact below is not true in relation to Star Schema??
Allows
What is Big Data's relationship to the cloud?
Amazon and Google have working Hadoop cloud offerings.
________ bring together hardware and software in a physical unit that is not only fast but also scalable on an as-needed basis.
Appliances
In Star Schema, the measures such as Quantities, Revenues, Costs, and Taxes are included in the:
FACT TABLE (D)1
Big Data simplifies data governance issues, especially for global firms.
FALSE
Big Data uses commodity hardware, which is expensive, specialized hardware that is custom built for a client or application.
FALSE
Hadoop and MapReduce require each other to work.
FALSE
In most cases, Hadoop is used to replace data warehouses.
FALSE
In a Hadoop "stack," what is a slave node?
a node where data is stored and processed
Data targets can be:
all the above
In-motion ________ is often overlooked today in the world of BI and Big Data.
analytics
As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to ________ all the data reliably and cost effectively.
capture
How does Hadoop work?
It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers.
Any kind of numeric information used to measure a business process is called:
Key Figures
In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multi-structured data. Examples include indexing and search, graph analysis, etc.
MapReduce
All of the following statements about MapReduce are true EXCEPT
MapReduce runs without fault tolerance.
All of the following are true in regards to an InfoProvider except _________?
May store data or are data storage figures, Storage Insist, May be used to access data
The ________ Node in a Hadoop cluster provides client information on where in the cluster particular data is stored and if any nodes fail.
Name
Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called?
Variability