Big Data Final Exam
In order for data analysis and analytics to offer value, enterprises need to have data management and Big Data governance frameworks.
. Big Data governance frameworks data management
Big Data frameworks are not turn-key solutions.
True
Business architecture provides a means of blueprinting or concretely expressing the design of the business.
True
Contemporary data visualization tools are interactive and can provide both summarized and detailed views of data.
True
Data visualization tools for Big Data solutions generally use in-memory analytical technologies that reduce the latency normally attribute to traditional disk-based data visualization tools.
True
Each shared can independently service reads and writes for the specific subset of data that it is responsible for.
True
NoSQL databases generally do not provide robust built-in security mechanisms.
True
Noise is data that can be converted into information and thus has no value, whereas signals do not have value and cannot lead to meaningful information.
True
Recognizing that external data brings additional context to their internal data allows a corporation to move up the analytic value chain from hindsight to insight with greater ease.
True
Sound processes and sufficient skillsets for those who will be responsible for implementing, customizing, populating and using Bid Data solutions are also necessary.
True
The management and analysis of large datasets has been a long standing problem.
True
The Information and Communications Technology (ICT) developments that have accelerated the pace of Big Data adoption in businesses include all but:
advancement in employee management
Computational approaches, statistical techniques and data warehousing have advanced to the point where they have merged, each bringing their specific techniques and tools that allow the performance of Big Data ___________.
analysis
To address the increasing size of the data warehouse, __________________ are used to handle reporting and data analysis tasks.
analytical database
Provenance information helps determine the _____________ and _________ of the data and it can be used for auditing purposes. `
authenticity, quality
An evaluation of a Big Data analytics _____________helps decision-makers understand the business resources that will need to be utilized and which business challenges the analysis will tackle.
business case
Data warehouses are heavily used by _________ to run various analytical queries and they usually interface with an OLAP system to support multi-dimensional analytical queries.
business intelligence
Prescriptive analytics involve the use of __________________ and large amounts of internal and external data to simulate outcomes and prescribe the best course of action.
business rules
The Big Data analytics lifecycles begins with the establishment of a __________ for the Big Data project and ends with ensuring that the analytic results are deployed to the organization to generate maximal value.
business-case
All the shards collectively represent the _______________ dataset.
complete
During Data Acquisition and Filtering, data is subjected to automated filtering for the removal of _______ data.
corrupt
The Business Case Evaluation stages presented in the text requires that a business case be __________, assessed and approved prior to proceeding with the actual hands-on analysis tasks.
created
______________ provide a holistic vie of key business areas.
dashboards
________________ is a subset of the data stored in a data warehouse that typically belongs to a department division or specific line of business.
data mart
Big Data solutions lack the robustness of traditional enterprise solution environments when it comes to ____________ and ________________.
data security, access control
A________________ is central, enterprise-wide repository consisting of historical and current data.
data warehouses
Extract Transform Load (ETL) represents the main operation through which _________________ are fed data.
data warehouses
Traditional BI uses ____________ and data marts for reporting and data analysis because they allow complex data analysis queries with multiple joins and aggregations to be issued.
data warehouses
Performance Indicators are often used to ________________________ and _________________________.
demonstrate regulatory compliance, identify business performance problems
Traditional Business Intelligence (BI) primary utilizes _____________ and _____________ analytics to provide information on historical and current events.
descriptive, diagnostic
A ____________ file system is a file system that can store large files spread across the odes of a cluster.
distributed
With increasing data volumes, the time to transfer a unit of data can _______ its actual data processing time.
exceed
Business process management applies process excellence techniques to improve corporate _____________.
execution
The need to address disparate types of data is more likely with data from ___________ sources.
external
The datasets and their sources for a Big Data solution could be _________ and/or __________ to organization.
external/internal
Customer reviews, complaints, and praises from social media sites ______ Big Data analysis algorithms.
feed
Veracity refers to the quality or ______ of data.
fidelity
Redundancy can be exploited to explore interconnected datasets in order to assemble validation parameters and _______ missing valid data.
fill in
Prescriptive Analytics can be used to _______________ or __________________.
gain an advantage OR mitigate a risk
Big Data solutions _____________ and _______________ all of which become assets of the business.
generate data, access data
Each Big Data analytics lifecycle must begin with a well-defined business case that presents a clear understanding of the justification, _______ and ______ of carrying out the analysis.
goals, motivation
External data sources for Big Data processing include ______________ and _______________.
government data sources & commercial data markets
Sharding is the process of ________________ partitioning a large data into a collection of smaller, more manageable data called shard.
horizontal
The data processed by big data solutions can be ______________ or ________________.
human-generated OR machine-generated
Business Intelligence (BI) enables an organization to gain insight into the performance of an enterprise by analyzing data generated by its _______________ and _____________________.
information systems, business process
Data variety brings challenges for enterprises in terms of data ________, transformation, processing, and __________.
integration, storage
Companies need to expand their Business Intelligence activities beyond looking back on ________ extracted information.
internal
Datasets continue to become
larger, more diverse, more complex and streaming-centric
The Utilization of Analysis Results stage is dedicated to determine how and where processed analysis data can be further _________.
leveraged
Big Data adds newer techniques that leverage computational ___________ and approaches to execute analytic algorithms.
resources
____________ reporting is a process that involves manually processing data to produce custom-made reports.
Ad-hoc
The Data Visualization stage is dedicated to using data visualization _____________ and _____________ to graphically communicate the analysis results for effective interpretation by business users.
Confirmatory
The presentation of data on ______________ is graphical in nature, using bar charts, pie charts and gauges.
Dashboards
________________ data analysis is an inductive approach that is closely associate with data mining in that data is explored through analysis to develop an understand of the cuase of the phenomenon.
Exploratory
____________________is the process of loading data form a source system into a target system.
Extract Transform Load(ETL)
Analyzing separate datasets that contain seemingly benign data cannot reveal private information when the datasets are analyzed jointly.
False
Big data solutions often leverage open-source software that executes on commodity hardware, which does not affect processing cost.
False
Each node in the cluster has shared recourses.
False
Issues related to tracking the provenance of a dataset from its procurement to its utilization is often a routine requirement for organizations.
False
OLAP system store historical data that is aggregated and normalized to support fast reporting capability.
False
Results from the Data Visualization stage could be represented in different ways, which can reinforce the interpretation of the results.
False
Big Data requires the meeting of several processes to include _________________________.
Internet of Everything
The convergence of advancements information and communications technology marketplace dynamics, business architecture and business process management all contribute to the opportunity of what is now know as the ________________________________.
Internet of Everything
_______________________________ can serve as a data source and a data sink that is capable of receiving data.
Online Analytical Processing (OLAP)
_______________________________ is a software system that is used for processing data analysis queries.
Online Analytical Processing (OLAP)
_______________________________ is a software system that process transaction-oriented data.
Online Transaction Processing (OLTP)
Big Data addresses distinct requirements such as
Processing of large amounts of unstructured data & Combining of multiple unrelated datasets
________________ refers to information about the source of the data and how it has been processed.
Provenance
When the combination of Big Data analytic results and goal-driven behavior are used together, process execution can become _________ to the marketplace and responsive to environmental conditions.
adaptive
Business architecture includes ________ from abstract concepts like business mission, vision, strategy and goals to more concrete ones like business service, organizational structure, key performance indicators and application services.
linkage
A file system provides a ____________ view of the data stored on the storage device.
logical
Outdated, invalid, or poorly identified data will result in low-quality input which, regardless of how good the Big Data solution is, will continue to produce __________ results.
low-quality
Depending on the nature of the analysis problem being addressed, it is possible for the analysis results to produce __________ that encapsulate new insights and understandings about the nature of the patterns and relationships that exist within the data that was analyzed.
model
Model used for predictive analytic have implicit dependencies on the conditions under which the past events occurred. If the underlying condition changes then, the models used for the predictions
need to be updated
Advanced data visualization tools for Big Data solutions incorporate ____________ and ______________ data analytics and data transformation features which eliminate the need for data pre-processing methods.
predictive, prescriptive
Big Data BI comprises both _____________ and ______________ analytics to facilitate the development of an enterprise-wide understanding of business performance.
prescriptive, predictive
Managing the __________ of constituents whose data is being handled or whose identity is revealed by analytic processes must be planed for.
privacy
Given the nature of Big Data and its analytic power, there are many issues that need to be considered and planned for in the beginning. For example, with the adoption of any new technology, the means to ________ it in a way that conforms to existing corporate standards needs to be addressed.
secure
Big Data BI builds upon traditional BI by acting on the cleansed, consolidated enterprise-wide data in the data warehouse and combing it with _____________ and _____________ data sources.
semi-structured, unstructered
A cluster is tightly coupled collection of _______ or _________.
servers, nodes
A cluster can execute a task by splitting it into small pieces and distributing their execution onto different computers that belong to the cluster.
small
Key Performance Indicators (KPIs) can be aligned with Critical Success Factors (CSF's) at the _____________, which in turn help measure progress being made toward the achievement of strategic goals and objectives.
strategic layer
NoSQL databases also support query languages other than Structured Query Language (SQL) because SQL was designed to query _______________ data stored within a relational database.
structured
The accepted view that a business operates as layered system defines the middle layer as the
tactical or managerial layer that seeks to steer the organization in alignment with the strategy
The Data Visualization stage is dedicated to using data visualization _____________ and _____________ to graphically communicate the analysis results for effective interpretation by business users.
tools, techniques
Approaches that achieve near-realtime results often process_______________ data as it arrives and combine it with previous summarized batch-processed data.
transactional
A file system presents the data as a _____ structure of directories and files as pictured.
tree
The data Validation and Cleansing stage is dedicated to establishing often complex ________ rules and removing any know invalid data.
validation
During Data Acquisition and Filtering, data is subjected to automated filtering for the removal of data that has been deemed to have no ________ to the analysis objectives.
value
Which is not part of the Big Data characteristics?
variable
Data _________ refers to the multiple formats and types of data that need to be supported by Big Data solutions.
variety
Identifying a wider __________ of data sources may increase the probability of finding hidden patterns and correlations.
variety
In order to qualify as a Big Data problem, a business problem needs to be directly related to one or more of the Big Data characteristics of volume, ___________ or ____________.
variety, velocity
The anticipated volume of data that is processed by Big Data solutions is substantial and ever-growing.
volume
The greater the _________ and _____________ of data that can be supplied, the higher the chance are of finding hidden insights from patterns.
volume & variety
IoE-specific companies can leverage Big Data to establish and optimize ___________ and offer them to third parties as outsourced business processes.
workflows