Exam 3 (Ch 7 & 8) BIV

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What are the 5 critical success factors for Big Data analytics? a. A clear _______ _______ b. Strong, committed ______ c. ________ between the business and IT strategy d. ____-____ decision-making culture e. _______ people with the right skills

business need, sponsorship, alignment, fact-based, right

Fog computing address IoT issue by: 1) Proposing fog nodes to process the data ____ to IoT 2) _____ ______ - any device including routers or switches

close, fog nodes

What is "model of enabling convention, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service-provider interaction"?

cloud computing

Hadoop's _____ run on inexpensive commodity hardware so projects can scale-out inexpensively

clusters

physical devices, sensors, and actuators where data is produced and records

hardware

Social network vs. ITO are like ________ to______ vs _________ to_______

human to human, machine to machine

which cloud moves between the private/public clouds and is more flexible

hybrid

(T/F): Fog computing is critical when data needs to be analyzed in less than a second

true

(T/F): Rockwell Automation wanted to use technology to monitor equipment status ahead of time to prevent costly repairs

true

Stream analytics can be used in the ___________ industry for analytics on crunching the numbers behind the scenes to understand what we are really interested in to provide creative offerings

e-commerce

Where does big data come from?

everywhere

(T/F) In the Salesforce case study, streaming data is used to identify services that customers use most.

false

(T/F) Internet of Things (IoT) is the phenomenon of connecting the virtual world to the Internet

false

(T/F) Social networking Web sites like Facebook, Twitter, and LinkedIn, are not examples of cloud computing

false

___ _______ is the middleman between physical device and data center

fog computing

Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near-real time with highly accurate insights. What is this process called?

in-memory analytics

(T/F) Big data by itself is worthless

true

(T/F) FitBit and Ring are IoT startups

true

The analytics revolution → _________ _________

Cultural transformation

What are the 7 keys to succeed with Big Data? SCVEIGE a. Simplify b. Coexist c. Visualize d. Empower e. _____________ f. Govern g. Evangelize

Integrate

______ ______ ________ is connecting the physical world to the Internet in contrast to the Internet of the people that connects us humans to each other through technology

Internet of Things

The skills of a data scientist are to ___________ Big Data

investigate

_____________ ______ _________ are used to capture, store, analytics, and manage the data linked to a location

geographic information systems

What are 4 building blocks to IoT? a. ________: physical devices, sensors, and actuators where data is produced and records b. Connectivity: connected to a network to communicate with each other or applications c. Software backend: manages connected networks with devices and provides data integration d. Applications: data is turned into meaningful information

hardware

What are 4 building blocks to IoT?

hardware, connectivity, software backend, applications

IoT has grown because of a. ________ - smaller, affordable, more powerful b. Creativity - new ___________ and use cases uncovered (asking what if we put sensors here....) c. Availability of ____ tools - maybe?

hardware, innovations, BI

RFID is a generic technology that refers to the use of radio-frequency waves to _________ objects

identify

Main Hadoop Components: ________ ________ initiates and coordinates MapReduce jobs or the processing of the data

job tracker

Top 3 security threats in the cloud? a. Data loss and _________ b. Hardware _________ of equipment c. ________ interfaces

leakage, failure, insecure

What are the 7 keys to succeed with Big Data? SCVEIGE a. Simplify b. Coexist c. ______________ d. Empower e. Integrate f. Govern g. Evangelize

Visualize

Which of Big Data's V is "conformity to facts: accuracy, quality, truthfulness, or trustworthiness"?

veracity

Main Hadoop Components: ________ _______ backup to name node

secondary node

What are the Enablers (or high-performing computing) of Big Data Analytics? a. In-memory analytics: solve problems in near real-time b. In-database analytics: speed times to insights c. ______-_______& _____: processing jobs in a shared, centrally managed pool of IT resources d. Appliances: bring together hardware and software

Grid computing & MPP

________ is an open-source framework for storing and analyzing massive amounts of distributed, unstructured data

Hadoop

Main Hadoop Components: ______ _______ _______ _______ (HDFS): default storage layer in any Hadoop cluster

Hadoop distributed file system

___ is an open-source data warehouses originally developed by Facebook that allows analytics modeling within Hadoop

Hive

What are the 3 main issues managers have to keep in mind when exploring IoT? a. __________ ________: everyone needs to be receptive to link up their systems b. __________ Challenges: connect all the applications seamlessly c. __________: entry points for malicious hackers

Organizational Alignment, Interoperability, security

What was the earliest sensor technology

RFID

Why is separating the impact of analytics from that of other computerized systems a difficult task? A) Businesses do not typically track the sources of successful projects. B) The trend is toward integrating systems. C) Software tools are not sophisticated enough. D) It is not an organizational priority.

The trend is toward integrating systems

Which of the following is true of data-as-a-Service (DaaS) platforms? A) Knowing where the data resides is critical to the functioning of the platform. B) There are standardized processes for accessing data wherever it is located. C) Business processes can access local data only. D) Data quality happens on each individual platform.

There are standardized processes for accessing data wherever it is located

_____ tags are larger, more expensive, HAVE a power source

active

3 major cloud services providers? Amazon elastic B________ Microsoft A________ Google A_____ E_______

beanstalks, azure, app engine

Big data + "_____" analytics = value

big

____ ______ has become a popular term to describe the exponential growth, availability, and use of information, both structure and unstructured

big data

Cloud-computing and service-oriented thinking is a service centered around _______ _______ (data, information, and analytics) capabilities as _________

building agile, service

Facebook data generation in a day

500 terabytes

Key players in the IoT ecosystem? a. A__________ b. M______ A_______ c. IBM _______ d. T_________

Amazon AWS, Microsoft Azure, Watson, Teradata,

What are 4 building blocks to IoT? a. Hardware: physical devices, sensors, and actuators where data is produced and records b. Connectivity: connected to a network to communicate with each other or applications c. Software backend: manages connected networks with devices and provides data integration d._________ : data is turned into meaningful information

Applications

What are 4 building blocks to IoT? a. Hardware: physical devices, sensors, and actuators where data is produced and records b. __________: connected to a network to communicate with each other or applications c. Software backend: manages connected networks with devices and provides data integration d. Applications: data is turned into meaningful information

Connectivity

What are the 7 keys to succeed with Big Data? SCVEIGE a. Simplify b. Coexist c. Visualize d. Empower e. Integrate f. Govern g. ____________

Evangelize

Pitney Bowers wanted to analyze data generated from the mailing machines in advance to prevent any outage and fix machines before they break down and used _____ _______

GE Prefix

"NoSQL" = ____ _____ SQL

Not only

SilverHook powerboats used IBM Bluemix Platform as a Service _____ to employ IBM SPSS analytics solutions and deliver insights in an understandable way to users and fans

PaaS

Using this model, companies can deploy their software and applications in the cloud so that their customers can use them. A) SaaS B) PaaS C) IaaS D) DaaS

PaaS

________ analytics evaluate every incoming observation against all prior observations where their no window size

Perpetual

Which of the following sources is likely to produce Big Data the fastest? a. Order entry clerks b. Cashiers c. RFID tags d. Online customers

RFID tags

In the opening vignette, ________ believes Big Data/IoT can help forecast component faults weeks in advance

Siemens

What are 4 building blocks to IoT? a. Hardware: physical devices, sensors, and actuators where data is produced and records b. Connectivity: connected to a network to communicate with each other or applications c. ______ ________: manages connected networks with devices and provides data integration d. Applications: data is turned into meaningful information

Software backend

data is turned into meaningful information for IoT

applications

analytics allows for more ________ ________ to be done by humans and decrease costs for organization & quality of output

cognitive tasks

connected to a network to communicate with each other or applications

connectivity

Which of the Big Data challenges are specific to Big Data Analytics success? a. strong _____ _______ b. right ________ __________

data infrastructure, analytics tools

Which of these is NOT a part of the IoT technology infrastructure? A) hardware B) connectivity C) electrical access D) software

electrical access

_____ _____, like routers or switches, process data and analyze it

fog nodes

Main Hadoop Components: ______ ______ (primary facilitator) provides client info on where the cluster data is stored if a node fails

name node

"Big" depends on the _________'s size

organization

______ tag are small, expensive, NO power

passive

Retail uses _______ (small, inexpensive, no power source) RFID tags with an ________ (electronic product code)

passive, EPC

which cloud is more secure and operated for single organization

private

which cloud has users subscribe to resources offered by service provider over internet

public

The in-memory analytics of Big Data analytics allows to solve problems in near ______ _______

real time

2 enablers of IoT are: 1) _____ 2) _______ devices

sensors, sensing

Main Hadoop Components: ______ ________ are the grunts of any Hadoop cluster

slave nodes

manages connected networks with devices and provides data integration

software backend

3 companies that are AaaS? T_______ IBM ________ _________ S_________

tableau, watson analytics, snowflake

(T/F) AaaS is part of SaaS, PaaS, and Iaas - which means costs and compliance risks are reduced while increasing productivity of users

true

(T/F) Dartmouth-Hitchcock Medical Center wanted to proactively determine the health of people who are likely to fall sick and prevent them from falling ill

true

(T/F) GulfAir developed a sentiment analysis tool called "Arabic Sentiment Analysis" that analyzed English and Arabic social media posts (based on Cloudera and Hadoop)

true

(T/F) Mankind Pharma Used IBM to reduce application implementation time by 98% through IBM Cloud platform called SoftLayer

true

(T/F) Pay-as-you-go and pay-per-use are cloud computing business models

true

(T/F) Public School in Tacoma, WA used Microsoft Azure Machine Learning to Predict School Dropouts and boost graduation rates

true

(T/F): Chime used Snowflake to connect FB, Google, JSON sources to learn more about customer engagement across mobile, web, and backend platforms

true

(T/F): The order of bytes from small to large: kilobyte, megabyte, gigabyte, terabyte, petabyte, exabyte, zettabyte, yottabyte, brontobyte, gegobyte, google

true

·(T/F) MD Anderson Cancer Center used all the collected clinical oncology data to provide better treatment to patients through IBM Watson

true

Big data is important for: 1. Finding _____ within and outside conventional data sources 2. Serves as basis of innovation, growth, and differentiation

value

Which of Big Data's V is "patterns can be detected in the data for insights and better decisions"

value

Cloud computing is a style of computing which is dynamically scalable and often __________ resources are provided over the Internet

virtualized

______ _______ allows turning millions of data records into informational graphics in just seconds for Big Data analytics

visual analytics

The addition of location components based lat/long to traditional analytical techniques enables organizations to add a new dimension of "_______" to their traditional business analysis (which before only answered who/what/when/how much)

where

How much information does a Google hold?

10^100

YouTube's data generation in a day

360 terabytes

What are the Enablers (or high-performing computing) of Big Data Analytics? a. In-memory analytics: solve problems in near real-time b. In-database analytics: speed times to insights c. Grid computing & MPP: processing jobs in a shared, centrally managed pool of IT resources d. _________: bring together hardware and software

Appliances

Who are the big players in the Big Data vendor landscape? C___ M____ H_______ O______ G_____ A_______

Cloudera, Microsoft, Hortonworks, Oracle, Google, Amazon

What are the 7 keys to succeed with Big Data? SCVEIGE a. Simplify b. _______________ c. Visualize d. Empower e. Integrate f. Govern g. Evangelize

Coexist

What are the 7 keys to succeed with Big Data? SCVEIGE a. Simplify b. Coexist c. Visualize d. Empower e. Integrate f. g. Evangelize

Govern

In this model, infrastructure resources like networks, storage, servers, and other computing resources are provided to client companies. A) SaaS B) PaaS C) IaaS D) DaaS

IaaS

What are the Enablers (or high-performing computing) of Big Data Analytics? a. In-memory analytics: solve problems in near real-time b. ___________: speed times to insights c. Grid computing & MPP: processing jobs in a shared, centrally managed pool of IT resources d. Appliances: bring together hardware and software

In-database analytics

What are the Enablers (or high-performing computing) of Big Data Analytics? a. ________: solve problems in near real-time b. In-database analytics: speed times to insights c. Grid computing & MPP: processing jobs in a shared, centrally managed pool of IT resources d. Appliances: bring together hardware and software

In-memory analytics

__________ is a technique popularized by Google that distributes the processing of very large multi-structured data files across a large cluster of machines

MapReduce

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is open source but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles _____ but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data diversity, not just data volume

SQL

What new geometric data type in Teradata's data warehouse captures geospatial features? A) NAVTEQ B) ST_GEOMETRY C) GIS D) SQL/MM

ST_GEOMETRY

This model allows consumers to use applications and software that run on distant computers in the cloud infrastructure: A) SaaS B) IaaS C) PaaS D) AaaS

SaaS

What are the 7 keys to succeed with Big Data? SCVEIGE a. ____________ b. Coexist c. Visualize d. Empower e. Integrate f. Govern g. Evangelize

Simplify

Which of the following is true about the furtherance of homeland security? A) There is a lessening of privacy issues. B) There is a greater need for oversight. C) The impetus was the need to harvest information related to financial fraud after 2001. D) Most people regard analytic tools as mostly ineffective in increasing security.

There is greater need for oversight

The grid computing & MPP of Big Data analytics is processing jobs in a shared, _________ managed pool of IT resources

centrally

In the opening vignette, AT explored how the problem of customer _______could be reduced based on an analysis of the customers' communication problem

churn

Streaming is the analytic process of extracting actionable information from __________ flowing data

continuously

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is open source but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides _____ for analytics, not analytics h. Hadoop is about data diversity, not just data volume

control

_____ ______ ________ is a method of capturing, tracking, and analyzing streams of data to detect events (out of normal happenings) of certain types that are worthy of the effort

critical event processing

This process is enabling technology for stream analytics and extracting novel patterns/knowledge structures from continuous, rapid data records

data stream mining

In the Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse case study, what was the analytic goal? a. determine differences in rates of disease in urban and rural populations b. determine if diseases are accurately diagnosed c. determine probabilities of diseases that are comorbid d. determine differences in rates of disease in males v. females

determine differences in rates of disease in urban and rural populations

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is open source but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data ________, not just data volume

diversity

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is open source but available from vendors, too c. Hadoop is an _______, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data diversity, not just data volume

ecosystem

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is open source but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a ____ ______, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data diversity, not just data volume

file system

The appliances of Big Data analytics bring together ______ and software

hardware

Big data is important for: 1. Finding value within and outside conventional data sources 2. Serves as basis of ___________ , growth, and differentiation

innovation

MapReduce's _______ is colored squares & counting number of squares of each color

input

The in-database analytics of Big Data analytics allows for speed times to _____

insights

In the Alternative Data for Market Analysis or Forecasts case study, satellite data was NOT used for which of the following: a. tracking agricultural estimates b. monitoring activity at factories c. monitoring individual customer patterns d. evaluating retail traffic

monitoring individual customer patterns

Demystifying Facts about Hadoop a. Hadoop consists of _______ products b. Hadoop is open source but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data diversity, not just data volume

multiple

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is open source but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but are ____ the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data diversity, not just data volume

not

Demystifying Facts about Hadoop a. Hadoop consists of multiple products b. Hadoop is ____ _______ but available from vendors, too c. Hadoop is an ecosystem, not a single product d. HDFS is a file system, not a DBMS e. Hive resembles SQL but is not standard SQL f. Hadoop and MapReduce are related but not the same g. MapReduce provides control for analytics, not analytics h. Hadoop is about data diversity, not just data volume

open source

The 3 use cases for data warehousing and RDBMS are: a. Data warehouse ________ b. Integrating data that provides business _____ c. ___________ BI tools

performance, value, interactive

Streaming analytics is the applying transaction level logic to ____-_____ observations (last 5 seconds)

real-time

The 2 use cases for Big Data and Hadoop are: a. Hadoop as the _____ and refinery b. Hadoop as the ______ _________

repository, active archive

Grouping a string of events together involving a particular customer into a defined time period (5 days over all the channels of communication) is called __________

sessionizing

What are the 7 keys to succeed with Big Data? SCVEIGE

simplify, coexist, visualize, empower, integrate, govern, evangelize

The continuous sequence of data elements relates to a ________

stream

(T/F) Current total storage capacity lags behind the digital information being generated in the world

true

(T/F) From massive amounts of high-dimensional location data, algorithms that reduce the dimensionality of the data can be used to uncover trends, meaning, and relationships to eventually produce human-understandable representations

true

(T/F) In Application Case 7.6, Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse, it was found that urban individuals have a higher number of diagnosed disease conditions.

true

(T/F) In the Great Clips case study, the company uses geospatial data to analyze, among other things, the types of haircuts most popular in different geographic locations

true

(T/F) In the Quiznos case, the company employed location-based behavioral targeting to narrow the characteristics of users who were most likely to eat at a quick-service restaurant.

true

(T/F) In the opening vignette, the Access Telecom (AT), built a system to better visualize customers who were unhappy before they canceled their service.

true

(T/F) Process efficiency and cost reduction is the top business problem addressed by Big Data Analytics

true

(T/F) Stream analytics is also called data-in-motion analytics and real-time data analytics

true

(T/F) The term "Big Data" is relative as it depends on the size of the using organization.

true

(T/F): Any industry that requires quickly staying on top of business events as they unfold and allowing organizations to address before they become a problem can benefit from stream analytics

true

Data elements in a stream are called ________

tuples

1. Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called?

variability

Which of Big Data's V is "increasing velocities and varieties of data, data flows can be highly consistent with periodic peaks"

variability

Which of Big Data's V is "data collected in all types of formats"?

variety

Stream is also referred to as ________, or the rapid and continuous streaming of data

velocity

What is the most overlooked characteristic of Big Data?

velocity

Which of Big Data's V is "how fast data is produced and how fast the data must be processed to meet the need or demand"?

velocity

What is the most common the Big Data's 3 Vs?

volume

_________ might be considered the most important because although size is relative to the organization, the growth of more and more data defines the need for Big Data

volume

What are the main challenges of Business Analytics? All the V's a. ___________ b. Data ___________ c. __________ capabilities d. Data ___________ e. _________ availability f. Solution _______

volume, integration, processing, governance, skill, costs

______, ______, and ______ are the 3 V's of Big Data

volume, variety, velocity

The difference between streaming and perpetual analytics is the ______ _______

window size

Why are companies like IBM shifting to provide more services and consulting? A) Customers see that significant value can be created with the application of analytics, and need help completing these tasks. B) They can no longer compete in the software market. C) New regulations forced them into this market. D) None of these.

Customers see that significant value can be created with the application of analytics, and need help completing these tasks

This model began with the notion that data quality could happen in a centralized place, cleansing and enriching data and offering it to different systems, applications, or users, irrespective of where they were in the organization, computers, or on the network. A) SaaS B) PaaS C) IaaS D) DaaS

DaaS

What are the 7 keys to succeed with Big Data? SCVEIGE a. Simplify b. Coexist c. Visualize d. _______________________ e. Integrate f. Govern g. Evangelize

Empower


Ensembles d'études connexes

Property and Casualty Insurance Guarantee Exam

View Set

World Regions-Exam 3: Russian Domain

View Set

NUR 190 CHP 1, 3,4,5,6 Practice Questions

View Set

Ch. 47- Nursing Care of a Family When a Child Has a Reproductive Disorder

View Set

Chapter 11 Animal Diversification

View Set