Chapter 7 & 8

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Why are some portions of tape backup workloads being redirected to Hadoop clusters today?

- difficulty of retrieval , data that is stored offline takes long to retrieve , tape formats change over time , and are prone to loss of data . There is a value in keeping historical data online and accessable.

In what ways can communications companies use geospatial analysis to harness their data effectively?

Communication companies often generate massive amounts of data every day. The ability to analyze the data quickly with a high level of location-specific granularity can better identify the customer churn and help in formulating strategies specific to locations for increasing operational efficiency, quality of service, and revenue.

Why are companies like IBM shifting to provide more services and consulting?

Customers see that significant value can be created with the application of analytics, and need help completing these tasks.

Server virtualization is the pooling of physical storage from multiple network storage devices into a single storage device.

False

________ is/are used to capture, store, analyze, and manage data linked to a location using integrated sensor technologies, global positioning systems installed in smartphones, or through RFID deployments in the retail and healthcare industries.

GIS

How does Hadoop work?

It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers.

In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multistructured data. Examples include indexing and search, graph analysis, etc.

MapReduce

What new geometric data type in Teradata's data warehouse captures geospatial features?

ST_GEOMETRY

What are the differences between stream analytics and perpetual analytics? When would you use one or the other?

Stream analytics : Appling the transaction level logic to a real-time observation. The rule applies to these observations take into account previous observation as long as they accured in a prescribed window. The windows have to have a size. Perpetual analytics: Evaluates every incoming observation against all prior observation and there is no window size. I would use stream analytics when the transactional volume is high and time to decision is to short, favoring nonpresistance and smaller window sizes.

Why is separating the impact of analytics from that of other computerized systems a difficult task?

The trend is toward integrating systems

Which of the following is true of data-as-a-Service (DaaS) platforms?

There are standardized processes for accessing data wherever it is located.

Describe your understanding of the emerging term people analytics. Are there any privacy issues associated with the application?

•Applications such as using sensor-embedded badges that employees wear to track their movement and predict behavior has resulted in the termpeople analytics. This application area combines organizational IT impact, Big Data, sensors, and has privacy concerns. One company, Sociometric Solutions, has reported several such applications of their sensor-embedded badges. •People analytics creates major privacy issues. Should the companies be able to monitor their employees this intrusively? Sociometric has reported that its analytics are only reported on an aggregate basis to their clients. No individual user data is shared. They have noted that some employers want to get individual employee data, but their contract explicitly prohibits this type of sharing. In any case, sensors are leading to another level of surveillance and analytics, which poses interesting privacy, legal, and ethical questions.

What is a data scientist and what does the job involve?

A data scientist is a role or a job frequently associated with Big Data or data science. In a very short time it has become one of the most sought-out roles in the marketplace. Currently, data scientists' most basic, current skill is the ability to write code (in the latest Big Data languages and platforms). A more enduring skill will be the need for data scientists to communicate in a language that all their stakeholders understand—and to demonstrate the special skills involved in storytelling with data, whether verbally, visually, or—ideally—both. Data scientists use a combination of their business and technical skills to investigate Big Data looking for ways to improve current business analytics practices (from descriptive to predictive and prescriptive) and hence to improve decisions for new business opportunities.

This model began with the notion that data quality could happen in a centralized place, cleansing and enriching data and offering it to different systems, applications, or users, irrespective of where they were in the organization, computers, or on the network.

DaaS

Big Data simplifies data governance issues, especially for global firms.

False

Connectivity is not a part of the IoT infrastructure.

False

For cloud computing to be successful, users must have knowledge and experience in the control of the technology infrastructures.

False

IaaS helps provide faster information, but provides information only to managers in an organization.

False

In the Great Clips case study, the company uses geospatial data to analyze, among other things, the types of haircuts most popular in different geographic locations.

False

In the Salesforce case study, streaming data is used to identify services that customers use most.

False

In the classification of location-based analytic applications, examining geographic site locations falls in the consumer-oriented category.

False

SaaS combines aspects of cloud computing with Big Data analytics and empowers data scientists and analysts by allowing them to access centrally managed information data sets.

False

Siemens utilizes data sensors to track failure rates in household appliances.

False

Users definitely own their biometric data.

False

Web-based e-mail such as Google's Gmail are not examples of cloud computing.

False

While cloud services are useful for small and midsize analytic applications, they are still limited in their ability to handle Big Data applications.

False

A critical emerging trend in analytics is the incorporation of location data. ________ data is the static location data used by these location-based analytic applications.

Geospatial

In this model, infrastructure resources like networks, storage, servers, and other computing resources are provided to client companies.

IaaS

________ provides resources like networks, storage, servers, and other computing resources to client companies.

IaaS

________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database.

In-database analytics

________ of data provides business value; pulling of data from multiple subject areas and numerous applications into one repository is the raison d'être for data warehouses.

Integration

What is Internet of Things (IoT) and how is it used?

IoT is the phenomenon of connecting the physical world to the Internet. In IoT, physical devices are connected to sensors that collect data on the operation, location, and state of a device. This data is processed using various analytics techniques for monitoring the device remotely from a central office or for predicting any upcoming faults in the device.

Data and text mining is a promising application of AaaS. What additional capabilities can AaaS bring to the analytic world?

It can also be used for large-scale optimization, highly-complex multi-criteria decision problems, and distributed simulation models. These prescriptive analytics require highly capable systems that can only be realized using service-based collaborative systems that can utilize large-scale computational resources.

How do the traditional location-based analytic techniques using geocoding of organizational locations and consumers hamper the organizations in understanding "true location-based" impacts?

Locations based on postal codes offer an aggregate view of a large geographic area. This poor granularity may not be able to pinpoint the growth opportunities within a region. The location of the target customers can change rapidly. An organization's promotional campaigns might not target the right customers.

Define MapReduce.

MapReduce is a programming model that is used to process and generate big datasets with a parallel distributed algorithm.

All of the following statements about MapReduce are true EXCEPT

MapReduce runs without fault tolerance

________ is the splitting of available bandwidth into channels.

Network virtualization

What is NoSQL as used for Big Data? Describe its major downsides.

NoSQL is a new form of databases that processes and stores unstructured data that is not in a tabular format . NoSQL is high performance and highly scalable . The downside is that they trade ACID compliance for performance and scalability.

Using this model, companies can deploy their software and applications in the cloud so that their customers can use them.

PaaS

Which of the following allows companies to deploy their software and applications in the cloud so that their customers can use them?

PaaS

________ is a generic technology that refers to the use of radio-frequency waves to identify objects.

RFID

This model allows consumers to use applications and software that run on distant computers in the cloud infrastructure.

SaaS

________ is the masking of physical servers from server users.

Server virtualization

How does Siemens use sensor data to help monitor equipment on trains?

Siemens uses an IoT model and sensors attached to several key components of trains and other railway equipment to help evaluate its current working condition, and predict the need for future repair. By using a wide variety of different types of sensors, the company is able to evaluate a multitude of conditions. This evaluation can be on the train itself, or within the supporting infrastructure. By using analytics to monitor these sensors, the company is able to predict the need for repair prior to component failure.

In the opening vignette, why was the Telecom company so concerned about the loss of customers, if customer churn is common in that industry?

The loss was at such a high rate . Th company had been losing customer faster than gaining customers . It was identified that the lost of customers could be traced back to customer service interactions.

Which of the following is true about the furtherance of homeland security?

There is a greater need for oversight

Current total storage capacity lags behind the digital information being generated in the world.

True

Data as a service began with the notion that data quality could happen in a centralized place, cleansing and enriching data and offering it to different systems, applications, or users, irrespective of where they were in the organization, computers, or on the network.

True

From massive amounts of high-dimensional location data, algorithms that reduce the dimensionality of the data can be used to uncover trends, meaning, and relationships to eventually produce human-understandable representations.

True

Hadoop was designed to handle petabytes and exabytes of data distributed over multiple nodes in parallel.

True

If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse.

True

In Application Case 7.6, Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse, it was found that urban individuals have a higher number of diagnosed disease conditions.

True

In the Quiznos case, the company employed location-based behavioral targeting to narrow the characteristics of users who were most likely to eat at a quick-service restaurant.

True

Internet of Things (IoT) is the phenomenon of connecting the physical world to the Internet.

True

MapReduce can be easily understood by skilled programmers due to its procedural nature.

True

One reason the IoT is growing exponentially is because hardware is smaller and more affordable.

True

RFID can be used in supply chains to manage product quality.

True

Satellite data can be used to evaluate the activity at retail locations as a source of alternative data.

True

Service-oriented DSS solutions generally offer individual or bundled services to the user as a service.

True

Social media mentions can be used to chart and predict flu outbreaks.

True

Social networking Web sites like Facebook, Twitter, and LinkedIn, are also examples of cloud computing.

True

The term "Big Data" is relative as it depends on the size of the using organization.

True

The term cloud computing originates from a reference to the Internet as a "cloud" and represents an evolution of all of the previously shared/centralized computing trends.

True

There is a clear difference between the type of information support provided by influential users versus the others on Twitter.

True

With RFID tags, a(n) ________ tag has a battery on board to energize it.

active

In-motion ________ is often overlooked today in the world of BI and Big Data.

analytics

The portion of the IoT technology infrastructure that focuses on controlling what and how information is captured is

applications

Pokémon GO is an example of a location-sensing ________ reality-based game.

augmented

As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to ________ all the data reliably and cost effectively.

capture

IaaS, AaaS and other ________-based offerings allow the rapid diffusion of advanced analysis tools among users, without significant investment in technology acquisition.

cloud

The portion of the IoT technology infrastructure that focuses on how to transmit data is

connectivity

GPS Navigation is an example of which kind of location-based analytics?

consumer-oriented geospatial static approach

HBase is a nonrelational ________ that allows for low-latency, quick lookups in Hadoop.

database

Analytics can change the way in which many ________ are made by managers and can consequently change their jobs.

decisions

Which of these is NOT a part of the IoT technology infrastructure?

electrical access

Smartbin has developed trash containers that include sensors to detect

fill levels

Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources?

grid computing

Today, most smartphones are equipped with various instruments to measure jerk, orientation, and sense motion. One of these instruments is an accelerometer, and the other is a(n)

gyroscope

The portion of the IoT technology infrastructure that focuses on the sensors themselves is

hardware

Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near-real time with highly accurate insights. What is this process called?

in-memory analytics

By using ________, businesses can collect and analyze data to discern large-scale patterns of movement and identify distinct classes of behaviors in specific contexts.

location-enabled services

Location information from ________ phones can be used to create profiles of user behavior and movement.

mobile

What kind of location-based analytics is a real-time marketing promotion?

organization-oriented location-based dynamic approach

With RFID tags, a(n) ________ tag receives energy from the electromagnetic field created by the interrogator.

passive

For individual decision makers, ________ values constitute a major factor in the issue of ethical decision making.

personal

In general, ________ is the right to be left alone and the right to be free from unreasonable personal intrusion.

privacy

Predictive analytics is beginning to enable development of software that is directly used by a consumer. One key concern in employing these technologies is the loss of ________.

privacy

A(n) ________ is operated solely for a single organization having a mission critical workload and security concerns.

private cloud

In a(n) ________ the subscriber uses the resources offered by service providers over the Internet.

public cloud

Services that let consumers permanently enter a profile of information along with a password and use this information repeatedly to access services at multiple sites are called

single-sign-on facilities

In the energy industry, ________ grids are one of the most impactful applications of stream analytics.

smart

The portion of the IoT technology infrastructure that focuses on how to manage incoming data and analyze it is

software backend

Traditional data warehouses have not been able to keep up with

the variety and complexity of data

A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data.

tracker

A major structural change that can occur when analytics are introduced into an organization is the creation of new organizational ________.

units

Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse?

unrestricted, ungoverned sandbox explorations

What is the Hadoop Distributed File System (HDFS) designed to handle?

unstructured and semistructured non-relational data

Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called?

variability

AaaS in the cloud has economies of scale and scope by providing many ________ analytical applications with better scalability and higher cost savings.

virtual

What is cloud computing? What is Amazon's general approach to the cloud computing services it provides?

•Wikipedia defines cloud computing as "a style of computing in which dynamically scalable and often virtualized resources are provided over the Internet. Users need not have knowledge of, experience in, or control over the technology infrastructures in the cloud that supports them." •Amazon.com has developed an impressive technology infrastructure for e- commerce as well as for business intelligence, customer relationship management, and supply chain management. It has built major data centers to manage its own operations. However, through Amazon.com's cloud services, many other companies can employ these very same facilities to gain advantages of these technologies without having to make a similar investment. Like other cloud-computing services, a user can subscribe to any of the facilities on a pay-as-you-go basis. This model of letting someone else own the hardware and software but making use of the facilities on a pay-per-use basis is the cornerstone of cloud computing.


Ensembles d'études connexes

Chapter 10: Disorders of the Reproductive System (Test)

View Set

Maryland Health and Life Insurance Exam

View Set

2.5 Preguntas sobre los animales

View Set

UNIT 1 (Physical and Chemical Properties of Matter) HOMEWORK QUESTIONS

View Set

Chapter 30: Employment Discrimination

View Set