Chapter 6

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Variety

Abby analyzes data from tweets and other social media sites, online reviews, Google searches, YouTube videos, and customer service call recordings to determine how her company's newly launched product line should be positioned for future upgrades.

Veracity

Andrey's company recently experienced extensive backlash for inaccurately predicting the outcome of a key election due to relying too heavily on data built from social media posts.

How can you make the best use of this data?

Anonymize the data and sell it to retailers and market researchers

Relational DB

As soon as Olivia completes the withdrawal from her bank account at the ATM, the reduced amount available in her bank account is immediately visible to a bank employee accessing Olivia's account information from the home office. Morgan builds tables to hold transaction records for analysis of customer purchase patterns.

The primary difference between business intelligence and analytics is that _______________.

BI is used to analyze historical data to tell what happened or is happening right now in your business while analytics employs algorithms to determine relationships among data to develop predictions of what will happen in the future.

NoSQL DB

Bailey distributes the company database across several servers located in various data centers around the globe to increase availability and resiliency. Jody builds a database consisting of data from a wide variety of sources for big data analysis.

The Marketing Dept. adds another layer of data analysis to the new project. They'd like to know what follow-up strategies tend to be most effective for customers in specific age and income brackets. They highlight two particular points along the _____________ where they most need research-based guidance.

Conversion Funnel

Board member 2: So, as I understand it, your recommendations are based on analyzing more data—more kinds of data—not just analyzing the usual data differently? You: Exactly. Instead of looking at a few disparate databases, we combine diverse sets of raw data essentially into one, giant _______, which different departments can pull from and process in ways that are specific to their needs. The raw data includes business transactions, clickstream data, sensor data, server logs, social media posts, videos, customer service call recordings, and many other sources.

Data Lake

To complete these tasks, Riad's team will need to pull more information and insights from their existing data sets. What analytics tool will help them explore this data to identify patterns they can use to make predictions about which strategies will work best for each identified population?

Data Mining

Encouragement of self-service analytics almost assuredly will eliminate the risk of erroneous analysis and reporting and the problem of different analyses yielding inconsistent conclusions. True or False?

False

Team member: I assume we'll be storing this data in the cloud? Using IoT devices like these fitness trackers means the data will be traveling over the Internet anyway. Might as well drop it into a cloud-based database. You: Right. We'll need to ensure the storage space is ________ compliant, but that shouldn't be hard to find these days.

HIPAA

Board member 3: That sounds pretty amazing. What kind of technology lets you access all that data in any sort of organized manner? You: We use the open-source _______ framework, which offers several modules to let us choose the tools we need to store and process massive amounts of data. It distributes the data across an entire server farm to create a highly redundant computing environment.

Hadoop

Team member: We're going to need some starter data to normalize our standards. Where do we get that information? You: Our company has a lot of its own data we can use, and we can expand those numbers using the ________ site. Some of these metrics, though, will have to be normalized as we collect data ourselves.

HealthCare.gov

Where should data from each country be stored?

In a cloud-based data center within that country's borders

Which of the following is not a disadvantage of self-service analytics?

It places valuable data in the hands of end users.

The _______________ component of the Hadoop environment is composed of a Procedure that performs filtering and sorting and a method that performs a summary operation.

Map/Reduce program

You work for an app development team currently in the planning phase for a new app that will track gas prices. The app will use this information to help consumers find the best gas prices near their locations. As the product manager, you are responsible for leading a team of developers, creating the strategy for development, deciding which tools and technologies to use, and managing the budget. Your team needs to make some critical decisions for the app and its accompanying website. Data for the fuel prices and station locations will be collected from user reports. However, your team needs to decide what kind of information to collect on users themselves.

Next

You've decided to anonymize certain portions of the user data and sell this information to retailers and market researchers. The management team loves this idea and preliminary numbers indicate this arrangement will provide a significant income stream while also protecting users' privacy.

Next

You've decided to use a NoSQL database, which will give your team a lot of flexibility to collect, manage, and analyze diverse data on users and fuel prices. The management team specifically requested that you design the app to work in multiple countries, including the United States, Canada, the United Kingdom, France, Germany, and a few other European Union countries. As a result, user data will be covered by a variety of privacy laws. Your team is trying to decide how to handle this layer of complexity.

Next

Your team has decided to add an authentication mechanism to the app so users can create accounts. Users will enter personal identifying information, a security password, and information about their interests, travel plans, and daily commute. This proves to be a very good move, as it collects information that you can use to add more features later that will help customize user experience and increase engagement with the app. Now that you've decided to collect this information, you must decide how to use it, and what to do with it to keep it safe. You anticipate collecting massive amounts of data that would be harmful to users if it were hacked, so security is your first concern. However, your team also wants to explore how best to use the data to enhance user experience and potentially bring additional income to your company.

Next

A _______________ database enables hundreds or even thousands of servers to operate on the data, providing faster response times for queries and updates.

NoSQL

What kind of database will your team use to store and organize this information?

NoSQL database

A _______________ differs from a _______________ in that it provides a means to store and retrieve data that is modelled using some means other than the simple two-dimensional tabular relations.

NoSQL database and relational database

Board member 1: What's so different about the way you're working with the data now? You: Our older ________ systems stored and processed data quickly and efficiently, but were much more limited in what kinds of data they could use and the kinds of relationships they could discover. Today's systems can examine many more types of data from much larger and more diverse data collections to develop more sophisticated insights for decision-making.

OLTP

Value

Peyton uses big data projections to determine which customers are most likely to cancel their insurance policy within the next two months.

Velocity

Piper relies on live sales data at a major sports event to determine whether to buy additional advertising space on Google during the event.

Your team decided to place the data for each country in a cloud-based data center within its borders. This allows your team to still remotely access the data , make updates to the database, and enable the security to conform with each country's laws. The app's beta testers start posting raving reviews almost right away. The management team is thrilled and fast-tracks the app's release. Although it was an expensive app to develop, the management team is pleased with the cashflow and soon returns with another app idea for your team to develop.

Submit

Optimization

The Marketing Dept. needs to maximize consumer interest in the company's new financial services by sending marketing emails, targeted ads, and texts at prescribed intervals throughout the initial process of attracting and building a relationship with a customer. Riad's team has a large amount of data from previous marketing campaigns that they can analyze to develop a recommended schedule, taking into account a large variety of factors about different types of customers. What kind of technique will this task force need to employ in order to make these recommendations?

Variety

The fact that big data comes in many formats and may be structured or unstructured is an indicator of its ______.

Volume

The insurance database Kevin is working on migrating to the cloud consists of more than a trillion records and grows by about 7 million records each day.

Choosing what data to store and where and how to store the data are two key challenges associated with big data. True or False?

True

While there are three key components that must be in place for an organization to get real value from its BI and analytics efforts, the one that is first and foremost is the existence of a solid data management program. True or False?

True

Team member: It's such a huge variety of data types. How are we supposed to work with all of this? You: We have ways of handling _______ data that accounts for the variety and even benefits from it. It's challenging at the beginning, especially in the design phase, but there are rich insights available when it's done well.

Unstructured

What information about users will the app collect and track?

User accounts with usernames and passwords in addition to GPS locations

_____ is not a key challenge associated with big data.

Which format the data should be stored in

An individual who combines strong business acumen, a deep understanding of analytics, and a healthy appreciation of the limitations of their data, tools, and techniques to deliver real improvements in decision making is a(n) _______________.

data scientist

A _______________ is a large database that holds business information from many sources in the enterprise, covering all aspects of the company's processes, products, and customers.

data warehouse

The goal of the ______ step of the ETL process is to take the source data from all the various sources and convert it into a single format suitable for processing.

extract

Veracity

is a measure of the quality of big data

The primary advantage associated with the use of an in-memory database to process big data is that _______________.

it provides access to data at rates much faster than storing data on some form of secondary storage

Genetic algorithm and linear programming belong in the _______________ general category of BI/analytics.

optimization

The five broad categories of BI/analytics techniques include _______________.

optimization, descriptive analytics, and text and video analysis, simulation, and predictive analytics

Data mining and time series belong in the general category of _______________ of BI/analytics.

predictive analytics

Two specific BI/analytics techniques that are in the general category of descriptive analytics are _______________.

regression analysis and visual analytics


संबंधित स्टडी सेट्स

Powers note taking worksheet chapter 18

View Set

algebra 2a - unit 5: rational equations lesson 19-24

View Set

Mobility, Clotting, and Transfusion EAQ

View Set