ISDS final- chapter 6

Ace your homework & exams now with Quizwiz!

T/F Hadoop and MapReduce require each other to work

False

what produces big data the fastest?

RFID

Using data to understand customers/clients and business operations to sustain and foster growth and profitability is:

a) an increasingly challenging task for today's enterprises. b) is not a new technological fad, rather, it's a business priority.

Variability

data flows can be highly inconsistent with periodic peaks, making data loads hard to manage

Where does big data come from?

everywhere, but most big data is generated by machines

is there a strong case for large cities to use big data and related information technologies?

for Dublin, big data was used to ease traffic problems and better understand the traffic network

What does big data mean to Luxottica?

for Luxoticca, big data includes everything they can find about their customer interactions in the from of transactions, click streams, product reviews and social media patterns

critical success factors for big data analytics

1. clear business need ) alignment with the vision and the strategy) 2. strong, committed sponsorship ( executive champion) 3. alignment between the business and IT strategy 4. a fact based decision making culture 5. a strong data infrastructure 6. the right analytics tools 7. right people with the right skills

Skills that define a data scientist

1. domain expertise, problem definition and decision modeling 2. data access and management (traditional and new data systems) 3. programming, scripting and hacking 4. internet and social media/social networking technologies 5. curiosity and creativity 6. communication and interpersonal

challenges of big data analytics

1. effectively and efficiently capturing, storing and analyzing big data 2. new breed of technologies needed (developed or purchased or hired or outsourced) 3. data volume 4. data integration 5. processing capabilities 6. data governance 7. skill availability (data scientist are in short supply) 8. solution cost (ROI)

know the inputs to the analytics system

1. market research 2. social media 3. census data 4. election databases

What should companies do to succeed with big data?

1. simply 2. coexist 3. visualize 4. empower 5. integrate 6. govern 7. evangelize

know the analytic system outputs or goals

1. voter mobilization 2. organize movements 3. increase # of volunteers 4. raise money contributions

Variety

80-85% of all organizations data is in some sort of unstructured or semi-structured format. ranging from traditional databases to hierarchical data stores.

What is CERN and why is it important to the world of science?

CERN is the European organization for nuclear research. plays a leading role in fundamental studies of physics. instrumental in many key global innovations and breakthrough studies in theoretical physics. operates the world largest particle physics laboratory located near Geneva, Switzerland

why did eBay need a big data solution?

Ebay is the worlds largest online marketplace and requires the ability to turn the enormous volumes of data it generates into useful insight for customers.

T/F Big data simplifies data governance issues, especially for global firms

False

What was the obtained results for Luxottica?

Luxoticca did not outsource their data strorage and promotional campaign development and management, nor did they merge with companies in Asia.

In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multi-structured data. Examples include indexing and search, graph analysis, etc.

MapReduce

What are the big data technologies?

MapReduce Hadoop NoSQL

HBase, Cassandra, MongoDB, and Accumulo are examples of ________ databases.

NoSQL

Turning Machine-Generated Streaming Data into Valuable Business Insights

The company uses stream analytics to boost customer satisfaction and competitive advantage. The company selected to work with Splunk, one of the leading analytics service providers in the area of turning machine-generated streaming data into valuable insights and provided beneficial results in the areas of application troubleshooting, operations, compliance, and security.

A case in the energy industry for stream analytics?

a classic smart grid application for the electric power supply chain

Business investments ought to be made for the good of the business, not for the sake of mere technology advancements. Therefore the main driver for Big Data analytics should be an alignment with the vision and the strategy and at any level-strategic, tactical, and operations. Which of the critical success factors for Big Data analytics is being described?

a clear business need

In ____?______, the numbers rather than intuition, gut feeling, or supposition drive decision making. There is also a culture of experimentation to see what works and doesn't. To create ____?____, senior management needs to do the following: recognize that some people can't or won't adjust; be a vocal supporter; stress that outdated methods must be discontinued; ask to see what analytics went into decisions; link incentives and compensation to desired behaviors

a fact based decision making culture

It is a well-known fact that if you don't have committed executive backing, it is difficult (if not impossible) to succeed. If the scope is a single or a few analytical applications, the support can be at the departmental level. However, if the target is enterprise-wide organizational transformation, which is often the case for Big Data initiatives, _____________________ needs to be at the highest levels and organization-wide. Which one best Critical Success Factor for Big Data Analytics best fills the blank in the previous sentence?

a strong committed sponsorship

data volume

ability to capture, store and process the huge volume of data in a timely manner

data integration

ability to combine data quickly and at a reasonable cost

what is the goal of MapReduce?

achieving high performance with "simple" computers

Stream analytics

also called data-in-motion analytics and real-time analytics. analytic process of extracting actionable information from continuously flowing/streaming data. - one of the V's is big data: velocity

In-motion ________ is often overlooked today in the world of BI and Big Data.

analytics

In the eBay use case study, load ________ helped the company meet its Big Data needs with the extremely fast data handling and application availability requirements.

balancing

How can big data benefit large-scale trading bank?

big data can handle the high volume, high variability and continuously streaming data that trading banks need to deal with

MapReduce + Hadoop=

big data core technology

Data scientist

big data guru, one with skills to investigate big data. very high salaries, very high expectations

How can big data help ease traffic in large cities?

by integrating geospatial data from buses into a central geographic information system you can create a digital map of the city. Then, using the dashboard screen operators can drill down to see if the number of buses that are on time or delayed. users can produce detailed reports on areas frequently delayed and take prompt action to ease congestion

What were the challenges, solutions and results for the investment bank?

challenge was the bank was not fast enough to respond to growing business needs and requirements. Big data offered the scalability to address the problem. the major benefit was providing real time access to trading data. achieved single version of the truth.

what were the challenges, solutions and results for eBay?

eBay needed a solution to perform rapid analysis on a broad assortment of structured and unstructured data. the solution did NOT integrate into a single big data center infrastructure. eBay can now more cost effectively process massive amounts of data at very high speeds.

stream analytics applications

ecommerce telecommunications law enforcement and cyber security power industry financial services health services- biggest potential source of big data comes from patient monitoring government

perpetual analytics

evaluated every incoming observation against all prior observations in the context of intelligent systems and recognizing how the new observation relates to all prior observations enables the discovery of real-time insights.

open source

hundred of contributors continuously improve the core technology

________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database.

in-database analytics

Allowing big data to be processing in memory and distributed across a dedicated set of nodes can solve complex problems in near real time. this process is called

in-memory analytics

What are some example tasks of MapReduce?

indexing the web for search, graph analysis, text analysis, machine learning

Why stream analytics?

it may not be feasible to store the data or may lose its value

challenges for Dublin city council?

major problem was the difficulty in getting a good picture of traffic in the city from a high level perspective. this gave operators the ability to see the system as a whole instead of just individuals corridors.

What does big data mean traditionally?

massive amounts of data

Do you think big data analytics could change the outcome of an election?

may well have in 08 and 12. many agree democrats clearly had the advantage in utilizing big data

critical event processing

method od capturing, tracking and analyzing stream of data to detect events (out of normal happenings) of certain types that are worthy of the effort

volume

most common trait of big data. factors of the exponential increase in data volume are: transaction based data stored through the years, text data from social media and increasing amounts of sensor data being collected.

petabyte (PB)

newly popular unit of data in the big data era which is 10^15 bytes

NoSQL

not only SQL, a new style of database to store and process large volumes of unstructured, semi-structured and multi-structured data. can handle big data better than traditional relational database technology.

Hadoop

open source framework for storing and analyzing massive amounts of distributed, unstructured data. originally created by Doug Cutting at Yahoo. breaks up big data into multiple parts so each part can be processed and analyzed at the same time on multiple computers.

What is Big Data?

popular term for exponential growth, availability and use of information, both structured and nonstructured. - relative term, "big" depends of organization size. - big data by itself, regardless of its size, type, or speed is worthless

grid computing

promotes efficiency, lower cost and better performance by processing jobs in a shared, centrally managed pool of IT resources

Velocity

refers to both how fast data is being produced and how fast the data must be processed (captured, stored and analyzed) to meet need/demand.

Veracity

refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of big data

data governance

security, privacy, access

MapReduce

technique popularized by Google that distributes the processing of very large multi-structured data files across a large cluster of ordinary machines/computer processors - good at processing and analyzing large volumes of multi-structured data in a timely manner

processing capabilities

the ability to process the data quickly as it is captured (i.e. stream analytics)

What were the main challenges for Luxottica?

there was a disconnect between data analytics and marketing execution. the technique company uses to gain visibility into its customers is data integration.

value proposition

this characteristics of big data is its potential to contain more useful patterns and interesting anomalies than small data. - with the value proposition, big data also brought big challenges

Why is there a need for big data?

traditional data warehouses have not been able to keep up with the variety and complexity of data so a new breed of technologies are need to take on big data (developed or purchased or hired or outsourced)

T/F Hadoop was deigned to handle petabytes and exabytes of data distributed over ,multiple nodes in parallel

true

T/F: many analytics tools are too complex for the average user and this is one justification for big data

true

Big data + "big" analytics=

value

The ________ of Big Data is its potential to contain more useful patterns and interesting anomalies than "small" data.

value proposition

Data flows can be highly inconsistent with periodic peaks making loads hard to manage. which V is this?

variability

refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data.

veracity

What are the 3 main V's?

volume variety velocity

6 v's that characterize/define big data

volume variety velocity veracity variability value proposition

What is the role of analytics and big data in modern day politics?

volume, variety and velocity readily apply to the kind of data used for political campaigns. big data analytics can help predict election outcomes as well as targeting potential voters and donors and have become a critical part of political campaigns.


Related study sets

Background to the Civil War- 4. The South Carolina exposition and protest

View Set

Chapter 35: Caring for Clients with HIV/AIDS

View Set

La famille Sandrine and Martin are talking about their own families and those of their friends. Choose the correct possessive adjectives to complete their statements.

View Set