ISDS final

¡Supera tus tareas y exámenes ahora con Quizwiz!

6 V's that define Big data

- volume - variety - velocity - veracity - value proposition

Variability

data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage

Variety

data today comes in all types of formats-ranging from traditional databases to hierarchical data stores created by the end users and OLAP systems, to text documents, email, XML, meter collected, sensor captured data, to video, audio, and stock ticker data. by some essentials, 80 to 85% of all organizations' data is in some sort of instructed or semi strutted format

Critical success factors for big data analytics

- a clear business need (alignment with the vision and the strategy) - strong, committed sponsorship (executive champion) - a fact based decision making culture

Hadoop

- an open source framework for storing and analyzing massive amounts of distributed, unstructured data. -originally created by doug cutting at yahoo.

big data technologies

- mapreduce - hadoop - noSQL

NoSQL

- not only SQL - a new style of database - to store and process large volumes of unstructured, semi structured, and multi structured data - can handle big data better than traditional relational database technology

What should companies do to succeed with big data?

- simplify - coexist - visualize - empower - integrate - govern - evangelize

data scientists

- use a combo of their business, communication, and technical skills to investigate Big data looking for whats to improve current business analytics practices (from descriptive to predictive and prescriptive) and hence to improve decisions for new business opportunities - considered a big data guru - positions are in high demand and offered with very high salaries and very high experiences

simplify

it's hard to keep track of all of the new database vendors, open source projects, and big data service providers. it will be even more crowded and complicated in the years ahead

Volume

most common trait of big data. Many factors contributed to the exponential increase in data volume, such as transaction based stored through the years, text data constantly streaming in from social media, increasing amounts of sensor data being collected, automatically generated RFID and GPS data and so forth

Velocity

refers to both how fast data is being produced and how fast the data must be processed (i.e., captured, stored, and analyzed) to meet the need or demand. RFID tags, automated sensors, GPS devices, and smart meters are driving an increasing need to deal with torrents of data in near-real time

Veracity

refers to the conformity to facts; accuracy, quality, truthfulness, or trustworthiness go big data

challenges of big data analytics

skill availability (data scientists are in short supply)

MapReduce

technique popularized by google that distributes the processing of very large multiple structured data files across a large cluster of ordinary machines/computer processor

Value proposition

this characteristic of big data is its potential to contain more useful patterns and interesting anomalies than "small" data


Conjuntos de estudio relacionados

Psychiatric-Mental Health Practice Exam HESI

View Set

اجتماعيات الوحدة الاولى القضية اللبنانية ودعم مصر ودعم استقلال الجزائر

View Set

Fundamentals Chapter 5 مهمممم

View Set

Pharmacology Prep U Chapter 30: Adrenergic Agonists

View Set

EXPRESIONES PARA PEDIR Y DAR OPINIONES,

View Set