lecture 8- big data

¡Supera tus tareas y exámenes ahora con Quizwiz!

DATA WARE HOUSE vs. DATA LAKE

data lake -native format of data - no transformation data warehouse - complete ETL infrastructure in place - data transformed and integrated for analysis

big data

-Heterogeneous -From various sources such as smart devices, social media, sensors, etc. -Variety of formats such as web-logs, e-mails, tweets, text, video, audio, etc.

Data Lake

-Large data pool in which the schema and data requirements are not defined until the data is queried -serves as a corporate big data repository

big data

-Not modeled up front for a pre-determined operational and/or analytical queries (retrievals) -Can encompass 80%-90% (or even more) of stored data -Some of it may be of use and some (actually much) of it may not be of use -Companies/organizations tend to store a lot of it knowing that some of it may be of use later

Three Types of Data stored in Corporations and Organizations

-Transactional Structured Data -Analytical Structured Data -Unstructured/Semi-structured, Un-modeled Data

Map Reduce

-a common big-data technique -Parallel computing divides complex tasks into a sequence of smaller tasks that are performed in parallel on multiple computers

Analytical Structured Data

Data Warehouses and Data Marts data modeled/structured and stored for anticipated pre-determined operational use)

Big Data

Massive volumes of diverse and rapidly growing data that are not formally modeled

Transactional Structured Data

Operational Databases (data modeled/structured and stored for anticipated pre-determined operational use)

true or false Big data methods replace database and data warehousing approaches developed for managing and utilizing formally modeled data assets

false


Conjuntos de estudio relacionados

Bob Brooks National Contracts Exam Questions

View Set

Chapter 4 muscles of the spine and thorax group

View Set

(SPA2 [Oral Interview Personal Questions])

View Set