Big Data and data lakes (Ch.10)
Choose the list that CORRECTLY has sizes of data sets from smallest to largest.
Database, data warehouse, data lake
Which of the following data examples has the most structure?
a row in a relational table
Examples of sources generating large amounts of data
ALL OF THESE ANSWERS
A large fact table in a data warehouse is an example of big data.
False
SQL provides all the necessary functionalities for managing and analyzing big data.
False
CONSTANT DOCUMENTARIAN is an organization that receives each day a one-hour log video from 50 contributors around the world, recording mundane daily-life scenes. It also receives one daily 200-words email from its contributors. Consider the following CONSTANT DOCUMENTARIAN data sets: - Set A: collection of daily one-hour videos from the 50 contributors - Set B: collection of daily 200-words emails from its contributors - Set C: video footage of its 24/7 CCTV camera constantly recording scenes outside its headquarters - Set D: relational table containing first name, last name, phone number, and email address of each contributor. Which data set is exhibiting the highest volume?
Set A
T/F: Big data is mostly unstructured
True
T/F: Big data is not formally modeled for querying and retrieval and are not accompanied with detailed metadata
True
CONSTANT DOCUMENTARIAN is an organization that receives each day a one-hour log video from 50 contributors around the world, recording mundane daily-life scenes. It also receives one daily 200-words email from its contributors. Consider the following CONSTANT DOCUMENTARIAN data sets: - Set A: collection of daily one-hour videos from the 50 contributors - Set B: collection of daily 200-words emails from its contributors - Set C: video footage of its 24/7 CCTV camera constantly recording scenes outside its headquarters - Set D: relational table containing first name, last name, phone number, and email address of each contributor. Which data set is exhibiting the lowest velocity?
Set D
CONSTANT DOCUMENTARIAN is an organization that receives each day a one-hour log video from 50 contributors around the world, recording mundane daily-life scenes. It also receives one daily 200-words email from its contributors. Consider the following CONSTANT DOCUMENTARIAN data sets: - Set A: collection of daily one-hour videos from the 50 contributors - Set B: collection of daily 200-words emails from its contributors - Set C: video footage of its 24/7 CCTV camera constantly recording scenes outside its headquarters - Set D: relational table containing first name, last name, phone number, and email address of each contributor. Which data set is exhibiting the lowest volume?
Set D
T/F: Big data does not easily fit into tables with rows and columns
True
Big data sets, when compared with most databases and data warehouses:
have higher possibility of different interpretations
Comparing a Data Warehouse vs a Data Lake, a common analogy is data warehouses are like agriculture while data lakes are like _____________________ .
hunting/gathering
Which of the following is true when comparing a data warehouse to a data lake?
in ETL, a data lake has no transform part
Where on the Spectrum of Solutions for Large Analytical Data Repositories, would the following example best fit? Data from 10 data sources is extracted. Two of those sources have the exact same structure, so the data from those two sources is pasted together before loading. There is a small overlap of data in those two sources, so the duplicates are eliminated before loading. The rest of the sources are loaded as they were.
inside the spectrum, closer to the left edge of the spectrum (pure data lake)
Data lake repositories of big data are not created through the process of formal database __________________.
modeling
Within a Data Lake, the large data pool in which the schema and data requirements are not defined until the data is ______________ .
queried
Data lakes store vast amount of __________________ data in its native format until it is needed
raw
CONSTANT DOCUMENTARIAN is an organization that receives each day a one-hour log video from 50 contributors around the world, recording mundane daily-life scenes. It also receives one daily 200-words email from its contributors. Consider the following CONSTANT DOCUMENTARIAN data sets: - Set A: collection of daily one-hour videos from the 50 contributors - Set B: collection of daily 200-words emails from its contributors - Set C: video footage of its 24/7 CCTV camera constantly recording scenes outside its headquarters - Set D: relational table containing first name, last name, phone number, and email address of each contributor. Which data set is exhibiting the highest velocity?
Set C
CONSTANT DOCUMENTARIAN is an organization that receives each day a one-hour log video from 50 contributors around the world, recording mundane daily-life scenes. It also receives one daily 200-words email from its contributors. Consider the following CONSTANT DOCUMENTARIAN data sets: - Set A: collection of daily one-hour videos from the 50 contributors - Set B: collection of daily 200-words emails from its contributors - Set C: video footage of its 24/7 CCTV camera constantly recording scenes outside its headquarters - Set D: relational table containing first name, last name, phone number, and email address of each contributor. Which data set has the most structure?
Set D