Azure Data Fundamentals: 12 Explore data storage and processing in Azure
In Azure Databricks how do you change the language a cell uses? ( 1. The first line in the cell is %language. For example, %scala. 2. Change the notebook language before writing the commands. 3. Wrap the command in the cell with ##language##. )
1. The first line in the cell is %language. For example, %scala.
You have a large amount of data held in files in Azure Data Lake storage. You want to retrieve the data in these files and use it to populate tables held in Azure Synapse Analytics. Which processing option is most appropriate? ( 1. Use Azure Synapse Link to connect to Azure Data Lake storage and download the data 2. Synapse SQL pool 3. Synapse Spark pool )
2. Synapse SQL pool
Which of the components of Azure Synapse Analytics allows you to train AI models using AzureML? ( 1. Synapse Studio 2. Synapse Pipelines 3. Synapse Spark )
3. Synapse Spark
________________is a service that can ingest large amounts of raw, unorganized data from relational and non-relational systems, and convert this data into meaningful information. ____________provides a scalable and programmable ingestion engine that you can use to implement complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.
Azure Data Factory ; Data Factory
___________ is a collection of analytics and storage services that you can combine to implement a big data solution. It comprises three main elements:
Azure Data Lake; 1.Data Lake Store 2.Data Lake Analytics 3.HDInsight
________________ is an analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.
Azure Databricks
the most common options for processing data in Azure include
Azure Databricks, Azure Data Factory, Azure Synapse Analytics, and Azure Data Lake
________________ is a managed analytics service in the cloud. It's based on Apache Hadoop, a collection of open-source tools and utilities that enable you to run processing tasks over large amounts of data.
Azure HDInsight
______________ provides a suite of tools to analyze and process an organization's data. It incorporates SQL technologies, Transact-SQL query capabilities, and open-source Spark tools to enable you to quickly process very large amounts of data.
Azure Synapse Analytics
_________________ is a generalized analytics service. You can use it to read data from many sources, process this data, generate various analyses and models, and save the results.
Azure Synapse Analytics
_____________ can process data held in many different types of storage, including Azure Blob storage, Azure Data Lake Store, Hadoop storage, flat files, databases, and data warehouses. ____________ can also process streaming data.
Databricks; Databricks
A common approach that you can use with Azure Synapse Analytics is to extract the data from where it's currently stored, load this data into an analytical data store, and then transform the data, shaping it for analysis. This approach is known as ________________
ELT, for extract, load, and transform.
____________ is a parallel-processing engine that supports large-scale analytics.
Spark
In Azure Synapse Analytics, you can select between two technologies to process data
TSQL; Spark
The term ______________ refers to data that is too large or complex for traditional database systems.
big data
Azure Synapse Analytics uses a ______________ architecture.
clustered