Azure DP-200
You develop data engineering solutions for a company. You must migrate data from Microsoft Azure Blob storage to an Azure SQL Data Warehouse for further transformation. You need to implement the solution.Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place:
1. Provision an Azure SQL Data warehouse instance 2. connect to the azure warehouse by using SQL server management studio 3. Build external tables by using the SQL server 4. Run T-SQL statements to load data
You are responsible for providing access to an Azure Data Lake Storage Gen2 account.Your user account has contributor access to the storage account, and you have the application ID access key.You plan to use PolyBase to load data into Azure SQL data warehouse.You need to configure PolyBase to connect the data warehouse to the storage account.Which three components should you create in sequence? To answer, move the appropriate components from the list of components to the answer are and arrange them in the correct order.Select and Place: A database encryption key An asymmetric key An external data source An external file format A database scoped credential
A database scoped credential An external data source An external file format
You develop data engineering solutions for a company.A project requires the deployment of data to Azure Data Lake Storage.You need to implement role-based access control (RBAC) so that project members can manage the Azure Data Lake Storage resources. Which three actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. A. Assign Azure AD security groups to Azure Data Lake Storage. B. Configure end-user authentication for the Azure Data Lake Storage account. C. Configure service-to-service authentication for the Azure Data Lake Storage account. D. Create security groups in Azure Active Directory (Azure AD) and add project members. E. Configure access control lists (ACL) for the Azure Data Lake Storage account.
A. Assign Azure AD security groups to Azure Data Lake Storage. D. Create security groups in Azure Active Directory (Azure AD) and add project members. E. Configure access control lists (ACL) for the Azure Data Lake Storage account.
You have an Azure Storage account and an Azure SQL data warehouse by using Azure Data Factory. The solution must meet the following requirements: ✑ Ensure that the data remains in the UK South region at all times. ✑ Minimize administrative effort. Which type of integration runtime should you use? A. Azure integration runtime B. Self-hosted integration runtime C. Azure-SSIS integration runtime
A. Azure integration runtime
A company plans to use Azure Storage for file storage purposes. Compliance rules require: ✑ A single storage account to store all operations including reads, writes and deletes ✑ Retention of an on-premises copy of historical operations You need to configure the storage account.Which two actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. A. Configure the storage account to log read, write and delete operations for service type Blob B. Use the AzCopy tool to download log data from $logs/blob C. Configure the storage account to log read, write and delete operations for service-type table D. Use the storage client to download log data from $logs/table E. Configure the storage account to log read, write and delete operations for service type queue
A. Configure the storage account to log read, write and delete operations for service type Blob B. Use the AzCopy tool to download log data from $logs/blob
Contoso, Ltd. plans to configure existing applications to use Azure SQL Database.When security-related operations occur, the security team must be informed.You need to configure Azure Monitor while minimizing administrative effort. Which three actions should you perform? Each correct answer presents part of the solution NOTE: Each correct selection is worth one point. A. Create a new action group to email [email protected]. B. Use [email protected] as an alert email address. C. Use all security operations as a condition. D. Use all Azure SQL Database servers as a resource. E. Query audit log entries as a condition
A. Create a new action group to email [email protected]. C. Use all security operations as a condition. D. Use all Azure SQL Database servers as a resource.
You need to develop a pipeline for processing data. The pipeline must meet the following requirements: ✑ Scale up and down resources for cost reduction ✑ Use an in-memory data processing engine to speed up ETL and machine learning operations. ✑ Use streaming capabilities ✑ Provide the ability to code in SQL, Python, Scala, and R ✑ Integrate workspace collaboration with GitWhat should you use? A. HDInsight Spark Cluster B. Azure Stream Analytics C. HDInsight Hadoop Cluster D. Azure SQL Data Warehouse E. HDInsight Kafka Cluster F. HDInsight Storm Cluster
A. HDInsight Spark cluster Aparch Spark is an open-source, parallel-processing framework that supports in-memory processing to boost the performance of big-data analysis applications.HDInsight is a managed Hadoop service. Use it deploy and manage Hadoop clusters in Azure. For batch processing, you can use Spark, Hive, Hive LLAP,MapReduce.Languages: R, Python, Java, Scala, SQL
A company plans to use Azure SQL Database to support a mission-critical application.The application must be highly available without performance degradation during maintenance windows.You need to implement the solution.Which three technologies should you implement? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. A. Premium service tier B. Virtual machine Scale Sets C. Basic service tier D. SQL Data Sync E. Always On availability groups F. Zone-redundant configuration
A. Premium service tier E. Always On availability groups F. Zone-redundant configuration
A company uses Azure Data Lake Gen 1 Storage to store big data related to consumer behavior.You need to implement logging. Solution: Configure Azure Data Lake Storage diagnostics to store logs and metrics in a storage account. Does the solution meet the goal? A. Yes B. No
A. Yes
You develop data engineering solutions for a company.A project requires the deployment of resources to Microsoft Azure for batch data processing on Azure HDInsight. Batch processing will run daily and must: ✑ Scale to minimize costs ✑ Be monitored for cluster performance You need to recommend a tool that will monitor clusters and provide information to suggest how to scale. Solution: Monitor clusters by using Azure Log Analytics and HDInsight cluster management solutions.Does the solution meet the goal? A. Yes B. No
A. Yes HDInsight provides cluster-specific management solutions that you can add for Azure Monitor logs. Management solutions add functionality to Azure Monitor logs, providing additional data and analysis tools. These solutions collect important performance metrics from your HDInsight clusters and provide the tools to search the metrics. These solutions also provide visualizations and dashboards for most cluster types supported in HDInsight. By using the metrics that you collect with the solution, you can create custom monitoring rules and alerts.
You develop a data ingestion process that will import data to a Microsoft Azure SQL Data Warehouse. The data to be ingested resides in parquet files stored in anAzure Data Lake Gen 2 storage account.You need to load the data from the Azure Data Lake Gen 2 storage account into the Azure SQL Data Warehouse.Solution: 1. Create an external data source pointing to the Azure Data Lake Gen 2 storage account 2. Create an external file format and external table using the external data source 3. Load the data using the CREATE TABLE AS SELECT statement Does the solution meet the goal? A. Yes B. No
A. Yes You need to create an external file format and external table using the external data source.You load the data using the CREATE TABLE AS SELECT statement.
A company has a SaaS solution that uses Azure SQL Database with elastic pools. The solution will have a dedicated database for each customer organization.Customer organizations have peak usage at different periods during the year.Which two factors affect your costs when sizing the Azure SQL Database elastic pools? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. maximum data size B. number of databases C. eDTUs consumption D. number of read operations E. number of transactions
A. maximum data size C. eDTUs consumption
A company has a Microsoft Azure HDInsight solution that uses different cluster types to process and analyze data. Operations are continuous.Reports indicate slowdowns during a specific time window. You need to determine a monitoring solution to track down the issue in the least amount of time. What should you use? A. Azure Log Analytics log search query B. Ambari REST API C. Azure Monitor Metrics D. HDInsight .NET SDK E. Azure Log Analytics alert rule query
B. Ambari REST API Ambari is the recommended tool for monitoring the health for any given HDInsight cluster.Note: Azure HDInsight is a high-availability service that has redundant gateway nodes, head nodes, and ZooKeeper nodes to keep your HDInsight clusters running smoothly. While this ensures that a single failure will not affect the functionality of a cluster, you may still want to monitor cluster health so you are alerted when an issue does arise. Monitoring cluster health refers to monitoring whether all nodes in your cluster and the components that run on them are available and functioning correctly.Ambari is the recommended tool for monitoring utilization across the whole cluster. The Ambari dashboard shows easily glanceable widgets that display metrics such as CPU, network, YARN memory, and HDFS disk usage. The specific metrics shown depend on cluster type. The "Hosts" tab shows metrics for individual nodes so you can ensure the load on your cluster is evenly distributed.
You develop data engineering solutions for a company.You need to ingest and visualize real-time Twitter data by using Microsoft Azure.Which three technologies should you use? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. A. Event Grid topic B. Azure Stream Analytics Job that queries Twitter data from an Event Hub C. Azure Stream Analytics Job that queries Twitter data from an Event Grid D. Logic App that sends Twitter posts which have target keywords to Azure E. Event Grid subscription F. Event Hub instance
B. Azure Stream Analytics Job that queries Twitter data from an Event Hub D. Logic App that sends Twitter posts which have target keywords to Azure F. Event Hub instance
A company uses Azure SQL Database to store sales transaction data. Field sales employees need an offline copy of the database that includes last year's sales on their laptops when there is no internet connection available.You need to create the offline export copy. Which three options can you use? Each correct answer presents a complete solution.NOTE: Each correct selection is worth one point. A. Export to a BACPAC file by using Azure Cloud Shell, and save the file to an Azure storage account B. Export to a BACPAC file by using SQL Server Management Studio. Save the file to an Azure storage account C. Export to a BACPAC file by using the Azure portal D. Export to a BACPAC file by using Azure PowerShell and save the file locally E. Export to a BACPAC file by using the SqlPackage utility
B. Export to a BACPAC file by using SQL Server Management Studio. Save the file to an Azure storage account C. Export to a BACPAC file by using the Azure portal E. Export to a BACPAC file by using the SqlPackage utility
Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.You must develop a pipeline that meets the following requirements: ✑ Process data every six hours ✑ Offer interactive data analysis capabilities ✑ Offer the ability to process data using solid-state drive (SSD) caching ✑ Use Directed Acyclic Graph(DAG) processing mechanisms ✑ Provide support for REST API calls to monitor processes ✑ Provide native support for Python ✑ Integrate with Microsoft Power BI You need to select the appropriate data technology to implement the pipeline.Which data technology should you implement? A. Azure SQL Data Warehouse B. HDInsight Apache Storm cluster C. Azure Stream Analytics D. HDInsight Apache Hadoop cluster using MapReduce E. HDInsight Spark cluster
B. HDInsight Apache Storm cluster
Your company uses several Azure HDInsight clusters.The data engineering team reports several errors with some applications using these clusters.You need to recommend a solution to review the health of the clusters. What should you include in your recommendation? A. Azure Automation B. Log Analytics C. Application Insights
B. Log Analytics
You are the data engineer for your company. An application uses a NoSQL database to store data. The database uses the key-value and wide-column NoSQL database type.Developers need to access data in the database using an API.You need to determine which API to use for the database model and type.Which two APIs should you use? Each correct answer presents a complete solution.NOTE: Each correct selection is worth one point. A. Table API B. MongoDB API C. Gremlin API D. SQL API E. Cassandra API
B. MongoDB API E. Cassandra API
You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar(22). You need to implement masking for the Customer_ID field to meet the following requirements: ✑ The first two prefix characters must be exposed. ✑ The last four prefix characters must be exposed. ✑ All other characters must be masked. Solution: You implement data masking and use a credit card function mask.Does this meet the goal? A. Yes B. No
B. No
You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar(22).You need to implement masking for the Customer_ID field to meet the following requirements: ✑ The first two prefix characters must be exposed. ✑ The last four prefix characters must be exposed. ✑ All other characters must be masked. Solution: You implement data masking and use a random number function mask.Does this meet the goal? A. Yes B. No
B. No
You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar(22).You need to implement masking for the Customer_ID field to meet the following requirements: ✑ The first two prefix characters must be exposed. ✑ The last four prefix characters must be exposed. ✑ All other characters must be masked. Solution: You implement data masking and use an email function mask.Does this meet the goal? A. Yes B. No
B. No
A company uses Azure Data Lake Gen 1 Storage to store big data related to consumer behavior.You need to implement logging. Solution: Use information stored in Azure Active Directory reports. Does the solution meet the goal? A. Yes B. No
B. No Instead configure Azure Data Lake Storage diagnostics to store logs and metrics in a storage account.
You develop a data ingestion process that will import data to a Microsoft Azure SQL Data Warehouse. The data to be ingested resides in parquet files stored in anAzure Data Lake Gen 2 storage account.You need to load the data from the Azure Data Lake Gen 2 storage account into the Azure SQL Data Warehouse.Solution: 1. Create a remote service binding pointing to the Azure Data Lake Gen 2 storage account 2. Create an external file format and external table using the external data source 3. Load the data using the CREATE TABLE AS SELECT statement Does the solution meet the goal? A. Yes B. No
B. No You need to create an external file format and external table from an external data source, instead from a remote service binding pointing.
You develop a data ingestion process that will import data to a Microsoft Azure SQL Data Warehouse. The data to be ingested resides in parquet files stored in anAzure Data Lake Gen 2 storage account.You need to load the data from the Azure Data Lake Gen 2 storage account into the Azure SQL Data Warehouse.Solution: 1. Create an external data source pointing to the Azure storage account 2. Create a workload group using the Azure storage account name as the pool name 3. Load the data using the CREATE TABLE AS SELECT statement Does the solution meet the goal? A. Yes B. No
B. No Use the Azure data lake gen 2 storage account
An application will use Microsoft Azure Cosmos DB as its data solution. The application will use the Cassandra API to support a column-based database type that uses containers to store items.You need to provision Azure Cosmos DB. Which container name and item name should you use? Each correct answer presents part of the solutions.NOTE: Each correct answer selection is worth one point. A. collection B. rows C. graph D. entities E. table
B. rows D. entities
You plan to use Microsoft Azure SQL Database instances with strict user access control. A user object must: ✑ Move with the database if it is run elsewhere ✑ Be able to create additional users You need to create the user object with correct permissions.Which two Transact-SQL commands should you run? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point. A. ALTER LOGIN Mary WITH PASSWORD = 'strong_password'; B. CREATE LOGIN Mary WITH PASSWORD = 'strong_password'; C. ALTER ROLE db_owner ADD MEMBER Mary; D. CREATE USER Mary WITH PASSWORD = 'strong_password'; E. GRANT ALTER ANY USER TO Mary;
C. ALTER ROLE db_owner ADD MEMBER Mary; D. CREATE USER Mary WITH PASSWORD = 'strong_password';
You manage a solution that uses Azure HDInsight clusters.You need to implement a solution to monitor cluster performance and status.Which technology should you use? A. Azure HDInsight .NET SDK B. Azure HDInsight REST API C. Ambari REST API D. Azure Log Analytics E. Ambari Web UI
C. Ambari REST API
You are developing a solution that will stream to Azure Stream Analytics. The solution will have both streaming data and reference data.Which input type should you use for the reference data? A. Azure Cosmos DB B. Azure Event Hubs C. Azure Blob storage D. Azure IoT Hub
C. Azure Blob storage
The data engineering team manages Azure HDInsight clusters. The team spends a large amount of time creating and destroying clusters daily because most of the data pipeline process runs in minutes.You need to implement a solution that deploys multiple HDInsight clusters with minimal effort.What should you implement? A. Azure Databricks B. Azure Traffic Manager C. Azure Resource Manager templates D. Ambari web user interface
C. Azure Resource Manager templates
A company runs Microsoft SQL Server in an on-premises virtual machine (VM).You must migrate the database to Azure SQL Database. You synchronize users from Active Directory to Azure Active Directory (Azure AD).You need to configure Azure SQL Database to use an Azure AD user as administrator.What should you configure? A. For each Azure SQL Database, set the Access Control to administrator. B. For each Azure SQL Database server, set the Active Directory to administrator. C. For each Azure SQL Database, set the Active Directory administrator role. D. For each Azure SQL Database server, set the Access Control to administrator.
C. For each Azure SQL Database, set the Active Directory administrator role.
You develop data engineering solutions for a company.You must integrate the company's on-premises Microsoft SQL Server data with Microsoft Azure SQL Database. Data must be transformed incrementally.You need to implement the data integration solution. Which tool should you use to configure a pipeline to copy data? A. Use the Copy Data tool with Blob storage linked service as the source B. Use Azure PowerShell with SQL Server linked service as a source C. Use Azure Data Factory UI with Blob storage linked service as a source D. Use the .NET Data Factory API with Blob storage linked service as the source
C. Use Azure Data Factory UI with Blob storage linked service as a source
You plan to create a dimension table in Azure Data Warehouse that will be less than 1 GB.You need to create the table to meet the following requirements: ✑ Provide the fastest query time. ✑ Minimize data movement. Which type of table should you use? A. hash distributed B. heap C. replicated D. round-robin
C. replicated
You implement an event processing solution using Microsoft Azure Stream Analytics.The solution must meet the following requirements: ✑ Ingest data from Blob storage ✑ Analyze data in real time ✑ Store processed data in Azure Cosmos DB Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.Select and Place: Create a query statement with the ORDER BY clause Create a query statement with the SELECT INTO statement Configure blob storage for a reference data JOIN clause Configure Azure event hub as input, select items with the TIMESTAMP BY clause Set up cosmos DB as the output Configure blob storage as input, select items with the TIMESTAMP BY clause
Configure blob storage as input, select items with the TIMESTAMP BY clause Set up cosmos DB as the output Create a query statement with the SELECT INTO statement
You are creating a managed data warehouse solution on Microsoft Azure.You must use PolyBase to retrieve data from Azure Blob storage that resides in parquet format and load the data into a large table called FactSalesOrderDetails.You need to configure Azure SQL Data Warehouse to receive the data.Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Create an external file format to map the parquet files Load the data to a staging table Create the external table FactSalesOrderDetails Enable transparent data encryption Create an external data source for azure blob storage Create a master key on the DB Configure polybase to use azure blob storage
Create a master key on the DB Create an external data source for azure blob storage Create an external file format to map the parquet files Create the external table FactSalesOrderDetails
DRAG DROP -You are developing a solution to visualize multiple terabytes of geospatial data.The solution has the following requirements: ✑ Data must be encrypted. ✑ Data must be accessible by multiple resources on Microsoft Azure. You need to provision storage for the solution. Which four actions should you perform in sequence? To answer, move the appropriate action from the list of actions to the answer area and arrange them in the correct order. Select and Place: Enable encryption on the Azure data lake using the Azure portal. Add an access policy for the new Azure data lake account to the key storage container. Create a new Azure data lake storage account with Azure data lake managed encryption keys. Select and configure an encryption key storage container. Create a new Azure data lake storage account with Azure key vault managed encryption keys. Create a new Azure data lake storage account with encryption disabled.
Create a new Azure data lake storage account with Azure key vault managed encryption keys. Select and configure an encryption key storage container. Add an access policy for the new Azure data lake account to the key storage container. Enable encryption on the Azure data lake using the Azure portal.
Your company manages on-premises Microsoft SQL Server pipelines by using a custom solution.The data engineering team must implement a process to pull data from SQL Server and migrate it to Azure Blob storage. The process must orchestrate and manage the data lifecycle.You need to configure Azure Data Factory to connect to the on-premises SQL Server database.Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Create an Azure data factory resource. Configure a self-hosted integration runtime Create a virtrual private network (VPN) connection from on-prem to MS Azure. Create a database master key on the SQL Server. Backup the database and send it to Azure blob storage. Configure the on-prem SQL server instance with an integration runtime.
Create an Azure Data Factory resource Configure a self-hosted integration runtime Configure on-premises SQL Server Instance with an integration runtime
You develop data engineering solutions for a company.You need to deploy a Microsoft Azure Stream Analytics job for an IoT solution. The solution must: ✑ Minimize latency. ✑ Minimize bandwidth usage between the job and IoT device. Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.Select and Place: Configure routes Create an azure data lake storage container Create an IoT hub and add the azure steam analytics module to the IoT hub namespace Create an azure stream Analytics edge job and configure job definition save location Create an azure stream Analytics cloud job and configure job definition save location Create an azure blob storage container Configure streaming units
Create an IoT hub and add the azure steam analytics module to the IoT hub namespace Create an azure blob storage container Create an azure stream Analytics edge job and configure job definition save location Configure routes
You have a table named SalesFact in an Azure SQL data warehouse. SalesFact contains sales data from the past 36 months and has the following characteristics: ✑ Is partitioned by month ✑ Contains one billion rows ✑ Has clustered columnstore indexes All the beginning of each month, you need to remove data SalesFact that is older than 36 months as quickly as possible. Which three actions should you perform in sequence in a stored procedure? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Create an empty table names SalesFact_Work that has the same schema as SalesFact. Drop the SalesFact_Work table. Copy the data to a new table by using Create table as select. Truncate the partitioning containing the state data. Switch the partitioning containing the stale data from SalesFact to SalesFact_Work. Execute a DELETE statement where the value in the date column is more than 36 months ago.
Create an empty table names SalesFact_Work that has the same schema as SalesFact. Switch the partitioning containing the stale data from SalesFact to SalesFact_Work. Drop the SalesFact_Work table.
You have data stored in thousands of CSV files in Azure Data Lake Storage Gen2. Each file has a header row followed by a property formatted carriage return (/r) and line feed (/n).You are implementing a pattern that batch loads the files daily into an Azure SQL data warehouse by using PolyBase. You need to skip the header row when you import the files into the data warehouse.Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.Which three actions you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Create an external data source that uses the abfs location Create an external file format and set the First_Row option Create an external data source that uses the hadoop location Create a database scoped credential that uses an OAuth2 token and a key Use Create external table as select (CETAS) and create a view that removes the empty row
Create an external data source that uses the abfs location Create an external file format and set the First_Row option Use Create external table as select (CETAS) and create a view that removes the empty row
A company is designing a hybrid solution to synchronize data and on-premises Microsoft SQL Server database to Azure SQL Database.You must perform an assessment of databases to determine whether data will move without compatibility issues. You need to perform the assessment.Which tool should you use? A. SQL Server Migration Assistant (SSMA) B. Microsoft Assessment and Planning Toolkit C. SQL Vulnerability Assessment (VA) D. Azure SQL Data Sync E. Data Migration Assistant (DMA)
E. Data Migration Assistant (DMA)
You are a data engineer implementing a lambda architecture on Microsoft Azure. You use an open-source big data solution to collect, process, and maintain data.The analytical data store performs poorly.You must implement a solution that meets the following requirements :✑ Provide data warehousing ✑ Reduce ongoing management activities ✑ Deliver SQL query responses in less than one second You need to create an HDInsight cluster to meet the requirements.Which type of cluster should you create? A. Interactive Query B. Apache Hadoop C. Apache HBase D. Apache Spark
D. Apache Spark
A company manages several on-premises Microsoft SQL Server databases.You need to migrate the databases to Microsoft Azure by using a backup process of Microsoft SQL Server.Which data technology should you use? A. Azure SQL Database single database B. Azure SQL Data Warehouse C. Azure Cosmos DB D. Azure SQL Database Managed Instance
D. Azure SQL Database Managed Instance
You develop data engineering solutions for a company. The company has on-premises Microsoft SQL Server databases at multiple locations.The company must integrate data with Microsoft Power BI and Microsoft Azure Logic Apps. The solution must avoid single points of failure during connection and transfer to the cloud. The solution must also minimize latency.You need to secure the transfer of data between on-premises databases and Microsoft Azure. What should you do? A. Install a standalone on-premises Azure data gateway at each location B. Install an on-premises data gateway in personal mode at each location C. Install an Azure on-premises data gateway at the primary location D. Install an Azure on-premises data gateway as a cluster at each location
D. Install an Azure on-premises data gateway as a cluster at each location
You plan to implement an Azure Cosmos DB database that will write 100,000 JSON every 24 hours. The database will be replicated to three regions. Only one region will be writable.You need to select a consistency level for the database to meet the following requirements: ✑ Guarantee monotonic reads and writes within a session. ✑ Provide the fastest throughput. ✑ Provide the lowest latency. Which consistency level should you select? A. Strong B. Bounded Staleness C. Eventual D. Session E. Consistent Prefix
D. Session
You are developing a solution using a Lambda architecture on Microsoft Azure.The data at rest layer must meet the following requirements: Data storage: ✑ Serve as a repository for high volumes of large files in various formats. ✑ Implement optimized storage for big data analytics workloads.✑ Ensure that data can be organized using a hierarchical structure. Batch processing: ✑ Use a managed solution for in-memory computation processing. ✑ Natively support Scala, Python, and R programming languages.Provide the ability to resize and terminate the cluster automatically. Analytical data store: ✑ Support parallel processing. ✑ Use columnar storage. ✑ Support SQL-based languages. You need to identify the correct technologies to build the Lambda architecture.Which technologies should you use? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. Data Storage - Azure SQL DB - Azure Blob storage - Azure Cosmos DB - Azure Data lake store Batch processing - HDInsight spark - HDInsight Hadoop - Azure Databricks - HDInsight interactive query Analytical data store - HDInsight HBase - Azure SQL Data warehouse - Azure Analysis services - Azure cosmos DB
Data Storage = Azure Blob storage Batch Processing = Azure Databricks Analytical data store = Azure SQL data warehouse
Your company has on-premises Microsoft SQL Server instance.The data engineering team plans to implement a process that copies data from the SQL Server instance to Azure Blob storage. The process must orchestrate and manage the data lifecycle.You need to configure Azure Data Factory to connect to the SQL Server instance. Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Configure a linked service to connect to the SQL server instance. From the on-prem network, install and configure a self-hosted runtime. From the SQL server, backup the db and then copy the database to Azure blob storage. Deploy an Azure Data factory. From the SQL server, create a database master key.
Deploy and Azure data factory. From the on-prem network, install and configure a self-hosted runtime. Configure a linked service to connect to the SQL server instance.
Your company plans to create an event processing engine to handle streaming data from Twitter.The data engineering team uses Azure Event Hubs to ingest the streaming data.You need to implement a solution that uses Azure Databricks to receive the streaming data from the Azure Event Hubs.Which three actions should you recommend be performed in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.Select and Place: Create and configure a notebook that consumes the streaming data Import data from blob storage Use environment variables to define the apache spark connection Deploy the Azure databricks service Configure an ODBC or JDBC connector Deploy a spark cluster and then attach the required libraries to the cluster
Deploy the Azure Databricks service Deploy a spark cluster and then attach the required libraries to the cluster Create and configure a notebook that consumes the streaming data
You have an Azure SQL data warehouse.Using PolyBase, you create table named [Ext].[Items] to query Parquet files stored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse.The external table has three columns.You discover that the Parquet files have a fourth column named ItemID.Which command should you run to add the ItemID column to the external table?
Drop Table [Ext].[Items] Create External Table [Ext].[Items] ( [ItemId] [int] Null, [ItemName] nvarchar(50) Null, [ItemType] nvarchar(20) Null, [ItemDescription] nvarchar(250)) With ( Location = '/Items/', Data_Source = AzureDataLakeStore, File_Format = Parquet, Reject_Type = Value, Reject_Value = 0 );
Question #4Topic 1 You are a data architect. The data engineering team needs to configure a synchronization of data between an on-premises Microsoft SQL Server database toAzure SQL Database.Ad-hoc and reporting queries are being overutilized the on-premises production instance. The synchronization process must :✑ Perform an initial data synchronization to Azure SQL Database with minimal downtime ✑ Perform bi-directional data synchronization after initial synchronization You need to implement this synchronization solution.Which synchronization method should you use? A. transactional replication B. Data Migration Assistant (DMA) C. backup and restore D. SQL Server Agent job E. Azure SQL Data Sync
E. Azure SQL Data Sync
You are developing a data engineering solution for a company. The solution will store a large set of key-value pair data by using Microsoft Azure Cosmos DB.The solution has the following requirements: ✑ Data must be partitioned into multiple containers. ✑ Data containers must be configured separately. ✑ Data must be accessible from applications hosted around the world. ✑ The solution must minimize latency. You need to provision Azure Cosmos DB. A. Cosmos account-level throughput. B. Provision an Azure Cosmos DB account with the Azure Table API. Enable geo-redundancy. C. Configure table-level throughput. D. Replicate the data globally by manually adding regions to the Azure Cosmos DB account. E. Provision an Azure Cosmos DB account with the Azure Table API. Enable multi-region writes.
E. Provision an Azure Cosmos DB account with the Azure Table API. Enable multi-region writes.
A company has a SaaS solution that uses Azure SQL Database with elastic pools. The solution contains a dedicated database for each customer organization.Customer organizations have peak usage at different periods during the year.You need to implement the Azure SQL Database elastic pool to minimize cost.Which option or options should you configure? A. Number of transactions only B. eDTUs per database only C. Number of databases only D. CPU usage only E. eDTUs and max data size
E. eDTUs and max data size
A company plans to use Platform-as-a-Service (PaaS) to create the new data pipeline process. The process must meet the following requirements: Ingest: ✑ Access multiple data sources. ✑ Provide the ability to orchestrate workflow. ✑ Provide the capability to run SQL Server Integration Services packages. Store: ✑ Optimize storage for big data workloads ✑ Provide encryption of data at rest. ✑ Operate with no size limits. Prepare and Train: ✑ Provide a fully-managed and interactive workspace for exploration and visualization. ✑ Provide the ability to program in R, SQL, Python, Scala, and Java. ✑ Provide seamless user authentication with Azure Active Directory. Model & Serve: ✑ Implement native columnar storage. ✑ Support for the SQL language. ✑ Provide support for structured streaming. You need to build the data integration pipeline.Which technologies should you use? To answer, select the appropriate options in the answer area.
Ingest = Azure Data Factory Store = Azure Data Lake storage Prepare and train = Azure Databricks Model and Serve = Azure SQL Data Warehouse