Databricks Lakehouse Accreditation Badge Exam questions

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Which of the following is a security feature made available in the Databricks Lakehouse Platform by Unity Catalog? Select two responses. - Single-source-of-truth identity management - Workspace-specific data metastores - Fine-grained access control on data objects - Workspace-specific identity management - Databricks SQL warehouse access control

Single-source-of-truth identity management. Workspace-specific identity management

Which of the following describes what challenges a data organization would likely face when migrating from a data warehouse to a data lake? Select two responses. - There are increased data reliability issues in a data lake. - There are increased security and privacy concerns in a data lake. - There are increased data quality guarantees in a data lakeThere are increased cloud storage costs in a data lake. - There are increased performance speeds in a data lake.

There are increased data reliability issues in a data lake. There are increased security and privacy concerns in a data lake.

In which of the following ways do serverless compute resources differ from classic compute resources within the Databricks Lakehouse Platform? Select two responses. - They exist within the customer cloud account - They are located within the cloud - They are always running and reserved for a single, specific customer when needed - They exist within the Databricks cloud account - They result in lower costs by not overprovisioning

They exist within the Databricks cloud account. They result in lower costs by not overprovisioning

A data architect is evaluating data warehousing solutions for their organization to use. As a part of this, the architect is considering the Databricks Lakehouse Platform. Which of the following is a benefit of using the Databricks Lakehouse Platform for warehousing? Select four responses. - Engineering capabilities supporting warehouse source data - Built-in governance for single-source-of-truth data - A rich ecosystem of business intelligence (BI) integrations - Local development software to integrate with other capabilities - Best available price/performance

- Engineering capabilities supporting warehouse source data. - Built-in governance for single-source-of-truth data. - A rich ecosystem of business intelligence (BI) integrations. ? Local development software to integrate with other capabilities. - Best available price/performance

Which of the following is a benefit of the Databricks Lakehouse Platform being designed to support all data and artificial intelligence (AI) workloads? Select four responses. 1 Analysts can easily integrate their favorite business intelligence (BI) tools for further analysis. 2 Data analysts, data engineers, and data scientists can easily collaborate within a single platform. 3 Data workloads can be automatically scaled when needed. 4 There is increased need for multiple, specialist platform administrators to maintain each component of the unified platform. 5 Data teams can all utilize secure data from a single source to deliver reliable, consistent results across workloads at scale.

1, 2, 3, 5

Data organizations need specialized environments designed specifically for machine learning workloads. Which of the following is made available by Databricks as part of Databricks Machine Learning to support machine learning workloads? Select four responses. - Lakehouse-specific deep learning frameworks - Built-in automated machine learning development - Built-in real-time model serving - Support for distributed model training on big data - Optimized and preconfigured machine learning frameworks

? Built-in automated machine learning development. Built-in real-time model serving. Support for distributed model training on big data. Optimized and preconfigured machine learning frameworks

Which of the following do Databricks SQL users experience when using serverless Databricks SQL warehouses rather than classic Databricks SQL warehouses? Select one response. - Availability of Photon - Increased total cost of use - Availability of automatic scaling - Performance degradation on long-running queries - Expedited environment startup

Availability of automatic scaling?

The Databricks Lakehouse Platform architecture consists of a control plane and a data plane. Which of the following resources exists within the Databricks control plane? Select two responses. - Cluster configurations - Serverless compute resources - Cloud object storage - Notebooks - Classic compute resources

Cluster configuration, Notebooks

Unity Catalog offers improved Lakehouse data object governance and organization capabilities for data segregation. Which of the following is a consequence of using Unity Catalog to manage, organize and segregate data objects? Select one response. - Catalogs exist within schemas (databases) - Complete data object referencing requires three levels - Table metadata is required - Data in tables and views must be stored in external storage locations - Views are made available outside of their schemas (databases)

Complete data object referencing requires three levels

Which of the following Databricks Lakehouse Platform services or capabilities provides a data warehousing experience to its users? Select one response. - Databricks Machine Learning - Unity Catalog - Delta Lake - Data Science and Engineering Workspace - Databricks SQL

Databricks SQL

Many organizations use a variety of open-source and proprietary tools for data orchestration, but these tools often have their own limitations. To address the orchestration needs of these organizations, Databricks developed Databricks Workflows. Which of the following is a benefit of using Databricks Workflows for orchestration purposes? Select two responses. - Databricks Workflows provides multiple-task workflow functionality only for Delta Live Tables workloads - Databricks Workflows supports workloads across multiple cloud service providers and tools - Databricks Workflows supports automating workloads as long as they are not in notebooks - Databricks Workflows provides Git-backed version control capabilities to notebooks - Databricks Workflows supports tasks for data ingestion, data engineering, machine learning, and business intelligence (BI)

Databricks Workflows supports workloads across multiple cloud service providers and tools. Databricks Workflows supports tasks for data ingestion, data engineering, machine learning, and business intelligence (BI)

While the Databricks Lakehouse Platform provides support for many types of data, analytics, and machine learning workloads, some organizations prefer to continue using other preferred vendors for use cases like data ingestion, data transformation, business intelligence, and machine learning. - Databricks can be integrated directly with a large number of Databricks partners. - Databricks can use cloud service provider capabilities to efficiently share data with other data tools and platforms. - Databricks can be used on-premises to allow for secure, in-house integrations. - Databricks can be used locally to allow developers to manually integrate with other systems. - Databricks cannot be used alongside other big data tools and platforms.

Databricks can be integrated directly with a large number of Databricks partners.

One of the foundational technologies provided by the Databricks Lakehouse Platform is an open-source, file-based storage format that provides a number of benefits. These benefits include ACID transaction guarantees, scalable data and metadata handling, audit history and time travel, table schema enforcement and schema evolution, support for deletes/updates/merges, and unified streaming and batch data processing. Which of the following technologies is being described in the above statement? Select one response. Photon Apache Spark Unity Catalog Delta Lake MLflow

Delta Lake

In the past, a lot of data engineering resources needed to be contributed to the development of tooling and other mechanisms for creating and managing data workloads. In response, Databricks developed and released a declarative ETL framework so data engineers can focus on helping their organizations get value from their data Which of the following technologies is being described above? Select one response. Delta Lake Databricks SQL Queries Databricks Jobs Delta Live Tables Autologging

Delta Live Tables

Data sharing has traditionally been performed by proprietary vendor solutions, SSH File Transfer Protocol (SFTP), or cloud-specific solutions. However, each of these sharing tools and solutions comes with its own set of limitations. As a result, Databricks helped to develop the solution, Delta Sharing. Describes Delta Sharing as a solution for data sharing.

Delta Sharing is a multicloud, proprietary solution for efficiently copying and transferring data from the lakehouse to any external system.

Which of the following data engineering capabilities simplifies the work of data engineers on the Databricks Lakehouse Platform? Select three responses. - End-to-end data pipeline visibility - Flexible machine learning development solutions - SQL and Python development compatibility - Automatic deployment and data operations - Serverless cluster startup times

End-to-end data pipeline visibility, SQL and Python development compatibility, Automatic deployment and data operations?

Which of the following correctly describes how a specific capability of the Databricks Lakehouse Platform supports a data streaming pattern? Select three responses. - MLflow ingests its automatic experiment tracking data into a stream for continuous monitoring. - Structured Streaming enables stream-based machine learning inference. - Databricks Workflows automatically passes data from task to task in regular microbatches. - Auto Loader continuously and incrementally ingests streaming data. - Delta Live Tables processes ETL pipelines on streaming data with advanced monitoring mechanisms.

MLflow ingests its automatic experiment tracking data into a stream for continuous monitoring. ? Structured Streaming enables stream-based machine learning inference. Delta Live Tables processes ETL pipelines on streaming data with advanced monitoring mechanisms.

Which of the following lists the relational entities in order from largest (most coarse) to smallest (most granular) within their hierarchy? Select one response. Schema (Database) → Metastore → Catalog → Table Catalog → Metastore → Schema (Database) → Table Schema (Database) → Catalog → Table → Metastore Metastore → Catalog → Schema (Database) → Table Metastore → Catalog → Table → Schema (Database)

Metastore → Catalog → Schema (Database) → Table

Which of the following describes the motivation for the creation of the data lakehouse? Select one response. - Organizations needed to reduce the costs of storing their open-format data files in the cloud. - Organizations needed a reliable data management system with transactional guarantees for their structured data. - Organizations needed a way to scale their data lake workloads without investing in additional on-premises hardware. - Organizations needed to be able to develop increasingly complex machine learning workloads using a simple, SQL-based solution. - Organizations needed a single, flexible, high-performance system to support data, analytics, and machine learning workloads.

Organizations needed a single, flexible, high-performance system to support data, analytics, and machine learning workloads.

It can be challenging for a data lakehouse to provide both performance and scalability for all of its query-based workloads to the standards of a data warehouse and a data lake. As a result, Databricks has introduced a technology built atop Apache Spark to further speed up and scale these varied workloads. Which of the following technologies is being described in the above statement? Select one response. Delta Lake Photon AutoML Unity Catalog AutoML

Photon

Which of the following architecture benefits is provided directly by the Databricks Lakehouse Platform? Select three responses. - Scalable, redundant cloud-based data storage - Available on and across multiple clouds - Built on open source and open standards - Unified security and governance approach for all data assets - Efficient on-premises optimized hardware

Scalable, redundant cloud-based data storage Available on and across multiple clouds. Unified security and governance approach for all data assets.

Which of the following compute resources is available in the Databricks Lakehouse Platform? Select two responses. - On-premises clusters - Serverless clusters - Classic clusters - Local Databricks SQL warehouses - Serverless Databricks SQL warehouses

Serverless Databricks SQL warehouses, Serverless clusters

Maintaining and improving data quality is a major goal of modern data engineering. Which of the following contributes directly to high levels of data quality within the Databricks Lakehouse Platform? Select two responses. - Data expectations enforcement - Business intelligence tool integrations - Simplified machine learning model serving - Table schema evolution - Apache Spark's data format flexibility

data expectations enforcement, Apache Spark's data format flexibility?

Which of the following is a common problem within a data lake architecture that can be easily solved by using the Databricks Lakehouse Platform? Select three responses. - Ineffective partitioning - Too many small files - Lack of cloud service integrations - Inability to use open-source data formats - Lack of ACID transaction support

Ineffective partitioning, Too many small files, Lack of ACID transaction support. Also: Lack of schema enforcement, lack of integration with a data catalog

Which of the following describes how the Databricks Lakehouse Platform makes data governance simpler? Select one response. - Unity Catalog provides a single governance solution across workload types and clouds. - Unity Catalog provides a different governance solution for each cloud. - Unity Catalog provides a different governance solution for each workload. - Unity Catalog provides a different governance solution for each major Databricks Lakehouse Platform Service. - Unity Catalog provides a single governance solution fully managed by the Databricks team.

Unity Catalog provides a single governance solution across workload types and clouds.


Set pelajaran terkait

PTA 130 Kinesiology Chapter 19 Knee Joint ( Terms only , review questions not done yet)

View Set

Feeding, Eating, Elimination, Sleep-Wake, Disruptive, Impulse-Control, and Conduct Disorders

View Set

HIPAA; DHA-US001; Challenge Exam

View Set