My Snowflake Study Set

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What is order of precedence for FILE FORMAT

1. COPY INTO TABLE statement. 2. Stage definition. 3. Table definition.

How many child tasks can a single task have?

100

How many predecessor tasks can a task have?

100

When unloading to an external stage, what is the MAXIMUM file size supported?

5 GB

What is a clustering key

A clustering key is a subset of columns in a table (or expressions on a table) that are explicitly designated to co-locate the data in the table in the same micro-partitions. This is useful for very large tables where the ordering was not ideal (at the time the data was inserted/loaded) or extensive DML has caused the table's natural clustering to degrade.

What is a directory table

A directory table is an implicit object layered on a stage (not a separate database object) and is conceptually similar to an external table because it stores file-level metadata about the data files in the stage. A directory table has no grantable privileges of its own.

What is PUBLIC schema

A schema is a logical grouping of database objects (tables, views, etc.), and each schema belongs to a single database. The PUBLIC schema is the default schema for a database, and all objects are, by default, created inside it if no other schema is specified.

Which ALTER commands will impact a column's availability in Time Travel?

ALTER TABLE ... SET DATA TYPE. Decreasing the precision of a number column can impact Time Travel, for example, converting from NUMBER(20,2) to NUMBER(10,2). SET DATA TYPE is the command that can make that.

Which command should be used to implement a masking policy that was already created in Snowflake?

ALTER TABLE <name> { ALTER | MODIFY } COLUMN <col1_name> SET MASKING POLICY <policy_name> [ USING ( <col1_name> , cond_col_1 , ... ) ]

Network policies can be set at which Snowflake levels?

Account and User

What type of query benefits from query acceleration service?

Ad hoc analytics, Workloads with unpredictable data volume per query. Queries with large scans and selective filters.

Standard (default) scaling policy. Warehouse shuts down..

After 2 to 3 consecutive successful checks (performed at 1 minute intervals)

Economy policy. Warehouse shuts down..

After 5 to 6 consecutive successful checks (performed at 1 minute intervals)

When clustering a table, which data types can be used as clustering keys?

All data types except GEOGRAPHY, VARIANT, OBJECT, ARRAY

How are unresolved objects resolved for Views and UDFs

All unqualifed objects in a view or UDF definition will be resolved in the view's or UDF's schema only.

What is cached during a query on a virtual warehouse?

Any columns accessed during the query

Which types of charts does Snowsight support?

Bar charts. Line charts. Scatterplots. Heat grids. Scorecards.

At which point is data encrypted when using a PUT command?

Before it is sent from the user's machine. It always uses end-to-end encryption.

Which is the MINIMUM required Snowflake edition that a user must have if they want to use AWS/Azure Privatelink or Google Cloud Private Service Connect?

Business Critical

Which of the following commands cannot be used within a reader account?

CREATE SHARE

What can a Snowflake user do in the Admin area of Snowsight?

Connect to Snowflake partner to explore extended functionality.

What is tool for identifying sensitive data?

Data classification

Users are responsible for data storage costs until what occurs?

Data expires from Fail-safe. Storage is calculated and charged for data regardless of whether it is in the Active, Time Travel, or Fail-safe state. Because these life-cycle states are sequential, updated/deleted data protected by CDP will continue to incur storage costs until the data leaves the Fail-safe state.

What is used for cross-region data sharing

Data replication

Which of the following Snowflake objects can be shared using a secure share?

Database, Tables, Dynamic Tables, External Tables, Iceberg tables, Secure views, Secure materialized views, Secure user-defined functions (UDFs)

When unloading to a stage, which of the following is a recommended practice or approach?

Defining a File Format: File format defines the type of data to be unloaded into the stage or S3. It is best practice to define an individual file format when regularly used to unload a certain type of data based on the characteristics of the file needed.

What column type does a Kafka connector store formatted information in a single column?

Each Kafka message is passed to Snowflake in JSON format or Avro format. The Kafka connector stores that formatted information in a single column of type VARIANT.

What is Minimum Snowflake edition required to use the periodic rekeying of micro-partitions?

Enterprise

A size 3X-Large multi-cluster warehouse runs one cluster for one full hour and then runs two clusters for the next full hour.What would be the total number of credits billed?

Equals 192. First hour, 1 cluster: 64 Second hour, 2 clusters: 64x2

Which features are available with Snowflake Enterprise edition?

Extended time travel. Native support for geospatial data

What is FLATTEN function

FLATTEN is a table function that takes a VARIANT, OBJECT, or ARRAY column and produces a lateral view (i.e. an inline view that contains correlation referring to other tables that precede it in the FROM clause, and returns a row for each element or attribute within the column.

Which REST API can be used with unstructured data?

GET /api/files/

Which SQL command can be used to see the CREATE definition of a masking policy?

GET_DDL and DESCRIBE

Which functions can be used to share unstructured data through a secure view?

GET_PRESIGNED_URL and BUILD_SCOPED_FILE_URL

Which file function retrieves the URL for an external or internal stage using the stage name as input?

GET_STAGE_LOCATION

Which privilege is required to use the search optimization service in Snowflake?

GRANT ADD SEARCH OPTIMIZATION ON SCHEMA TO ROLE and must have OWNERSHIP privilege on the table

What is HyperLogLog (HLL)

HyperLogLog can be used in place of COUNT(DISTINCT ...) in situations where estimating cardinality is acceptable. HyperLogLog is a state-of-the-art cardinality estimation algorithm, capable of estimating distinct cardinalities of trillions of rows with an average relative error of a few percent.

What information is found within the Statistic output in the Query Profile Overview?

IO, DML, Pruning, Spilling, Network, External Functions, Extension Functions

Economy policy. Warehouse starts..

If there is enough query load to keep the cluster busy for at least 6 minutes.

What is difference between regular schema and managed access schema

In regular (i.e. non-managed) schemas, object owners (i.e. a role with the OWNERSHIP privilege on an object) can grant access on their objects to other roles, with the option to further grant those roles the ability to manage object grants. With managed access schemas, object owners lose the ability to make grant decisions. Only the schema owner (i.e. the role with the OWNERSHIP privilege on the schema) or a role with the MANAGE GRANTS privilege can grant privileges on objects in the schema, including future grants, centralizing privilege management.

Snowpark provides libraries for which programming languages?

Java, Python, Scala

When unloading data to an external stage, which compression format can be used for Parquet files with the COPY INTO command?

LZO or SNAPPY

What is the recommended file sizing for data loading using Snowpipe?

Loading data files roughly 100-250 MB in size or larger reduces the overhead charge relative to the amount of total data loaded to the point where the overhead cost is immaterial.

Which of the following is a valid source for an external stage when the Snowflake account is located on Microsoft Azure?

Loading data from any of the following cloud storage services is supported regardless of the cloud platform that hosts your Snowflake account: - Amazon S3 - Google Cloud Storage - Microsoft Azure

Which parameter prevents streams on tables from becoming stale?

MAX_DATA_EXTENSION_TIME_IN_DAYS

What are the metadata columns for staged files?

METADATA$FILENAME Name of the staged data file the current row belongs to. Includes the path to the data file in the stage. METADATA$FILE_ROW_NUMBER Row number for each record in the staged data file. METADATA$FILE_CONTENT_KEY Checksum of the staged data file the current row belongs to. METADATA$FILE_LAST_MODIFIED Last modified timestamp of the staged data file the current row belongs to. Returned as TIMESTAMP_NTZ. METADATA$START_SCAN_TIME Start timestamp of operation for each record in the staged data file. Returned as TIMESTAMP_LTZ.

What service is provided as an integrated Snowflake feature to enhance Multi-Factor Authentication (MFA) support?

MFA support is provided as an integrated Snowflake feature, powered by the Duo Security service, which is managed completely by Snowflake.

What are Resource Monitor privileges

MODIFY Enables altering any properties of a resource monitor, such as changing the monthly credit quota. MONITOR Enables viewing a resource monitor. ALL [ PRIVILEGES ] Grants all privileges, except OWNERSHIP, on the resource monitor.

What are least privileges required to view and modify resource monitors?

MODIFY and MONITOR

What privilege is required to view the load monitoring chart

MONITOR

What is the maximum number of clusters in multi-cluster warehouse

Maximum number of clusters, greater than 1 (up to 10). Minimum number of clusters, equal to or less than the maximum (up to 10).

STATEMENT_TIMEOUT_IN_SECONDS

Maximum time in seconds a query is run before it times out. Can be set at Account, User, and Session level. It can be set for a particular warehouse. Default is two days.

What is the impact on queries that are being executed when a resource monitor set to the "Notify & Suspend" threshold level is exceeded?

Notify & Suspend Send a notification (to all account administrators with notifications enabled) and suspend all assigned warehouses after all statements being executed by the warehouse(s) have completed. Notify & Suspend Immediately Send a notification (to all account administrators with notifications enabled) and suspend all assigned warehouses immediately, which cancels any statements being executed by the warehouses at the time.

What actions are supported by Snowflake resource monitors?

Notify & Suspend Send a notification (to all account administrators with notifications enabled) and suspend all assigned warehouses after all statements being executed by the warehouse(s) have completed. Notify & Suspend Immediately Send a notification (to all account administrators with notifications enabled) and suspend all assigned warehouses immediately, which cancels any statements being executed by the warehouses at the time. Notify Perform no action, but send an alert notification (to all account administrators with notifications enabled).

STATEMENT_QUEUED_TIMEOUT_IN_SECONDS

Number of seconds a query is waiting in queue before it is cancelled. Default is zero (no timeout),

How is replication and failover services enabled?

ORGADMIN role calls SYSTEM$GLOBAL_ACCOUNT_SET_PARAMETER function

How many network policies can be assigned to an account or specific user at a time?

One

Which function should be used to insert JSON formatted string data into a VARIANT field?

PARSE_JSON: interprets an input string as a json document, producing a VARIANT value.

Which columns are part of the result set of the Snowflake LATERAL FLATTEN command?

PATH and INDEX

What authentication options are available for Snowflake

Password-based authentication. To use this, set the password option when establishing the connection. Single sign-on (SSO) through a web browser. Native SSO through Okta. Key pair authentication. OAuth.

Which pipes are cloned when cloning a database or schema

Pipes that reference external stages. Internal named stages are NEVER cloned, so pipes that reference internal stages are not cloned.

A Snowflake user wants to share unstructured data through the use of secure views. Which URL types can be used?

Pre-signed URL and Scoped URL

Which privilege must be granted to a share to allow secure views the ability to reference data in multiple databases?

REFERENCE_USAGE on databases

How can an administrator check for updates (for example, SCIM API requests) sent to Snowflake by the identity provider?

REST_EVENT_HISTORY

What will happen if a Snowflake user increases the size of a suspended virtual warehouse?

Resizing a suspended warehouse does not provision any new compute resources for the warehouse. It simply instructs Snowflake to provision the additional compute resources when the warehouse is next resumed, at which time all the usage and credit rules associated with starting a warehouse apply.

Which property helps us control the credits consumed by a multi-cluster warehouse?

SCALING_POLICY

How can a Snowflake user sample 10 rows from a table named SNOWPRO?

SELECT * FROM SNOWPRO SAMPLE (10 ROWS) SELECT * FROM SNOWPRO SAMPLE BERNOULLI (10)

What SQL statement returns empty sample

SELECT * FROM testtable SAMPLE ROW (0);

What is used to diagnose and troubleshoot network connections to Snowflake?

SNOWCD - Snowflake Connectivity Diagnostic Tool

Which encryption type will enable client-side encryption for a directory table?

SNOWFLAKE_FULL: Client-side and Server-side encryption. The files are encrypted by a client when it uploads them to the internal stage using PUT. All files are also automatically encrypted using AES-256 strong encryption on the server side.

Which function will provide the proxy information needed to protect Snowsight?

SYSTEM$ALLOWLIST

In a managed access schema, who can grant privileges on objects in the schema to other roles?

Schema owner role and the role with MANAGE GRANTS privilege

What types of URLs are used to access unstructured data

Scoped URL Encoded URL that permits temporary access to a staged file without granting privileges to the stage. The URL expires when the persisted query result period ends (i.e. the results cache expires), which is currently 24 hours. File URL URL that identifies the database, schema, stage, and file path to a set of files. A role that has sufficient privileges on the stage can access the files. Pre-signed URL Simple HTTPS URL used to access a file via a web browser. A file is temporarily accessible to users via this URL using a pre-signed access token. The expiration time for the access token is configurable.

Which snowflake objects will incur both storage and cloud compute charges?

Search Optimization service, Clustered table, Materialized view

What general guideline does Snowflake recommend when setting the auto-suspension time limit?

Set tasks for suspension after 5 minutes. If you enable auto-suspend, we recommend setting it to a low value (e.g. 5 or 10 minutes or less) because Snowflake utilizes per-second billing. This will help keep your warehouses from running (and consuming credits) when not in use.

In which use case does Snowflake apply egress charges?

Snowflake charges a per-byte fee for data egress when users transfer data from a Snowflake account into a different region on the same cloud platform or into a completely different cloud platform. Data transfers within the same region are free.

Which ACCOUNT_USAGE views are used to evaluate the details of dynamic data masking?

Snowflake provides two Account Usage views to obtain information about masking policies: 1. The MASKING POLICIES view provides a list of all masking policies in your Snowflake account. 2. The POLICY_REFERENCES view provides a list of all objects in which a masking policy is set.

A task is still being executed before the next scheduled task. What is going to happen with the new scheduled task?

Snowflake will skip it. Snowflake ensures that only one instance of a task with a schedule is executed at a given time. If a task is still running when the next scheduled execution time occurs, then that scheduled time is skipped.

What the types of Streams

Standard, Append-Only, Insert-only

What is VALIDATION_MODE = RETURN_ROWS

String (constant) that instructs the COPY command to return the results of the query in the SQL statement instead of unloading the results to the specified cloud storage location.

What is SCIM

System for Cross-domain Identity management. It's used for managing user identities across different systems, including provisioning and deprovisioning users in a centralized manner.

What is ALLOWED_VALUES property for tags

The ALLOWED_VALUES tag property enables specifying the possible string values that can be assigned to the tag when the tag is set on an object. The limit on string values is 300.

While using a COPY command with a Validation_mode parameter, which of the following statements will return an error?

The VALIDATION_MODE parameter does not support COPY statements that transform data during a load.

Which of the following are handled by the cloud services layer of the Snowflake architecture?

The cloud services layer is a collection of services that coordinate activities across Snowflake. These services tie together all of the different components of Snowflake in order to process user requests, from login to query dispatch. The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider. Services managed in this layer include: - Authentication - Infrastructure management - Metadata management - Query parsing and optimization - Access control

After how many hours does Snowflake cancel our running SQL statement by default?

The default value for the command "STATEMENT_TIMEOUT_IN_SECONDS" is 172800 seconds, which is 48 hours. A query constantly running can be dangerous, so Snowflake kills it in 48 hours.

What is the maximum time that Snowflake can run a query

The default value for the command "STATEMENT_TIMEOUT_IN_SECONDS" is 172800 seconds, which is two days, but the maximum time we can configure is seven days.

Standard (default) scaling policy. Warehouse starts..

The first cluster starts when a query is queued, or there is one more query than the currently running clusters can execute. Successive clusters run 20 seconds after the prior one has started.

What privileges are required to create a task?

The role must have acess to the target schema and the CREATE TASK privilege on the schema itself

What is INFORMATION_SCHEMA

The schema contains the following objects: Views for all the objects contained in the database, as well as views for account-level objects (i.e. non-database objects such as roles, warehouses, and databases) Table functions for historical and usage data across your account.

What is SEARCH_PATH

The search path is stored in the session-level parameter SEARCH_PATH. he default value of the search path is $current, $public. $current Specifies the current schema. $public Specifies the public schema of the current database

Which statements reflect key functionalities of a Snowflake Data Exchange?

The sharing of data in a Data Exchange is bidirectional. An account can be a provider for some datasets and a consumer for others. A Data Exchange allows groups of accounts to share data privately among the accounts.

What are recommended steps to address poor SQL query performance due to data spilling?

The spilling can't always be avoided, especially for large batches of data, but it can be decreased by: Reviewing the query for query optimization especially if it is a new query Reducing the amount of data processed. For example, by trying to improve partition pruning, or projecting only the columns that are needed in the output. Decreasing the number of parallel queries running in the warehouse. Trying to split the processing into several steps (for example by replacing the CTEs with temporary tables). Using a larger warehouse - this effectively means more memory and more local disk space.

What is the effect of configuring a virtual warehouse auto-suspend value to '0'?

The warehouse never suspends.

The property MINS_TO_BYPASS_NETWORK_POLICY is set at which level?

User

Which of the following can be used when unloading data from Snowflake?

Using the SINGLE=TRUE parameter, a single file up to 5 GB in size can be exported to the storage layer. The OBJECT_CONSTRUCT function can be used to convert relational data to semi-structured data

What happens when the values for both an ALLOWED_IP_LIST and a BLOCKED_IP_LIST are used in a network policy?

When a network policy includes values in both the allowed and blocked IP address lists, Snowflake applies the blocked IP address list first.

Which of the following statements describe features of Snowflake data caching

When data cache is full, the least-recently used data will be cleared to make room. The RESULT_SCAN table function can access and filter the contents of the query result cache.

What is maximum warehouse size for serverless tasks?

XXLARGE

What is TABLE_STORAGE_METRICS view in Information schema

displays table-level storage utilization information, which is used to calculate the storage billing for each table in the account, including tables that have been dropped, but are still incurring storage costs. In addition to table metadata, the view displays the number of storage bytes billed for each table.

Which commands should be used to grant the privilege allowing a role to select data from all current tables and any tables that will be created later in a schema?

use role accountadmin; grant usage on database MY_DB to role TEST_ROLE; grant usage on schema MY_DB.MY_SCHEMA to role TEST_ROLE; grant select on future tables in schema MY_DB.MY_SCHEMA to role TEST_ROLE;


Kaugnay na mga set ng pag-aaral

Chptr 25 The Milky Way Galaxy - Starry Night

View Set

Live Virtual Machine Lab 3.2: Module 03 Install and Configure DHCP and DNS Servers

View Set