SnowPro Certification

¡Supera tus tareas y exámenes ahora con Quizwiz!

Which of the following are options when creating a Virtual Warehouse? A. Auto-suspend B. Storage size C. Auto-resume D. Server Count

A & C

Which of the following are true about the variant data type in Snowflake? A. Optimized storage based on repeated elements B. Stored in a separate file format from structured data C. Can be queried using JSON path notation D. Requires a customer mapping for each record type

A & C

Which are the supported data types for a JSON name/value pair? ​ A number (integer or floating point) ​ A string (in double quotes) ​ A Boolean (true or false) ​ An array (in square brackets) ​ An object (in curly braces) ​ Null ​ complex datatype

A number (integer or floating point) ​ A string (in double quotes) ​ A Boolean (true or false) ​ An array (in square brackets) ​ An object (in curly braces) ​ Null

Currently in snowflake, replication is supported for which type of objects ​ A) Databases ​ B) Users ​ C) Roles ​ D) Warehouses

A) Databases

What is the technique called which snowflake uses to limit the number of micro-partitions retrieved as part of a query? ​ A) Pruning ​ B) Selective Filter ​ C) Indexing ​ D) Clustering

A) Pruning

In snowflake, what are the two major cost categories ​ A) Storage ​ B) Compute ​ C) Cloud services ​ D) Network Traffic

A) Storage ​ B) Compute

While loading data through the COPY command, you can transform the data. Which of the below transformations are allowed? ​ A) TRUNCATECOLUMNS ​ B) REORDER COLUMNS ​ C) OMIT COLUMNS ​ D) CAST ​ E) AGGREGATE

A) TRUNCATECOLUMNS ​ B) REORDER COLUMNS ​ C) OMIT COLUMNS ​ D) CAST

Which database objects can be shared using snowflake's secure data sharing ​ A) Tables ​ B) External tables ​ C) Secure views ​ D) Secure materialized views ​ E) Secure UDFs ​ F) Roles

A) Tables ​ B) External tables ​ C) Secure views ​ D) Secure materialized views ​ E) Secure UDFs

Select the objects which do not have a fail safe period ​ A) Transient table ​ B) Temporary table ​ C) Permanent table

A) Transient table ​ B) Temporary table

This object in snowflake returns a set of rows instead of a single, scalar value, and can be accessed in the FROM clause of a query ​ A) UDTF ​ B) UDF ​ C) Stored procedure

A) UDTF

Let us imagine you have a permanent table named EMPLOYEE, you have dropped the table. Then you created another table with the same name. If you execute the UNDROP command now, what will happen? ​ A) UNDROP command will fail ​ B) It will work similar to insert overwrite ​ C) The new table will be renamed with a suffix of v1.

A) UNDROP command will fail

Which of the below privileges are required to add clustering on a table ​ A) USAGE or Ownership on the schema and database ​ B) MONITOR USAGE on the table ​ C) ALTER privilege on the table

A) USAGE or Ownership on the schema and database

What are the valid data loading options in snowflake? ​ A) Using snowsql and sql ​ B) Using snowpipe ​ C) Using 3rd part ETL tool ​ D) Using the cloud providers data upload tools

A) Using snowsql and sql ​ B) Using snowpipe ​ C) Using 3rd part ETL tool

You want to query the history of task usage within a specified date range. What will you do? ​ A) You will query the TASK_HISTORY table function ​ B) You will run 'SHOW TASK HISTORY' ​ C) You will run 'DESCRIBE TASKS'

A) You will query the TASK_HISTORY table function

You have several CSV files loaded into your named snowflake internal stage. You want to load files from the stage into a table using pattern matching to only load uncompressed CSV files whose names include the string sales. Which is the command that you will use to do the same? ​ A) copy into mytable from @my_int_stage file_format = (format_name = myformat) pattern='.*sales.*[.]csv'; ​ B) copy into mytable from @my_int_stage regex='.*sales.*[.]csv'; ​ C) copy into mytable from @my_int_stage match_pattern='.*sales.*[.]csv';

A) copy into mytable from @my_int_stage file_format = (format_name = myformat) pattern='.*sales.*[.]csv';

What is the default option for ON_ERROR clause of COPY command for bulk loading? ​ CONTINUE ​ SKIP_FILE ​ SKIP_FILE_10 ​ ABORT_STATEMENT

ABORT_STATEMENT

Which role in Snowflake allows a user to administer users and manage all database objects? SYSADMIN SECURITYADMIN ACCOUNTADMIN ROOT

ACCOUNTADMIN

Select the most accurate statement regarding Snowflake's transaction support ​ ACID Compliant ​ Requires special user configuration ​ BASE compliant ​ Works like a NoSQL database

ACID Compliant

If you would like to disable the query cache, what will you do? ​ ALTER SESSION SET USE_CACHED_RESULT = FALSE; ​ ALTER SESSION SET USE_CACHED_RESULT = OFF; ​ ALTER SESSION SET USE_QUER_CACHE = TRUE;

ALTER SESSION SET USE_CACHED_RESULT = FALSE;

Choose the answer which true about key rotation in snowflake ​ Account and table master keys are automatically rotated by Snowflake when they are more than 30 days old ​ Only account master key is rotated by snowflake when it is required ​ Only table master key is rotated when it is required ​ Key rotation is an expensive operation hence it is done manually when required

Account and table master keys are automatically rotated by Snowflake when they are more than 30 days old

When you sign up for a snowflake trial account, what are three snowflake editions that are offered ​ A) Free-Tier ​ B) Standard ​ C) Enterprise Data Security ​ D) Enterprise ​ E) Business critical

B) Standard ​ D) Enterprise ​ E) Business critical

A task can execute multiple SQL statement. ​ TRUE ​ FALSE

FALSE

You need to contact Snowflake Support team if retrieval of data is required from fail safe (TRUE OR FALSE)

TRUE

What are the various stages available in Snowflake ​ A) USER ​ B) TABLE ​ C) NAMED INTERNAL ​ D) NAMED EXTERNAL ​ E) ACCOUNT STAGE ​ F) SCHEMA STAGE

​ A) USER ​ B) TABLE ​ C) NAMED INTERNAL ​ D) NAMED EXTERNAL Note: SNOWFLAKE supports three type of stages 1. USER 2. TABLE 3. NAMED Named stages are again of two types internal and external USER and Table stages are automatically available in snowflake and are not needed to be created or configured https://docs.snowflake.com/en/user-guide/data-load-local-file-system-create-stage.html

After how many days do the load metadata of a table expire? ​ A) 64 ​ B) 14 ​ C) 7 ​ D) 365

A) 64

Which of the below are binary formats? ​ AVRO ​ JSON ​ ORC ​ PARQUET

AVRO ​ ORC ​ PARQUET

Tri-Secret Secure option is available in which snowflake edition ​ Business critical or higher ​ Enterprise Edition ​ All editions

Business critical or higher

Select the term that is associated with compute layer? ​ A) Query optimization ​ B) Query planning ​ C) Query processing

C) Query processing

Load performance in snowflake is fastest for which file format ​ CSV ​ ORC ​ AVRO ​ PARQUET

CSV

True or False: Snowflake charges a premium for storing semi-structured data.

False

True or False: Snowflake only works with cloud based tools.

False

True or False: The query profiler view is only available for completed queries.

False

Which parameter controls the maximum number of days for which Snowflake can extend the data retention period for tables to prevent streams on the tables from becoming stale. ​ MAX_DATA_EXTENSION_TIME_IN_DAYS ​ MAX_DATA_RETENTION_TIME_IN_DAYS ​ MAX_DATA_EXTENSION_TIME_IN_MONTHS

MAX_DATA_EXTENSION_TIME_IN_DAYS

Which constrains are enforced in snowflake ​ Referential integrity constraints ​ NOT NULL constraint ​ UNIQUE Constraint

NOT NULL constraint

Which of the two statements are true about the variant data type in SnowFlake? ​ Optimized storage based on repeated elements ​ Stored in a seperate file format from structured data ​ Can be queried using json path notation ​ Requires a custom mapping for each record type

Optimized storage based on repeated elements ​ Can be queried using json path notation

Which are the two variant columns available in a snowflake table loaded by kafka connector ​ RECORD_CONTENT ​ RECORD_METADATA ​ RECORD_KEY

RECORD_CONTENT ​ RECORD_METADATA

When adding Consumers to Outbound Shares, what types of Consumer Accounts can be selected? ​ Reader ​ GCP ​ Azure ​ Full ​ AWS

Reader​ Full Explanation Refer Link for details - https://docs.snowflake.com/en/user-guide/data-sharing-intro.html

Which are the key concepts that will need to be considered while loading data into snowflake ​ STAGE OBJECT ​ FILE FORMAT ​ TRANSFORMATION AND ERROR VALIDATION ​ FILE SIZE

STAGE OBJECT ​ FILE FORMAT ​ TRANSFORMATION AND ERROR VALIDATION

You want to load a JSON file but would like to remove the outer array structure and load the records into separate table rows. Which file format option would you use? ​ STRIP_OUTER_ARRAY ​ TRUNCATE_OUTER_ARRAY ​ UNSET_OUTER_ARRAY ​ DROP_OUTER_ARRAY

STRIP_OUTER_ARRAY

What is the name of the Snowflake's Command Line Interface tool? ​ A) SnowCLI ​ B) SnowSQL ​ C) SnowCMD ​ D) SnowSpark

SnowSQL

Which of the following are Snowflake view types? ​ Standard ​ External ​ Transient ​ Materialized​ Permanent ​ Secure

Standard Materialized​ Secure https://docs.snowflake.com/en/user-guide/views-introduction.html

Insert-only streams are supported on external tables only and it is currently a preview feature (TRUE OR FALSE)

TRUE

JSON data is a hierarchical collection of name/value pairs grouped into objects and arrays ​ TRUE ​ FALSE

TRUE

Looker is a snowflake business intelligence partner (TRUE OR​ FALSE)

TRUE

True or False: Snowflake caches are automatically invalidated if the underlying data changes.

True

You have two data warehouses in your snowflake instance. You have updated the data in the storage layer using one of the warehouses. When will the other warehouse be able to see the data ​ A) Immediately ​ B) After 30 minutes once snowflake completes data synchronization ​ C) You will need to trigger the data synchronization process for the other warehouse to see the data

A) Immediately

Lets say that you have scheduled a task. At time T0, the task has started executing. If it is still executing when the next scheduling execution time occurs, what happens to that task ​ A) It will be skipped ​ B) It will wait for the previous task to complete ​ C) It will be aborted

A) It will be skipped

Which cloud providers are supported by snowflake ​ A) Google Cloud Platform ​ B) Azure ​ C) AWS

A) Google Cloud Platform ​ B) Azure ​ C) AWS

Snowflake is compliant with which certifications? ​ A) HIPAA, PCI DSS, HIPAA, PCI DSS and FedRAMP ​ B) HIPAA, PCI DSS ​ C) Only HIPAA ​ D) Only FedRAMP

A) HIPAA, PCI DSS, HIPAA, PCI DSS and FedRAMP

When you use this parameter with SHOW STREAMS, The output also includes an additional dropped_on column, which displays 1. Date and timestamp (for dropped streams). 2.NULL (for active streams). ​ A) HISTORY ​ B) TERSE ​ C) DROPPED=TRUE

A) HISTORY

When unloading data into multiple files, you will use this copy option to specify the maximum size of each file created ​ A) MAX_FILE_SIZE ​ B) MAX_SIZE ​ C) MAX_FILE_BYTES

A) MAX_FILE_SIZE

Select the snowflake edition that allow only a maximum of 1 day of time travel ​ A) Standard ​ B) Enterprise ​ C) VPS ​ D) Business Critical

A) Standard

What should be your minimum snowflake edition for data sharing capability ​ A) Standard ​ B) Enterprise ​ C) Business critical

A) Standard

Load history for bulk data load is stored in the metadata of the target table for 64 days (TRUE OR FALSE)

TRUE Note: Bulk data load Stored in the metadata of the target table for 64 days. Available upon completion of the COPY statement as the statement output. Snowpipe Stored in the metadata of the pipe for 14 days. Must be requested from Snowflake via a REST endpoint, SQL table function, or ACCOUNT_USAGE view. https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html#load-history

Client has ODBC or JDBC available in their system but they do not have snowflake drivers. Client will still be able to connect to snowflake? ​ TRUE ​ FALSE

FALSE

In a multi-cluster warehouse, you will need to manually scale the warehouses(minimum cluster to maximum cluster) (TRUE OR FALSE)

FALSE

JSON and PARQUET files can be loaded to columns in the same table ​ TRUE ​ FALSE

FALSE

Loading data using snowpipe REST API is supported for external stage only ​ TRUE ​ FALSE

FALSE

Once you have selected a warehouse size, you cannot resize the size of the warehouse (TRUE OR FALSE)

FALSE

Reclustering in Snowflake requires manual configuration (TRUE OR FALSE)

FALSE

Scaling down a virtual warehouse(e.g from a large warehouse to a small one) is an automated process. ​ TRUE ​ FALSE

FALSE

SnowPipe can load a file with same name if it has been modified later. (TRUE OR FALSE)

FALSE

Snowflake allows only loading of structured data (TRUE OR FALSE)

FALSE

Snowpipe guarantees that files are loaded in the same order they are staged (TRUE OR FALSE)

FALSE

Temporary tables cannot be created with a clustering key defined ​ TRUE ​ FALSE

FALSE

The insertReport SNOWPIPE API can retrieve file ingestion events and report it. The events are retained for infinite time until deleted manually. (TRUE OR FALSE)

FALSE

When migrating from Teradata to snowflake, you will also need to migrate the indexes. ​ TRUE ​ FALSE

FALSE

When you are loading data through the COPY command, the table and the file being loaded must have the same order of columns ​ FALSE ​ TRUE

FALSE

When you load a table from a file through COPY command, the file should have the same number and order of columns as the table (TRUE OR ​FALSE)

FALSE

When you load data using Snowpipe, loads are always performed in a single transaction ​ FALSE ​ TRUE

FALSE

You can load data using PUT command through worksheets in snowflake web ui (TRUE OR FALSE)

FALSE

You can map snowflake to any s3 bucket and can query the data directly as long as the data is in Parquet or ORC format ​ TRUE ​ FALSE

FALSE

You can use the query profiler view only for completed queries (TRUE OR FALSE)

FALSE

You have a materialized view and you want to track changes in the view. You can use streams to track that change (TRUE OR FALSE)

FALSE

You have specified PURGE=TRUE in your COPY command. Due to a network failure the data files could not be purged after successful load. You will get an error returned by snowflake due to this failure. ​ (TRUE OR ​FALSE)

FALSE

The data stored in temporary table does not contribute to storage cost (TRUE OR FALSE)

FALSE Note: For the duration of the existence of a temporary table, the data stored in the table contributes to the overall storage charges that Snowflake bills your account. To prevent any unexpected storage changes, particularly if you create large temporary tables in sessions that you maintain for periods longer than 24 hours, Snowflake recommends explicitly dropping these tables once they are no longer needed. You can also explicitly exit the session in which the table was created to ensure no additional charges are accrued.

Micropartitions can be modified after creation of the table ​ TRUE ​ FALSE

FALSE Note: In snowflake data is physically stored into micro-partitions in proprietary columnar format. As a used you do not have direct access to micropartitions. Snowflake manages the micropartitions. Also micro partitions are immutable which means that once a micro partition is created, it cannot be changed. When a row is updated, the micro partition containing that row is copied to a new micro partition and then the updated row is inserted into that partition. The old micro-partition is marked for deletion. As a used, you however have control to create clustering on your table which helps to re-arrange the micro-partitions based on the cluster keys for faster query retrieval.

By executing the 'SHOW TABLES' command, we can list all the tables in all the schemas even if we do not have access to all the tables ​ TRUE ​ FALSE

FALSE Note: SHOW TABLES Lists the tables for which you have access privileges, including dropped tables that are still within the Time Travel retention period and, therefore, can be undropped. The command can be used to list tables for the current/specified database or schema, or across your entire account.

Time travel can be disabled at the account level ​ TRUE ​ FALSE

FALSE Note: Time Travel cannot be disabled for an account; however, it can be disabled for individual databases, schemas, and tables by specifying DATA_RETENTION_TIME_IN_DAYS with a value of 0 for the object. https://docs.snowflake.com/en/user-guide/data-time-travel.html#enabling-and-disabling-time-travel

Virtual warehouses cannot be resized while they are running ​ TRUE ​ FALSE

FALSE Note: Warehouses can be started and stopped at any time. They can also be resized at any time, even while running, to accommodate the need for more or less compute resources, based on the type of operations being performed by the warehouse. https://docs.snowflake.com/en/user-guide/warehouses-overview.html

Providers can list data on the Marketplace without contacting Snowflake ​ TRUE ​ FALSE

FALSE Explanation Refer Link - https://docs.snowflake.com/en/user-guide/data-marketplace-intro.html If you would like to request new data to be added to the Snowflake Data Marketplace you can click the Is there data you'd like to see on the Marketplace? link in the product UI or use this web registration form to make your request. Snowflake team needs to be consulted to get the data listed on Marketplace.

Information Schema Views will require a warehouse to execute whereas SHOW command does not require one. (TRUE OR FALSE)

TRUE

Multi-factor Authentication can be used for connecting to Snowflake via the Snowflake JDBC driver ​ TRUE ​ FALSE

TRUE

Non-native values such as dates and timestamps are stored as strings when loaded into a VARIANT column ​ TRUE ​ FALSE

TRUE

Only named stages (internal or external) and user stages are supported for COPY transformations (TRUE OR FALSE)

TRUE

Organizing input data by granular path can improve load performance ​ TRUE ​ FALSE

TRUE

Query load is calculated by dividing the execution time (in seconds) of all queries in an interval by the total time (in seconds) for the interval. ​ TRUE ​ FALSE

TRUE

SNOWPIPE REST APIs support both internal and external stage (TRUE OR FALSE)

TRUE

Snowflake automatically stores data in encrypted form in all editions (TRUE OR FALSE)

TRUE

Snowflake can carry out transformations after loading files staged by partner software (ELT) (TRUE OR​ FALSE)

TRUE

Snowflake has scalar and tabular user-defined functions (UDFs), with support for both SQL and JavaScript. (TRUE OR FALSE)

TRUE

Snowflake is a SaaS offering ​ TRUE ​ FALSE

TRUE

Snowflake recommends to compress your data files when you are loading large data sets. (TRUE OR FALSE)

TRUE

Snowflake supports multi-factor authentication (MFA) to provide increased login security for users connecting to Snowflake. At this time, Snowflake users are not automatically enrolled in MFA. (TRUE OR ​FALSE)

TRUE

Snowflake supports specifying a SELECT statement instead of a table in the COPY INTO <location> command (TRUE OR FALSE)

TRUE

Snowpipe continuously loads micro-batches of data from an external stage location (Amazon S3, Google Cloud Storage, or Microsoft Azure) into a staging table. ​ TRUE ​ FALSE

TRUE

Snowpipe is a serverless function (TRUE OR FALSE)

TRUE

Streams cannot track changes in materialized views ​ TRUE ​ FALSE

TRUE

Streams ensure exactly once semantics for new or changed data in a table (TRUE OR FALSE)

TRUE

The Kafka connector continuously loads records from one or more Apache Kafka topics into an internal (Snowflake) stage and then into a staging table using Snowpipe. ​ TRUE ​ FALSE

TRUE

The Snowflake Web UI data loading wizard is only intended for loading small numbers of files of limited size (up to 50 MB). ​ (TRUE OR ​FALSE)

TRUE

The VALIDATION_MODE parameter does not support COPY statements that transform data during a load. (TRUE OR FALSE)

TRUE

The amount of storage used by failsafe can be viewed under Account->Billing & Usage section ​ TRUE ​ FALSE

TRUE

The interactions with data are initialized through the services layer ​ TRUE ​ FALSE

TRUE

The table functions in INFORMATION_SCHEMA can be used to return account-level usage and historical information for storage, warehouses, user logins, and queries (TRUE OR FALSE)

TRUE

To help avoid exceeding the rate limit (error code 429), snowflake recommends relying more heavily on insertReport than loadHistoryScan (TRUE OR FALSE)

TRUE

Two virtual warehouses can access the same data at the same time without any contention issues ​ TRUE ​ FALSE

TRUE

When a network policy includes values in both the allowed and blocked IP address lists, Snowflake applies the blocked IP address list first ​ TRUE ​ FALSE

TRUE

When a network policy includes values in both the allowed and blocked IP address lists, Snowflake applies the blocked IP address list first. (TRUE​ OR FALSE)

TRUE

When cloning a table, if the COPY GRANTS keywords are not included in the CREATE <object> statement, then the new object does not inherit any explicit access privileges granted on the original table but does inherit any future grants defined for the object type in the schema ​ TRUE ​ FALSE

TRUE

When staging uncompressed files in a Snowflake stage, the files are automatically compressed using gzip, unless compression is explicitly disabled. (TRUE OR ​FALSE)

TRUE

When you clone a table, it does not include the load history of the source table, and hence the data files that were already loaded in source table can be reloaded in the clone table ​ TRUE ​ FALSE

TRUE

While sharing data, consumer account and provider account must be in the same region (TRUE​ OR FALSE)

TRUE

With respect to snowflake security, Tri-secret requires that customers manage their keys ​ TRUE ​ FALSE

TRUE

You have two types of named stages, one is external stage and the other one is internal stage. External stage will always require a cloud storage provider ​ TRUE ​ FALSE

TRUE

You need snowflake instances in different regions. You will need to have separate account for each region (TRUE OR FALSE)

TRUE

A warehouse can be assigned to a single resource monitor only (TRUE OR ​FALSE)

TRUE Note: A single monitor can be set at the account level to control credit usage for all warehouses in your account. In addition, a monitor can be assigned to one or more warehouses, thereby controlling the credit usage for each assigned warehouse. Note, however, that a warehouse can be assigned to only a single resource monitor.

How do you set a return value in a task? ​ create task set_return_value warehouse=return_task_wh schedule='1 minute' as call system$set_return_value('The quick brown fox jumps over the lazy dog'); ​ create task set_return_value warehouse=return_task_wh schedule='1 minute' as call system$set_return_code('The quick brown fox jumps over the lazy dog'); ​ create task set_return_value warehouse=return_task_wh schedule='1 minute' as call set_return_value('The quick brown fox jumps over the lazy dog');

create task set_return_value warehouse=return_task_wh schedule='1 minute' as call system$set_return_value('The quick brown fox jumps over the lazy dog');

To provide failover protection, how many availability zones does snowflake replicate to ​ one ​ two ​ three ​ as configured by user

two

You want to get the DDL statement of a snowflake table. What is the command that you will use? ​ select get_ddl('table', 'mydb.public.salesorders'); ​ show table 'mydb.public.salesorders'; ​ show table like 'mydb.public.salesorders';

select get_ddl('table', 'mydb.public.salesorders');

Which of the below statement will you use to recreate a specified object? ​ A) GET_DDL ​ B) GET_SCHEMA ​ C) SHOW SCHEMA

A) GET_DDL

Which two options mentioned below will you choose to concat a column named 'FRUIT_DESCRIPTION' between two % signs? ​ '%'||FRUIT_DESCRIPTION||'%' ​ CONCAT('%',CONCAT(FRUIT_DESCRIPTION,'%')) ​ %+FRUIT_DESCRIPTION+% ​ SUBSTR(%,FRUIT_DESCRIPTION,%)

'%'||FRUIT_DESCRIPTION||'%' ​ CONCAT('%',CONCAT(FRUIT_DESCRIPTION,'%'))

Which command is used to download data from Snowflake stage ​ PUT ​ GET ​ COPY ​ DUMP

GET

An empty table with no micro-partition will have a clustering depth of ​ 1 ​ 0 ​ -1 ​ 99

0 Note: The clustering depth for a populated table measures the average depth (1 or greater) of the overlapping micro-partitions for specified columns in a table. The smaller the average depth, the better clustered the table is with regards to the specified columns. Clustering depth can be used for a variety of purposes, including: 1. Monitoring the clustering "health" of a large table, particularly over time as DML is performed on the table. 2. Determining whether a large table would benefit from explicitly defining a clustering key. A table with no micro-partitions (i.e. an unpopulated/empty table) has a clustering depth of 0.

How much Snowpipe charge for the files in Queue ​ 0.06 credits per 100 files queued ​ No charges for files in Queue ​ 0.06 credits per 10000 files queued ​ 0.06 credits per 1000 files queued

0.06 credits per 1000 files queued https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html#:~:text=Continuous%20Data%20Loads%20(i.e.%20Snowpipe)%20and%20File%20Sizing,-Snowpipe%20is%20designed&text=This%20overhead%20increases%20in%20relation,credits%20per%201000%20files%20queued.

What is the recommended maximum data size limitations for Parquet files? ​ 100 MB ​ 10 MB ​ 1 GB ​ 3 GB

1 GB Note: Currently, data loads of large Parquet files (e.g. greater than 3 GB) could time out. Split large files into files 1 GB in size (or smaller) for loading.

What is the maximum number of child tasks can a task have? ​ 50 ​ 100 ​ 1000 ​ 10

100

What is the recommended file size, to optimize the number of parallel operation for a load ​ 100-250 MB (Compressed) ​ 1 MB to 10 MB (Compressed) ​ None of the above ​ 50 MB to 150 MB (Compressed)

100-250 MB (Compressed) The number of load operations that run in parallel cannot exceed the number of data files to be loaded. To optimize the number of parallel operations for a load, we recommend aiming to produce data files roughly 100-250 MB (or larger) in size compressed. Refer Link - https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html

What is the recommended file size to optimize the number of parallel operations for a load? ​ 100-250 MB(or larger) in size compressed ​ 10 MB to 100 MB in size un compressed. ​ less than 1 MB compressed

100-250 MB(or larger) in size compressed Note: https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html#file-sizing-best-practices-and-limitations

What is the size limit on the variant data type in snowflake ​ 16 MB ​ 64 MB ​ 32 MB

16 MB

What is the size limit imposed by VARIANT data type. ​ 100 MB compressed ​ 16 MB compressed ​ 16 MB uncompressed ​ 100 MB compressed

16 MB compressed

You are using a 2XL warehouse which has 32 nodes and 8 cores in each node. When you submit a query to this warehouse, snowflake will try to split the query into how many partitions? (Provided that the query is eligible for splitting) ​ 256 ​ 32 ​ 8

256 Note: partitions = cores x nodes

If a transaction is running on a session and the session disconnects abruptly, the transaction will be in a detached state. It will not allow the transaction to commit or roll back. The locks on the resources will also not be released. If you do not do anything and let snowflake abort the transaction, how long will you need to wait? ​ 4 hours ​ 15 minutes ​ 1 hour

4 hours Note: If a transaction is running in a session and the session disconnects abruptly, preventing the transaction from committing or rolling back, the transaction is left in a detached state, including any locks that the transaction is holding on resources. If this happens, you might need to abort the transaction. To abort a running transaction, the user who started the transaction or an account administrator can call the system function, SYSTEM$ABORT_TRANSACTION. If the transaction is left open, Snowflake typically rolls back the transaction after it has been idle for four hours. https://docs.snowflake.com/en/sql-reference/transactions.html#aborting-transactions

After how many days does the load metadata expire ​ 30 ​ 60 ​ 64

64

True or False: A table in Snowflake can contain both a variant column and a timestamp

A

True or False: The warehouse cache may be reset if a running warehouse is suspended and then resumed.

A

Which of the following DML commands isn't supported by Snowflake? A. UPSERT B. MERGE C. UPDATE D. TRUNCATE TABLE

A

Please look at the below two queries A. SELECT * FROM TABLE1 WHERE YEAR = 2015 AND UPPER(COL1) = 'COLVAL'; B. SELECT * FROM TABLE1 WHERE YEAR = 2015 AND COL1 ='COLVAL'; Which one of these will scan more partitions? Note: COL1 stores values both in upper and lower case ​ A ​ B ​ Both will scan the same number of partitions

A Note: This probably will not come in the certification exam, but it is important to know this when you are working on a project. Snowflake stores the metadata for each column in a metadata services layer which includes the range of values of the columns in the micro partition. In this case, the range of values stored in the metadata layer for COL1 is stored in its raw form which is a mix of upper and lower case. Because of this when you apply the UPPER function, the metadata services layer cannot use the information stored in the layer to fetch the micro partitions for the range of the rows

Which of the following are true of Multi-Cluster warehouses? A. Adds clusters automatically based on query activity B. Sizes each cluster optimally based on the queries C. Scales down when query activity slows down D. Multi-cluster warehouses will never auto-suspend

A & C

Which of the following statements apply to Snowflake in terms of security? A. Snowflake leverages a role-based access control model B. Snowflake requires you to configure an IAM user in order to connect to the database C. All data in Snowflake is encrypte D. Snowflake can run within a customer's VPC

A & C

Select all of the answers that describe SF micro-partitions. Micro-partitions A. Are the physical data files that comprise SF's logical tables B. Are written to a customer configured cloud storage location C. Will be updated as their row & column values are updated D. Enable horizontal and vertical query pruning

A & D

Which of the following are TRUE about SF data sharing? A. Consumers can query shared tables in the same query as their own tables B. Producers can share data to other databases such as Postgres and MySQL C. You don't need a SF virtual WH to query shared data D. Data sharing is integrated with role-based access control

A & D

What is the recommended approach for making a variant column accessible in a BI tool ​ A pre-defined mapping ​ A view ​ Leveraging a json parser ​ BI tool cannot access json

A view

You have a warehouse. It ran for 62 seconds and then was suspended. After that it resumed and ran for 20 seconds. For how many seconds will you be billed ​ A) 122 ​ B) 20 ​ C) 92

A) 122 Note: When a virtual warehouse starts for the first time, the first one minute is anyway charged after that the billing is per second Warehouses are only billed for credit usage when they are running. When a warehouse is suspended, it does not accrue any credit usage. The credit numbers shown here are for a full hour of usage; however, credits are billed per-second, with a 60-second (i.e. 1-minute) minimum: Each time a warehouse is started or resized to a larger size, the warehouse is billed for 1 minute's worth of usage based on the hourly rate shown above. After 1 minute, all subsequent billing is per-second. Stopping and restarting a warehouse within the first minute does not change the amount billed; the minimum billing charge is 1 minute. When a warehouse is increased in size, credits are billed only for the additional servers that are provisioned. For example, changing from Small (2) to Medium (4) results in billing charges for 1 minute's worth of 2 credits. https://docs.snowflake.com/en/user-guide/credits.html#virtual-warehouse-credit-usage

Which character below identifies a table stage ​ A) @% ​ B) @ ​ C) %

A) @%

Which character combination identifies a user stage ​ A) @~ ​ B) ~ ​ C) @

A) @~

Please select the ones that are true for a snowflake task ​ A) A task can execute a single SQL Statement ​ B) A task can execute a call to a store procedure ​ C) A task can execute multiple SQL statements ​ D) A task can execute a function

A) A task can execute a single SQL Statement ​ B) A task can execute a call to a store procedure

What can you expect if the filters specified in an INFORMATION_SCHEMA query are not sufficiently selective? ​ A) AN ERROR ​ B) WILL SHOW RESULTS WITH AN WARNING ​ C) WILL SWITCH TO A LARGE WAREHOUSE

A) AN ERROR

If you logon to snowflake using https://mycompany.snowflakecomputing.com/, then what is the cloud provider and region you are logging into? ​ A) AWS and us-west-2 ​ B) Azure and us-east-1 ​ C) AWS and us-east-2

A) AWS and us-west-2 Note: If an account has just the account name followed by snowflakecomputing.com, it means that the cloud provider is AWS and the region is us-west-2. This is the first cloud provider and region where snowflake was deployed

Which role or privileges are required to view TASK_HISTORY? ​ A) Account administrator (i.e. users with the ACCOUNTADMIN role) ​ B) Task owner (i.e. role that has the OWNERSHIP privilege on a task) ​ C) Any role that has the global MONITOR EXECUTION privilege ​ D) Any one with SYSADMIN role

A) Account administrator (i.e. users with the ACCOUNTADMIN role) ​ B) Task owner (i.e. role that has the OWNERSHIP privilege on a task) ​ C) Any role that has the global MONITOR EXECUTION privilege

Which roles can use SQL to view the task history within a specified date range? ​ A) Account administrator (i.e. users with the ACCOUNTADMIN role). ​ B) Task owner (i.e. role that has the OWNERSHIP privilege on a task). ​ C) Any role that has the global MONITOR EXECUTION privilege. ​ D) SECURITYADMIN

A) Account administrator (i.e. users with the ACCOUNTADMIN role). ​ B) Task owner (i.e. role that has the OWNERSHIP privilege on a task). ​ C) Any role that has the global MONITOR EXECUTION privilege.

What is the storage hierarchy in snowflake? ​ A) Account->Database->Schemas->Objects ​ B) Organization->folder->project->resource ​ C) Database->Schemas->Objects

A) Account->Database->Schemas->Objects

Select all options that are true for ORDER BY in snowflake ​ A) All data is sorted according to the numeric byte value of each character in the ASCII table. UTF-8 encoding is supported ​ B) For numeric values, leading zeros before the decimal point and trailing zeros (0) after the decimal point have no effect on sort order. ​ C) Unless specified otherwise, NULL values are considered to be higher than any non-NULL values. As a result, the ordering for NULLS depends on the sort order: 1. If the sort order is ASC, NULLS are returned last; to force NULLS to be first, use NULLS FIRST. 2. If the sort order is DESC, NULLS are returned first; to force NULLS to be last, use NULLS LAST.

A) All data is sorted according to the numeric byte value of each character in the ASCII table. UTF-8 encoding is supported ​ B) For numeric values, leading zeros before the decimal point and trailing zeros (0) after the decimal point have no effect on sort order. ​ C) Unless specified otherwise, NULL values are considered to be higher than any non-NULL values. As a result, the ordering for NULLS depends on the sort order: 1. If the sort order is ASC, NULLS are returned last; to force NULLS to be first, use NULLS FIRST. 2. If the sort order is DESC, NULLS are returned first; to force NULLS to be last, use NULLS LAST.

You have a schema with a table and a stream. You cloned the schema. What will happen to the stream ​ A) Any unconsumed records in the stream (in the clone) are inaccessible B) ​Any unconsumed records in the stream (in the clone) will also be accessible in the clone ​ C) Any unconsumed records in the stream (in the clone) will automatically flow into the table

A) Any unconsumed records in the stream (in the clone) are inaccessible

To download files from the stage/location loaded through COPY INTO <LOCATION> command, you will use ​ A) GET ​ B) UNLOAD ​ C) COPY INTO

A) GET

How is the data storage cost computed for snowflake? ​ A) Based on amount of compressed data ​ B) Based on amount of un-compressed data ​ C) Based on daily average of data stored ​ D) Amount stored on first day of month ​ E) Amount stored on last day of month

A) Based on amount of compressed data C) Based on daily average of data stored Note: Usage for data storage is calculated on the daily average amount of data (in bytes) stored in the system for: 1. Files staged for bulk data loading/unloading (can be stored compressed or uncompressed). 2. Database tables, including historical data for Time Travel (always compressed by Snowflake). 3. Fail-safe for database tables (always compressed by Snowflake). 4. Clones of database tables that reference data deleted in the table that owns the clones. Please look at point #3 above. Snowflake has a storage cost for fail-safe. Now by this time you may already know that Transient tables do not have fail-safe. So for use cases where it makes sense to use transient tables, you can choose transient table to save some costs. Note that storage costs are not that much so weigh in on what you are gaining by going with transient table.

Which one of the below requires an user-specified warehouse to execute COPY statements ​ A) Bulk data load ​ B) Snowpipe ​ C) Both

A) Bulk data load

You can specify the below action to perform when an error is encountered while loading data from a file ​ A) CONTINUE ​ B) SKIP_FILE ​ C) SKIP_FILE_<num> ​ D) SKIP_FILE_<num>% ​ E) ABORT_STATEMENT ​ F) ERROR_STATEMENT

A) CONTINUE ​ B) SKIP_FILE ​ C) SKIP_FILE_<num> ​ D) SKIP_FILE_<num>% ​ E) ABORT_STATEMENT

What will you use to query Snowflake data loading history for the last 365 days (1 year)? ​ A) COPY_HISTORY View ​ B) LOAD_HISTORY View from INFORMATION SCHEMA ​ C) INSERT_HISTORY View

A) COPY_HISTORY View Note: COPY_HISTORY View This Account Usage view can be used to query Snowflake data loading history for the last 365 days (1 year). The view displays load activity for both COPY INTO <table> statements and continuous data loading using Snowpipe. The view avoids the 10,000 row limitation of the LOAD_HISTORY View. https://docs.snowflake.com/en/sql-reference/account-usage/copy_history.html#copy-history-view LOAD_HISTORY View from Information schema This Information Schema view enables you to retrieve the history of data loaded into tables using the COPY INTO <table> command within the last 14 days. The view displays one row for each file loaded. https://docs.snowflake.com/en/sql-reference/info-schema/load_history.html#load-history-view Please also read about the LOAD_HISTORY view from ACCOUNT USAGE schema https://docs.snowflake.com/en/sql-reference/account-usage/load_history.html#load-history-view

Which of the below will you consider while choosing a cluster key ​ A) Columns that are most actively used in selective filters ​ B) Columns frequently used in join predicates ​ C) Columns with extremely high cardinality ​ D) Columns with extremely low cardinality

A) Columns that are most actively used in selective filters ​ B) Columns frequently used in join predicates

Which of the below statements are true about snowflake data sharing? ​ A) Consumers can query shared tables in the same query as their own tables ​ B) Producers can share data to other databases such as Postgres and MySQL ​ C) You don't need a Snowflake Virtual Warehouse to query shared data ​ D) Data Sharing is integrated with role-based access controls

A) Consumers can query shared tables in the same query as their own tables ​​ D) Data Sharing is integrated with role-based access controls Note: https://docs.snowflake.com/en/user-guide/data-sharing-intro.html#how-does-secure-data-sharing-work

Which role permissions are required to create and manage streams ? Please specify the OBJECT and the PERMISSION. ​ A) DATABASE - USAGE ​ B) SCHEMA - USAGE, CREATE STREAM ​ C) TABLE - SELECT ​ D) TABLE - DELETE

A) DATABASE - USAGE ​ B) SCHEMA - USAGE, CREATE STREAM ​ C) TABLE - SELECT Note: https://docs.snowflake.com/en/user-guide/streams.html#required-access-privileges

Which type of data incur snowflake storage cost? ​ A) Data stored in permanent tables ​ B) Data retained to enable data recovery(fail-safe and time travel) ​ C) Cached results ​ D) Semi structured data stored in the cloud storage(like AWS S3, GCS)

A) Data stored in permanent tables B) Data retained to enable data recovery(fail-safe and time travel) Note: Storage cost is charged for below type of data 1. Files stored in Snowflake locations (i.e. user and table stages or internal named stages) for bulk data loading/unloading. The user who stages a file can choose whether or not to compress the file to reduce storage. 2. Data stored in database tables, including historical data maintained for Time Travel. Snowflake automatically compresses all data stored in tables and uses the compressed file size to calculate the total storage used for an account. 3. Historical data maintained for Fail-safe.

What is the best way to get the latest ODBC connector for use with Snowflake? ​ A) Download it from snowflake web UI ​ B) Search google and download any version of ODBC ​ C) Compile in .NET

A) Download it from snowflake web UI

Column level security in Snowflake allows the application of a masking policy to a column within a table or view. Which two features are related to column level security ​ A) Dynamic data masking ​ B) External tokenization ​ C) Data loss prevention(DLP)

A) Dynamic data masking ​ B) External tokenization

Which of the below are true with respect to snowflake web ui? ​ A) Each worksheet can have different role, warehouse, schema and database ​ B) Each worksheet is a separate session ​ C) You will be able to run a 'PUT' command from worksheet ​ D) You must use the same role and warehouse for all worksheets

A) Each worksheet can have different role, warehouse, schema and database ​ B) Each worksheet is a separate session

By default, result reuse is enabled in snowflake. if you want to disable it what will you do? ​ A) Execute ALTER SESSION SET USE_CACHED_RESULT=FALSE ​ B) Execute ALTER SESSION SET QUERY_CACHE=FALSE ​ C) Execute ALTER SESSION SET QUERY_RESULT_CACHE =FALSE

A) Execute ALTER SESSION SET USE_CACHED_RESULT=FALSE

How will you know if a stream has become stale? ​ A) Execute DESCRIBE STREAM or SHOW STREAMS ​ B) Execute SHOW TABLE STREAMS ​ C) Execute SHOW PIPES

A) Execute DESCRIBE STREAM or SHOW STREAMS

Following transformations are not supported in a COPY command. Select three. ​ A) FLATTEN ​ B) JOIN ​ C) GROUP BY ​ D) NVL

A) FLATTEN ​ B) JOIN ​ C) GROUP BY

A stream stores data in the same shape as the source table with some additional columns. Which are those additional columns? ​ A) METADATA$ACTION ​ B) METADATA$ISUPDATE ​ C) METADATA$ROW_ID ​ D) METADATA$COLUMN_ID

A) METADATA$ACTION ​ B) METADATA$ISUPDATE ​ C) METADATA$ROW_ID

Which are the two metadata columns for staged files ​ A) METADATA$FILENAME ​ B) METADATA$FILEFORMAT ​ C) METADATA$FILE_ROW_NUMBER

A) METADATA$FILENAME ​ C) METADATA$FILE_ROW_NUMBER

Which of the below services are provided by cloud services ​ A) Metadata management​ B) Authentication​ C) Infrastructure management​ D) Query execution

A) Metadata management ​B) Authentication​ C) Infrastructure management

Zero-copy cloning in snowflake is powered by which service? ​ A) Metadata store of the service layer ​ B) SSD cache of Virtual warehouse ​ C) Query result cache

A) Metadata store of the service layer

Using COPY INTO <location> command, you can unload data from a table (or query) into the below locations. ​ A) Named internal stage (or table/user stage) ​ B) Named external stage ​ C) External location (Amazon S3, Google Cloud Storage, or Microsoft Azure) ​ D) Local drive

A) Named internal stage (or table/user stage) ​ B) Named external stage ​ C) External location (Amazon S3, Google Cloud Storage, or Microsoft Azure)

You will use this parameter to specify the number of threads to use for downloading the files using GET command ​ A) PARALLEL = <INTEGER> ​ B) DOWNLOAD_PARALLEL=<INTEGER> ​ C) DUMP=<INTEGER>

A) PARALLEL = <INTEGER>

What table types are available in snowflake ​ A) PERMANENT ​ B) TEMPORARY ​ C) TRANSIENT ​ D) EXTERNAL E) INTERNAL

A) PERMANENT ​ B) TEMPORARY ​ C) TRANSIENT ​ D) EXTERNAL

COPY does not purge loaded files from the location. What option will you specify to purge files after successful load of a table from table stage ​ A) PURGE=TRUE ​ B) DELETE=TRUE ​ C) REMOVE=TRUE ​ D) TRUNCATE=TRUE

A) PURGE=TRUE

Following commands cannot be executed from worksheets ​ A) PUT ​ B) GET ​ C) SHOW ​ D) LIST <STAGE>

A) PUT ​ B) GET

What types of view are available in snowflake? ​ A) REGULAR ​ B) SECURE ​ C) MATERIALIZED ​ D) HASH VIEW ​ E) TABLE VIEW

A) REGULAR ​ B) SECURE ​ C) MATERIALIZED Note: REGULAR Views are the default view type. The underlying DDL is accessible to any role who has access to the view SECURE VIEWS are those where the query definition and the details are accessible to authorized users only. Please note that since the underlying DDL is not visible to the optimizer, this view type will be slower than a normal view Materialized View - These are more like tables, results are store like it is in a table. Snowflake can auto refresh a materialized view. Please take extra care while using materialized views, it may increase your cost.

In you organization, you have snowflake Enterprise addition. You notice that consistently, your queries are getting queued on the warehouses and delaying your ETL process. What are the possible solution options that you can think of? ​ A) Resize the warehouse ​ B) Use multi-cluster warehouse ​ C) Set the auto-resize parameter of the warehouse ​ D) Contact snowflake support to increase the number of servers for the warehouse

A) Resize the warehouse ​ B) Use multi-cluster warehouse

Which command will you run to list all users and roles to which a role has been granted. ​ A) SHOW GRANTS OF ROLE <role_name> ​ B) SHOW GRANTS TO ROLE <role_name> ​ C) SHOW GRANTS ON ACCOUNT

A) SHOW GRANTS OF ROLE <role_name>

This command can be used to list streams for the current/specified database or schema, or across your entire account. ​ A) SHOW STREAMS ​ B) DISPLAY STREAMS ​ C) LIST STREAMS

A) SHOW STREAMS

Which feature in snowflake allows you to load data in micro batches? ​ A) SNOWPIPE ​ B) COPY INTO COMMAND ​ C) LOAD COMMAND

A) SNOWPIPE

Which SNOWFLAKE object stores a generated identity and access management (IAM) entity for your external cloud storage, along with an optional set of allowed or blocked storage locations (Amazon S3, Google Cloud Storage, or Microsoft Azure)? ​ A) STORAGE INTEGRATION ​ B) STORAGE GROUP ​ C) STORAGE CLUSTERS

A) STORAGE INTEGRATION Note: A storage integration is a Snowflake object that stores a generated identity and access management (IAM) entity for your external cloud storage, along with an optional set of allowed or blocked storage locations (Amazon S3, Google Cloud Storage, or Microsoft Azure). Cloud provider administrators in your organization grant permissions on the storage locations to the generated entity. This option allows users to avoid supplying credentials when creating stages or when loading or unloading data

Which parameter will you use while creating a STORAGE integration to limit external stages that use the integration to reference one or more storage locations. ​ A) STORAGE_ALLOWED_LOCATIONS ​ B) STORAGE_ALLOWED_SITES ​ C) STORAGE_OBJECTS_ALLOWED_LOCATIONS

A) STORAGE_ALLOWED_LOCATIONS

Which system functions are available in snowflake to view/monitor the clustering metadata for a table ​ A) SYSTEM$CLUSTERING_DEPTH ​ B) SYSTEM$CLUSTERING_INFORMATION (including clustering depth) ​ C) CLUSTERING_DEPTH_VIEW

A) SYSTEM$CLUSTERING_DEPTH ​ B) SYSTEM$CLUSTERING_INFORMATION (including clustering depth)

The current offset for a stream can be determined by querying which function? ​ A) SYSTEM$STREAM_GET_TABLE_TIMESTAMP ​ B) SYSTEM$DISABLE_BEHAVIOR_CHANGE_BUNDLE ​ C) SYSTEM$TASK_DEPENDENTS_ENABLE ​ D) SYSTEM$USER_TASK_CANCEL_ONGOING_EXECUTIONS

A) SYSTEM$STREAM_GET_TABLE_TIMESTAMP

What are the installment options available for snowflake? ​ A) Snowflake hosted accounts on AWS ​ B) Hybrid on-premise and cloud ​ C) On-premise ​ D) Snowflake hosted accounts on Azure

A) Snowflake hosted accounts on AWS ​​ D) Snowflake hosted accounts on Azure

What are the usual data loading steps in snowflake? ​ A) Source -> Snowflake Stage -> Snowflake table ​ B) Source -> Cloud storage -> Snowflake stage -> Snowflake table ​ C) Source -> Snowflake temp table -> Snowflake transient table-> Snowflake permanent table

A) Source -> Snowflake Stage -> Snowflake table

What are the various types of table streams? ​ A) Standard ​ B) Append-only ​ C) Insert-only ​ D) Upsert

A) Standard ​ B) Append-only ​ C) Insert-only Note: Standard A standard (i.e. delta) table stream tracks all DML changes to the source table, including inserts, updates, and deletes (including table truncates). This stream type performs a join on inserted and deleted rows in the change set to provide the row level delta. As a net effect, for example, a row that is inserted and then deleted between two transactional points of time in a table is removed in the delta (i.e. is not returned when the stream is queried). Append-only An append-only table stream tracks row inserts only. Update and delete operations (including table truncates) are not recorded. For example, if 10 rows are inserted into a table and then 5 of those rows are deleted before the offset for an append-only stream is advanced, the stream records 10 rows. An append-only stream returns the appended rows only and therefore can be much more performant than a standard stream for extract, load, transform (ELT) and similar scenarios that depend exclusively on row inserts. For example, the source table can be truncated immediately after the rows in an append-only stream are consumed, and the record deletions do not contribute to the overhead the next time the stream is queried or consumed. Insert-only Supported on external tables only. An insert-only stream tracks row inserts only; they do not record delete operations that remove rows from an inserted set (i.e. no-ops). For example, in-between any two offsets, if File1 is removed from the cloud storage location referenced by the external table, and File2 is added, the stream returns records for the rows in File2 only. Unlike when tracking CDC data for standard tables, Snowflake cannot access the historical records for files in cloud storage.

What are the snowflake editions available as of today? ​ A) Standard Edition,Enterprise Edition,Business Critical Edition,Virtual Private Snowflake ​ B) Standard Edition,Enterprise Edition,ESD,Business Critical Edition,Virtual Private Snowflake ​ C) Standard Edition,Enterprise Edition

A) Standard Edition,Enterprise Edition,Business Critical Edition,Virtual Private Snowflake

You would like to execute a merge statement in snowflake every one hour. If you want to schedule this query using a snowflake object what will you use? ​ A) TASK ​ B) STREAM ​ C) SNOWPIPE

A) TASK

One of your team members accidentally truncated a table in your snowflake database? What feature of snowflake will you use to restore data into the table ​ A) TIME TRAVEL ​ B) FAIL SAFE ​ C) UNDROP TABLE

A) TIME TRAVEL

Which of the following object types are stored within schema ​ A) Tables ​ B) Views ​ C) User defined functions ​ D) File formats ​ E) Sequences ​ F) Roles

A) Tables ​ B) Views ​ C) User defined functions ​ D) File formats ​ E) Sequences

After you have granted a share to a consumer, you added more objects to the share. You have provided the required access by using the GRANT <privilege> ... TO SHARE command. Please select the true statement ​ A) The added objects will be accessible to the consumers immediately ​ B) You will need to recreate the share

A) The added objects will be accessible to the consumers immediately Note: objects in the share flow in, but you cannot change access on the share

You are trying to set a variable by suing the set variable function SET MY_VARIABLE ='XXXX' . You got an error as below Assignment to 'MY_VARIABLE' not done because value exceeds size limit of variables. Its size is 312; the limit is 256 What is the reason for this error? ​ A) The size of string or binary variables is limited to 256 bytes ​ B) This is a temporary error due to insufficient memory in VM instance ​ C) Your warehouse is not big enough to accomodate this

A) The size of string or binary variables is limited to 256 bytes

When the owner role of a given task (i.e. the role with the OWNERSHIP privilege on the task) is deleted, what happens to the task? ​ A) The task is "re-possessed" by the role that dropped the owner role ​ B) The task is deleted immediately ​ C) The task is suspended immediately

A) The task is "re-possessed" by the role that dropped the owner role

Which of the below mentioned clustering metadata for the micro-partitions is maintained by snowflake in a table? ​ A) The total number of micro-partitions that comprise the table ​ B) The number of micro-partitions containing values that overlap with each other ​ C) The depth of the overlapping micro-partitions ​ D) None of the above

A) The total number of micro-partitions that comprise the table ​ B) The number of micro-partitions containing values that overlap with each other ​ C) The depth of the overlapping micro-partitions

From a snowflake perspective what is common about Fivetran, Informatica, Matillion, Segment, Stitch and Talend ​ A) They are all snowflake data integration partner ​ B) They are all snowflake competitors ​ C) They are all programming interface patterns

A) They are all snowflake data integration partner

You have a multi-threaded client program that interacts with snowflake. If you are following snowflake recommendations what will you do ​ A) Use a separate connection for each thread ​ B) Execute the threads synchronously ​ C) Execute the threads asynchronously ​ D) Use one connection for all the threads

A) Use a separate connection for each thread ​ B) Execute the threads synchronously Note: Although multiple sessions cannot share the same transaction, multiple threads using a single connection share the same session, and thus share the same transaction. This can lead to unexpected results, such as one thread rolling back work done in another thread. This situation can occur when a client application using a Snowflake driver (such as the Snowflake JDBC Driver) or connector (such as the Snowflake Connector for Python) is multi-threaded. If two or more threads share the same connection, then those threads also share the current transaction in that connection. A BEGIN, COMMIT, or ROLLBACK by one thread affects all threads using that shared connection. If the threads are running asynchronously, the results can be unpredictable. Similarly, changing the AUTOCOMMIT setting in one thread affects the AUTOCOMMIT setting in all other threads that use the same connection. Note that changing the state of the AUTOCOMMIT setting commits and existing transaction. Snowflake recommends that multi-threaded client programs do at least one of the following: Use a separate connection for each thread. Note that even with separate connections, your code can still hit race conditions that generate unpredicable output; for example, one thread might delete data before another thread tries to update it. Execute the threads synchronously rather than asynchronously, to control the order in which steps are performed. https://docs.snowflake.com/en/sql-reference/transactions.html#transactions-and-multi-threading

Which of the below semi structured data types are supported by snowflake? ​ A) VARIANT ​ B) OBJECT ​ C) ARRAY ​ D) GEOGRAPHY

A) VARIANT ​ B) OBJECT ​ C) ARRAY Note: Snowflake supports three types of semi structured data types VARIANT - A tagged universal type, which can store values of any other type, including OBJECT and ARRAY, up to a maximum size of 16 MB compressed. OBJECT - Used to represent collections of key-value pairs, where the key is a non-empty string, and the value is a value of VARIANT type. Snowflake does not currently support explicitly-typed objects. ARRAY - Used to represent dense or sparse arrays of arbitrary size, where index is a non-negative integer (up to 2^31-1), and values have VARIANT type. Snowflake does not currently support fixed-size arrays or arrays of elements of a specific non-VARIANT type.

If a query is running slow, to check if this is for an overloaded warehouse where in the Snowflake UI you will go to check the same ​ A) Warehouse Tab ​ B) Database Tab ​ C) Query Tab

A) Warehouse Tab

What are the different caching mechanisms that are available in snowflake? ​ A) Warehouse caching ​ B) Index Caching ​ C) Metadata caching ​ D) Query result caching

A) Warehouse caching ​ C) Metadata caching ​ D) Query result caching Note: https://blog.ippon.tech/innovative-snowflake-features-caching/#:~:text=Metadata%20Cache,-Fully%20Managed%20in&text=Snowflake%20automatically%20collects%20and%20manages,Table%20Size%20in%20Bytes

You ran a query in snowflake and it took 2 minute 3 seconds to run, you ran the query again and it returned the results in less than a second. What might have happened? ​ A) When the query ran for the second time, it used the persisted query results from the query result cache ​ B) The query used the meta data cache ​ C) All the partitions were scanned from the SSD cache

A) When the query ran for the second time, it used the persisted query results from the query result cache

Snowpipe is recommended to load data in which of the below scenario? ​ A) You have small volume of frequent data ​ B) You have a huge volume of data generated as part of a batch schedule

A) You have small volume of frequent data

Which of the below are SNOWPIPE REST APIs. Select three. ​ A) insertFiles ​ B) insertReport ​ C) loadHistoryScan ​ D) loadHistoryProgress

A) insertFiles ​ B) insertReport ​ C) loadHistoryScan

When calling the rest endpoints in snowpipe, below authentication methods are supported ​ A) key pair authentication with JSON Web Token ​ B) user id and password ​ C) Both

A) key pair authentication with JSON Web Token

Which command will you run to view a list of files that have been unloaded to a stage named mystage. ​ A) list @mystage; ​ B) show @mystage; ​ C) display @mystage;

A) list @mystage;

Which permissions are required by Snowflake in AWS S3 to unload data to S3? ​ A) s3:DeleteObject ​ B) s3:PutObject ​ C) s3:ListObject

A) s3:DeleteObject ​ B) s3:PutObject

You have stored the below JSON in a table named car_sales as a variant column { "customer": [ { "address": "San Francisco, CA", "name": "Joyce Ridgely", "phone": "16504378889" } ], "date": "2017-04-28", "dealership": "Valley View Auto Sales", "salesperson": { "id": "55", "name": "Frank Beasley" }, "vehicle": [ { "extras": [ "ext warranty", "paint protection" ], "make": "Honda", "model": "Civic", "price": "20275", "year": "2017" } ] } How will you query the table to get the dealership data? ​ A) select src:dealership from car_sales; ​ B) select src:$dealership from car_sales; ​ C) select dealership from car_sales;

A) select src:dealership from car_sales;

Which GCP regions are supported by Snowflake in US and Canada ​ A) us-central1 ​ B) us-central2 ​ C) us-east1

A) us-central1

Which AWS regions are supported by snowflake in US and Canada ​ A) us-west-2 ​ B) us-east-2 ​ C) us-east-1 ​ D) ca-central-1 ​ E) us-central-2

A) us-west-2 ​ B) us-east-2 ​ C) us-east-1 ​ D) ca-central-1 Note: https://docs.snowflake.com/en/user-guide/intro-regions.html#government-regions

If file format options are specified in multiple locations, the load operation applies the options in the following order of precedence. A)​ 1. COPY INTO TABLE statement. 2. Stage definition. 3. Table definition. B)​ 1. Stage definition. 2. COPY INTO TABLE statement. 3. Table definition. C)​ 1. COPY INTO TABLE statement. 2. Table definition. 3. Stage definition.

A)​ 1. COPY INTO TABLE statement. 2. Stage definition. 3. Table definition.

Snowflake supports which of the following file formats for data loading? A. Parquet B. ORC C. CSV D. PDF

A, B, & C

The Snowflake metadata repository contains which of the following? A. Table definitions B. References to all of the micro partition files for that table C. Tracking of all versions of the table data within the data retention window D. Aggregated data from the table

A, B, & C

Which of the following are true of multi-cluster Warehouses? Select all that apply below. A. A multi-cluster Warehouse can add clusters automatically based on query activity B. A multi-cluster Warehouse can automatically turn itself off after a period of inactivity C. A multi-cluster Warehouse can scale down when query activity slows D. A multi-cluster Warehouse can automatically turn itself on when a query is executed against it

A, B, C, D

Which of the following are types of caching used by SF? A. Warehouse Caching B. Index Caching C. Metadata Caching D. Query result Caching

A, C, & D

On which of the following cloud platform can a Snowflake account be hosted? Choose 2 answers A. Amazon Web Services B. Private Virtual Cloud C. Oracle Cloud D. Microsoft Azure Cloud

A. Amazon Web Services & D. Microsoft Azure Cloud

What command is used to load files into an Internal Stage within Snowflake? A. PUT B. COPY INTO C. TRANSFER D. INSERT

A. PUT

True or False: Snowpipe via RFST API can only reference External Stages as source.

A. True

Select the statements which are true for an external table ​ External tables are read-only ​ External tables can be used for query and join operations ​ Views can be created against external tables ​ Data can be updated in external tables

External tables are read-only ​ External tables can be used for query and join operations ​ Views can be created against external tables

The PIPE_EXECUTION_PAUSED parameter can be set at which levels ​ DATABASE ​ ACCOUNT ​ SCHEMA ​ PIPE

ACCOUNT ​ SCHEMA ​ PIPE Note: Complete the following steps to transfer ownership of a pipe: Set the PIPE_EXECUTION_PAUSED parameter to TRUE. This parameter enables pausing or resuming a pipe. The parameter is supported at the following levels: Account Schema Pipe At the pipe level, the object owner (or a parent role in a role hierarchy) can set the parameter to pause or resume an individual pipe. An account administrator (user with the ACCOUNTADMIN role) can set this parameter at the account level to pause or resume all pipes in the account. Likewise, a user with the MODIFY privilege on the schema can pause or resume pipes at the schema level. Note that this larger domain control only affects pipes for which the parameter was not already set at a lower level; e.g., by the owner at the object level. https://docs.snowflake.com/en/user-guide/data-load-snowpipe-manage.html#transferring-pipe-ownership

Which roles create, alter, or drop network policies? ​ ACCOUNTADMIN ​ SECURITYADMIN ​ SYSADMIN

ACCOUNTADMIN ​ SECURITYADMIN

Who can setup a data share ​ SECURITYADMIN ​ SYSADMIN ​ ACCOUNTADMIN or a role granted the CREATE SHARES global privilege

ACCOUNTADMIN or a role granted the CREATE SHARES global privilege Note: https://docs.snowflake.com/en/user-guide/data-sharing-provider.html

What is the encryption algorithm used by snowflake? ​ AES 256-bit encryption ​ DES ​ BLOWFISH

AES 256-bit encryption

When working with a cloned table, you can use the below SQL statements ​ SELECT ​ DROP ​ SHOW ​ ALL OF THE ABOVE

ALL OF THE ABOVE

You have created a TASK in snowflake. How will you resume it? ​ No need to resume, the creation operation automatically enables the task ​ ALTER TASK mytask1 RESUME; ​ ALTER TASK mytask1 START;

ALTER TASK mytask1 RESUME; Note: After creating a task, you must execute ALTER TASK ... RESUME before the task will run based on the parameters specified in the task definition.

You want to scale out query performance. Which strategy will you follow with respect to your warehouses? ​ Resize the warehouse ​ Add clusters to the warehouse ​ Create another warehouse

Add clusters to the warehouse Note: Scale out = add clusters scale up = make WH larger

Your business team runs a set of identical queries every day after the batch ETL run is complete. From the following actions, what is the best action that you will recommend. ​ After the ETL run, execute the identical queries so that they remain in the result cache ​ After the ETL run, resize the warehouse to a larger warehouse ​ After the ETL run, copy the tables to another schema for the business users to query

After the ETL run, execute the identical queries so that they remain in the result cache Note: Please note the key word here which is IDENTICAL queries. When a query is run the first time, and the same query is run the second time it picks up the results from the query result cache and it does not cost you any compute. The query result cache is valid for 24 hours and it gets extended for another 24 hours, every time you access it(even if you access it at 23 hours 59 seconds).

Having separate accounts in snowflake enables users to have ​ Different editions of snowflake and different regions ​ Billing can be done at account level ​ Database object deployment between environments become simpler ​ All of the above

All of the above

Which approach would result in improved performance through linear scaling of data ingestion workload? ​ Resize virtual warehouse ​ Consider practice of organizing data by granular path ​ Consider practice of splitting input file batch within the recommended size of 10 MB to 100 MB ​ All of the above

All of the above

Which best describes zero-copy cloning? ​ Metadata only operation ​ No replication of data ​ Unmodified data stored once and modified data stored as new micro partitions ​ All of the above

All of the above Note: When a clone is created of a table, the clone utilizes no data storage because it shares all the existing micro-partitions of the original table at the time it was cloned; however, rows can then be added, deleted, or updated in the clone independently from the original table. Each change to the clone results in new micro-partitions that are owned exclusively by the clone and are protected through CDP(continuous data protection).

OVERWRITE option for PUT is supported in which cloud providers ​ Amazon AWS ​ Microsoft Azure ​ Google cloud platform

Amazon AWS ​ Microsoft Azure

A Stream has been created on a table. A row in the table got inserted and subsequently got updated. The stream will capture both the events. ​(TRUE OR FALSE)

FALSE

Select the two true statements about multi cluster warehouse ​ These type of warehouses cannot be resumed or suspended ​ Multi cluster warehouses can be created on snowflake hosted on AWS only ​ As query demand decreases, clusters are removed from this warehouse ​ When query workload increases, clusters are automatically added

As query demand decreases, clusters are removed from this warehouse ​ When query workload increases, clusters are automatically added Note: In Auto-scale mode, a multi-cluster warehouse eliminates the need for resizing the warehouse or starting and stopping additional warehouses to handle fluctuating workloads. Snowflake automatically starts and stops additional clusters as needed. In Maximized mode, you can control the capacity of the warehouse by increasing or decreasing the number of clusters as needed.

Snowflake on GCP is not available in which region currently (as of Year 2020) ​ Europe ​ Asia Pacific ​ North America

Asia Pacific Refer link for details of snowflake availablity in differenent regions - https://docs.snowflake.com/en/user-guide/intro-regions.html

Cloud services layer is one of the layers in snowflake. Which services are provided by this layer ​ Authentication ​ Infrastructure management ​ Metadata management ​ Query parsing and optimization ​ Access control ​ Query processing

Authentication ​ Infrastructure management ​ Metadata management ​ Query parsing and optimization ​ Access control

Which of the following are options when creating a virtual warehouse ​ Auto Suspend ​ Auto Resume ​ Storage Size ​ Server count

Auto Suspend ​ Auto Resume

Which two modes are available in Multi cluster warehouse ​ Auto-scale ​ Peak Scale ​ Flexible ​ Maximized

Auto-scale ​ Maximized

What are the two mechanisms to detect if new stage file is there in a snowpipe? ​ Automating Snowpipe using cloud messaging ​ Calling Snowpipe REST endpoints ​ Calling the custom APIs exposed through AWS EKS

Automating Snowpipe using cloud messaging ​ Calling Snowpipe REST endpoints

What is the maximum number of consumer accounts that can be added to a Share object? A. One B. Unlimited C. 10 D. 100

B

What is the recommended approach for making a variant column accessible in a BI tool? A. A pre-defined mapping B. A View C. Leveraging a JSON parser D. There is no way to do this

B

What technique does Snowflake use to limit the number of micro-partitions scanned by each query? A. Indexing B. Pruning C. Map Reduce D. B-True

B

Which of the following techniques can be in SF to improve query performance? A. Index Definition B. Cluster Keys/Reclustering C. Distribution Keys D. Query Hints

B

Which type of data integration tools leverage SF's scalable compute for data transformation? A. Data Replication B. ELT C. ETL D. Streaming

B

Load history of Snowpipe expires after how many days. ​ A) 30 ​ B) 14 ​ C) 64

B) 14

You have executed a task. Due to some coding issue, the task went into a loop. How long the task will run before it is terminated by Snowflake ​ A) 24 hours ​ B) 60 minutes ​ C) 30 minutes

B) 60 minutes

One of your user complained that a query is taking a long time to return any result. What techniques would you consider to improve performance? ​ A) Create an index on the search columns ​ B) Create cluster keys and turn on auto clustering on the table ​ C) Enable query hints on the table ​ D) Define the partition keys for the table

B) Create cluster keys and turn on auto clustering on the table

Tick the statements which are true for fail safe ​ A) Fail safe period is configurable ​ B) Data can be recovered from a fail-safe storage only by snowflake employee ​ C) Using fail-safe you can create dev/test environments ​ D) Tables with fail-safe turned on incurs additional storage costs compared to tables where fail-safe is not turned on

B) Data can be recovered from a fail-safe storage only by snowflake employee D) Tables with fail-safe turned on incurs additional storage costs compared to tables where fail-safe is not turned on

The three key layers of snowflake are ​ A) Extraction, Ingestion, Load ​ B) Database storage, Query Processing, Cloud Services ​ C) Database, Virtual Warehouse, Data Experience

B) Database storage, Query Processing, Cloud Services Note: https://docs.snowflake.com/en/user-guide/intro-key-concepts.html

You are an account administrator and you want to use a third party product in snowflake. Where will you go to enable the third party product? ​ A) The third party product's webpage and contact them through contact us ​ B) Call snowflake support to enable the product ​ C) Enable the product through Partner Connect in Snowflake web console

C) Enable the product through Partner Connect in Snowflake web console

A snowflake account hosted on AWS cannot load data from a file that is staged in GCP or Azure ​ FALSE ​ TRUE

FALSE

Select two true statements about fail safe from the following ​ A) Fail safe is same as time travel ​ B) Fail-safe provides a (non-configurable) 7-day period during which historical data is recoverable by Snowflake ​ C) Fail-safe provides a (non-configurable) 90-day period during which historical data is recoverable by Snowflake ​ D) Fail-safe ensures historical data is protected in the event of a system failure or other catastrophic event, e.g. a hardware failure or security breach

B) Fail-safe provides a (non-configurable) 7-day period during which historical data is recoverable by Snowflake ​ D) Fail-safe ensures historical data is protected in the event of a system failure or other catastrophic event, e.g. a hardware failure or security breach

Which of the below is true about snowflake architecture ​ A) One node shared data ​ B) Multi Cluster Shared data ​ C) Multi Node distributed data

B) Multi Cluster Shared data

What is the default action to perform when an error is encountered while loading data from a file using SNOWPIPE? ​ A) ABORT_STATEMENT ​ B) SKIP_FILE ​ C) CONTINUE

B) SKIP_FILE

Select two choices that are true about snowflake roles ​ A) Snowflake users has a limit on the number of roles that they can assume ​ B) Snowflake user can have one or more roles ​ C) Privileges can be directly assigned to a user ​ D) For a particular session, only one role can be active at a given time

B) Snowflake user can have one or more roles ​ D) For a particular session, only one role can be active at a given time

Which object in snowflake records data manipulation language changes made to tables, including inserts, updates, and deletes, as well as metadata change about each change? ​ A) Tasks ​ B) Table stream ​ C) Snowpipe ​ D) Change Data Capture

B) Table stream Note: A stream object records data manipulation language (DML) changes made to tables, including inserts, updates, and deletes, as well as metadata about each change, so that actions can be taken using the changed data. This process is referred to as change data capture (CDC). An individual table stream tracks the changes made to rows in a source table. A table stream (also referred to as simply a "stream") makes a "change table" available of what changed, at the row level, between two transactional points of time in a table. This allows querying and consuming a sequence of change records in a transactional fashion.

When you remove the predecessor of a child task, what may happen to the child task. Select two possibilities? ​ A) The child task is removed from the system ​ B) The child task may become the root task ​ C) The child task may become a stand-alone task

B) The child task may become the root task ​ C) The child task may become a stand-alone task

How often does snowflake release new feature ​ A) Daily ​ B) Weekly ​ C) Monthly ​ D) Annually

B) Weekly

What are the three things customers want most from their enterprise data warehouse solution? Choose 3 answers A. On-premise availability B. Simplicity C. Open source based D. Concurrency E. Performance

B, D, E

What are the two data loading approaches in snowflake ​ BULK LOADING ​ CONTINUOUS LOADING ​ INGEST LOADING

BULK LOADING (COPY command) ​ CONTINUOUS LOADING (Snowpipe)

How would you execute a series of SQL statements using Task? A. include the SQL statements in the body of the task create task mytask.. as insert into target1 select., from stream_s1 where ..INSERT INTO target2 SELECT .. FROM stream .s1 where .. B. A stored procedure can have only one DML statement per stored procedure invocation and therefore you should sequence stored procedures calls in the task definition CREATE TASK mytask.... AS call stored_prc1(); call stored_proc2t); C. A stored procedure executing multiple SQL statements and invoke the stored procedure from the task. CREATE TASK mytask.... AS callstored_proc_multiple_statements_inside(): D. Create a task for each SQL statement (e.g. resulting in task1. task2, etc) and string the series of SQL statements by having a control taskcalling task 1, task 2 sequentially.

C

Which of the following terms best describes Snowflake's database architecture? A. Columnar shared nothing B. Shared disk C. Multi-cluster shared data D. Cloud native shared memory

C

Lets say you executed a transaction in a snowflake session. Due to some reason the session disconnects and the transaction now is in a detached state. The transaction cannot be committed or rolled back. The object on which the transaction was applied is also now locked. if you do not do anything and let snowflake eventually abort the transaction, how long will you need to wait? ​ A) 60 minutes ​ B) 15 minutes ​ C) 4 hours

C) 4 hours

Your customer needs to setup Failover/Failback mechanism in their snowflake instance. Which edition of Snowflake will you suggest? ​ A) STANDARD ​ B) PREMIER ​ C) BUSINESS CRITICAL(OR HIGHER)

C) BUSINESS CRITICAL(OR HIGHER) Note: https://docs.snowflake.com/en/user-guide/database-replication-failover.html

Your customer deals with extremely sensitive data like PHI data. They need to be compliant with HIPAA. What is the minimum edition that you will recommend ​ A) STANDARD ​ B) PREMIUM ​ C) Business Critical Edition ​ D) Virtual Private Snowflake

C) Business Critical Edition

What is the easiest way to monitor the queries that are run on snowflake? ​ A) Create a tableau dashboard and connect to snowflake ​ B) All queries go to cloudwatch and use cloudwatch to monitor ​ C) Click on the the History tab and monitor all queries that are executed in the last 14 days

C) Click on the the History tab and monitor all queries that are executed in the last 14 days

Pick the true statement for snowflake architecture ​ A) Shared nothing architecture ​ B) Shared disk architecture ​ C) Multi-Cluster Shared Data architecture

C) Multi-Cluster Shared Data architecture Note: nowflake's architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. Similar to shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture. https://docs.snowflake.com/en/user-guide/intro-key-concepts.html

What happens when you execute the below query CREATE OR REPLACE TABLE MyTable_V2 CLONE MyTable ​ A) MyTable_V2 is created and all the data from MyTable is copied into MyTable_V2 ​ B) MyTable_V2 is created and a batch job is submitted to copy data from MyTable ​ C) Snowflake creates a new entry in the metadata store to keep track of the new clone. The existing micro partitions of table MyTable are also mapped to the new table. This is called zero copy clone.

C) Snowflake creates a new entry in the metadata store to keep track of the new clone. The existing micro partitions of table MyTable are also mapped to the new table. This is called zero copy clone.

What influences snowflake pricing? ​ A) Amount of data queried from snowflake ​ B) Amount of data scanned during querying snowflake ​ C) Snowflake pricing is based on usage and it charges only for storage and compute

C) Snowflake pricing is based on usage and it charges only for storage and compute

Query processing in snowflake is done by... ​ A) Snowflake processes queries using AWS EMR running spark ​ B) Snowflake processes queries running spark on a EC2 instance ​ C) Snowflake process queries using 'Virtual Warehouses'

C) Snowflake process queries using 'Virtual Warehouses'

How is data loaded into snowflake? ​ A) Snowflake loads the data in parquet format on the underlying cloud storage ​ B) Snowflake loads the data in JSON format on the underlying cloud storage ​ C) Snowflake reorganizes the data into its internal optimized, compressed, columnar format at stores on the underlying cloud storage

C) Snowflake reorganizes the data into its internal optimized, compressed, columnar format at stores on the underlying cloud storage

If you want customer dedicated virtual warehouse, which is the lowest snowflake edition that you should opt for ​ A) Business Critical ​ B) Enterprise ​ C) Standard

C) Standard

You are developing an ETL query. one of your ETL logic requires you to store the intermediate data temporarily. This data will be used only by your ETL. The data will not be required outside the ETL. If you want to optimize storage cost, what type of table will you create to store this data ​ A) PERMANENT ​ B) TRANSIENT ​ C) TEMPORARY ​ D) EXTERNAL

C) TEMPORARY

You have created a lot of shares with role as ACCOUNTADMIN. You want to create the shares with SYSADMIN as role, so you have granted 'CREATE SHARE' privilege to SYSADMIN. How do you change the ownership of the existing shares? ​ A) Execute 'GRANT OWNERSHIP ON <SHARE NAME> TO ROLE SYSADMIN' ​ B) Execute 'GRANT OWNERSHIP TO ROLE SYSADMIN ON <SHARE NAME> ​ C) The only option is to drop and recreate the share. Ownership of a share cannot be granted to another role

C) The only option is to drop and recreate the share. Ownership of a share cannot be granted to another role

When a warehouse does not have enough resources available to process the queries, what happens to the incoming queries? ​ A) Queries are aborted immediately ​ B) Snowflake automatically resizes the warehouse ​ C) The queries are queued and then executed when the resources are available again

C) The queries are queued and then executed when the resources are available again

Compute cost in snowflake depends on ​ A) The actual query execution time ​ B) The query execution time and the time the query waits for the resource ​ C) The warehouse size and how long the warehouse runs

C) The warehouse size and how long the warehouse runs

How do you truncate a date (from a timestamp) down to the year, month, and day ​ A) Use concatenation of the date part ​ B) Split the date into year, month and day ​ C) Use date_trunc and execute a query as below select to_date('2015-05-08T23:39:20.123-07:00') as "DATE1", date_trunc('YEAR', "DATE1") as "TRUNCATED TO YEAR", date_trunc('MONTH', "DATE1") as "TRUNCATED TO MONTH", date_trunc('DAY', "DATE1") as "TRUNCATED TO DAY";

C) Use date_trunc and execute a query as below select to_date('2015-05-08T23:39:20.123-07:00') as "DATE1", date_trunc('YEAR', "DATE1") as "TRUNCATED TO YEAR", date_trunc('MONTH', "DATE1") as "TRUNCATED TO MONTH", date_trunc('DAY', "DATE1") as "TRUNCATED TO DAY";

Your customer required the highest level of data protection and needs a completely isolated environment for snowflake. Which edition will you recommend ​ A) STANDARD ​ B) PREMIER ​ C) Virtual Private Snowflake ​ D) Business Critical Edition

C) Virtual Private Snowflake

Compute in snowflake is provided by ​ A) The cloud providers VM instances ​ B) Only EC2 instances on AWS ​ C) Virtual warehouses

C) Virtual warehouses

In your snowflake environment, you have a medium warehouse. The medium warehouse is used by business to run adhoc queries. The warehouse has auto suspend set at 15 minutes. You have noticed that all of the queries that run on this warehouse finishes within a minute. What will you do to optimize cost of compute in this case? ​ A) Since all the queries are completing in a minute, you should delete this warehouse and tell the users to use another existing warehouse ​ B) You will tell the users that they do not need snowflake to run their queries, they should load their data in another on premise database and run query from there ​ C) You will recommend to reduce the auto suspend time to 1 minute

C) You will recommend to reduce the auto suspend time to 1 minute

A successful response from the SNOWPIPE insertFiles API means that the files are ingested. (TRUE OR FALSE)

FALSE

A table in snowflake can only be queried using the virtual warehouse used to load the data ​ TRUE ​ FALSE

FALSE

You are a snowflake architect hired by an organization. They want you to design their warehouse strategy. Which one of the strategy will you pick up. ​ A) Both loading and analysis of data will be done by a single warehouse to reduce cost ​ B) You do not need to use warehouse for loading data, only query analysis will require a warehouse ​ C) You will recommend to use a multi-warehouse strategy. The load workload will be done by one ware house and the query analysis workload will be done by another warehouse

C) You will recommend to use a multi-warehouse strategy. The load workload will be done by one ware house and the query analysis workload will be done by another warehouse

When loading data into Snowflake, the COPY command supports: Choose 2 answers A. Joins B. Fitters C. Data type conversions D. Column reordering E. Aggregates

C. Data Type Conversion & D. Column Reordering

The commands to load data into snowflake are ​ COPY AND PUT ​ COPY AND INSERT ​ INSERT AND PUT ​ NONE OF THE ABOVE

COPY AND PUT ​ COPY AND INSERT ​ INSERT AND PUT

Which of the below function returns the name of the warehouse of the current session? ​ ACTIVE_WAREHOUSE() ​ SESSION_WAREHOUSE() ​ CURRENT_WAREHOUSE() ​ WAREHOUSE()

CURRENT_WAREHOUSE()

What makes a Partner Connect Partner different from other partners? ​ Can be connected to Snowflake using a streamlined wizard ​ Requires Enterprise Edition ​ Can be connected from within the WebUI​ Includes automated role, user and staging database set up ​ Includes a streamlined Partner Trial Account Signup

Can be connected to Snowflake using a streamlined wizard ​ Can be connected from within the WebUI ​ Includes automated role, user and staging database set up ​ Includes a streamlined Partner Trial Account Signup Explanation Refer Link for details - https://docs.snowflake.com/en/user-guide/ecosystem-partner-connect.html

In snowflake data storage cost depends on which factors ​ Cloud service provider ​ Region ​ Type of account we chose (on-demand/capacity) ​ Edition

Cloud service provider ​ Region ​ Type of account we chose (on-demand/capacity)

To improve the query performance, which of the below techniques can be used in snowflake ​ Indexes ​ Distribution keys ​ Query hints ​ Cluster keys/Reclustering

Cluster keys/Reclustering

Why would you not recommend a high cardinality column for including in the clustering key? (Select 2) ​ Extremely high cardinality column results in skewed distribution across micro-partitions and hence will impact query pruning ​ High cardinality will lead to low clustering depth, so as you add more data query performance will degrade ​ Cluster maintenance with high cardinality keys is expensive ​ Point lookup queries will only benefit, whereas range queries will not be effectively utilize the clustering key

Cluster maintenance with high cardinality keys is expensive ​ Point lookup queries will only benefit, whereas range queries will not be effectively utilize the clustering key

Which transformations are available when using the COPY INTO command to load data files into Snowflake from a stage? Filters Aggregates Column data type conversion Column concatention

Column data type conversion Column concatention

Select the options that differentiates a Partner Connect partner from a regular partner ​ Connect with snowflake through a wizard ​ Includes a partner trial account signup ​ Can be connected from the WEB UI ​ Includes automated role, user and staging database setup ​ None of the above

Connect with snowflake through a wizard ​ Includes a partner trial account signup ​ Can be connected from the WEB UI ​ Includes automated role, user and staging database setup Note: https://docs.snowflake.com/en/user-guide/ecosystem-partner-connect.html#snowflake-partner-connect

Which statements are true about a micro partition in snowflake ​ Contiguous unit of storage ​ Each micro partition contains between 50 MB and 500 MB of uncompressed data ​ Organized in a columnar fashion ​ Micro partitions can be updated using snowflake query

Contiguous unit of storage ​ Each micro partition contains between 50 MB and 500 MB of uncompressed data ​ Organized in a columnar fashion Note: Micro partitions are immutable which means once created they cannot be changed. If a row is updated in snowflake, the micro partition holding the row is copied into a new micro partition and the updated row is inserted in that micro partition. The olde micro partition is marked for deletion All data in Snowflake tables is automatically divided into micro-partitions, which are contiguous units of storage. Each micro-partition contains between 50 MB and 500 MB of uncompressed data (note that the actual size in Snowflake is smaller because data is always stored compressed). Groups of rows in tables are mapped into individual micro-partitions, organized in a columnar fashion. This size and structure allows for extremely granular pruning of very large tables, which can be comprised of millions, or even hundreds of millions, of micro-partitions. Snowflake stores metadata about all rows stored in a micro-partition, including: The range of values for each of the columns in the micro-partition. The number of distinct values. Additional properties used for both optimization and efficient query processing. https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions.html

What are the three features provided by Snowflake for continuous data pipeline ​ Continuous data loading ​ Change data tracking Recurring tasks ​ Cloud functions Table Pipes

Continuous data loading ​ Change data tracking Recurring tasks

What is the best practice for handling semi-structured data with 3rd party BI tools?

Create a Snowflake view that parse the semi-structured column into structure columns for the BI tool to retrieve

You want to convert an existing permanent table to a transient table (or vice versa) while preserving data and other characteristics such as column defaults and granted privileges. What is the best way to do it? ​ Run an ALTER TABLE command to convert the tables ​ Unload the data from the existing table into a CSV file. Create the new table and then load the data back in ​ Create a new table and use the COPY GRANTS clause

Create a new table and use the COPY GRANTS clause

Your business team wants to load JSON data into a table in snowflake. What will you recommend them? ​ Create a table with a variant column and load the JSON data in that column ​ Tell the team to convert JSON to CSV ​ Construct a python program to convert the JSON into CSV and then load the data.

Create a table with a variant column and load the JSON data in that column

Which of the below features are only available to Business Critical edition and above? ​ Customer-managed encryption keys through Tri-Secret Secure. ​ Support for secure, direct proxy to your other virtual networks or on-premises data centers using AWS PrivateLink or Azure Private Link. ​ Support for PHI data (in accordance with HIPAA regulations). ​ Support for PCI DSS. ​ Extended Time Travel (up to 90 days).

Customer-managed encryption keys through Tri-Secret Secure. ​ Support for secure, direct proxy to your other virtual networks or on-premises data centers using AWS PrivateLink or Azure Private Link. ​ Support for PHI data (in accordance with HIPAA regulations). ​ Support for PCI DSS.

You have cloned a table. Which of the below queries will work on the cloned table? ​ A) DROP TABLE <TABLE_NAME> ​ B) SELECT * FROM <TABLE_NAME> ​ C) SHOW TABLES LIKE '<TABLE_NAME>' ​ D) ALL OF THE ABOVE

D) ALL OF THE ABOVE

When the snowflake staged release happens, which accounts get the updates first ​ A) Enterprise or above ​ B) Trial accounts ​ C) Premier accounts ​ D) Accounts who has opted for early access

D) Accounts who has opted for early access

Which snowflake edition supports private communication between Snowflake and your other VPCs through AWS PrivateLink ​ A) Standard ​ B) Premier ​ C) Enterprise ​ D) Business critical

D) Business critical

Which of the following best describes snowflakes processing engine? ​ A) Leverages Apache Spark ​ B) Leverages Map Reduce ​ C) A derivative of presto ​ D) Native SQL

D) Native SQL

For federated authentication and MFA, what is the minimum snowflake edition required? ​ A) Business critical ​ B) Enterprise ​ C) Premier ​ D) Standard

D) Standard

What are the product offerings for secure data sharing? ​ DIRECT SHARE ​ DATA MARKETPLACE ​ DATA EXCHANGE ​ DATA STORE

DIRECT SHARE ​ DATA MARKETPLACE ​ DATA EXCHANGE Note: https://docs.snowflake.com/en/user-guide/data-sharing-product-offerings.html#direct-share

Which of the following is not a layer in SF's Architecture? Storage Cloud Services Data Catalog Virtual Warehouses

Data Catalog

What are the available partner categories in snowflake? ​ Data integration ​ Business Intelligence ​ Security and governance ​ ML and data science ​ Data Warehousing

Data integration ​ Business Intelligence ​ Security and governance ​ ML and data science Note: https://docs.snowflake.com/en/user-guide/ecosystem.html

There is no query performance difference between a column with a maximum length declaration (e.g. VARCHAR(16777216)), and a smaller precision. Still it is recommended to define an appropriate column length because of the below reasons ​ Data loading operations are more likely to detect issues such as columns loaded out of order, e.g. a 50-character string loaded erroneously into a VARCHAR(10) column. Such issues produce errors ​ When the column length is unspecified, some third-party tools may anticipate consuming the maximum size value, which can translate into increased client-side memory usage or unusual behavior ​ Data unloading will be performant if appropriate column lengths are defined

Data loading operations are more likely to detect issues such as columns loaded out of order, e.g. a 50-character string loaded erroneously into a VARCHAR(10) column. Such issues produce errors ​ When the column length is unspecified, some third-party tools may anticipate consuming the maximum size value, which can translate into increased client-side memory usage or unusual behavior

Which of the following conditions are required for sharing data in Snowflake? Note - Here Data Sharing being referred is not via Data Replication. ​ Data providers with ACCOUNTADMIN role can set up shares ​ Consumer accounts must be in same Snowflake region as the provider account ​ Secure views are not required when query performance is priority ​ Each data share must contain a single database

Data providers with ACCOUNTADMIN role can set up shares ​ Consumer accounts must be in same Snowflake region as the provider account Details for setting up a share can be referred at following link - https://docs.snowflake.com/en/user-guide/data-share-providers.html

What is the best way to store semi-structured data?

Data should be loaded and stored in a VARIANT type column

Which layer in snowflake contains data in compressed, columnar format ​ Cloud services ​ Query processing ​ Database storage ​ None

Database storage

Which feature is only available in VPS edition? ​ Dedicated metadata store and pool of virtual servers (used in virtual warehouses). ​ SOC 2 Type II certification. ​ Automatic encryption of all data.

Dedicated metadata store and pool of virtual servers (used in virtual warehouses). Note: https://docs.snowflake.com/en/user-guide/intro-editions.html#business-critical-edition

Each object has an owner, who can in turn grant access to that object. What is this model called? ​ Discretionary Access Control (DAC) ​ Role-based Access Control (RBAC) ​ Object ownership model

Discretionary Access Control (DAC) Note: Snowflake's approach to access control combines aspects from both of the following models: 1. Discretionary Access Control (DAC): Each object has an owner, who can in turn grant access to that object. 2. Role-based Access Control (RBAC): Access privileges are assigned to roles, which are in turn assigned to users. https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#access-control-framework

To avoid unexpected task executions due to daylight saving time, you will do one of the below activities. ​ Do not schedule tasks to run at a specific time between 1 AM and 3 AM (daily, or on days of the week that include Sundays) ​ Manually adjust the cron expression for tasks scheduled during those hours twice each year to compensate for the time change due to daylight saving time ​ Add the USE DAYLIGHT SAVINGS parameter for task

Do not schedule tasks to run at a specific time between 1 AM and 3 AM (daily, or on days of the week that include Sundays) ​ Manually adjust the cron expression for tasks scheduled during those hours twice each year to compensate for the time change due to daylight saving time Note: The cron expression in a task definition supports specifying a time zone. A scheduled task runs according to the specified cron expression in the local time for a given time zone. Special care should be taken with regard to scheduling tasks for time zones that recognize daylight saving time. Tasks scheduled during specific times on days when the transition from standard time to daylight saving time (or the reverse) occurs can have unexpected behaviors. For example: During the autumn change from daylight saving time to standard time, a task scheduled to start at 1 AM in the America/Los_Angeles time zone (i.e. 0 1 * * * America/Los_Angeles) would run twice: once at 1 AM and then again when 1:59:59 AM shifts to 1:00:00 AM local time. That is, there are two points in time when the local time is 1 AM. During the spring change from standard time to daylight saving time, a task scheduled to start at 2 AM in the America/Los_Angeles time zone (i.e. 0 2 * * * America/Los_Angeles) would not run at all because the local time shifts from 1:59:59 AM to 3:00:00 AM. That is, there is no point during that day when the local time is 2 AM.

To have multi cluster data virtual warehouse, what should be the minimum snowflake subscription ​ ENTERPRISE ​ BUSINESS CRITICAL ​ STANDARD

ENTERPRISE

Your organization has a need for Column-level security by applying masking policies to columns in tables or views. Which editions will be suitable for this requirement? ​ Standard ​ Enterprise ​ Business Critical ​ VPS

Enterprise ​ Business Critical ​ VPS

Extended Time Travel (up to 90 days) is available in which editions ​ Standard ​ Enterprise ​ Business Critical ​ VPS

Enterprise ​ Business Critical ​ VPS Note: https://docs.snowflake.com/en/user-guide/intro-editions.html#security-data-protections

What was Business Critical Edition formerly known as ​ Standard Edition ​ Enterprise Edition ​ Business Edition ​ Enterprise for Sensitive Data (ESD)

Enterprise for Sensitive Data (ESD)

With respect to Snowflake UI, which of the following is true? ​ A single session can be shared between multiple worksheets ​ Every worksheet can have a different role, warehouse and a database ​ Worksheets cannot have different role, warehouse and database ​ Every worksheet has its own session

Every worksheet can have a different role, warehouse and a database ​​ Every worksheet has its own session

How will you check if a share is inbound or outbound? ​ Execute SHOW SHARES and look at kind column ​ Execute describe share and look at location column ​ Execute DISPLAY SHARE and look at kind column

Execute SHOW SHARES and look at kind column

One of your users have accidentally dropped a table T1 in production. What can you do to restore the table? ​ Execute UNDROP TABLE T1; ​ Execute UNDELETE TABLE T1; ​ Call Snowflake team and request them to recover the data from cloud storage ​ Nothing can be done, whatever is lost is lost See all questions BackSkip question

Execute UNDROP TABLE T1; Note: UNDROP is a very useful feature in Snowflake. It makes it easier to retrieve your data and that is because snowflake stores data and objects in encrypted form for 24 hours by default. You have an option to purchase even longer retention period, if required. Any lost data can be instantaneously recovered with the UNDROP features. It is a single short command UNDROP TABLE <tablename> In addition to this, you can also do UNDROP SCHEMA and UNDROP DATABASE if someone messes it up even bigger. https://docs.snowflake.com/en/sql-reference/sql/undrop-table.html#undrop-table

What are the recommendations from Snowflake to effectively and efficiently use a warehouse. ​ Experiment with different types of queries and different warehouse sizes to determine the combinations that best meet your specific query needs and workload ​ Don't focus on warehouse size. Snowflake utilizes per-second billing, so you can run larger warehouses (Large, X-Large, 2X-Large, etc.) and simply suspend them when not in use ​ Be very careful on what size of warehouse you are picking up. The credits charged are based on number of queries running on the warehouse and the size of the warehouse

Experiment with different types of queries and different warehouse sizes to determine the combinations that best meet your specific query needs and workload ​ Don't focus on warehouse size. Snowflake utilizes per-second billing, so you can run larger warehouses (Large, X-Large, 2X-Large, etc.) and simply suspend them when not in use Note: Remember one more thing. The credits are not dependent on number of queries running in your warehouse, it depends on how long your warehouse is active.

A transient table can be cloned to a permanent table ​ TRUE ​ FALSE

FALSE

You have a small table in snowflake which has only 10,000 rows. Specifying a clustering key will further improved the queries that run on this table ​ FALSE ​ TRUE

FALSE Note: Specifying a clustering key is not necessary for most tables. Snowflake performs automatic tuning via the optimization engine and micro-partitioning. In many cases, data is loaded and organized into micro-partitions by date or timestamp, and is queried along the same dimension. When should you specify a clustering key for a table? First, note that clustering a small table typically doesn't improve query performance significantly. For larger data sets, you might consider specifying a clustering key for a table when: The order in which the data is loaded does not match the dimension by which it is most commonly queried (e.g. the data is loaded by date, but reports filter the data by ID). If your existing scripts or reports query the data by both date and ID (and potentially a third or fourth column), you may see some performance improvement by creating a multi-column clustering key. Query Profile indicates that a significant percentage of the total duration time for typical queries against the table is spent scanning. This applies to queries that filter on one or more specific columns. Note that reclustering rewrites existing data with a different order. The previous ordering is stored for 7 days to provide Fail-safe protection. Reclustering a table incurs compute costs that correlate to the size of the data that is reordered. https://docs.snowflake.com/en/user-guide/table-considerations.html#when-to-set-a-clustering-key

PUT command can be run from Snowflake worksheets ​ TRUE ​ FALSE

FALSE Note: The PUT command cannot be executed from the Worksheets page in the Snowflake web interface; instead, use the SnowSQL client to upload data files, or check the documentation for the specific Snowflake client to verify support for this command. https://docs.snowflake.com/en/sql-reference/sql/put.html#usage-notes

A COUNT(*) and COUNT(<COLUMN_NAME>) on a table will always give you the same result ​ TRUE ​ FALSE

FALSE Note: A COUNT(*) will return a total count of rows in the table, while COUNT(<column_name>) will return a count of rows with a non-NULL value in that particular column. COUNT(A,B) only counts the rows that have no NULL values in either the A or B column while COUNT(<ALIAS>.*) can be used to count all the rows containing no NULL columns.

When you clone a table in snowflake, the storage requirements are doubled as it needs to now store two copies of data. ​ TRUE ​ FALSE

FALSE Note: A massive benefit of zero-copy cloning of snowflake is that the underlying data is never duplicated. Only the metadata/pointers of the micropartitons are replicated for the new cloned table. Hence, when you clone a table actual data is never copied over to the new cloned table.

Auto clustering can be switched off at an account level ​ FALSE ​ TRUE

FALSE Note: Auto clustering cannot be switched off at database of account level, it will need to be done at the table level.

Resource monitors can be used to control credit usage for the Snowflake-provided warehouses, including the snowpipe warehouse ​ TRUE ​ FALSE

FALSE Note: Resource monitors provide control over virtual warehouse credit usage; however, you cannot use them to control credit usage for the Snowflake-provided warehouses, including the SNOWPIPE warehouse.

A consumer of shared database can add as many tables and views they want in the shared database ​ FALSE ​ TRUE

FALSE Note: Shares are read only

Snowflake guarantees that only one instance of a task with a defined predecessor task is running at a given time ​ TRUE ​ FALSE

FALSE Note: Snowflake cannot guarantee that only one instance of a task with a defined predecessor task is running at a given time.

If you clone a database, the internal snowflake stages in that database are also cloned ​ FALSE ​ TRUE

FALSE Note: The following rules apply to cloning stages or objects that contain stages (i.e. databases and schemas): Individual external named stages can be cloned; internal named stages cannot be cloned. When cloning a database or schema: External named stages that were present in the source when the cloning operation started are cloned. Tables are cloned, which means their internal stages are also cloned. Internal named stages are not cloned. Regardless of how a stage was cloned, the clone does not include any of the files from the source. i.e. all cloned stages are empty.

You want to delete the snowpipe file loading metadata. You decided to truncate the table to be able to do this. This will remove the metadata from the pipe ​ TRUE ​ FALSE

FALSE Note: Unable to Reload Modified Data, Modified Data Loaded Unintentionally Snowflake uses file loading metadata to prevent reloading the same files (and duplicating data) in a table. Snowpipe prevents loading files with the same name even if they were later modified (i.e. have a different eTag). The file loading metadata is associated with the pipe object rather than the table. As a result: Staged files with the same name as files that were already loaded are ignored, even if they have been modified, e.g. if new rows were added or errors in the file were corrected. Truncating the table using the TRUNCATE TABLE command does not delete the Snowpipe file loading metadata. However, note that pipes only maintain the load history metadata for 14 days.

A user can be defaulted to a role which user does not have access to ​ TRUE ​ FALSE

FALSE Note: You will be able to create a user with a default role to which the user does not have access. However, the user will nit be able to logon to snowflake if he/she does not have access to the default role. Hence a user cannot be defaulted to a role which he/she does not have access to

You have just loaded a file named student_data.csv from your snowflake stage to snowflake table. You try to reload again using the COPY INTO command. You will be able to load the file. ​ TRUE ​ FALSE

FALSE Note: you can overridewith There is a way, you can use FORCE=TRUE

Which COPY INTO <table> copy options are not supported by SNOWPIPE? ​ FILES = ( 'file_name1' [ , 'file_name2', ... ] ) ​ ON_ERROR = ABORT_STATEMENT ​ PURGE = TRUE | FALSE ​ FORCE = TRUE | FALSE ​ FILE_FORMAT =( { FORMAT_NAME = '[<namespace>.]<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] } ) ]

FILES = ( 'file_name1' [ , 'file_name2', ... ] ) ​ ON_ERROR = ABORT_STATEMENT ​ PURGE = TRUE | FALSE ​ FORCE = TRUE | FALSE Note: All COPY INTO <table> copy options are supported except for the following: FILES = ( 'file_name1' [ , 'file_name2', ... ] ) ON_ERROR = ABORT_STATEMENT SIZE_LIMIT = num PURGE = TRUE | FALSE (i.e. automatic purging while loading) MATCH_BY_COLUMN_NAME = CASE_SENSITIVE | CASE_INSENSITIVE | NONE FORCE = TRUE | FALSE Note that you can manually remove files from an internal (i.e. Snowflake) stage (after they've been loaded) using the REMOVE command. RETURN_FAILED_ONLY = TRUE | FALSE VALIDATION_MODE = RETURN_n_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS Please go through this link thoroughly https://docs.snowflake.com/en/sql-reference/sql/create-pipe.html#usage-notes Please note that PATTERN = 'regex_pattern' copy option is provided as a preview feature . If this is listed as one of the option, it will fall under not supported option.

Select the TABLE function which helps to convert semi-structured data to a relational representation ​ FLATTEN ​ CHECK_JSON ​ PARSE_JSON

FLATTEN

Which OPTION will you set on the copy if you would like to ignore the load metadata? ​ FORCE ​ LOAD_UNCERTAIN_FILES ​ IGNORE METADATA

FORCE Note: To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES copy option to true. The copy option references load metadata, if available, to avoid data duplication, but also attempts to load files with expired load metadata. Alternatively, set the FORCE option to load all files, ignoring load metadata if it exists. Note that this option reloads files, potentially duplicating data in a table. https://docs.snowflake.com/en/user-guide/data-load-considerations-load.html#workarounds

What are the two types of data consumer accounts available in snowflake ​ FULL ACCOUNT ​ READER ACCOUNT ​ SHARED ACCOUNT ​ QUERY ONLY ACCOUNT

FULL ACCOUNT ​ READER ACCOUNT

True or False: A customer using SnowSQL / native connectors will be unable to also use the Snowflake Web interface (UI) unless access to the UI is explicitly granted by supported.

False

True or False: A table in Snowflake can only be queried using the Virtual Warehouse used to load the data.

False

True or False: SF offers tools to extract data from source systems.

False

True or False: Virtual Warehouses cannot be resized while queries are running.

False

True or False: You can point Snowflake at any S3 bucket to directly query files in that bucket as long as the files are in Parquet or ORC format

False

Integration object is a first-class database object ​ True ​ False

False Note: Integration object is an account level object and not a database object

To improve performance materialized view can created on top of external tables without any additional cost ​ TRUE ​ False

False Note: Materialized views are designed to improve query performance for workloads composed of common, repeated query patterns. However, materializing intermediate results incurs additional costs. As such, before creating any materialized views, you should consider whether the costs are offset by the savings from re-using these results frequently enough.

SF is available on premise? (TRUE OR FALSE)

False Note: Snowflake's data warehouse is a true SaaS offering. More specifically: There is no hardware (virtual or physical) for you to select, install, configure, or manage. There is no software for you to install, configure, or manage. Ongoing maintenance, management, and tuning is handled by Snowflake. Snowflake runs completely on cloud infrastructure. All components of Snowflake's service (other than an optional command line client), run in a public cloud infrastructure. Snowflake uses virtual compute instances for its compute needs and a storage service for persistent storage of data. Snowflake cannot be run on private cloud infrastructures (on-premises or hosted). Snowflake is not a packaged software offering that can be installed by a user. Snowflake manages all aspects of software installation and updates. https://docs.snowflake.com/en/user-guide/intro-key-concepts.html

Snowflake has administration settings for resource consumption to help with below activities ​ Help control costs associated with unexpected warehouse credit usage ​ Manage access to snowflake for specific users ​ Manage availability of the product

Help control costs associated with unexpected warehouse credit usage Note: What is a resource monitor? To help control costs and avoid unexpected credit usage caused by running warehouses, Snowflake provides resource monitors. A virtual warehouse consumes Snowflake credits while it runs. Resource monitors can be used to impose limits on the number of credits that are consumed by: 1. User-managed virtual warehouses 2. Virtual warehouses used by cloud services

Which feature does not come with query profile? ​ Graphical representation of the main components of the processing plan of the query ​ Details and statistics for the overall query ​ Hints for improving the query performance ​ Statistics for each component of the query

Hints for improving the query performance

Which properties of a resource monitor can be modified? ​ Increase or decrease the credit quota for the monitor ​ If the monitor is monitoring your account, convert it to monitor individual warehouses ​ If the monitor is monitoring individual warehouses: 1. Add or remove warehouses from the list. 2. Convert it to monitor your account ​ Customize the schedule (frequency, start timestamp, and end timestamp) for the monitor ​ Add or remove actions, or modify the threshold percentages for existing actions

Increase or decrease the credit quota for the monitor ​ If the monitor is monitoring your account, convert it to monitor individual warehouses ​ If the monitor is monitoring individual warehouses: 1. Add or remove warehouses from the list. 2. Convert it to monitor your account ​ Customize the schedule (frequency, start timestamp, and end timestamp) for the monitor ​ Add or remove actions, or modify the threshold percentages for existing actions

Which of the below are automatically provided by snowflake compared to other databases? ​ Installation and Hardware Configurations ​ Patch releases ​ Physical Security ​ Metadata and Collection statistics documentation

Installation and Hardware Configurations ​ Patch releases ​ Physical Security

Which of the below objects will contribute to your account level storage usage? ​ Secure Views ​ Internal Stage ​ External stage ​ Database and all tables inside it ​ Functions

Internal Stage ​ External stage ​ Database and all tables inside it Note: SECURE VIEWS and FUNCTION definitions are stored in metadata store. They do not store any data physically. One of my students asked a valid question that external stage is a pointer to the cloud provider's storage and hence from a snowflake perspective there is no cost. This is absolutely true. But when this question is presented to you and it is not explicitly asking 'snowflake account', you should tick external stage also. If the question specifically asks about 'snowflake account', then you should not tick this option. This is kind of a trick question so please pay attention to the question text.

Which of the below are considered as best practices while loading data into snowflake? ​ Isolate data loading workload into its own virtual warehouse ​ Split large files into smaller files ​ Compress the source files ​ If format is in CSV, convert them into ORC

Isolate data loading workload into its own virtual warehouse ​ Split large files into smaller files ​ Compress the source files

In your organization, you use a 2XL warehouse to load data into Snowflake as part of your ETL workload. Your BI team reached out to you and wanted access to the data stored in Snowflake. You gave them access to the tables required for them to query the data. What will be your recommendation with respect to the warehouse that they will need to use? ​ Since you already have a 2XL warehouse for loading data, you will let them use that warehouse ​ Warehouse is not required for querying data in snowflake ​ It is recommended to use a multi-warehouse approach for workload isolation in snowflake. Since you would not like the load workload to interfere with the read workload of the BI team, you will create a separate warehouse for them

It is recommended to use a multi-warehouse approach for workload isolation in snowflake. Since you would not like the load workload to interfere with the read workload of the BI team, you will create a separate warehouse for them Note: Snowflake's unique multi-cluster, shared data architecture makes it possible to allocate multiple independent, isolated clusters for processing while sharing the same data. In snowflake, each cluster can both read and write data complying to transactional consistency which is ensured by the cloud services layer. The compute clusters in Snowflake are called virtual warehouses. The size and resources of each virtual warehouse can be chosen independently by the user based on the characteristics and performance requirements for the workload(s) that will run on each virtual warehouse. It is, therefore, recommended to isolate workloads and have each workload use its own warehouse. It helps to track usage better and also through this means you will ensure that one workload does not interfere with the other workload. Hence you will create a separate workload for the BI TEAM

What is the benefit of client-side encryption in snowflake? ​ It provides a secure system for managing data in cloud storage ​ It helps to decrypt data faster at a later period ​ It reduces the cost of encryption by the cloud provider

It provides a secure system for managing data in cloud storage

Select the ones that are true for snowflake kafka connector ​ Kafka connector guarantees exactly-once delivery ​ Kafka connector guarantees that rows are inserted in the order ​ Kafka connector guarantees reprocessing of messages

Kafka connector guarantees exactly-once delivery

What does snowflake use for monitoring network traffic and use activity? ​ Lacework ​ Sumo logic ​ Threat Stack

Lacework

You are a snowflake architect and you are investigating a long running query to optimize the query. You want to know if the query retrieved data from long-term centralized storage. If you are looking at the Query History table in the history area of the WebUI, where will you look for this information? ​ Look in the Bytes Scanned Column for a green bar ​ Look in the Bytes scanned column for the acronym LTCS ​ Look in the Rows column for a red bar ​ Look in the Rows column for the acronym LTCS

Look in the Bytes Scanned Column for a green bar

A stream stores data in the same shape as the source table (i.e. the same column names and ordering). What additional columns does it store? ​ METADATA$ACTION ​ METADATA$ISUPDATE ​ METADATA$ROW_ID ​ METADATA$STREAM_ID

METADATA$ACTION ​ METADATA$ISUPDATE ​ METADATA$ROW_ID

Snowflake supports multi-factor authentication (MFA) to provide increased login security for users connecting to Snowflake. Which statements are true about MFA security? ​ MFA is not enabled by default. it is available to all, but user need to activate it ​ MFA is an integrated feature powered by DUO Security Service ​ MFA can be used to connect to SNOWFLAKE using Snowflake JDBC Driver ​ MFA Login is designed to connect to snowflake only through the web interface

MFA is not enabled by default. it is available to all, but user need to activate it ​ MFA is an integrated feature powered by DUO Security Service MFA can be used to connect to SNOWFLAKE using Snowflake JDBC Driver

Which access control privileges must be granted on specified resource monitors for roles other than account administrators to view and modify resource monitors as needed ​ MONITOR ​ CREATE ​ MODIFY

MONITOR ​ MODIFY

This property of resource monitor specifies whether the resource monitor is used to monitor the credit usage for your entire Account (i.e. all warehouses in the account) or a specific set of individual warehouses ​ CREDIT QUOTA ​ MONITOR LEVEL ​ SCHEDULE

MONITOR LEVEL Note: Monitor Level This property specifies whether the resource monitor is used to monitor the credit usage for your entire Account (i.e. all warehouses in the account) or a specific set of individual warehouses. If this property is not set, the resource monitor doesn't monitor any credit usage. It simply remains dormant. https://docs.snowflake.com/en/user-guide/resource-monitors.html#monitor-level

Snowflake provides specific administration features and capabilities to support the following activities except? ​ Manage databases and warehouses within snowflake account ​ Manage roles and users within a snowflake account ​ Monitor usage and manage resources to control costs in a snowflake account ​ Manage 3rd party applications providing data to a snowflake account

Manage 3rd party applications providing data to a snowflake account

Snowflake provides specific administration features and capabilities to support the following activities ​ Managing database and warehouses within a snowflake account ​ Managing roles and users within a snowflake account ​ Monitor usage and manage resources to control costs in a snowflake account ​ Manage 3rd party applications providing data to snowflake

Managing database and warehouses within a snowflake account ​ Managing roles and users within a snowflake account ​ Monitor usage and manage resources to control costs in a snowflake account

What are the two modes in which a multi-cluster warehouse can run? ​ Static ​ Dynamic ​ Maximized ​ Auto-scale ​ None of the above

Maximized ​ Auto-scale Note: Maximized This mode is enabled by specifying the same value for both maximum and minimum clusters (note that the specified value must be larger than 1). In this mode, when the warehouse is started, Snowflake starts all the clusters so that maximum resources are available while the warehouse is running. This mode is effective for statically controlling the available resources (i.e. servers), particularly if you have large numbers of concurrent user sessions and/or queries and the numbers do not fluctuate significantly. Auto-scale This mode is enabled by specifying different values for maximum and minimum clusters. In this mode, Snowflake starts and stops clusters as needed to dynamically manage the load on the warehouse: As the number of concurrent user sessions and/or queries for the warehouse increases, and queries start to queue due to insufficient resources, Snowflake automatically starts additional clusters, up to the maximum number defined for the warehouse. Similarly, as the load on the warehouse decreases, Snowflake automatically shuts down clusters to reduce the number of running servers and, correspondingly, the number of credits used by the warehouse. To help control the usage of credits in Auto-scale mode, Snowflake provides a property, SCALING_POLICY, that determines the scaling policy to use when automatically starting or shutting down additional clusters. For more information, see Setting the Scaling Policy for a Multi-cluster Warehouse (in this topic). https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html#maximized-vs-auto-scale

When configuring a Warehouse using a Snowflake edition that has Elastic Data Warehousing enabled, what facets or components will you need to configure that are not needed in accounts where Elastic Data Warehousing is not enabled ​ Minimum and Maximum Clusters ​ Minimum and Maximum Servers ​ Scaling Policy ​ Auto-Suspend ​ Auto-Resume

Minimum and Maximum Clusters Scaling Policy In Snowflake Standard edition which is not having a support of multi cluster warehouse there is no need to configure the Min & Max clusters and also the scaling policy.

Select all characteristics of snowflake's multi-cluster environment ​ Multiple virtual warehouses in a deployment ​ Individual warehouses scale out/in based on query activity ​ Multi-cluster warehouses support the same properties and actions as single-cluster warehouse ​ User must specify which cluster each query will use

Multiple virtual warehouses in a deployment ​ Individual warehouses scale out/in based on query activity ​ Multi-cluster warehouses support the same properties and actions as single-cluster warehouse Note: Multiple virtual warehouse in a deployment means specifying more than 1 warehouse while creating a multi-cluster warehouse. In a multi-cluster warehouse, you can specify the maximum number of server cluster(or virtual warehouse) greater than 1 and upto 10 and also a minimum number of clusters(equal to or less than the maximum(up to 10). Additionally, multi-cluster warehouses support all the same properties and actions as single-cluster warehouses, including: 1. Specifying a warehouse size. 2. Resizing a warehouse at any time. 3. Auto-suspending a running warehouse due to inactivity; note that this does not apply to individual clusters, but rather the entire warehouse. 4. Auto-resuming a suspended warehouse when new queries are submitted. In a multi-cluster warehouse, individual warehouses can scale up or down based on query load which is not possible in a single cluster warehouse. However as mentioned above, it supports the same properties and actions like the single cluster warehouse

In load metadata, Snowflake maintains detailed metadata for each table into which data is loaded. What else is stored in this metadata ​ Name of each file from which data was loaded ​ File size ​ Number of columns in the file ​ ETag for the file ​ Number of rows parsed in the file ​ Timestamp of the last load for the file ​ Information about any errors encountered in the file during loading

Name of each file from which data was loaded ​ File size ​ ETag for the file ​ Number of rows parsed in the file ​ Timestamp of the last load for the file ​ Information about any errors encountered in the file during loading Note: https://docs.snowflake.com/en/user-guide/data-load-considerations-load.html#load-metadata

PUT command in snowflake can upload files from a local folder/directory of a client machines to the below locations ​ Named internal stage ​ Internal stage for a specified table ​ Internal stage for the current user ​ External Stage

Named internal stage ​ Internal stage for a specified table ​ Internal stage for the current user

Which of the following best describes SF's processing engine? Leverages Apache Spark Based on Map Reduce A derivative of Presto Native SQL

Native SQL

Which of the following best describes Snowflake's processing engine ​ EMR(Elastic Map Reduce) ​ Spark Engine ​ Presto ​ Native SQL Database engine

Native SQL Database engine Note: https://docs.snowflake.com/en/user-guide/intro-key-concepts.html#key-concepts-architecture

Which of the following is not a characteristic of micro-partitions in snowflake? ​ New partitions are created in logical properties ​ Avoids skews between partitions ​ Partitioning is automatically completed on the natural ingestion order of the data ​ The natural ingestion order maintains correlations between columns which could be useful for pruning

New partitions are created in logical properties

create or replace table result_scan_table_1 like result_scan_table; Will the above query cost compute credits? ​ No, since it is a metadata operation only ​ Yes, it will need compute as the table structure is also getting created ​ It is hard to say without looking at query profile

No, since it is a metadata operation only Note: Why do you think it will not cost any compute credit? The reason behind this is that the LIKE clause only creates the definition of the table(or the structure of the table) in the metadata repository. It does not require any compute credit for any type of metadata operations.

What will the below query return SELECT TOP 10 GRADES FROM STUDENT; ​ The top 10 highest grades ​ The 10 lowest grades ​ Non-deterministic list of 10 grades

Non-deterministic list of 10 grades

Resource monitors support the below actions ​ Notify & Suspend ​ Notify & Suspend Immediately ​ Notify ​ Notify and automatically call the account administrator

Notify & Suspend ​ Notify & Suspend Immediately ​ Notify

You are sizing a snowflake warehouse, what factors would you consider? ​ Number of users ​ Number of concurrent queries ​ Number of tables being queried ​ Data size and composition

Number of concurrent queries ​ Number of tables being queried ​ Data size and composition

When you size a warehouse, which of the below factors are not required to be considered ​ Number of users ​ Number of concurrent queries ​ Number of tables queried ​ Data size and composition ​ All of the above

Number of users

If you plan to regularly unload similarly-formatted data, it is recommended to use named file formats ​(TRUE OR ​FALSE)

TRUE

If you clone a permanent table(bar) into a transient table(foo) using the below command create transient table foo clone bar copy grants; What will happen to the partitions? ​ Old partitions will not be affected,but new partitions added to the clone will follow the transient lifecycle ​ All the partitions will be affected ​ Only the old partitions will be affected

Old partitions will not be affected,but new partitions added to the clone will follow the transient lifecycle

The Cloning feature in Snowflake requires less storage because ​ Only metadata is copied; no physical data is copied ​ The cloned data is constantly updated to synchronize with original table data ​ Data is replicated to ensure integrity of data in the original table data

Only metadata is copied; no physical data is copied There is no data copy that happens while clone is created. Its just one more object that gets created which points to the same underlying data as that of original table. Only when changes are made to clone then it starts incurring storage costs.

The following factors affect data load rates: A. Physical location of the stage B. RAM on the virtual warehouse C. Gzip compression efficiency D. Thread size

Physical location of the stage Gzip compression efficiency

A Snowflake mechanism that is used to limit the number of micro-partitions scanned by a query is called ________?

Pruning

If you recreate a storage integration after it has been linked to one or more stages, you must reestablish the association between each stage by running the ALTER STAGE command. ​ (TRUE OR ​FALSE)

TRUE

Query execution is performed by which layers of Snowflake architecture ​ Storage Layer ​ Query Processing Layer ​ Cloud Services Layer

Query Processing Layer Query execution is performed using Virtual Warehouse, which is also called as compute layer or Query Processing Layer.

Which layer in the snowflake architecture contains virtual warehouse ​ Cloud services ​ Query processing ​ Database Storage ​ None

Query processing

How do you remove files from a stage? ​ REMOVE ​ DELETE ​ PURGE ​ TRUNCATE

REMOVE Note: Removes files that have been staged (i.e. uploaded from a local file system or unloaded from a table) in one of the following Snowflake internal stages: Named internal stage. Stage for a specified table. Stage for the current user. Note that using the command to remove files from an external stage might work but is not officially supported. REMOVE can be abbreviated to RM. https://docs.snowflake.com/en/sql-reference/sql/remove.html#remove

Which is not a characteristic of time travel? ​ Protection from accidental data operations ​ Previous versions of data automatically retained ​ Retention period selected by customers (up to 90 days for Enterprise edition) ​ Recover data with the cost of running backups

Recover data with the cost of running backups

Which of the below activities can be done leveraging time-travel feature of snowflake? ​ Restoring data-related objects (tables, schemas, and databases) that may have been accidentally or intentionally deleted ​ Duplicating and backing up data from key points in the past ​ Analyzing data usage/manipulation over specified periods of time ​ To backup data for compliance and regulatory requirements

Restoring data-related objects (tables, schemas, and databases) that may have been accidentally or intentionally deleted Duplicating and backing up data from key points in the past ​​ Analyzing data usage/manipulation over specified periods of time

Which cache runs on a 24 hour clock? ​ Result Cache ​ Metadata cache ​ Warehouse cache

Result Cache Note: WH Cache is when WH goes offline Result Cache is 24 hours Metadata cache is always changing as tables change

You want to update the value of a specific column in a table. But the updation is not based on any specific condition. You just need to ensure that 80% of the rows in the table are updated with the new value. What snowflake feature will you use ​ SAMPLE ​ CONCAT ​ AVERAGE ​ MAX

SAMPLE

If you want to clone a table, your role must have which access on the Table ​ SELECT ​ USAGE ​ READ ​ CLONE

SELECT Note: Tables - SELECT Pipes, Streams, Tasks - OWNERSHIP Other objects - USAGE

You have a Snowflake table which is defined as below CREATE OR REPLACE TABLE FRUITS(FRUIT_NUMBER NUMBER, FRUIT_DESCRIPTION VARCHAR, AVAILABILITY VARCHAR); If you would like to convert the fruit_number column to be a decimal with a certain precision and scale, which command will you run? ​ SELECT FRUIT_NUMBER::DECIMAL(10,5) FROM FRUITS; ​ SELECT FRUIT_NUMBER(DECIMAL(10,5)) FROM FRUITS; ​ SELECT FRUIT_NUMBER AS DECIMAL(10,5) FROM FRUITS; ​ SELECT FRUIT_NUMBER.DECIMAL(10,5) FROM FRUITS;

SELECT FRUIT_NUMBER::DECIMAL(10,5) FROM FRUITS;

Which of the below operations are allowed on an inbound share data? ​ MERGE ​ CREATE TABLE ​ ALTER TABLE ​ SELECT WITH GROUP BY ​ SELECT WITH JOIN ​ INSERT INTO

SELECT WITH GROUP BY ​ SELECT WITH JOIN

Output of a FLATTEN query has the below columns. Select three. ​ SEQ ​ PATH ​ VALUE ​ RECORD_METADATA

SEQ ​ PATH ​ VALUE

Choose the continuous data loading options from below ​ SNOWPIPE ​ SNOWFLAKE CONNECTOR FOR KAFKA ​ Third-party data integration tools ​ BULK COPY

SNOWPIPE ​ SNOWFLAKE CONNECTOR FOR KAFKA ​ Third-party data integration tools

Snowflake supports the following performance optimization techniques ​ SSD CACHING/DATA CACHING ​ QUERY RESULT CACHING ​ B-TREE INDEXES

SSD CACHING/DATA CACHING ​ QUERY RESULT CACHING Note: https://community.snowflake.com/s/article/Caching-in-Snowflake-Data-Warehouse

All of the below are valid executionState of a snowpipe except: ​ RUNNING ​ STOPPED_FEATURE_DISABLED ​ STALLED_EXECUTION_ERROR ​ PAUSED ​ STOPPED

STOPPED Note: Current execution state of the pipe; could be any one of the following: RUNNING (i.e. everything is normal; Snowflake may or may not be actively processing files for this pipe) STOPPED_FEATURE_DISABLED STOPPED_STAGE_DROPPED STOPPED_FILE_FORMAT_DROPPED STOPPED_MISSING_PIPE STOPPED_MISSING_TABLE STALLED_COMPILATION_ERROR STALLED_INITIALIZATION_ERROR STALLED_EXECUTION_ERROR STALLED_INTERNAL_ERROR PAUSED PAUSED_BY_SNOWFLAKE_ADMIN PAUSED_BY_ACCOUNT_ADMIN

Snowflake adds the system defined roles to each new account. Check the system defined roles below. ​ SYSADMIN ​ ACCOUNTADMIN ​ SECURITYADMIN ​ PUBLIC ​ USERADMIN ​ SUPERUSER

SYSADMIN ​ ACCOUNTADMIN ​ SECURITYADMIN ​ PUBLIC USERADMIN

To help control the usage of credits in Auto-scale mode, Snowflake provides a property, _________________, that determines the scaling policy to use when automatically starting or shutting down additional clusters. Select the property ​ Auto Scale ​ Scaling Policy ​ Max no of cluster ​ Min no of cluster

Scaling Policy Note: To help control the credits consumed by a multi-cluster warehouse running in Auto-scale mode, Snowflake provides scaling policies, which are used to determine when to start or shut down a cluster. The scaling policy for a multi-cluster warehouse only applies if it is running in Auto-scale mode. In Maximized mode, all clusters run concurrently so there is no need to start or shut down individual clusters. https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html#setting-the-scaling-policy-for-a-multi-cluster-warehouse

An entity to which access can be granted. Unless allowed by a grant, access will be denied. in snowflake terms, what is this called? ​ Securable object ​ Role ​ Privilege ​ User

Securable object Note: Securable object: An entity to which access can be granted. Unless allowed by a grant, access will be denied. Role: An entity to which privileges can be granted. Roles are in turn assigned to users. Note that roles can also be assigned to other roles, creating a role hierarchy. Privilege: A defined level of access to an object. Multiple distinct privileges may be used to control the granularity of access granted. User: A user identity recognized by Snowflake, whether associated with a person or program. https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#access-control-framework

All security related information is stored in which layer of snowflake architecture ​ Service ​ Storage ​ Compute ​ All of the above

Service Note: Service(or Cloud service) layer is the one which stores all security related information The services included in this layer are 1. Authentication 2. Infrastructure management 3. Metadata management 4. Query parsing and optimization 5. Access control

You are speaking to a CTO of an organization and would like to define the architecture of snowflake to him in two words. What will you use? ​ Shared Disk ​ Shared Nothing ​ Shared Data ​ Shared Memory

Shared Data Note: Snowflake's architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. Similar to shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture. https://docs.snowflake.com/en/user-guide/intro-key-concepts.html

Select the statements that are true for a shared database ​ Shared databases are read-only ​ A clone can be created from a shared database ​ Time travel can be done on a shared database ​ Can edit the comments for a shared database

Shared databases are read-only Note: https://docs.snowflake.com/en/user-guide/data-share-consumers.html#general-limitations-for-shared-databases

Select the statements that are true for a data shar ​ Shares are named Snowflake objects ​ Share consists of privileges that grant access to the database(s) and the schema containing the objects to share ​ Share consists of privileges that grant access to the specific objects in the database ​ Share consists of consumer accounts with which the database and its objects are shared

Shares are named Snowflake objects ​ Share consists of privileges that grant access to the database(s) and the schema containing the objects to share ​ Share consists of privileges that grant access to the specific objects in the database ​ Share consists of consumer accounts with which the database and its objects are shared Note: https://docs.snowflake.com/en/user-guide/data-sharing-intro.html#what-is-a-share

Which factors influence the unit cost of snowflake credits and data storage? ​ Snowflake Edition ​ The region where your Snowflake account is located ​ Whether it is an On Demand or Capacity account ​ Number of users of snowflake

Snowflake Edition ​ The region where your Snowflake account is located ​ Whether it is an On Demand or Capacity account Note: The Snowflake Edition that your organization chooses determines the unit costs for the credits and the data storage you use. Other factors that impact unit costs are the region where your Snowflake account is located and whether it is an On Demand or Capacity account: 1. On Demand: Usage-based pricing with no long-term licensing requirements. 2. Capacity: Discounted pricing based on an up-front Capacity commitment.

Mark all the true statements ​ Snowflake Stages can be defined as either External or Internal. ​ Snowflake Stages can be defined as either FTP or SFTP. ​ External Stages require a cloud storage provider. ​ FTP Stages require three cloud storage providers. ​ Cloud storage providers must be linked to internal stage objects.

Snowflake Stages can be defined as either External or Internal. ​​ External Stages require a cloud storage provider.​

With default settings, how long will a query run on snowflake ​ Snowflake will cancel the query if it runs more than 48 hours ​ Snowflake will cancel the query if it runs more than 24 hours ​ Snowflake will cancel the query if the warehouse runs out of memory ​ Snowflake will cancel the query if the warehouse runs out of memory and hard disk storage

Snowflake will cancel the query if it runs more than 48 hours

Select the limitations with materialized views(Select 3) ​ Snowflake's "Time Travel" feature is not supported on materialized views ​ using CURRENT_TIME or CURRENT_TIMESTAMP is not permitted ​ A materialized view cannot include Window functions ​ Clustering keys cannot be defined in Materialized views

Snowflake's "Time Travel" feature is not supported on materialized views ​ using CURRENT_TIME or CURRENT_TIMESTAMP is not permitted ​ A materialized view cannot include Window functions

What is a point in time snapshot of data which can be updated by users is called ​ Snowflakes' cloning ​ Time travel ​ Fail safe

Snowflakes' cloning

Which of the following is not a snowpipe feature? ​ Snowpipe can load data from any internal or external stage ​ It is a server-less compute model ​ The service provides REST end points and uses snowflake provided compute resources to load the data and retrieve history reports ​ Snowpipe loads data after it is in stage and use executes the LOADDATA command

Snowpipe loads data after it is in stage and use executes the LOADDATA command Note: There is no LOADDATA command and data is automatically loaded by snowpipe after the files are added to the stage https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html#introduction-to-snowpipe

What scaling policies are supported by snowflake? ​ Standard ​ Economy ​ Premium

Standard ​ Economy Note: Standard (default) Prevents/minimizes queuing by favoring starting additional clusters over conserving credits. Economy Conserves credits by favoring keeping running clusters fully-loaded rather than starting additional clusters, which may result in queries being queued and taking longer to complete. https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html#setting-the-scaling-policy-for-a-multi-cluster-warehouse

What are the two types of snowflake data listings which define how data is shared and consumed via data marketplace? ​ Standard Data Listings ​ Personalized Data Listings ​ Business Critical Data Listings ​ Enterprise Data Listings

Standard Data Listings ​ Personalized Data Listings Note: The Snowflake Data Marketplace offers two types of data listings which define how data is shared and consumed. Once published, both types of listings are displayed to consumers in the selected regions. The key difference between standard and personalized listings is the ability to access the data share. Personalized listings allow you to control which consumers can access the data. Standard Data Listings A standard listing provides instant access to a published data set. This type of listing is best for providing generic, aggregated, or non-customer-specific data. Each listing includes details about the shared data, sample queries, and information about the data provider. To access data from a standard data listing, consumers click Get, opening a prompt where they name the database for Snowflake users, permission access among Snowflake users, and agree to the provider's terms of use and Snowflake's consumer terms. Clicking Create Database immediately creates the database inside their Snowflake account in the Databases tab. Only users that have a role with the IMPORT SHARE privilege can create databases from the Snowflake Data Marketplace. Personalized Data Listings A personalized listing allows customers to request specific data sets. This can be premium data that a provider charges for or data that is specific to each consumer. Each listing includes details about the shared data, sample usage examples, and information about the data provider. To access data from a personalized listing, consumers must submit a request by clicking the Request button, and provide their contact information. Once a request is submitted, the data provider is notified. The provider then contacts the consumer. Each data provider can have different commercial terms. Once those are agreed to, then the personalized data set is created and shared with the consumer.

Which is the preferred way to store a generated identity and access management (IAM) entity for external cloud storage ​ Access Token ​ Stage ​ Storage integration ​ File Format

Storage integration

Which of the below columns are usually a good choice for clustering key. ​ UUID column in a Customer fact table ​ Gender with Male/Female values in a multi-terabyte table ​ Timestamp column in a 10TB order fact table ​ Store_Id in a 2 TB Item Store Sales fact table

Store_Id in a 2 TB Item Store Sales fact table Note: Remember that a cluster key should not have a very high cardinality nor it should have a very low cardinality. If it has high cardinality, there will be lot of partitions that will be created, with low cardinality very few partitions will be created and hence query perfromance will not be able take advantage of partition pruning. We need to take a middle ground.

Select two true statement related to streams ​ Stream itself does not contain any table data ​ A stream only stores the offset for the source table ​ The hidden columns used by a stream does not consume any storage

Stream itself does not contain any table data ​ A stream only stores the offset for the source table

Select two options which are true about variant data type in snowflake? ​ Supports querying using JSON path notation ​ Optimizes storage based on repeated elements ​ A separate file format is used to store variant data ​ Requires custom mapping for each type of record

Supports querying using JSON path notation ​ Optimizes storage based on repeated elements

When a resource limit has hit the limit or is about to hit the limit, what can the resource monitor do ​ Suspend the warehouse ​ Send alert notifications ​ Kill the query that is running

Suspend the warehouse ​ Send alert notifications ​ Kill the query that is running

The FLATTEN command in snowflake has two version. One version uses a join and the other version uses an object keyword. Please select two words that represent the options used with the command? ​ OBJECT_CONSTRUCT ​ TABLE ​ TRY_CAST ​ LATERAL

TABLE ​ LATERAL

Snowflake has three types of stages USER, TABLE, NAMED. Named stage is again divided into External and Internal. Which of the below two stages are automatically created and does not need explicit configuration by the user ​ TABLE ​ EXTERNAL ​ INTERNAL ​ USER

TABLE ​ USER

Select the table type that is automatically deleted after the session is closed and hence it has no fail-safe or time travel option post closure of the session ​ TRANSIENT ​ TEMPORARY ​ PERMANENT ​ EXTERNAL

TEMPORARY Note: temporary are valid for a session transient are same as permanent except they do not have a fail safe period

A simple tree of tasks is limited to a maximum of 1000 tasks total ​ TRUE ​ FALSE

TRUE

A simple tree of tasks is limited to a maximum of 1000 tasks total (including the root task) in a resumed state ​ (TRUE OR FALSE)

TRUE

A single storage integration can support multiple external stages (TRUE OR FALSE)

TRUE

A stream object records the delta of change data capture (CDC) information for a table (such as a staging table), including inserts and other data manipulation language (DML) changes. ​ TRUE ​ FALSE

TRUE

A warehouse can be assigned to only a single resource monitor. ​ TRUE ​ FALSE

TRUE

ACCOUNTADMIN ROLE encapsulates the SYSADMIN and SECURITYADMIN system-defined roles and ACCOUNTADMIN sits on top of the role hierarchy ​ TRUE ​ FALSE

TRUE

According to snowflake architecture, data storage is independent of compute ​ TRUE ​ FALSE

TRUE

All virtual warehouses in snowflake has access to all data ​ TRUE ​ FALSE

TRUE

Automatic reclustering in snowflake is triggered only if/when the table would benefit from the operation (TRUE OR FALSE)

TRUE

Before any PHI data can be stored in Snowflake, a signed business associate agreement (BAA) must be in place between your agency/organization and Snowflake Inc ​ TRUE ​ FALSE

TRUE

COPY and INSERT operations in snowflake are non-blocking ​ TRUE ​ FALSE

TRUE

COPY transformations are supported only for named stages(both internal and external) and user stages ​ TRUE ​ FALSE

TRUE

Clustering is generally most cost-effective for tables that are queried frequently and do not change frequently ​ (TRUE OR ​FALSE)

TRUE

Each time a persisted result for a query is reused, Snowflake resets the 24-hour retention period for the result, up to a maximum of 31 days from the date and time that the query was first executed. After 31 days, the result is purged and the next time the query is submitted, a new result is generated and persisted. (TRUE OR FALSE)

TRUE

Each time a warehouse is started or resized to a larger size, the warehouse is billed for 1 minute's worth of usage.After 1 minute, all subsequent billing is per-second ​ TRUE ​ FALSE

TRUE

Failsafe period for temporary and transient table is zero (TRUE OR ​FALSE)

TRUE

Federated authentication in snowflake is compliant with SAML2.0 ​ TRUE ​ FALSE

TRUE

Files that are already copied from the stage to the source table can be loaded again into a table cloned from the source table (TRUE​ OR FALSE)

TRUE

GET does not support downloading files from external stages (TRUE OR FALSE)

TRUE

If multiple instances of the kafka connector is started on the same topic or partitions, duplicate records may flow into snowflake table ​ TRUE ​ FALSE

TRUE

If query performance degrades over time, the table is likely no longer well-clustered and may benefit from clustering ​ (TRUE OR ​FALSE)

TRUE

If you are on Snowflake Enterprise Edition (and higher), the time travel retention period can be set to any value from 0 up to 90 days for databases, schemas, and tables ​ TRUE ​ FALSE

TRUE

Metadata columns for staged files can only be queried by name (TRUE OR FALSE)

TRUE Note: Metadata cannot be inserted into existing table rows. Metadata columns can only be queried by name; as such, they are not included in the output of any of the following statements: SELECT * SHOW <objects> DESCRIBE <object> Queries on INFORMATION_SCHEMA views https://docs.snowflake.com/en/user-guide/querying-metadata.html#query-limitations

A table stream(also known as stream) itself does not contain any table data (TRUE OR FALSE)

TRUE Note: Note that a stream itself does not contain any table data. A stream only stores the offset for the source table and returns CDC records by leveraging the versioning history for the source table. When the first stream for a table is created, a pair of hidden columns are added to the source table and begin storing change tracking metadata. These columns consume a small amount of storage. The CDC records returned when querying a stream rely on a combination of the offset stored in the stream and the change tracking metadata stored in the table. https://docs.snowflake.com/en/user-guide/streams.html#overview-of-table-streams

You are loading data to a snowflake internal stage area using the PUT Command, the data will be encrypted on the client's machine and you will not be able to turn off the encryption. (TRUE OR FALSE)

TRUE Note: One of the biggest worries people have about moving to the cloud is security. One key piece of providing enterprise class security is the ability to encrypt the data in your data warehouse environment. With Snowflake, your data is automatically encrypted by default. https://www.snowflake.com/blog/automatic-encryption-data/

Snowflake does not support nested transactions (TRUE OR FALSE)

TRUE Note: Overlapping Transactions This section describes overlapping transactions. A stored procedure that contains a transaction can be called from within another transaction. The outer transaction can be in an outer stored procedure or can be outside any stored procedure. The inner transaction is not treated as nested; instead, the inner transaction is a separate transaction. Snowflake calls these "autonomous scoped transactions" (or simply "scoped transactions"), because each transaction executes in a conceptually independent scope. Note Terminology note: The terms "inner" and "outer" are commonly used when describing nested operations, such as nested stored procedure calls. Although Snowflake supports nested procedure calls, Snowflake does not support nested transactions; https://docs.snowflake.com/en/sql-reference/transactions.html#overlapping-transactions

Snowflake waits till all the servers are provisioned for a new virtual warehouse, before it executes query on that warehouse (TRUE​ OR FALSE)

TRUE Note: Snowflake does not begin executing SQL statements submitted to a warehouse until all of the servers for the warehouse are successfully provisioned, unless any of the servers fail to provision: 1. If any of the servers for the warehouse fail to provision during start-up, Snowflake attempts to repair the failed server(s). 2. During the repair process, the warehouse starts processing SQL statements once 50% or more of the requested servers are successfully provisioned.

Streams support repeatable read isolation (TRUE OR ​FALSE)

TRUE Note: Streams support repeatable read isolation. In repeatable read mode, multiple SQL statements within a transaction see the same set of records in a stream. This differs from the read committed mode supported for tables, in which statements see any changes made by previous statements executed within the same transaction, even though those changes are not yet committed. https://docs.snowflake.com/en/user-guide/streams.html#repeatable-read-isolation

When you load data into snowflake, snowflake reorganizes the data into its internal optimized compressed columnar format ​ TRUE ​ FALSE

TRUE Note: When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage. Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake.

The warehouse cache may be reset if a running warehouse is suspended and then resumes ​ TRUE ​ FALSE

TRUE Note: You do not need to remember this, just understand the concept. Warehouses are nothing but VM instances underneath. And the cache is nothing but SSD cache attached to the instances, If the warehouse is suspended and resumed again it may not get teh same instance.

You have many queries that needs to run at the same time on a warehouse and you need high concurrency when they execute. You will go for a multi-cluster warehouse in this case. ​ TRUE ​ FALSE

TRUE Note: 1. Multi Cluster warehouse can scale compute resources 2. The scaling is based on concurrent users and queries Multi-cluster warehouses enable you to scale compute resources to manage your user and query concurrency needs as they change, such as during peak and off hours.

A DML statement that selects from a stream consumes all of the change data in the stream as long as the transaction commits successfully ​ TRUE ​ FALSE

TRUE Note: A DML statement that selects from a stream consumes all of the change data in the stream as long as the transaction commits successfully. To ensure multiple statements access the same change records in the stream, surround them with an explicit transaction statement (BEGIN .. COMMIT). This locks the stream. DML updates to the source table in parallel transactions are tracked by the change tracking system but do not update the stream until the explicit transaction statement is committed and the existing change data is consumed. This is very important to know, if you are given a TASK which consumes from the stream and a set of query statements and then if you are asked which query statement will consume all the change data then please select the DML query statement(like INSERT query)

The COPY command is more performant than the INSERT statement? ​ TRUE ​ FALSE

TRUE Note: Do you remember that Snowflake stores data physically into immutable micro-partitions? So, how does insert/update work in snowflake? Since the micropartitons are immutable, any insert or update will have to copy the entire micro-partition into a new copy and then insert the new record in that partition. For update the old record will be marked for deletion and the new record will be added. so, imagine if you do insert one by one, how many micropartitons will get created. Snowflake's architecture is optimized for bulk load. Otherwise, small DMLs (and at high frequency) cause issue of triggering the engine to create a lot of small partition files (this is because each DML create a new (micro)partition file; and a lot of small partition files are bad for data processing and the database engine has to regularly compact the small partition files into larger ones for more optimal query performance). Hope this clarifies the concept.

Multiple streams can be created for the same table and consumed by different tasks ​ TRUE ​ FALSE

TRUE Note: Multiple tasks that consume change data from a single table stream retrieve different deltas. When a task consumes the change data in a stream using a DML statement, the stream advances the offset. The change data is no longer available for the next task to consume. Currently, we recommend that only a single task consumes the change data from a stream. Multiple streams can be created for the same table and consumed by different tasks.

A share cannot be shared ​ TRUE ​ FALSE

TRUE Note: read-only

Time travel and fail safe requires additional storage which will cost you storage cost ​ TRUE ​ FALSE

TRUE Note: The fees are calculated for each 24-hour period (i.e. 1 day) from the time the data changed. The number of days historical data is maintained is based on the table type and the Time Travel retention period for the table. Also, Snowflake minimizes the amount of storage required for historical data by maintaining only the information required to restore the individual table rows that were updated or deleted. As a result, storage usage is calculated as a percentage of the table that changed. Full copies of tables are only maintained when tables are dropped or truncated.

Which three objects did we explicitly refer to using the COPY INTO command to load data on using external stages? ​ Table ​ File Format ​ View ​ Stage

Table ​ File Format ​ Stage

Select two true statements for FLATTEN ? ​ Takes a VARIANT, OBJECT, or ARRAY column and produces a lateral view ​ Can be used to convert semi-structured data to a relational representation ​ Cannot be used on permanent tables

Takes a VARIANT, OBJECT, or ARRAY column and produces a lateral view ​ Can be used to convert semi-structured data to a relational representation

Create a task and a stream following the below steps. So, when the system$stream_has_data('rawstream1') condition returns false, what will happen to the task ? -- Create a landing table to store raw JSON data. -- Snowpipe could load data into this table. create or replace table raw (var variant); -- Create a stream to capture inserts to the landing table. -- A task will consume a set of columns from this stream. create or replace stream rawstream1 on table raw; -- Create a second stream to capture inserts to the landing table. -- A second task will consume another set of columns from this stream. create or replace stream rawstream2 on table raw; -- Create a table that stores the names of office visitors identified in the raw data. create or replace table names (id int, first_name string, last_name string); -- Create a table that stores the visitation dates of office visitors identified in the raw data. create or replace table visits (id int, dt date); -- Create a task that inserts new name records from the rawstream1 stream into the names table -- every minute when the stream contains records. -- Replace the 'etl_wh' warehouse with a warehouse that your role has USAGE privilege on. create or replace task raw_to_names warehouse = etl_wh schedule = '1 minute' when system$stream_has_data('rawstream1') as merge into names n using (select var:id id, var:fname fname, var:lname lname from rawstream1) r1 on n.id = to_number(r1.id) when matched then update set n.first_name = r1.fname, n.last_name = r1.lname when not matched then insert (id, first_name, last_name) values (r1.id, r1.fname, r1.lname) ; -- Create another task that merges visitation records from the rawstream1 stream into the visits table -- every minute when the stream contains records. -- Records with new IDs are inserted into the visits table; -- Records with IDs that exist in the visits table update the DT column in the table. -- Replace the 'etl_wh' warehouse with a warehouse that your role has USAGE privilege on. create or replace task raw_to_visits warehouse = etl_wh schedule = '1 minute' when system$stream_has_data('rawstream2') as merge into visits v using (select var:id id, var:visit_dt visit_dt from rawstream2) r2 on v.id = to_number(r2.id) when matched then update set v.dt = r2.visit_dt when not matched then insert (id, dt) values (r2.id, r2.visit_dt) ; -- Resume both tasks. alter task raw_to_names resume; alter task raw_to_visits resume; -- Insert a set of records into the landing table. insert into raw select parse_json(column1) from values ('{"id": "123","fname": "Jane","lname": "Smith","visit_dt": "2019-09-17"}'), ('{"id": "456","fname": "Peter","lname": "Williams","visit_dt": "2019-09-17"}'); -- Query the change data capture record in the table streams select * from rawstream1; select * from rawstream2; ​ Task will be executed but no rows will be merged ​ Task will return an warning message ​ Task will be skipped

Task will be skipped

Which of these are Snowflake table types? ​ Materialized ​ Secure ​ Temporary ​ Permanent ​ Transient ​ External

Temporary​ Permanent​ Transient​ External Explanation Secure and Materialized are type of views, while rest are type of tables.

In a stream, if a row is added and then updated in the current offset, how will the delta change be represented ​ The delta change is a new row ​ The delta change will be a pair of delete and insert record ​ The delta change will be a pair of insert and upate record

The delta change is a new row Note: You know its an update from METADATA$ISUPDATE

What factors influence the credits charged for a warehouse? ​ The number of servers per cluster (determined by warehouse size) ​ The number of clusters (if using multi-cluster warehouses) ​ The size of the query that runs on the warehouse ​ The length of time each server in each cluster runs

The number of servers per cluster (determined by warehouse size) ​ The number of clusters (if using multi-cluster warehouses) ​ The length of time each server in each cluster runs

In which scenarios would you consider to use materialized views ​ The query results contain a small number of rows and/or columns relative to the base table ​ Query results contain results that require significant processing ​ Query is on an external table ​ View's base table does not change frequently ​ None of the above

The query results contain a small number of rows and/or columns relative to the base table ​ Query results contain results that require significant processing ​ Query is on an external table ​ View's base table does not change frequently

Select all the options which are true about the data that is stored in a micro partition metadata ​ The range of values for each of the columns in the micro-partition ​ Mean and standard deviation of each column ​ The number of distinct values ​ Additional properties used for both optimization and efficient query processing

The range of values for each of the columns in the micro-partition ​​ The number of distinct values ​ Additional properties used for both optimization and efficient query processing

You ran the below query. I have a warehouse with auto suspend set at 5 seconds. Also result cache is disabled. SELECT * FROM INVENTORY; The query profile looks like as below. Please see below 'Percentage scanned from cache' is 0% You ran the query again before 5 seconds has elapsed and the query profile looks as below. Look at the 'Percentage scanned for cache', it is 75% You ran the query again after 5 seconds. The query profile looks as below. Look at the 'Percentage scanned from cache', it is zero again. Why is this happening? ​ The second run of the query used data cache to retrieve part of the result since it ran before the warehouse was suspended ​ The second run of the query used query result cache ​ The third run of the query used query result cache

The second run of the query used data cache to retrieve part of the result since it ran before the warehouse was suspended Note: Virtual warehouses are an abstraction on the compute instances of the cloud provider(in case of AWS, it is EC2 instances). Each Virtual warehouse is a cluster of these compute instances(or EC2 in case of AWS). The compute instances has local SSD attached to them. When you ran the query for the first time, the results of the query were retrieved from the remote storage(which is the object store of the cloud provider, S3 in case of AWS), part of the results also got cached in the local SSD storage of the compute instance. So, when we ran the query second time, part of the results got retrieved from the SSD cache also known as Data Cache. Ok, if that is the case why did not it retrieve from data cache the third time. The third run of the query happened after 5 seconds. The virtual warehouse had a auto suspend setting of 5 seconds. So, since there were no activity for 5 seconds, the warehouse suspended itself. When the warehouse is suspended, it loses the data cache. Why? because when it resumes it may be a completely new set of compute instances which were not used in the earlier runs. Hope this clarifies and you probably will never forget this concept now. Please remember SSD cache is also called Data cache In real work scenario, how can you leverage the understanding of this concept. Well if you a nightly ETL that runs against several tables, group the ETLs which use similar tables to run at the same time. This will help you to utilize the data cache effectively. This strategy, I named it as ETL-Colocation:)

When will you consider creating a clustering key on a table. Please select all that applies ​ The table houses multi-terabyte data ​ The table has variant columns ​ The query performance on the table has degraded over time

The table houses multi-terabyte data ​ The query performance on the table has degraded over time note: 1. Clustering will be effective only when you have enough partitions in the table and the table is more than 1 TB 2. As you insert/update rows in the table, the partitions are moved around and hence it no longer remains clustered. With Auto clustering, Snowflake automatically reclusters the table based on a proprietary algorithm

Select the ones that are true about data sharing through data shares ​ There is only one copy of data, which lives in the data provider's account ​ Shared data is always live, real-time and immediately available to consumers ​ Providers can establish revocable, fine-grained access grants to shares ​ Data Sharing is supported between accounts in the same or different Snowflake Provider and Region

There is only one copy of data, which lives in the data provider's account ​ Shared data is always live, real-time and immediately available to consumers ​ Providers can establish revocable, fine-grained access grants to shares

The storage architecture of snowflake has two key features, they are ​ Time travel ​ Replication ​ Zero-copy cloning ​ Query tag

Time travel ​ Zero-copy cloning

Time travel is available for which table types? ​ Transient ​ Permanent ​ External ​ Temporary

Transient Permanent Temporary Refer Link for details - https://docs.snowflake.com/en/user-guide/data-time-travel.html A key component of Snowflake Time Travel is the data retention period. When data in a table is modified, including deletion of data or dropping an object containing data, Snowflake preserves the state of the data before the update. The data retention period specifies the number of days for which this historical data is preserved and, therefore, Time Travel operations (SELECT, CREATE ... CLONE, UNDROP) can be performed on the data. The standard retention period is 1 day (24 hours) and is automatically enabled for all Snowflake accounts: For Snowflake Standard Edition, the retention period can be set to 0 (or unset back to the default of 1 day) at the account and object level (i.e. databases, schemas, and tables). For Snowflake Enterprise Edition (and higher): For transient databases, schemas, and tables, the retention period can be set to 0 (or unset back to the default of 1 day). The same is also true for temporary tables. For permanent databases, schemas, and tables, the retention period can be set to any value from 0 up to 90 days.

True or False: When data is staged to a SF internal staging area using the PUT command, the data is encrypted on the client's machine.

True

Warehouse cache size changes with warehouse size ​ True ​ False

True Note: The size of the cache is determined by the number of servers in the warehouse (i.e. the larger the warehouse and, therefore, the number of servers in the warehouse), the larger the cache. https://docs.snowflake.com/en/user-guide/warehouses-considerations.html#how-does-warehouse-caching-impact-queries

True or False: Micro-partitions are immutable?

True - subsequent changes of any type to the data will be written to additional micro-partitions

You are running your query in the worksheet using a large warehouse. You want to change the warehouse. How do you do that? Choose 2 answers ​ USE WAREHOUSE ETL_WH; ​ SET WAREHOUSE CONTEXT=ETL_WH ​ Update the Warehouse field in the Context Menu located above the worksheet ​ Go to the Worksheet properties page and set the Warehouse field to ETL_WH

USE WAREHOUSE ETL_WH; Update the Warehouse field in the Context Menu located above the worksheet

In snowflake, you want to compare whether two expressions are equal. The expressions can have null values. What is the best way to compare in this case ​ Use EQUAL_NULL ​ Run a query with WHERE EXPR1 = EXPR2 ​ Run a query with WHERE EXPR2=EXPR1 See all questions BackNext question

Use EQUAL_NULL Note: This is an important concept to understand in Snowflake. This has caused me pains in production. Even we do not follow this, you will have incorrect data in production. EQUAL comparison operator (=), treats NULLs as unknown values. Like most SQL languages, comparing NULL = NULL does not return TRUE. In SnowFlake, it returns NULL EQUAL_NULL function is NULL-safe, meaning it treats NULLs as known values for comparing equality. https://docs.snowflake.com/en/sql-reference/functions/equal_null.html#equal-null

You want to identify the potential performance bottlenecks and improvement opportunities of a query. What will you do? ​ Use Query Profile ​ Use Explain plan ​ Call snowflake support

Use Query Profile

You need to handle a huge number of concurrent users for your snowflake data warehouse. All of them are from the same team and needs same type of access. Which is the most appropriate warehouse strategy will you recommend? ​ Create one warehouse for each user ​ Create one warehouse for each group of user ​ Use multi-cluster warehouse

Use multi-cluster warehouse

How did you validate the data after unloading it using COPY INTO command? ​ Load the data in a relational table and validate the rows ​ Load the data in a CSV and validate the rows ​ Use validation_mode='RETURN_ROWS'

Use validation_mode='RETURN_ROWS' Note: Execute COPY in validation mode to return the result of a query and view the data that will be unloaded from the orderstiny table if COPY is executed in normal mode: https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html#validating-data-to-be-unloaded-from-a-query

Semi structured data strings in snowflake are stored in a column with the below datatype ​ VARIANT ​ OBJECT ​ VARCHAR ​ NONE OF THE ABOVE

VARIANT

Your organization requires a snowflake instance that is a completely separate Snowflake environment, isolated from all other Snowflake accounts. Which edition will you suggest? ​ Standard ​ Premier ​ Virtual Private Snowflake (VPS)

Virtual Private Snowflake (VPS)

In order to load data into snowflake, which of the below are required? ​ Virtual Warehouse ​ Predefined target table ​ Staging location with data staged ​ File format

Virtual Warehouse ​ Predefined target table ​ Staging location with data staged ​ File format Note: Virtual warehouse is required to provide compute resources Target table must be pre-defined load does not have feature to auto create tables Data must be staged through the PUT command File format must be defined to indicate whether CSV, JSON, XML etc is being loaded

You ran a query in snowflake and went to query history tab. The query history shows you the below columns 1. QueryID 2. SQL TEXT 3. WAREHOUSE NAME 4. WAREHOUSE SIZE 5. SESSION ID 6. START TIME 7. END TIME Which of the above column will indicate if an compute cost was incurred to run the query? ​ WAREHOUSE NAME ​ CREDIT ​ WAREHOUSE SIZE ​ SESSION ID

WAREHOUSE SIZE Note: Anytime a query incurs compute cost, you will see the warehouse size mentioned as shown below Just fo an experiment as below 1. Run the query SHOW TABLES 2. Go to query history Do you see the ware house size? No, because SHOW TABLES is a metadata query and does not incur any compute cost

If you attempt to PUT a file but cannot because a file with the same name already exists in the stage, what can you do to load the file? ​ Wait until the existing file's data is loaded, then retry PUT ​ Rename the file that you want to PUT ​ Set OVERWRITE = TRUE ​ Set FORCE COPY = TRUE

Wait until the existing file's data is loaded, then retry PUT ​ Rename the file that you want to PUT ​ Set OVERWRITE = TRUE

Which cache type gets purged in a predictable way so that it ends up empty of all cached infromation? ​ Warehouse Cache ​ Results Cache ​ Metadata Cache

Warehouse Cache Not: Physical storage that gets purged everytime WH is suspended

When creating a warehouse, what are the two most critical factors to consider from cost and performance persepective ​ Warehouse size (i.e. number of servers per cluster) ​ Manual vs automated management (for starting/resuming and suspending warehouses) ​ Cloud provider selection

Warehouse size (i.e. number of servers per cluster) ​ Manual vs automated management (for starting/resuming and suspending warehouses)

Resource monitor notifications can be received by account administrators through the following means ​ Web interface ​ Email ​ Pager ​ Mobile

Web interface ​ Email

Cloud services also use compute. Does snowflake charge you for cloud services. ​ Yes, if it exceeds 10% of the daily usage of the compute resources ​ Yes, irrespective of how much compute resources are used ​ Snowflake never charges for cloud services

Yes, if it exceeds 10% of the daily usage of the compute resources Note: Similar to virtual warehouse usage, Snowflake credits are used to pay for the usage of the cloud services that exceeds 10% of the daily usage of the compute resources. Usage for cloud-services is charged only if the daily consumption of cloud services exceeds 10% of the daily usage of the compute resources. The charge is calculated daily (in the UTC time zone). This ensures that the 10% adjustment is accurately applied each day, at the credit price for that day https://docs.snowflake.com/en/user-guide/credits.html#cloud-services-credit-usage

In which of the below scenarios, you will use an external table. ​ You have data on the cloud providers's object store(AWS/GCS/AZURE Blob) but the data cannot be copied or moved to any other location due to compliance regulations ​ You have high volume of data on the cloud providers's object store(AWS/GCS/AZURE Blob) but only a part of the data is needed in snowflake ​ You have data on the cloud providers's object store(AWS/GCS/AZURE Blob) which needs to be updated by snowflake after applying required transformations ​ You have XML data on cloud provider's object store

You have data on the cloud providers's object store(AWS/GCS/AZURE Blob) but the data cannot be copied or moved to any other location due to compliance regulations ​ You have high volume of data on the cloud providers's object store(AWS/GCS/AZURE Blob) but only a part of the data is needed in snowflake Note: Please use elimination method to answer this questions 1. External tables are read-only, hence #3 is not correct 2. External tables do not support XML and hence #4 is not correct

What command will you run to pause a pipe? ​ alter pipe <pipe name> set pipe_execution_paused = true; ​ alter pipe <pipe name> set pipe_execution_paused = stop; ​ alter pipe <pipe name> set pipe_execution_paused = halt;

alter pipe <pipe name> set pipe_execution_paused = true;

You have created a secure view view1. Now you want to convert it back to a regular view. Which of the below can be used for the conversion? ​ alter view view1 set type = unsecure; ​ alter view view1 unset secure; ​ alter view view1 set secure; ​ None of the above

alter view view1 unset secure; Note: https://docs.snowflake.com/en/sql-reference/sql/alter-view.html#examples

You created an warehouse(ETL_WH) which is sized at XSMALL, you want to resize the Warehouse to SMALL. How will you do that? ​ alter warehouse etl_wh set warehouse_size=SMALL; ​ You will need to recreate the warehouse with the new size ​ Once created an warehouse can never be re-sized

alter warehouse etl_wh set warehouse_size=SMALL;

This snowpipe rest API Fetches a report about ingested files whose contents have been added to table ​ loadHistoryScan and insertReport ​ insertPipeReport ​ insertFiles

loadHistoryScan and insertReport

How are virtual warehouse credits charged? ​ per minute ​ per second ​ per-second, with a 60-second (i.e. 1-minute) minimum: ​ per hour

per-second, with a 60-second (i.e. 1-minute) minimum:

You want to get the clustering information for a table T1 that is clustered on column C1. Which of the below queries will you run to get the details? ​ select system$clustering_information('T1') ​ select clustering('T1') from information_schema.clusteringinfo; ​ Show clustering info for T1;

select system$clustering_information('T1')

Let's say that you have two JSONs as below 1. {"stuId":2000, "stuName":"Amy"} 2.{"stuId":2000,"stuCourse":"Snowflake"} How will you write a query that will check if stuId in JSON in #1 is also there in JSON in#2 ​ with stu_demography as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; ​ with stu_demography as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; ​ SELECT CONTAINS('{"stuId":2000, "stuName":"Amy"}','{"stuId":2000,"stuCourse":"Snowflake"}'); ​ with stu_demography as (select parse_json(column1) as src, src['STUID'] as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; See all questions BackSkip question

with stu_demography as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; with stu_demography as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; Note: Please note that this may not be the way the question will appear in the certification exam, but why we are still learning this? 1. When you take this course, I prepare you not just for the certification but also to be prepared to work on a project 2. With this hands-on, what will you learn a. That there are two ways to query JSON, dot notation(src:stuId) and bracket notation src['stuId'] b. The JSON Value is case sensitive, if you run the 4th query with src['STUID'], it will return false because src['STUID'] will return you a NULL since there is a mismatch in the case. In the certification you may be asked question on a), b) - but do you agree if you learn this way you will never forget the concept:)

One of the government agencies has decided to use Snowflake for their datawarehouse. They require compliance with US federal privacy and security standards, such as FIPS 140-2and FedRAMP (Moderate Impact) . Which cloud providers can you choose to setup the snowflake instance? ​ A) AWS ​ B) AZURE ​ C) GOOGLE ​ D) IBM

​ A) AWS ​ B) AZURE Note: https://docs.snowflake.com/en/user-guide/intro-regions.html#government-regions

What are the default roles that snowflake automatically adds for any new account? ​ A) AccountAdmin ​ B) SecurityAdmin ​ C) SysAdmin ​ D) Public ​ E) Security Admin

​ A) AccountAdmin ​ B) SecurityAdmin ​ C) SysAdmin ​ D) Public

What are the two ways available in snowflake before change tracking metadata is recorded for a table? ​ A) Change tracking is enabled on the table (using ALTER TABLE ... CHANGE_TRACKING = TRUE) B) ​ A stream is created for the table (using CREATE STREAM) ​ C) Enable CDC on the table

​ A) Change tracking is enabled on the table (using ALTER TABLE ... CHANGE_TRACKING = TRUE) B) ​ A stream is created for the table (using CREATE STREAM)

What would you check to see if a large table will benefit from explicitly defining a clustering key ​ A) Clustering depth ​ B) Clustering percentage ​ C) Clustering ratio

​ A) Clustering depth

Which are the required parameters while creating a storage integration? ​ A) Name ​ B) Type ​ C) Enabled ​ D) Storage_Allowed_Locations ​ E) Storage_blocked_locations

​ A) Name ​ B) Type ​ C) Enabled ​ D) Storage_Allowed_Locations

Which factors affect your snowflake data loading rate? ​ A) Physical location of the stage ​ B) RAM on the virtual warehouse ​ C) GZIP compression efficiency ​ D) Thread size

​ A) Physical location of the stage ​ C) GZIP compression efficiency Note: https://www.snowflake.com/blog/how-to-load-terabytes-into-snowflake-speeds-feeds-and-techniques/ Load rates for your own data files may differ based on a number of factors: Location of your S3 buckets - For our test, both our Snowflake deployment and S3 buckets were located in us-west-2 Number and types of columns - A larger number of columns may require more time relative to number of bytes in the files. Gzip Compression efficiency - More data read from S3 per uncompressed byte may lead to longer load times.

Snowflake supports below type of streams. Please select three ​ A) Standard ​ B) Update-only ​ C) Append-only ​ D) Insert-only

​ A) Standard ​ C) Append-only ​ D) Insert-only

Which Azure regions are supported by Snowflake in US and Canada ​ A) west-us-2 ​ B) east-us-2 ​ C) canada-central ​ D) west-us-1

​ A) west-us-2 ​ B) east-us-2 ​ C) canada-central

Which are true with respect to SMT(simple message transformation) when neither key.converter or value.converter is set ​ All SMTs are supported ​ Most SMTs are supported ​ regex.router is not supported

​ Most SMTs are supported ​ regex.router is not supported

The RECORD_METADATA contains which information ​ Topic ​ Partition ​ Key ​ CreateTime / LogAppendTime ​ Value

​ Topic ​ Partition ​ Key ​ CreateTime / LogAppendTime

SQL functionality in snowflake can be extended by ​ USER DEFINED FUNCTIONS USING SQL ​ USER DEFINED FUNCTIONS USING JAVASCRIPT ​ SESSION VARIABLES ​ NONE OF THE ABOVE

​ USER DEFINED FUNCTIONS USING SQL ​ USER DEFINED FUNCTIONS USING JAVASCRIPT ​ SESSION VARIABLES

How will you store JSON data in snowflake ​ A) Using a column with datatype as JSON ​ B) Using a column with datatype as VARCHAR ​ C) Using a column with datatype as VARIANT

​C) Using a column with datatype as VARIANT


Conjuntos de estudio relacionados

Ch. 50 Prep U Patients With Hearing and Balance Disorders

View Set

Chp 12 - Employee Benefits: Group Life and Health Insurance

View Set

Chapter 21: PrepU - Complications Occurring Before Labor and Delivery

View Set

Marketing & Professional Sales Unit 3 Test Review

View Set

Introduction to project management week1

View Set