SnowPro Certification
Which of the following are options when creating a Virtual Warehouse? A. Auto-suspend B. Storage size C. Auto-resume D. Server Count
A & C
Which of the following are true about the variant data type in Snowflake? A. Optimized storage based on repeated elements B. Stored in a separate file format from structured data C. Can be queried using JSON path notation D. Requires a customer mapping for each record type
A & C
Which are the supported data types for a JSON name/value pair? A number (integer or floating point) A string (in double quotes) A Boolean (true or false) An array (in square brackets) An object (in curly braces) Null complex datatype
A number (integer or floating point) A string (in double quotes) A Boolean (true or false) An array (in square brackets) An object (in curly braces) Null
Currently in snowflake, replication is supported for which type of objects A) Databases B) Users C) Roles D) Warehouses
A) Databases
What is the technique called which snowflake uses to limit the number of micro-partitions retrieved as part of a query? A) Pruning B) Selective Filter C) Indexing D) Clustering
A) Pruning
In snowflake, what are the two major cost categories A) Storage B) Compute C) Cloud services D) Network Traffic
A) Storage B) Compute
While loading data through the COPY command, you can transform the data. Which of the below transformations are allowed? A) TRUNCATECOLUMNS B) REORDER COLUMNS C) OMIT COLUMNS D) CAST E) AGGREGATE
A) TRUNCATECOLUMNS B) REORDER COLUMNS C) OMIT COLUMNS D) CAST
Which database objects can be shared using snowflake's secure data sharing A) Tables B) External tables C) Secure views D) Secure materialized views E) Secure UDFs F) Roles
A) Tables B) External tables C) Secure views D) Secure materialized views E) Secure UDFs
Select the objects which do not have a fail safe period A) Transient table B) Temporary table C) Permanent table
A) Transient table B) Temporary table
This object in snowflake returns a set of rows instead of a single, scalar value, and can be accessed in the FROM clause of a query A) UDTF B) UDF C) Stored procedure
A) UDTF
Let us imagine you have a permanent table named EMPLOYEE, you have dropped the table. Then you created another table with the same name. If you execute the UNDROP command now, what will happen? A) UNDROP command will fail B) It will work similar to insert overwrite C) The new table will be renamed with a suffix of v1.
A) UNDROP command will fail
Which of the below privileges are required to add clustering on a table A) USAGE or Ownership on the schema and database B) MONITOR USAGE on the table C) ALTER privilege on the table
A) USAGE or Ownership on the schema and database
What are the valid data loading options in snowflake? A) Using snowsql and sql B) Using snowpipe C) Using 3rd part ETL tool D) Using the cloud providers data upload tools
A) Using snowsql and sql B) Using snowpipe C) Using 3rd part ETL tool
You want to query the history of task usage within a specified date range. What will you do? A) You will query the TASK_HISTORY table function B) You will run 'SHOW TASK HISTORY' C) You will run 'DESCRIBE TASKS'
A) You will query the TASK_HISTORY table function
You have several CSV files loaded into your named snowflake internal stage. You want to load files from the stage into a table using pattern matching to only load uncompressed CSV files whose names include the string sales. Which is the command that you will use to do the same? A) copy into mytable from @my_int_stage file_format = (format_name = myformat) pattern='.*sales.*[.]csv'; B) copy into mytable from @my_int_stage regex='.*sales.*[.]csv'; C) copy into mytable from @my_int_stage match_pattern='.*sales.*[.]csv';
A) copy into mytable from @my_int_stage file_format = (format_name = myformat) pattern='.*sales.*[.]csv';
What is the default option for ON_ERROR clause of COPY command for bulk loading? CONTINUE SKIP_FILE SKIP_FILE_10 ABORT_STATEMENT
ABORT_STATEMENT
Which role in Snowflake allows a user to administer users and manage all database objects? SYSADMIN SECURITYADMIN ACCOUNTADMIN ROOT
ACCOUNTADMIN
Select the most accurate statement regarding Snowflake's transaction support ACID Compliant Requires special user configuration BASE compliant Works like a NoSQL database
ACID Compliant
If you would like to disable the query cache, what will you do? ALTER SESSION SET USE_CACHED_RESULT = FALSE; ALTER SESSION SET USE_CACHED_RESULT = OFF; ALTER SESSION SET USE_QUER_CACHE = TRUE;
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
Choose the answer which true about key rotation in snowflake Account and table master keys are automatically rotated by Snowflake when they are more than 30 days old Only account master key is rotated by snowflake when it is required Only table master key is rotated when it is required Key rotation is an expensive operation hence it is done manually when required
Account and table master keys are automatically rotated by Snowflake when they are more than 30 days old
When you sign up for a snowflake trial account, what are three snowflake editions that are offered A) Free-Tier B) Standard C) Enterprise Data Security D) Enterprise E) Business critical
B) Standard D) Enterprise E) Business critical
A task can execute multiple SQL statement. TRUE FALSE
FALSE
You need to contact Snowflake Support team if retrieval of data is required from fail safe (TRUE OR FALSE)
TRUE
What are the various stages available in Snowflake A) USER B) TABLE C) NAMED INTERNAL D) NAMED EXTERNAL E) ACCOUNT STAGE F) SCHEMA STAGE
A) USER B) TABLE C) NAMED INTERNAL D) NAMED EXTERNAL Note: SNOWFLAKE supports three type of stages 1. USER 2. TABLE 3. NAMED Named stages are again of two types internal and external USER and Table stages are automatically available in snowflake and are not needed to be created or configured https://docs.snowflake.com/en/user-guide/data-load-local-file-system-create-stage.html
After how many days do the load metadata of a table expire? A) 64 B) 14 C) 7 D) 365
A) 64
Which of the below are binary formats? AVRO JSON ORC PARQUET
AVRO ORC PARQUET
Tri-Secret Secure option is available in which snowflake edition Business critical or higher Enterprise Edition All editions
Business critical or higher
Select the term that is associated with compute layer? A) Query optimization B) Query planning C) Query processing
C) Query processing
Load performance in snowflake is fastest for which file format CSV ORC AVRO PARQUET
CSV
True or False: Snowflake charges a premium for storing semi-structured data.
False
True or False: Snowflake only works with cloud based tools.
False
True or False: The query profiler view is only available for completed queries.
False
Which parameter controls the maximum number of days for which Snowflake can extend the data retention period for tables to prevent streams on the tables from becoming stale. MAX_DATA_EXTENSION_TIME_IN_DAYS MAX_DATA_RETENTION_TIME_IN_DAYS MAX_DATA_EXTENSION_TIME_IN_MONTHS
MAX_DATA_EXTENSION_TIME_IN_DAYS
Which constrains are enforced in snowflake Referential integrity constraints NOT NULL constraint UNIQUE Constraint
NOT NULL constraint
Which of the two statements are true about the variant data type in SnowFlake? Optimized storage based on repeated elements Stored in a seperate file format from structured data Can be queried using json path notation Requires a custom mapping for each record type
Optimized storage based on repeated elements Can be queried using json path notation
Which are the two variant columns available in a snowflake table loaded by kafka connector RECORD_CONTENT RECORD_METADATA RECORD_KEY
RECORD_CONTENT RECORD_METADATA
When adding Consumers to Outbound Shares, what types of Consumer Accounts can be selected? Reader GCP Azure Full AWS
Reader Full Explanation Refer Link for details - https://docs.snowflake.com/en/user-guide/data-sharing-intro.html
Which are the key concepts that will need to be considered while loading data into snowflake STAGE OBJECT FILE FORMAT TRANSFORMATION AND ERROR VALIDATION FILE SIZE
STAGE OBJECT FILE FORMAT TRANSFORMATION AND ERROR VALIDATION
You want to load a JSON file but would like to remove the outer array structure and load the records into separate table rows. Which file format option would you use? STRIP_OUTER_ARRAY TRUNCATE_OUTER_ARRAY UNSET_OUTER_ARRAY DROP_OUTER_ARRAY
STRIP_OUTER_ARRAY
What is the name of the Snowflake's Command Line Interface tool? A) SnowCLI B) SnowSQL C) SnowCMD D) SnowSpark
SnowSQL
Which of the following are Snowflake view types? Standard External Transient Materialized Permanent Secure
Standard Materialized Secure https://docs.snowflake.com/en/user-guide/views-introduction.html
Insert-only streams are supported on external tables only and it is currently a preview feature (TRUE OR FALSE)
TRUE
JSON data is a hierarchical collection of name/value pairs grouped into objects and arrays TRUE FALSE
TRUE
Looker is a snowflake business intelligence partner (TRUE OR FALSE)
TRUE
True or False: Snowflake caches are automatically invalidated if the underlying data changes.
True
You have two data warehouses in your snowflake instance. You have updated the data in the storage layer using one of the warehouses. When will the other warehouse be able to see the data A) Immediately B) After 30 minutes once snowflake completes data synchronization C) You will need to trigger the data synchronization process for the other warehouse to see the data
A) Immediately
Lets say that you have scheduled a task. At time T0, the task has started executing. If it is still executing when the next scheduling execution time occurs, what happens to that task A) It will be skipped B) It will wait for the previous task to complete C) It will be aborted
A) It will be skipped
Which cloud providers are supported by snowflake A) Google Cloud Platform B) Azure C) AWS
A) Google Cloud Platform B) Azure C) AWS
Snowflake is compliant with which certifications? A) HIPAA, PCI DSS, HIPAA, PCI DSS and FedRAMP B) HIPAA, PCI DSS C) Only HIPAA D) Only FedRAMP
A) HIPAA, PCI DSS, HIPAA, PCI DSS and FedRAMP
When you use this parameter with SHOW STREAMS, The output also includes an additional dropped_on column, which displays 1. Date and timestamp (for dropped streams). 2.NULL (for active streams). A) HISTORY B) TERSE C) DROPPED=TRUE
A) HISTORY
When unloading data into multiple files, you will use this copy option to specify the maximum size of each file created A) MAX_FILE_SIZE B) MAX_SIZE C) MAX_FILE_BYTES
A) MAX_FILE_SIZE
Select the snowflake edition that allow only a maximum of 1 day of time travel A) Standard B) Enterprise C) VPS D) Business Critical
A) Standard
What should be your minimum snowflake edition for data sharing capability A) Standard B) Enterprise C) Business critical
A) Standard
Load history for bulk data load is stored in the metadata of the target table for 64 days (TRUE OR FALSE)
TRUE Note: Bulk data load Stored in the metadata of the target table for 64 days. Available upon completion of the COPY statement as the statement output. Snowpipe Stored in the metadata of the pipe for 14 days. Must be requested from Snowflake via a REST endpoint, SQL table function, or ACCOUNT_USAGE view. https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html#load-history
Client has ODBC or JDBC available in their system but they do not have snowflake drivers. Client will still be able to connect to snowflake? TRUE FALSE
FALSE
In a multi-cluster warehouse, you will need to manually scale the warehouses(minimum cluster to maximum cluster) (TRUE OR FALSE)
FALSE
JSON and PARQUET files can be loaded to columns in the same table TRUE FALSE
FALSE
Loading data using snowpipe REST API is supported for external stage only TRUE FALSE
FALSE
Once you have selected a warehouse size, you cannot resize the size of the warehouse (TRUE OR FALSE)
FALSE
Reclustering in Snowflake requires manual configuration (TRUE OR FALSE)
FALSE
Scaling down a virtual warehouse(e.g from a large warehouse to a small one) is an automated process. TRUE FALSE
FALSE
SnowPipe can load a file with same name if it has been modified later. (TRUE OR FALSE)
FALSE
Snowflake allows only loading of structured data (TRUE OR FALSE)
FALSE
Snowpipe guarantees that files are loaded in the same order they are staged (TRUE OR FALSE)
FALSE
Temporary tables cannot be created with a clustering key defined TRUE FALSE
FALSE
The insertReport SNOWPIPE API can retrieve file ingestion events and report it. The events are retained for infinite time until deleted manually. (TRUE OR FALSE)
FALSE
When migrating from Teradata to snowflake, you will also need to migrate the indexes. TRUE FALSE
FALSE
When you are loading data through the COPY command, the table and the file being loaded must have the same order of columns FALSE TRUE
FALSE
When you load a table from a file through COPY command, the file should have the same number and order of columns as the table (TRUE OR FALSE)
FALSE
When you load data using Snowpipe, loads are always performed in a single transaction FALSE TRUE
FALSE
You can load data using PUT command through worksheets in snowflake web ui (TRUE OR FALSE)
FALSE
You can map snowflake to any s3 bucket and can query the data directly as long as the data is in Parquet or ORC format TRUE FALSE
FALSE
You can use the query profiler view only for completed queries (TRUE OR FALSE)
FALSE
You have a materialized view and you want to track changes in the view. You can use streams to track that change (TRUE OR FALSE)
FALSE
You have specified PURGE=TRUE in your COPY command. Due to a network failure the data files could not be purged after successful load. You will get an error returned by snowflake due to this failure. (TRUE OR FALSE)
FALSE
The data stored in temporary table does not contribute to storage cost (TRUE OR FALSE)
FALSE Note: For the duration of the existence of a temporary table, the data stored in the table contributes to the overall storage charges that Snowflake bills your account. To prevent any unexpected storage changes, particularly if you create large temporary tables in sessions that you maintain for periods longer than 24 hours, Snowflake recommends explicitly dropping these tables once they are no longer needed. You can also explicitly exit the session in which the table was created to ensure no additional charges are accrued.
Micropartitions can be modified after creation of the table TRUE FALSE
FALSE Note: In snowflake data is physically stored into micro-partitions in proprietary columnar format. As a used you do not have direct access to micropartitions. Snowflake manages the micropartitions. Also micro partitions are immutable which means that once a micro partition is created, it cannot be changed. When a row is updated, the micro partition containing that row is copied to a new micro partition and then the updated row is inserted into that partition. The old micro-partition is marked for deletion. As a used, you however have control to create clustering on your table which helps to re-arrange the micro-partitions based on the cluster keys for faster query retrieval.
By executing the 'SHOW TABLES' command, we can list all the tables in all the schemas even if we do not have access to all the tables TRUE FALSE
FALSE Note: SHOW TABLES Lists the tables for which you have access privileges, including dropped tables that are still within the Time Travel retention period and, therefore, can be undropped. The command can be used to list tables for the current/specified database or schema, or across your entire account.
Time travel can be disabled at the account level TRUE FALSE
FALSE Note: Time Travel cannot be disabled for an account; however, it can be disabled for individual databases, schemas, and tables by specifying DATA_RETENTION_TIME_IN_DAYS with a value of 0 for the object. https://docs.snowflake.com/en/user-guide/data-time-travel.html#enabling-and-disabling-time-travel
Virtual warehouses cannot be resized while they are running TRUE FALSE
FALSE Note: Warehouses can be started and stopped at any time. They can also be resized at any time, even while running, to accommodate the need for more or less compute resources, based on the type of operations being performed by the warehouse. https://docs.snowflake.com/en/user-guide/warehouses-overview.html
Providers can list data on the Marketplace without contacting Snowflake TRUE FALSE
FALSE Explanation Refer Link - https://docs.snowflake.com/en/user-guide/data-marketplace-intro.html If you would like to request new data to be added to the Snowflake Data Marketplace you can click the Is there data you'd like to see on the Marketplace? link in the product UI or use this web registration form to make your request. Snowflake team needs to be consulted to get the data listed on Marketplace.
Information Schema Views will require a warehouse to execute whereas SHOW command does not require one. (TRUE OR FALSE)
TRUE
Multi-factor Authentication can be used for connecting to Snowflake via the Snowflake JDBC driver TRUE FALSE
TRUE
Non-native values such as dates and timestamps are stored as strings when loaded into a VARIANT column TRUE FALSE
TRUE
Only named stages (internal or external) and user stages are supported for COPY transformations (TRUE OR FALSE)
TRUE
Organizing input data by granular path can improve load performance TRUE FALSE
TRUE
Query load is calculated by dividing the execution time (in seconds) of all queries in an interval by the total time (in seconds) for the interval. TRUE FALSE
TRUE
SNOWPIPE REST APIs support both internal and external stage (TRUE OR FALSE)
TRUE
Snowflake automatically stores data in encrypted form in all editions (TRUE OR FALSE)
TRUE
Snowflake can carry out transformations after loading files staged by partner software (ELT) (TRUE OR FALSE)
TRUE
Snowflake has scalar and tabular user-defined functions (UDFs), with support for both SQL and JavaScript. (TRUE OR FALSE)
TRUE
Snowflake is a SaaS offering TRUE FALSE
TRUE
Snowflake recommends to compress your data files when you are loading large data sets. (TRUE OR FALSE)
TRUE
Snowflake supports multi-factor authentication (MFA) to provide increased login security for users connecting to Snowflake. At this time, Snowflake users are not automatically enrolled in MFA. (TRUE OR FALSE)
TRUE
Snowflake supports specifying a SELECT statement instead of a table in the COPY INTO <location> command (TRUE OR FALSE)
TRUE
Snowpipe continuously loads micro-batches of data from an external stage location (Amazon S3, Google Cloud Storage, or Microsoft Azure) into a staging table. TRUE FALSE
TRUE
Snowpipe is a serverless function (TRUE OR FALSE)
TRUE
Streams cannot track changes in materialized views TRUE FALSE
TRUE
Streams ensure exactly once semantics for new or changed data in a table (TRUE OR FALSE)
TRUE
The Kafka connector continuously loads records from one or more Apache Kafka topics into an internal (Snowflake) stage and then into a staging table using Snowpipe. TRUE FALSE
TRUE
The Snowflake Web UI data loading wizard is only intended for loading small numbers of files of limited size (up to 50 MB). (TRUE OR FALSE)
TRUE
The VALIDATION_MODE parameter does not support COPY statements that transform data during a load. (TRUE OR FALSE)
TRUE
The amount of storage used by failsafe can be viewed under Account->Billing & Usage section TRUE FALSE
TRUE
The interactions with data are initialized through the services layer TRUE FALSE
TRUE
The table functions in INFORMATION_SCHEMA can be used to return account-level usage and historical information for storage, warehouses, user logins, and queries (TRUE OR FALSE)
TRUE
To help avoid exceeding the rate limit (error code 429), snowflake recommends relying more heavily on insertReport than loadHistoryScan (TRUE OR FALSE)
TRUE
Two virtual warehouses can access the same data at the same time without any contention issues TRUE FALSE
TRUE
When a network policy includes values in both the allowed and blocked IP address lists, Snowflake applies the blocked IP address list first TRUE FALSE
TRUE
When a network policy includes values in both the allowed and blocked IP address lists, Snowflake applies the blocked IP address list first. (TRUE OR FALSE)
TRUE
When cloning a table, if the COPY GRANTS keywords are not included in the CREATE <object> statement, then the new object does not inherit any explicit access privileges granted on the original table but does inherit any future grants defined for the object type in the schema TRUE FALSE
TRUE
When staging uncompressed files in a Snowflake stage, the files are automatically compressed using gzip, unless compression is explicitly disabled. (TRUE OR FALSE)
TRUE
When you clone a table, it does not include the load history of the source table, and hence the data files that were already loaded in source table can be reloaded in the clone table TRUE FALSE
TRUE
While sharing data, consumer account and provider account must be in the same region (TRUE OR FALSE)
TRUE
With respect to snowflake security, Tri-secret requires that customers manage their keys TRUE FALSE
TRUE
You have two types of named stages, one is external stage and the other one is internal stage. External stage will always require a cloud storage provider TRUE FALSE
TRUE
You need snowflake instances in different regions. You will need to have separate account for each region (TRUE OR FALSE)
TRUE
A warehouse can be assigned to a single resource monitor only (TRUE OR FALSE)
TRUE Note: A single monitor can be set at the account level to control credit usage for all warehouses in your account. In addition, a monitor can be assigned to one or more warehouses, thereby controlling the credit usage for each assigned warehouse. Note, however, that a warehouse can be assigned to only a single resource monitor.
How do you set a return value in a task? create task set_return_value warehouse=return_task_wh schedule='1 minute' as call system$set_return_value('The quick brown fox jumps over the lazy dog'); create task set_return_value warehouse=return_task_wh schedule='1 minute' as call system$set_return_code('The quick brown fox jumps over the lazy dog'); create task set_return_value warehouse=return_task_wh schedule='1 minute' as call set_return_value('The quick brown fox jumps over the lazy dog');
create task set_return_value warehouse=return_task_wh schedule='1 minute' as call system$set_return_value('The quick brown fox jumps over the lazy dog');
To provide failover protection, how many availability zones does snowflake replicate to one two three as configured by user
two
You want to get the DDL statement of a snowflake table. What is the command that you will use? select get_ddl('table', 'mydb.public.salesorders'); show table 'mydb.public.salesorders'; show table like 'mydb.public.salesorders';
select get_ddl('table', 'mydb.public.salesorders');
Which of the below statement will you use to recreate a specified object? A) GET_DDL B) GET_SCHEMA C) SHOW SCHEMA
A) GET_DDL
Which two options mentioned below will you choose to concat a column named 'FRUIT_DESCRIPTION' between two % signs? '%'||FRUIT_DESCRIPTION||'%' CONCAT('%',CONCAT(FRUIT_DESCRIPTION,'%')) %+FRUIT_DESCRIPTION+% SUBSTR(%,FRUIT_DESCRIPTION,%)
'%'||FRUIT_DESCRIPTION||'%' CONCAT('%',CONCAT(FRUIT_DESCRIPTION,'%'))
Which command is used to download data from Snowflake stage PUT GET COPY DUMP
GET
An empty table with no micro-partition will have a clustering depth of 1 0 -1 99
0 Note: The clustering depth for a populated table measures the average depth (1 or greater) of the overlapping micro-partitions for specified columns in a table. The smaller the average depth, the better clustered the table is with regards to the specified columns. Clustering depth can be used for a variety of purposes, including: 1. Monitoring the clustering "health" of a large table, particularly over time as DML is performed on the table. 2. Determining whether a large table would benefit from explicitly defining a clustering key. A table with no micro-partitions (i.e. an unpopulated/empty table) has a clustering depth of 0.
How much Snowpipe charge for the files in Queue 0.06 credits per 100 files queued No charges for files in Queue 0.06 credits per 10000 files queued 0.06 credits per 1000 files queued
0.06 credits per 1000 files queued https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html#:~:text=Continuous%20Data%20Loads%20(i.e.%20Snowpipe)%20and%20File%20Sizing,-Snowpipe%20is%20designed&text=This%20overhead%20increases%20in%20relation,credits%20per%201000%20files%20queued.
What is the recommended maximum data size limitations for Parquet files? 100 MB 10 MB 1 GB 3 GB
1 GB Note: Currently, data loads of large Parquet files (e.g. greater than 3 GB) could time out. Split large files into files 1 GB in size (or smaller) for loading.
What is the maximum number of child tasks can a task have? 50 100 1000 10
100
What is the recommended file size, to optimize the number of parallel operation for a load 100-250 MB (Compressed) 1 MB to 10 MB (Compressed) None of the above 50 MB to 150 MB (Compressed)
100-250 MB (Compressed) The number of load operations that run in parallel cannot exceed the number of data files to be loaded. To optimize the number of parallel operations for a load, we recommend aiming to produce data files roughly 100-250 MB (or larger) in size compressed. Refer Link - https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html
What is the recommended file size to optimize the number of parallel operations for a load? 100-250 MB(or larger) in size compressed 10 MB to 100 MB in size un compressed. less than 1 MB compressed
100-250 MB(or larger) in size compressed Note: https://docs.snowflake.com/en/user-guide/data-load-considerations-prepare.html#file-sizing-best-practices-and-limitations
What is the size limit on the variant data type in snowflake 16 MB 64 MB 32 MB
16 MB
What is the size limit imposed by VARIANT data type. 100 MB compressed 16 MB compressed 16 MB uncompressed 100 MB compressed
16 MB compressed
You are using a 2XL warehouse which has 32 nodes and 8 cores in each node. When you submit a query to this warehouse, snowflake will try to split the query into how many partitions? (Provided that the query is eligible for splitting) 256 32 8
256 Note: partitions = cores x nodes
If a transaction is running on a session and the session disconnects abruptly, the transaction will be in a detached state. It will not allow the transaction to commit or roll back. The locks on the resources will also not be released. If you do not do anything and let snowflake abort the transaction, how long will you need to wait? 4 hours 15 minutes 1 hour
4 hours Note: If a transaction is running in a session and the session disconnects abruptly, preventing the transaction from committing or rolling back, the transaction is left in a detached state, including any locks that the transaction is holding on resources. If this happens, you might need to abort the transaction. To abort a running transaction, the user who started the transaction or an account administrator can call the system function, SYSTEM$ABORT_TRANSACTION. If the transaction is left open, Snowflake typically rolls back the transaction after it has been idle for four hours. https://docs.snowflake.com/en/sql-reference/transactions.html#aborting-transactions
After how many days does the load metadata expire 30 60 64
64
True or False: A table in Snowflake can contain both a variant column and a timestamp
A
True or False: The warehouse cache may be reset if a running warehouse is suspended and then resumed.
A
Which of the following DML commands isn't supported by Snowflake? A. UPSERT B. MERGE C. UPDATE D. TRUNCATE TABLE
A
Please look at the below two queries A. SELECT * FROM TABLE1 WHERE YEAR = 2015 AND UPPER(COL1) = 'COLVAL'; B. SELECT * FROM TABLE1 WHERE YEAR = 2015 AND COL1 ='COLVAL'; Which one of these will scan more partitions? Note: COL1 stores values both in upper and lower case A B Both will scan the same number of partitions
A Note: This probably will not come in the certification exam, but it is important to know this when you are working on a project. Snowflake stores the metadata for each column in a metadata services layer which includes the range of values of the columns in the micro partition. In this case, the range of values stored in the metadata layer for COL1 is stored in its raw form which is a mix of upper and lower case. Because of this when you apply the UPPER function, the metadata services layer cannot use the information stored in the layer to fetch the micro partitions for the range of the rows
Which of the following are true of Multi-Cluster warehouses? A. Adds clusters automatically based on query activity B. Sizes each cluster optimally based on the queries C. Scales down when query activity slows down D. Multi-cluster warehouses will never auto-suspend
A & C
Which of the following statements apply to Snowflake in terms of security? A. Snowflake leverages a role-based access control model B. Snowflake requires you to configure an IAM user in order to connect to the database C. All data in Snowflake is encrypte D. Snowflake can run within a customer's VPC
A & C
Select all of the answers that describe SF micro-partitions. Micro-partitions A. Are the physical data files that comprise SF's logical tables B. Are written to a customer configured cloud storage location C. Will be updated as their row & column values are updated D. Enable horizontal and vertical query pruning
A & D
Which of the following are TRUE about SF data sharing? A. Consumers can query shared tables in the same query as their own tables B. Producers can share data to other databases such as Postgres and MySQL C. You don't need a SF virtual WH to query shared data D. Data sharing is integrated with role-based access control
A & D
What is the recommended approach for making a variant column accessible in a BI tool A pre-defined mapping A view Leveraging a json parser BI tool cannot access json
A view
You have a warehouse. It ran for 62 seconds and then was suspended. After that it resumed and ran for 20 seconds. For how many seconds will you be billed A) 122 B) 20 C) 92
A) 122 Note: When a virtual warehouse starts for the first time, the first one minute is anyway charged after that the billing is per second Warehouses are only billed for credit usage when they are running. When a warehouse is suspended, it does not accrue any credit usage. The credit numbers shown here are for a full hour of usage; however, credits are billed per-second, with a 60-second (i.e. 1-minute) minimum: Each time a warehouse is started or resized to a larger size, the warehouse is billed for 1 minute's worth of usage based on the hourly rate shown above. After 1 minute, all subsequent billing is per-second. Stopping and restarting a warehouse within the first minute does not change the amount billed; the minimum billing charge is 1 minute. When a warehouse is increased in size, credits are billed only for the additional servers that are provisioned. For example, changing from Small (2) to Medium (4) results in billing charges for 1 minute's worth of 2 credits. https://docs.snowflake.com/en/user-guide/credits.html#virtual-warehouse-credit-usage
Which character below identifies a table stage A) @% B) @ C) %
A) @%
Which character combination identifies a user stage A) @~ B) ~ C) @
A) @~
Please select the ones that are true for a snowflake task A) A task can execute a single SQL Statement B) A task can execute a call to a store procedure C) A task can execute multiple SQL statements D) A task can execute a function
A) A task can execute a single SQL Statement B) A task can execute a call to a store procedure
What can you expect if the filters specified in an INFORMATION_SCHEMA query are not sufficiently selective? A) AN ERROR B) WILL SHOW RESULTS WITH AN WARNING C) WILL SWITCH TO A LARGE WAREHOUSE
A) AN ERROR
If you logon to snowflake using https://mycompany.snowflakecomputing.com/, then what is the cloud provider and region you are logging into? A) AWS and us-west-2 B) Azure and us-east-1 C) AWS and us-east-2
A) AWS and us-west-2 Note: If an account has just the account name followed by snowflakecomputing.com, it means that the cloud provider is AWS and the region is us-west-2. This is the first cloud provider and region where snowflake was deployed
Which role or privileges are required to view TASK_HISTORY? A) Account administrator (i.e. users with the ACCOUNTADMIN role) B) Task owner (i.e. role that has the OWNERSHIP privilege on a task) C) Any role that has the global MONITOR EXECUTION privilege D) Any one with SYSADMIN role
A) Account administrator (i.e. users with the ACCOUNTADMIN role) B) Task owner (i.e. role that has the OWNERSHIP privilege on a task) C) Any role that has the global MONITOR EXECUTION privilege
Which roles can use SQL to view the task history within a specified date range? A) Account administrator (i.e. users with the ACCOUNTADMIN role). B) Task owner (i.e. role that has the OWNERSHIP privilege on a task). C) Any role that has the global MONITOR EXECUTION privilege. D) SECURITYADMIN
A) Account administrator (i.e. users with the ACCOUNTADMIN role). B) Task owner (i.e. role that has the OWNERSHIP privilege on a task). C) Any role that has the global MONITOR EXECUTION privilege.
What is the storage hierarchy in snowflake? A) Account->Database->Schemas->Objects B) Organization->folder->project->resource C) Database->Schemas->Objects
A) Account->Database->Schemas->Objects
Select all options that are true for ORDER BY in snowflake A) All data is sorted according to the numeric byte value of each character in the ASCII table. UTF-8 encoding is supported B) For numeric values, leading zeros before the decimal point and trailing zeros (0) after the decimal point have no effect on sort order. C) Unless specified otherwise, NULL values are considered to be higher than any non-NULL values. As a result, the ordering for NULLS depends on the sort order: 1. If the sort order is ASC, NULLS are returned last; to force NULLS to be first, use NULLS FIRST. 2. If the sort order is DESC, NULLS are returned first; to force NULLS to be last, use NULLS LAST.
A) All data is sorted according to the numeric byte value of each character in the ASCII table. UTF-8 encoding is supported B) For numeric values, leading zeros before the decimal point and trailing zeros (0) after the decimal point have no effect on sort order. C) Unless specified otherwise, NULL values are considered to be higher than any non-NULL values. As a result, the ordering for NULLS depends on the sort order: 1. If the sort order is ASC, NULLS are returned last; to force NULLS to be first, use NULLS FIRST. 2. If the sort order is DESC, NULLS are returned first; to force NULLS to be last, use NULLS LAST.
You have a schema with a table and a stream. You cloned the schema. What will happen to the stream A) Any unconsumed records in the stream (in the clone) are inaccessible B) Any unconsumed records in the stream (in the clone) will also be accessible in the clone C) Any unconsumed records in the stream (in the clone) will automatically flow into the table
A) Any unconsumed records in the stream (in the clone) are inaccessible
To download files from the stage/location loaded through COPY INTO <LOCATION> command, you will use A) GET B) UNLOAD C) COPY INTO
A) GET
How is the data storage cost computed for snowflake? A) Based on amount of compressed data B) Based on amount of un-compressed data C) Based on daily average of data stored D) Amount stored on first day of month E) Amount stored on last day of month
A) Based on amount of compressed data C) Based on daily average of data stored Note: Usage for data storage is calculated on the daily average amount of data (in bytes) stored in the system for: 1. Files staged for bulk data loading/unloading (can be stored compressed or uncompressed). 2. Database tables, including historical data for Time Travel (always compressed by Snowflake). 3. Fail-safe for database tables (always compressed by Snowflake). 4. Clones of database tables that reference data deleted in the table that owns the clones. Please look at point #3 above. Snowflake has a storage cost for fail-safe. Now by this time you may already know that Transient tables do not have fail-safe. So for use cases where it makes sense to use transient tables, you can choose transient table to save some costs. Note that storage costs are not that much so weigh in on what you are gaining by going with transient table.
Which one of the below requires an user-specified warehouse to execute COPY statements A) Bulk data load B) Snowpipe C) Both
A) Bulk data load
You can specify the below action to perform when an error is encountered while loading data from a file A) CONTINUE B) SKIP_FILE C) SKIP_FILE_<num> D) SKIP_FILE_<num>% E) ABORT_STATEMENT F) ERROR_STATEMENT
A) CONTINUE B) SKIP_FILE C) SKIP_FILE_<num> D) SKIP_FILE_<num>% E) ABORT_STATEMENT
What will you use to query Snowflake data loading history for the last 365 days (1 year)? A) COPY_HISTORY View B) LOAD_HISTORY View from INFORMATION SCHEMA C) INSERT_HISTORY View
A) COPY_HISTORY View Note: COPY_HISTORY View This Account Usage view can be used to query Snowflake data loading history for the last 365 days (1 year). The view displays load activity for both COPY INTO <table> statements and continuous data loading using Snowpipe. The view avoids the 10,000 row limitation of the LOAD_HISTORY View. https://docs.snowflake.com/en/sql-reference/account-usage/copy_history.html#copy-history-view LOAD_HISTORY View from Information schema This Information Schema view enables you to retrieve the history of data loaded into tables using the COPY INTO <table> command within the last 14 days. The view displays one row for each file loaded. https://docs.snowflake.com/en/sql-reference/info-schema/load_history.html#load-history-view Please also read about the LOAD_HISTORY view from ACCOUNT USAGE schema https://docs.snowflake.com/en/sql-reference/account-usage/load_history.html#load-history-view
Which of the below will you consider while choosing a cluster key A) Columns that are most actively used in selective filters B) Columns frequently used in join predicates C) Columns with extremely high cardinality D) Columns with extremely low cardinality
A) Columns that are most actively used in selective filters B) Columns frequently used in join predicates
Which of the below statements are true about snowflake data sharing? A) Consumers can query shared tables in the same query as their own tables B) Producers can share data to other databases such as Postgres and MySQL C) You don't need a Snowflake Virtual Warehouse to query shared data D) Data Sharing is integrated with role-based access controls
A) Consumers can query shared tables in the same query as their own tables D) Data Sharing is integrated with role-based access controls Note: https://docs.snowflake.com/en/user-guide/data-sharing-intro.html#how-does-secure-data-sharing-work
Which role permissions are required to create and manage streams ? Please specify the OBJECT and the PERMISSION. A) DATABASE - USAGE B) SCHEMA - USAGE, CREATE STREAM C) TABLE - SELECT D) TABLE - DELETE
A) DATABASE - USAGE B) SCHEMA - USAGE, CREATE STREAM C) TABLE - SELECT Note: https://docs.snowflake.com/en/user-guide/streams.html#required-access-privileges
Which type of data incur snowflake storage cost? A) Data stored in permanent tables B) Data retained to enable data recovery(fail-safe and time travel) C) Cached results D) Semi structured data stored in the cloud storage(like AWS S3, GCS)
A) Data stored in permanent tables B) Data retained to enable data recovery(fail-safe and time travel) Note: Storage cost is charged for below type of data 1. Files stored in Snowflake locations (i.e. user and table stages or internal named stages) for bulk data loading/unloading. The user who stages a file can choose whether or not to compress the file to reduce storage. 2. Data stored in database tables, including historical data maintained for Time Travel. Snowflake automatically compresses all data stored in tables and uses the compressed file size to calculate the total storage used for an account. 3. Historical data maintained for Fail-safe.
What is the best way to get the latest ODBC connector for use with Snowflake? A) Download it from snowflake web UI B) Search google and download any version of ODBC C) Compile in .NET
A) Download it from snowflake web UI
Column level security in Snowflake allows the application of a masking policy to a column within a table or view. Which two features are related to column level security A) Dynamic data masking B) External tokenization C) Data loss prevention(DLP)
A) Dynamic data masking B) External tokenization
Which of the below are true with respect to snowflake web ui? A) Each worksheet can have different role, warehouse, schema and database B) Each worksheet is a separate session C) You will be able to run a 'PUT' command from worksheet D) You must use the same role and warehouse for all worksheets
A) Each worksheet can have different role, warehouse, schema and database B) Each worksheet is a separate session
By default, result reuse is enabled in snowflake. if you want to disable it what will you do? A) Execute ALTER SESSION SET USE_CACHED_RESULT=FALSE B) Execute ALTER SESSION SET QUERY_CACHE=FALSE C) Execute ALTER SESSION SET QUERY_RESULT_CACHE =FALSE
A) Execute ALTER SESSION SET USE_CACHED_RESULT=FALSE
How will you know if a stream has become stale? A) Execute DESCRIBE STREAM or SHOW STREAMS B) Execute SHOW TABLE STREAMS C) Execute SHOW PIPES
A) Execute DESCRIBE STREAM or SHOW STREAMS
Following transformations are not supported in a COPY command. Select three. A) FLATTEN B) JOIN C) GROUP BY D) NVL
A) FLATTEN B) JOIN C) GROUP BY
A stream stores data in the same shape as the source table with some additional columns. Which are those additional columns? A) METADATA$ACTION B) METADATA$ISUPDATE C) METADATA$ROW_ID D) METADATA$COLUMN_ID
A) METADATA$ACTION B) METADATA$ISUPDATE C) METADATA$ROW_ID
Which are the two metadata columns for staged files A) METADATA$FILENAME B) METADATA$FILEFORMAT C) METADATA$FILE_ROW_NUMBER
A) METADATA$FILENAME C) METADATA$FILE_ROW_NUMBER
Which of the below services are provided by cloud services A) Metadata management B) Authentication C) Infrastructure management D) Query execution
A) Metadata management B) Authentication C) Infrastructure management
Zero-copy cloning in snowflake is powered by which service? A) Metadata store of the service layer B) SSD cache of Virtual warehouse C) Query result cache
A) Metadata store of the service layer
Using COPY INTO <location> command, you can unload data from a table (or query) into the below locations. A) Named internal stage (or table/user stage) B) Named external stage C) External location (Amazon S3, Google Cloud Storage, or Microsoft Azure) D) Local drive
A) Named internal stage (or table/user stage) B) Named external stage C) External location (Amazon S3, Google Cloud Storage, or Microsoft Azure)
You will use this parameter to specify the number of threads to use for downloading the files using GET command A) PARALLEL = <INTEGER> B) DOWNLOAD_PARALLEL=<INTEGER> C) DUMP=<INTEGER>
A) PARALLEL = <INTEGER>
What table types are available in snowflake A) PERMANENT B) TEMPORARY C) TRANSIENT D) EXTERNAL E) INTERNAL
A) PERMANENT B) TEMPORARY C) TRANSIENT D) EXTERNAL
COPY does not purge loaded files from the location. What option will you specify to purge files after successful load of a table from table stage A) PURGE=TRUE B) DELETE=TRUE C) REMOVE=TRUE D) TRUNCATE=TRUE
A) PURGE=TRUE
Following commands cannot be executed from worksheets A) PUT B) GET C) SHOW D) LIST <STAGE>
A) PUT B) GET
What types of view are available in snowflake? A) REGULAR B) SECURE C) MATERIALIZED D) HASH VIEW E) TABLE VIEW
A) REGULAR B) SECURE C) MATERIALIZED Note: REGULAR Views are the default view type. The underlying DDL is accessible to any role who has access to the view SECURE VIEWS are those where the query definition and the details are accessible to authorized users only. Please note that since the underlying DDL is not visible to the optimizer, this view type will be slower than a normal view Materialized View - These are more like tables, results are store like it is in a table. Snowflake can auto refresh a materialized view. Please take extra care while using materialized views, it may increase your cost.
In you organization, you have snowflake Enterprise addition. You notice that consistently, your queries are getting queued on the warehouses and delaying your ETL process. What are the possible solution options that you can think of? A) Resize the warehouse B) Use multi-cluster warehouse C) Set the auto-resize parameter of the warehouse D) Contact snowflake support to increase the number of servers for the warehouse
A) Resize the warehouse B) Use multi-cluster warehouse
Which command will you run to list all users and roles to which a role has been granted. A) SHOW GRANTS OF ROLE <role_name> B) SHOW GRANTS TO ROLE <role_name> C) SHOW GRANTS ON ACCOUNT
A) SHOW GRANTS OF ROLE <role_name>
This command can be used to list streams for the current/specified database or schema, or across your entire account. A) SHOW STREAMS B) DISPLAY STREAMS C) LIST STREAMS
A) SHOW STREAMS
Which feature in snowflake allows you to load data in micro batches? A) SNOWPIPE B) COPY INTO COMMAND C) LOAD COMMAND
A) SNOWPIPE
Which SNOWFLAKE object stores a generated identity and access management (IAM) entity for your external cloud storage, along with an optional set of allowed or blocked storage locations (Amazon S3, Google Cloud Storage, or Microsoft Azure)? A) STORAGE INTEGRATION B) STORAGE GROUP C) STORAGE CLUSTERS
A) STORAGE INTEGRATION Note: A storage integration is a Snowflake object that stores a generated identity and access management (IAM) entity for your external cloud storage, along with an optional set of allowed or blocked storage locations (Amazon S3, Google Cloud Storage, or Microsoft Azure). Cloud provider administrators in your organization grant permissions on the storage locations to the generated entity. This option allows users to avoid supplying credentials when creating stages or when loading or unloading data
Which parameter will you use while creating a STORAGE integration to limit external stages that use the integration to reference one or more storage locations. A) STORAGE_ALLOWED_LOCATIONS B) STORAGE_ALLOWED_SITES C) STORAGE_OBJECTS_ALLOWED_LOCATIONS
A) STORAGE_ALLOWED_LOCATIONS
Which system functions are available in snowflake to view/monitor the clustering metadata for a table A) SYSTEM$CLUSTERING_DEPTH B) SYSTEM$CLUSTERING_INFORMATION (including clustering depth) C) CLUSTERING_DEPTH_VIEW
A) SYSTEM$CLUSTERING_DEPTH B) SYSTEM$CLUSTERING_INFORMATION (including clustering depth)
The current offset for a stream can be determined by querying which function? A) SYSTEM$STREAM_GET_TABLE_TIMESTAMP B) SYSTEM$DISABLE_BEHAVIOR_CHANGE_BUNDLE C) SYSTEM$TASK_DEPENDENTS_ENABLE D) SYSTEM$USER_TASK_CANCEL_ONGOING_EXECUTIONS
A) SYSTEM$STREAM_GET_TABLE_TIMESTAMP
What are the installment options available for snowflake? A) Snowflake hosted accounts on AWS B) Hybrid on-premise and cloud C) On-premise D) Snowflake hosted accounts on Azure
A) Snowflake hosted accounts on AWS D) Snowflake hosted accounts on Azure
What are the usual data loading steps in snowflake? A) Source -> Snowflake Stage -> Snowflake table B) Source -> Cloud storage -> Snowflake stage -> Snowflake table C) Source -> Snowflake temp table -> Snowflake transient table-> Snowflake permanent table
A) Source -> Snowflake Stage -> Snowflake table
What are the various types of table streams? A) Standard B) Append-only C) Insert-only D) Upsert
A) Standard B) Append-only C) Insert-only Note: Standard A standard (i.e. delta) table stream tracks all DML changes to the source table, including inserts, updates, and deletes (including table truncates). This stream type performs a join on inserted and deleted rows in the change set to provide the row level delta. As a net effect, for example, a row that is inserted and then deleted between two transactional points of time in a table is removed in the delta (i.e. is not returned when the stream is queried). Append-only An append-only table stream tracks row inserts only. Update and delete operations (including table truncates) are not recorded. For example, if 10 rows are inserted into a table and then 5 of those rows are deleted before the offset for an append-only stream is advanced, the stream records 10 rows. An append-only stream returns the appended rows only and therefore can be much more performant than a standard stream for extract, load, transform (ELT) and similar scenarios that depend exclusively on row inserts. For example, the source table can be truncated immediately after the rows in an append-only stream are consumed, and the record deletions do not contribute to the overhead the next time the stream is queried or consumed. Insert-only Supported on external tables only. An insert-only stream tracks row inserts only; they do not record delete operations that remove rows from an inserted set (i.e. no-ops). For example, in-between any two offsets, if File1 is removed from the cloud storage location referenced by the external table, and File2 is added, the stream returns records for the rows in File2 only. Unlike when tracking CDC data for standard tables, Snowflake cannot access the historical records for files in cloud storage.
What are the snowflake editions available as of today? A) Standard Edition,Enterprise Edition,Business Critical Edition,Virtual Private Snowflake B) Standard Edition,Enterprise Edition,ESD,Business Critical Edition,Virtual Private Snowflake C) Standard Edition,Enterprise Edition
A) Standard Edition,Enterprise Edition,Business Critical Edition,Virtual Private Snowflake
You would like to execute a merge statement in snowflake every one hour. If you want to schedule this query using a snowflake object what will you use? A) TASK B) STREAM C) SNOWPIPE
A) TASK
One of your team members accidentally truncated a table in your snowflake database? What feature of snowflake will you use to restore data into the table A) TIME TRAVEL B) FAIL SAFE C) UNDROP TABLE
A) TIME TRAVEL
Which of the following object types are stored within schema A) Tables B) Views C) User defined functions D) File formats E) Sequences F) Roles
A) Tables B) Views C) User defined functions D) File formats E) Sequences
After you have granted a share to a consumer, you added more objects to the share. You have provided the required access by using the GRANT <privilege> ... TO SHARE command. Please select the true statement A) The added objects will be accessible to the consumers immediately B) You will need to recreate the share
A) The added objects will be accessible to the consumers immediately Note: objects in the share flow in, but you cannot change access on the share
You are trying to set a variable by suing the set variable function SET MY_VARIABLE ='XXXX' . You got an error as below Assignment to 'MY_VARIABLE' not done because value exceeds size limit of variables. Its size is 312; the limit is 256 What is the reason for this error? A) The size of string or binary variables is limited to 256 bytes B) This is a temporary error due to insufficient memory in VM instance C) Your warehouse is not big enough to accomodate this
A) The size of string or binary variables is limited to 256 bytes
When the owner role of a given task (i.e. the role with the OWNERSHIP privilege on the task) is deleted, what happens to the task? A) The task is "re-possessed" by the role that dropped the owner role B) The task is deleted immediately C) The task is suspended immediately
A) The task is "re-possessed" by the role that dropped the owner role
Which of the below mentioned clustering metadata for the micro-partitions is maintained by snowflake in a table? A) The total number of micro-partitions that comprise the table B) The number of micro-partitions containing values that overlap with each other C) The depth of the overlapping micro-partitions D) None of the above
A) The total number of micro-partitions that comprise the table B) The number of micro-partitions containing values that overlap with each other C) The depth of the overlapping micro-partitions
From a snowflake perspective what is common about Fivetran, Informatica, Matillion, Segment, Stitch and Talend A) They are all snowflake data integration partner B) They are all snowflake competitors C) They are all programming interface patterns
A) They are all snowflake data integration partner
You have a multi-threaded client program that interacts with snowflake. If you are following snowflake recommendations what will you do A) Use a separate connection for each thread B) Execute the threads synchronously C) Execute the threads asynchronously D) Use one connection for all the threads
A) Use a separate connection for each thread B) Execute the threads synchronously Note: Although multiple sessions cannot share the same transaction, multiple threads using a single connection share the same session, and thus share the same transaction. This can lead to unexpected results, such as one thread rolling back work done in another thread. This situation can occur when a client application using a Snowflake driver (such as the Snowflake JDBC Driver) or connector (such as the Snowflake Connector for Python) is multi-threaded. If two or more threads share the same connection, then those threads also share the current transaction in that connection. A BEGIN, COMMIT, or ROLLBACK by one thread affects all threads using that shared connection. If the threads are running asynchronously, the results can be unpredictable. Similarly, changing the AUTOCOMMIT setting in one thread affects the AUTOCOMMIT setting in all other threads that use the same connection. Note that changing the state of the AUTOCOMMIT setting commits and existing transaction. Snowflake recommends that multi-threaded client programs do at least one of the following: Use a separate connection for each thread. Note that even with separate connections, your code can still hit race conditions that generate unpredicable output; for example, one thread might delete data before another thread tries to update it. Execute the threads synchronously rather than asynchronously, to control the order in which steps are performed. https://docs.snowflake.com/en/sql-reference/transactions.html#transactions-and-multi-threading
Which of the below semi structured data types are supported by snowflake? A) VARIANT B) OBJECT C) ARRAY D) GEOGRAPHY
A) VARIANT B) OBJECT C) ARRAY Note: Snowflake supports three types of semi structured data types VARIANT - A tagged universal type, which can store values of any other type, including OBJECT and ARRAY, up to a maximum size of 16 MB compressed. OBJECT - Used to represent collections of key-value pairs, where the key is a non-empty string, and the value is a value of VARIANT type. Snowflake does not currently support explicitly-typed objects. ARRAY - Used to represent dense or sparse arrays of arbitrary size, where index is a non-negative integer (up to 2^31-1), and values have VARIANT type. Snowflake does not currently support fixed-size arrays or arrays of elements of a specific non-VARIANT type.
If a query is running slow, to check if this is for an overloaded warehouse where in the Snowflake UI you will go to check the same A) Warehouse Tab B) Database Tab C) Query Tab
A) Warehouse Tab
What are the different caching mechanisms that are available in snowflake? A) Warehouse caching B) Index Caching C) Metadata caching D) Query result caching
A) Warehouse caching C) Metadata caching D) Query result caching Note: https://blog.ippon.tech/innovative-snowflake-features-caching/#:~:text=Metadata%20Cache,-Fully%20Managed%20in&text=Snowflake%20automatically%20collects%20and%20manages,Table%20Size%20in%20Bytes
You ran a query in snowflake and it took 2 minute 3 seconds to run, you ran the query again and it returned the results in less than a second. What might have happened? A) When the query ran for the second time, it used the persisted query results from the query result cache B) The query used the meta data cache C) All the partitions were scanned from the SSD cache
A) When the query ran for the second time, it used the persisted query results from the query result cache
Snowpipe is recommended to load data in which of the below scenario? A) You have small volume of frequent data B) You have a huge volume of data generated as part of a batch schedule
A) You have small volume of frequent data
Which of the below are SNOWPIPE REST APIs. Select three. A) insertFiles B) insertReport C) loadHistoryScan D) loadHistoryProgress
A) insertFiles B) insertReport C) loadHistoryScan
When calling the rest endpoints in snowpipe, below authentication methods are supported A) key pair authentication with JSON Web Token B) user id and password C) Both
A) key pair authentication with JSON Web Token
Which command will you run to view a list of files that have been unloaded to a stage named mystage. A) list @mystage; B) show @mystage; C) display @mystage;
A) list @mystage;
Which permissions are required by Snowflake in AWS S3 to unload data to S3? A) s3:DeleteObject B) s3:PutObject C) s3:ListObject
A) s3:DeleteObject B) s3:PutObject
You have stored the below JSON in a table named car_sales as a variant column { "customer": [ { "address": "San Francisco, CA", "name": "Joyce Ridgely", "phone": "16504378889" } ], "date": "2017-04-28", "dealership": "Valley View Auto Sales", "salesperson": { "id": "55", "name": "Frank Beasley" }, "vehicle": [ { "extras": [ "ext warranty", "paint protection" ], "make": "Honda", "model": "Civic", "price": "20275", "year": "2017" } ] } How will you query the table to get the dealership data? A) select src:dealership from car_sales; B) select src:$dealership from car_sales; C) select dealership from car_sales;
A) select src:dealership from car_sales;
Which GCP regions are supported by Snowflake in US and Canada A) us-central1 B) us-central2 C) us-east1
A) us-central1
Which AWS regions are supported by snowflake in US and Canada A) us-west-2 B) us-east-2 C) us-east-1 D) ca-central-1 E) us-central-2
A) us-west-2 B) us-east-2 C) us-east-1 D) ca-central-1 Note: https://docs.snowflake.com/en/user-guide/intro-regions.html#government-regions
If file format options are specified in multiple locations, the load operation applies the options in the following order of precedence. A) 1. COPY INTO TABLE statement. 2. Stage definition. 3. Table definition. B) 1. Stage definition. 2. COPY INTO TABLE statement. 3. Table definition. C) 1. COPY INTO TABLE statement. 2. Table definition. 3. Stage definition.
A) 1. COPY INTO TABLE statement. 2. Stage definition. 3. Table definition.
Snowflake supports which of the following file formats for data loading? A. Parquet B. ORC C. CSV D. PDF
A, B, & C
The Snowflake metadata repository contains which of the following? A. Table definitions B. References to all of the micro partition files for that table C. Tracking of all versions of the table data within the data retention window D. Aggregated data from the table
A, B, & C
Which of the following are true of multi-cluster Warehouses? Select all that apply below. A. A multi-cluster Warehouse can add clusters automatically based on query activity B. A multi-cluster Warehouse can automatically turn itself off after a period of inactivity C. A multi-cluster Warehouse can scale down when query activity slows D. A multi-cluster Warehouse can automatically turn itself on when a query is executed against it
A, B, C, D
Which of the following are types of caching used by SF? A. Warehouse Caching B. Index Caching C. Metadata Caching D. Query result Caching
A, C, & D
On which of the following cloud platform can a Snowflake account be hosted? Choose 2 answers A. Amazon Web Services B. Private Virtual Cloud C. Oracle Cloud D. Microsoft Azure Cloud
A. Amazon Web Services & D. Microsoft Azure Cloud
What command is used to load files into an Internal Stage within Snowflake? A. PUT B. COPY INTO C. TRANSFER D. INSERT
A. PUT
True or False: Snowpipe via RFST API can only reference External Stages as source.
A. True
Select the statements which are true for an external table External tables are read-only External tables can be used for query and join operations Views can be created against external tables Data can be updated in external tables
External tables are read-only External tables can be used for query and join operations Views can be created against external tables
The PIPE_EXECUTION_PAUSED parameter can be set at which levels DATABASE ACCOUNT SCHEMA PIPE
ACCOUNT SCHEMA PIPE Note: Complete the following steps to transfer ownership of a pipe: Set the PIPE_EXECUTION_PAUSED parameter to TRUE. This parameter enables pausing or resuming a pipe. The parameter is supported at the following levels: Account Schema Pipe At the pipe level, the object owner (or a parent role in a role hierarchy) can set the parameter to pause or resume an individual pipe. An account administrator (user with the ACCOUNTADMIN role) can set this parameter at the account level to pause or resume all pipes in the account. Likewise, a user with the MODIFY privilege on the schema can pause or resume pipes at the schema level. Note that this larger domain control only affects pipes for which the parameter was not already set at a lower level; e.g., by the owner at the object level. https://docs.snowflake.com/en/user-guide/data-load-snowpipe-manage.html#transferring-pipe-ownership
Which roles create, alter, or drop network policies? ACCOUNTADMIN SECURITYADMIN SYSADMIN
ACCOUNTADMIN SECURITYADMIN
Who can setup a data share SECURITYADMIN SYSADMIN ACCOUNTADMIN or a role granted the CREATE SHARES global privilege
ACCOUNTADMIN or a role granted the CREATE SHARES global privilege Note: https://docs.snowflake.com/en/user-guide/data-sharing-provider.html
What is the encryption algorithm used by snowflake? AES 256-bit encryption DES BLOWFISH
AES 256-bit encryption
When working with a cloned table, you can use the below SQL statements SELECT DROP SHOW ALL OF THE ABOVE
ALL OF THE ABOVE
You have created a TASK in snowflake. How will you resume it? No need to resume, the creation operation automatically enables the task ALTER TASK mytask1 RESUME; ALTER TASK mytask1 START;
ALTER TASK mytask1 RESUME; Note: After creating a task, you must execute ALTER TASK ... RESUME before the task will run based on the parameters specified in the task definition.
You want to scale out query performance. Which strategy will you follow with respect to your warehouses? Resize the warehouse Add clusters to the warehouse Create another warehouse
Add clusters to the warehouse Note: Scale out = add clusters scale up = make WH larger
Your business team runs a set of identical queries every day after the batch ETL run is complete. From the following actions, what is the best action that you will recommend. After the ETL run, execute the identical queries so that they remain in the result cache After the ETL run, resize the warehouse to a larger warehouse After the ETL run, copy the tables to another schema for the business users to query
After the ETL run, execute the identical queries so that they remain in the result cache Note: Please note the key word here which is IDENTICAL queries. When a query is run the first time, and the same query is run the second time it picks up the results from the query result cache and it does not cost you any compute. The query result cache is valid for 24 hours and it gets extended for another 24 hours, every time you access it(even if you access it at 23 hours 59 seconds).
Having separate accounts in snowflake enables users to have Different editions of snowflake and different regions Billing can be done at account level Database object deployment between environments become simpler All of the above
All of the above
Which approach would result in improved performance through linear scaling of data ingestion workload? Resize virtual warehouse Consider practice of organizing data by granular path Consider practice of splitting input file batch within the recommended size of 10 MB to 100 MB All of the above
All of the above
Which best describes zero-copy cloning? Metadata only operation No replication of data Unmodified data stored once and modified data stored as new micro partitions All of the above
All of the above Note: When a clone is created of a table, the clone utilizes no data storage because it shares all the existing micro-partitions of the original table at the time it was cloned; however, rows can then be added, deleted, or updated in the clone independently from the original table. Each change to the clone results in new micro-partitions that are owned exclusively by the clone and are protected through CDP(continuous data protection).
OVERWRITE option for PUT is supported in which cloud providers Amazon AWS Microsoft Azure Google cloud platform
Amazon AWS Microsoft Azure
A Stream has been created on a table. A row in the table got inserted and subsequently got updated. The stream will capture both the events. (TRUE OR FALSE)
FALSE
Select the two true statements about multi cluster warehouse These type of warehouses cannot be resumed or suspended Multi cluster warehouses can be created on snowflake hosted on AWS only As query demand decreases, clusters are removed from this warehouse When query workload increases, clusters are automatically added
As query demand decreases, clusters are removed from this warehouse When query workload increases, clusters are automatically added Note: In Auto-scale mode, a multi-cluster warehouse eliminates the need for resizing the warehouse or starting and stopping additional warehouses to handle fluctuating workloads. Snowflake automatically starts and stops additional clusters as needed. In Maximized mode, you can control the capacity of the warehouse by increasing or decreasing the number of clusters as needed.
Snowflake on GCP is not available in which region currently (as of Year 2020) Europe Asia Pacific North America
Asia Pacific Refer link for details of snowflake availablity in differenent regions - https://docs.snowflake.com/en/user-guide/intro-regions.html
Cloud services layer is one of the layers in snowflake. Which services are provided by this layer Authentication Infrastructure management Metadata management Query parsing and optimization Access control Query processing
Authentication Infrastructure management Metadata management Query parsing and optimization Access control
Which of the following are options when creating a virtual warehouse Auto Suspend Auto Resume Storage Size Server count
Auto Suspend Auto Resume
Which two modes are available in Multi cluster warehouse Auto-scale Peak Scale Flexible Maximized
Auto-scale Maximized
What are the two mechanisms to detect if new stage file is there in a snowpipe? Automating Snowpipe using cloud messaging Calling Snowpipe REST endpoints Calling the custom APIs exposed through AWS EKS
Automating Snowpipe using cloud messaging Calling Snowpipe REST endpoints
What is the maximum number of consumer accounts that can be added to a Share object? A. One B. Unlimited C. 10 D. 100
B
What is the recommended approach for making a variant column accessible in a BI tool? A. A pre-defined mapping B. A View C. Leveraging a JSON parser D. There is no way to do this
B
What technique does Snowflake use to limit the number of micro-partitions scanned by each query? A. Indexing B. Pruning C. Map Reduce D. B-True
B
Which of the following techniques can be in SF to improve query performance? A. Index Definition B. Cluster Keys/Reclustering C. Distribution Keys D. Query Hints
B
Which type of data integration tools leverage SF's scalable compute for data transformation? A. Data Replication B. ELT C. ETL D. Streaming
B
Load history of Snowpipe expires after how many days. A) 30 B) 14 C) 64
B) 14
You have executed a task. Due to some coding issue, the task went into a loop. How long the task will run before it is terminated by Snowflake A) 24 hours B) 60 minutes C) 30 minutes
B) 60 minutes
One of your user complained that a query is taking a long time to return any result. What techniques would you consider to improve performance? A) Create an index on the search columns B) Create cluster keys and turn on auto clustering on the table C) Enable query hints on the table D) Define the partition keys for the table
B) Create cluster keys and turn on auto clustering on the table
Tick the statements which are true for fail safe A) Fail safe period is configurable B) Data can be recovered from a fail-safe storage only by snowflake employee C) Using fail-safe you can create dev/test environments D) Tables with fail-safe turned on incurs additional storage costs compared to tables where fail-safe is not turned on
B) Data can be recovered from a fail-safe storage only by snowflake employee D) Tables with fail-safe turned on incurs additional storage costs compared to tables where fail-safe is not turned on
The three key layers of snowflake are A) Extraction, Ingestion, Load B) Database storage, Query Processing, Cloud Services C) Database, Virtual Warehouse, Data Experience
B) Database storage, Query Processing, Cloud Services Note: https://docs.snowflake.com/en/user-guide/intro-key-concepts.html
You are an account administrator and you want to use a third party product in snowflake. Where will you go to enable the third party product? A) The third party product's webpage and contact them through contact us B) Call snowflake support to enable the product C) Enable the product through Partner Connect in Snowflake web console
C) Enable the product through Partner Connect in Snowflake web console
A snowflake account hosted on AWS cannot load data from a file that is staged in GCP or Azure FALSE TRUE
FALSE
Select two true statements about fail safe from the following A) Fail safe is same as time travel B) Fail-safe provides a (non-configurable) 7-day period during which historical data is recoverable by Snowflake C) Fail-safe provides a (non-configurable) 90-day period during which historical data is recoverable by Snowflake D) Fail-safe ensures historical data is protected in the event of a system failure or other catastrophic event, e.g. a hardware failure or security breach
B) Fail-safe provides a (non-configurable) 7-day period during which historical data is recoverable by Snowflake D) Fail-safe ensures historical data is protected in the event of a system failure or other catastrophic event, e.g. a hardware failure or security breach
Which of the below is true about snowflake architecture A) One node shared data B) Multi Cluster Shared data C) Multi Node distributed data
B) Multi Cluster Shared data
What is the default action to perform when an error is encountered while loading data from a file using SNOWPIPE? A) ABORT_STATEMENT B) SKIP_FILE C) CONTINUE
B) SKIP_FILE
Select two choices that are true about snowflake roles A) Snowflake users has a limit on the number of roles that they can assume B) Snowflake user can have one or more roles C) Privileges can be directly assigned to a user D) For a particular session, only one role can be active at a given time
B) Snowflake user can have one or more roles D) For a particular session, only one role can be active at a given time
Which object in snowflake records data manipulation language changes made to tables, including inserts, updates, and deletes, as well as metadata change about each change? A) Tasks B) Table stream C) Snowpipe D) Change Data Capture
B) Table stream Note: A stream object records data manipulation language (DML) changes made to tables, including inserts, updates, and deletes, as well as metadata about each change, so that actions can be taken using the changed data. This process is referred to as change data capture (CDC). An individual table stream tracks the changes made to rows in a source table. A table stream (also referred to as simply a "stream") makes a "change table" available of what changed, at the row level, between two transactional points of time in a table. This allows querying and consuming a sequence of change records in a transactional fashion.
When you remove the predecessor of a child task, what may happen to the child task. Select two possibilities? A) The child task is removed from the system B) The child task may become the root task C) The child task may become a stand-alone task
B) The child task may become the root task C) The child task may become a stand-alone task
How often does snowflake release new feature A) Daily B) Weekly C) Monthly D) Annually
B) Weekly
What are the three things customers want most from their enterprise data warehouse solution? Choose 3 answers A. On-premise availability B. Simplicity C. Open source based D. Concurrency E. Performance
B, D, E
What are the two data loading approaches in snowflake BULK LOADING CONTINUOUS LOADING INGEST LOADING
BULK LOADING (COPY command) CONTINUOUS LOADING (Snowpipe)
How would you execute a series of SQL statements using Task? A. include the SQL statements in the body of the task create task mytask.. as insert into target1 select., from stream_s1 where ..INSERT INTO target2 SELECT .. FROM stream .s1 where .. B. A stored procedure can have only one DML statement per stored procedure invocation and therefore you should sequence stored procedures calls in the task definition CREATE TASK mytask.... AS call stored_prc1(); call stored_proc2t); C. A stored procedure executing multiple SQL statements and invoke the stored procedure from the task. CREATE TASK mytask.... AS callstored_proc_multiple_statements_inside(): D. Create a task for each SQL statement (e.g. resulting in task1. task2, etc) and string the series of SQL statements by having a control taskcalling task 1, task 2 sequentially.
C
Which of the following terms best describes Snowflake's database architecture? A. Columnar shared nothing B. Shared disk C. Multi-cluster shared data D. Cloud native shared memory
C
Lets say you executed a transaction in a snowflake session. Due to some reason the session disconnects and the transaction now is in a detached state. The transaction cannot be committed or rolled back. The object on which the transaction was applied is also now locked. if you do not do anything and let snowflake eventually abort the transaction, how long will you need to wait? A) 60 minutes B) 15 minutes C) 4 hours
C) 4 hours
Your customer needs to setup Failover/Failback mechanism in their snowflake instance. Which edition of Snowflake will you suggest? A) STANDARD B) PREMIER C) BUSINESS CRITICAL(OR HIGHER)
C) BUSINESS CRITICAL(OR HIGHER) Note: https://docs.snowflake.com/en/user-guide/database-replication-failover.html
Your customer deals with extremely sensitive data like PHI data. They need to be compliant with HIPAA. What is the minimum edition that you will recommend A) STANDARD B) PREMIUM C) Business Critical Edition D) Virtual Private Snowflake
C) Business Critical Edition
What is the easiest way to monitor the queries that are run on snowflake? A) Create a tableau dashboard and connect to snowflake B) All queries go to cloudwatch and use cloudwatch to monitor C) Click on the the History tab and monitor all queries that are executed in the last 14 days
C) Click on the the History tab and monitor all queries that are executed in the last 14 days
Pick the true statement for snowflake architecture A) Shared nothing architecture B) Shared disk architecture C) Multi-Cluster Shared Data architecture
C) Multi-Cluster Shared Data architecture Note: nowflake's architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. Similar to shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture. https://docs.snowflake.com/en/user-guide/intro-key-concepts.html
What happens when you execute the below query CREATE OR REPLACE TABLE MyTable_V2 CLONE MyTable A) MyTable_V2 is created and all the data from MyTable is copied into MyTable_V2 B) MyTable_V2 is created and a batch job is submitted to copy data from MyTable C) Snowflake creates a new entry in the metadata store to keep track of the new clone. The existing micro partitions of table MyTable are also mapped to the new table. This is called zero copy clone.
C) Snowflake creates a new entry in the metadata store to keep track of the new clone. The existing micro partitions of table MyTable are also mapped to the new table. This is called zero copy clone.
What influences snowflake pricing? A) Amount of data queried from snowflake B) Amount of data scanned during querying snowflake C) Snowflake pricing is based on usage and it charges only for storage and compute
C) Snowflake pricing is based on usage and it charges only for storage and compute
Query processing in snowflake is done by... A) Snowflake processes queries using AWS EMR running spark B) Snowflake processes queries running spark on a EC2 instance C) Snowflake process queries using 'Virtual Warehouses'
C) Snowflake process queries using 'Virtual Warehouses'
How is data loaded into snowflake? A) Snowflake loads the data in parquet format on the underlying cloud storage B) Snowflake loads the data in JSON format on the underlying cloud storage C) Snowflake reorganizes the data into its internal optimized, compressed, columnar format at stores on the underlying cloud storage
C) Snowflake reorganizes the data into its internal optimized, compressed, columnar format at stores on the underlying cloud storage
If you want customer dedicated virtual warehouse, which is the lowest snowflake edition that you should opt for A) Business Critical B) Enterprise C) Standard
C) Standard
You are developing an ETL query. one of your ETL logic requires you to store the intermediate data temporarily. This data will be used only by your ETL. The data will not be required outside the ETL. If you want to optimize storage cost, what type of table will you create to store this data A) PERMANENT B) TRANSIENT C) TEMPORARY D) EXTERNAL
C) TEMPORARY
You have created a lot of shares with role as ACCOUNTADMIN. You want to create the shares with SYSADMIN as role, so you have granted 'CREATE SHARE' privilege to SYSADMIN. How do you change the ownership of the existing shares? A) Execute 'GRANT OWNERSHIP ON <SHARE NAME> TO ROLE SYSADMIN' B) Execute 'GRANT OWNERSHIP TO ROLE SYSADMIN ON <SHARE NAME> C) The only option is to drop and recreate the share. Ownership of a share cannot be granted to another role
C) The only option is to drop and recreate the share. Ownership of a share cannot be granted to another role
When a warehouse does not have enough resources available to process the queries, what happens to the incoming queries? A) Queries are aborted immediately B) Snowflake automatically resizes the warehouse C) The queries are queued and then executed when the resources are available again
C) The queries are queued and then executed when the resources are available again
Compute cost in snowflake depends on A) The actual query execution time B) The query execution time and the time the query waits for the resource C) The warehouse size and how long the warehouse runs
C) The warehouse size and how long the warehouse runs
How do you truncate a date (from a timestamp) down to the year, month, and day A) Use concatenation of the date part B) Split the date into year, month and day C) Use date_trunc and execute a query as below select to_date('2015-05-08T23:39:20.123-07:00') as "DATE1", date_trunc('YEAR', "DATE1") as "TRUNCATED TO YEAR", date_trunc('MONTH', "DATE1") as "TRUNCATED TO MONTH", date_trunc('DAY', "DATE1") as "TRUNCATED TO DAY";
C) Use date_trunc and execute a query as below select to_date('2015-05-08T23:39:20.123-07:00') as "DATE1", date_trunc('YEAR', "DATE1") as "TRUNCATED TO YEAR", date_trunc('MONTH', "DATE1") as "TRUNCATED TO MONTH", date_trunc('DAY', "DATE1") as "TRUNCATED TO DAY";
Your customer required the highest level of data protection and needs a completely isolated environment for snowflake. Which edition will you recommend A) STANDARD B) PREMIER C) Virtual Private Snowflake D) Business Critical Edition
C) Virtual Private Snowflake
Compute in snowflake is provided by A) The cloud providers VM instances B) Only EC2 instances on AWS C) Virtual warehouses
C) Virtual warehouses
In your snowflake environment, you have a medium warehouse. The medium warehouse is used by business to run adhoc queries. The warehouse has auto suspend set at 15 minutes. You have noticed that all of the queries that run on this warehouse finishes within a minute. What will you do to optimize cost of compute in this case? A) Since all the queries are completing in a minute, you should delete this warehouse and tell the users to use another existing warehouse B) You will tell the users that they do not need snowflake to run their queries, they should load their data in another on premise database and run query from there C) You will recommend to reduce the auto suspend time to 1 minute
C) You will recommend to reduce the auto suspend time to 1 minute
A successful response from the SNOWPIPE insertFiles API means that the files are ingested. (TRUE OR FALSE)
FALSE
A table in snowflake can only be queried using the virtual warehouse used to load the data TRUE FALSE
FALSE
You are a snowflake architect hired by an organization. They want you to design their warehouse strategy. Which one of the strategy will you pick up. A) Both loading and analysis of data will be done by a single warehouse to reduce cost B) You do not need to use warehouse for loading data, only query analysis will require a warehouse C) You will recommend to use a multi-warehouse strategy. The load workload will be done by one ware house and the query analysis workload will be done by another warehouse
C) You will recommend to use a multi-warehouse strategy. The load workload will be done by one ware house and the query analysis workload will be done by another warehouse
When loading data into Snowflake, the COPY command supports: Choose 2 answers A. Joins B. Fitters C. Data type conversions D. Column reordering E. Aggregates
C. Data Type Conversion & D. Column Reordering
The commands to load data into snowflake are COPY AND PUT COPY AND INSERT INSERT AND PUT NONE OF THE ABOVE
COPY AND PUT COPY AND INSERT INSERT AND PUT
Which of the below function returns the name of the warehouse of the current session? ACTIVE_WAREHOUSE() SESSION_WAREHOUSE() CURRENT_WAREHOUSE() WAREHOUSE()
CURRENT_WAREHOUSE()
What makes a Partner Connect Partner different from other partners? Can be connected to Snowflake using a streamlined wizard Requires Enterprise Edition Can be connected from within the WebUI Includes automated role, user and staging database set up Includes a streamlined Partner Trial Account Signup
Can be connected to Snowflake using a streamlined wizard Can be connected from within the WebUI Includes automated role, user and staging database set up Includes a streamlined Partner Trial Account Signup Explanation Refer Link for details - https://docs.snowflake.com/en/user-guide/ecosystem-partner-connect.html
In snowflake data storage cost depends on which factors Cloud service provider Region Type of account we chose (on-demand/capacity) Edition
Cloud service provider Region Type of account we chose (on-demand/capacity)
To improve the query performance, which of the below techniques can be used in snowflake Indexes Distribution keys Query hints Cluster keys/Reclustering
Cluster keys/Reclustering
Why would you not recommend a high cardinality column for including in the clustering key? (Select 2) Extremely high cardinality column results in skewed distribution across micro-partitions and hence will impact query pruning High cardinality will lead to low clustering depth, so as you add more data query performance will degrade Cluster maintenance with high cardinality keys is expensive Point lookup queries will only benefit, whereas range queries will not be effectively utilize the clustering key
Cluster maintenance with high cardinality keys is expensive Point lookup queries will only benefit, whereas range queries will not be effectively utilize the clustering key
Which transformations are available when using the COPY INTO command to load data files into Snowflake from a stage? Filters Aggregates Column data type conversion Column concatention
Column data type conversion Column concatention
Select the options that differentiates a Partner Connect partner from a regular partner Connect with snowflake through a wizard Includes a partner trial account signup Can be connected from the WEB UI Includes automated role, user and staging database setup None of the above
Connect with snowflake through a wizard Includes a partner trial account signup Can be connected from the WEB UI Includes automated role, user and staging database setup Note: https://docs.snowflake.com/en/user-guide/ecosystem-partner-connect.html#snowflake-partner-connect
Which statements are true about a micro partition in snowflake Contiguous unit of storage Each micro partition contains between 50 MB and 500 MB of uncompressed data Organized in a columnar fashion Micro partitions can be updated using snowflake query
Contiguous unit of storage Each micro partition contains between 50 MB and 500 MB of uncompressed data Organized in a columnar fashion Note: Micro partitions are immutable which means once created they cannot be changed. If a row is updated in snowflake, the micro partition holding the row is copied into a new micro partition and the updated row is inserted in that micro partition. The olde micro partition is marked for deletion All data in Snowflake tables is automatically divided into micro-partitions, which are contiguous units of storage. Each micro-partition contains between 50 MB and 500 MB of uncompressed data (note that the actual size in Snowflake is smaller because data is always stored compressed). Groups of rows in tables are mapped into individual micro-partitions, organized in a columnar fashion. This size and structure allows for extremely granular pruning of very large tables, which can be comprised of millions, or even hundreds of millions, of micro-partitions. Snowflake stores metadata about all rows stored in a micro-partition, including: The range of values for each of the columns in the micro-partition. The number of distinct values. Additional properties used for both optimization and efficient query processing. https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions.html
What are the three features provided by Snowflake for continuous data pipeline Continuous data loading Change data tracking Recurring tasks Cloud functions Table Pipes
Continuous data loading Change data tracking Recurring tasks
What is the best practice for handling semi-structured data with 3rd party BI tools?
Create a Snowflake view that parse the semi-structured column into structure columns for the BI tool to retrieve
You want to convert an existing permanent table to a transient table (or vice versa) while preserving data and other characteristics such as column defaults and granted privileges. What is the best way to do it? Run an ALTER TABLE command to convert the tables Unload the data from the existing table into a CSV file. Create the new table and then load the data back in Create a new table and use the COPY GRANTS clause
Create a new table and use the COPY GRANTS clause
Your business team wants to load JSON data into a table in snowflake. What will you recommend them? Create a table with a variant column and load the JSON data in that column Tell the team to convert JSON to CSV Construct a python program to convert the JSON into CSV and then load the data.
Create a table with a variant column and load the JSON data in that column
Which of the below features are only available to Business Critical edition and above? Customer-managed encryption keys through Tri-Secret Secure. Support for secure, direct proxy to your other virtual networks or on-premises data centers using AWS PrivateLink or Azure Private Link. Support for PHI data (in accordance with HIPAA regulations). Support for PCI DSS. Extended Time Travel (up to 90 days).
Customer-managed encryption keys through Tri-Secret Secure. Support for secure, direct proxy to your other virtual networks or on-premises data centers using AWS PrivateLink or Azure Private Link. Support for PHI data (in accordance with HIPAA regulations). Support for PCI DSS.
You have cloned a table. Which of the below queries will work on the cloned table? A) DROP TABLE <TABLE_NAME> B) SELECT * FROM <TABLE_NAME> C) SHOW TABLES LIKE '<TABLE_NAME>' D) ALL OF THE ABOVE
D) ALL OF THE ABOVE
When the snowflake staged release happens, which accounts get the updates first A) Enterprise or above B) Trial accounts C) Premier accounts D) Accounts who has opted for early access
D) Accounts who has opted for early access
Which snowflake edition supports private communication between Snowflake and your other VPCs through AWS PrivateLink A) Standard B) Premier C) Enterprise D) Business critical
D) Business critical
Which of the following best describes snowflakes processing engine? A) Leverages Apache Spark B) Leverages Map Reduce C) A derivative of presto D) Native SQL
D) Native SQL
For federated authentication and MFA, what is the minimum snowflake edition required? A) Business critical B) Enterprise C) Premier D) Standard
D) Standard
What are the product offerings for secure data sharing? DIRECT SHARE DATA MARKETPLACE DATA EXCHANGE DATA STORE
DIRECT SHARE DATA MARKETPLACE DATA EXCHANGE Note: https://docs.snowflake.com/en/user-guide/data-sharing-product-offerings.html#direct-share
Which of the following is not a layer in SF's Architecture? Storage Cloud Services Data Catalog Virtual Warehouses
Data Catalog
What are the available partner categories in snowflake? Data integration Business Intelligence Security and governance ML and data science Data Warehousing
Data integration Business Intelligence Security and governance ML and data science Note: https://docs.snowflake.com/en/user-guide/ecosystem.html
There is no query performance difference between a column with a maximum length declaration (e.g. VARCHAR(16777216)), and a smaller precision. Still it is recommended to define an appropriate column length because of the below reasons Data loading operations are more likely to detect issues such as columns loaded out of order, e.g. a 50-character string loaded erroneously into a VARCHAR(10) column. Such issues produce errors When the column length is unspecified, some third-party tools may anticipate consuming the maximum size value, which can translate into increased client-side memory usage or unusual behavior Data unloading will be performant if appropriate column lengths are defined
Data loading operations are more likely to detect issues such as columns loaded out of order, e.g. a 50-character string loaded erroneously into a VARCHAR(10) column. Such issues produce errors When the column length is unspecified, some third-party tools may anticipate consuming the maximum size value, which can translate into increased client-side memory usage or unusual behavior
Which of the following conditions are required for sharing data in Snowflake? Note - Here Data Sharing being referred is not via Data Replication. Data providers with ACCOUNTADMIN role can set up shares Consumer accounts must be in same Snowflake region as the provider account Secure views are not required when query performance is priority Each data share must contain a single database
Data providers with ACCOUNTADMIN role can set up shares Consumer accounts must be in same Snowflake region as the provider account Details for setting up a share can be referred at following link - https://docs.snowflake.com/en/user-guide/data-share-providers.html
What is the best way to store semi-structured data?
Data should be loaded and stored in a VARIANT type column
Which layer in snowflake contains data in compressed, columnar format Cloud services Query processing Database storage None
Database storage
Which feature is only available in VPS edition? Dedicated metadata store and pool of virtual servers (used in virtual warehouses). SOC 2 Type II certification. Automatic encryption of all data.
Dedicated metadata store and pool of virtual servers (used in virtual warehouses). Note: https://docs.snowflake.com/en/user-guide/intro-editions.html#business-critical-edition
Each object has an owner, who can in turn grant access to that object. What is this model called? Discretionary Access Control (DAC) Role-based Access Control (RBAC) Object ownership model
Discretionary Access Control (DAC) Note: Snowflake's approach to access control combines aspects from both of the following models: 1. Discretionary Access Control (DAC): Each object has an owner, who can in turn grant access to that object. 2. Role-based Access Control (RBAC): Access privileges are assigned to roles, which are in turn assigned to users. https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#access-control-framework
To avoid unexpected task executions due to daylight saving time, you will do one of the below activities. Do not schedule tasks to run at a specific time between 1 AM and 3 AM (daily, or on days of the week that include Sundays) Manually adjust the cron expression for tasks scheduled during those hours twice each year to compensate for the time change due to daylight saving time Add the USE DAYLIGHT SAVINGS parameter for task
Do not schedule tasks to run at a specific time between 1 AM and 3 AM (daily, or on days of the week that include Sundays) Manually adjust the cron expression for tasks scheduled during those hours twice each year to compensate for the time change due to daylight saving time Note: The cron expression in a task definition supports specifying a time zone. A scheduled task runs according to the specified cron expression in the local time for a given time zone. Special care should be taken with regard to scheduling tasks for time zones that recognize daylight saving time. Tasks scheduled during specific times on days when the transition from standard time to daylight saving time (or the reverse) occurs can have unexpected behaviors. For example: During the autumn change from daylight saving time to standard time, a task scheduled to start at 1 AM in the America/Los_Angeles time zone (i.e. 0 1 * * * America/Los_Angeles) would run twice: once at 1 AM and then again when 1:59:59 AM shifts to 1:00:00 AM local time. That is, there are two points in time when the local time is 1 AM. During the spring change from standard time to daylight saving time, a task scheduled to start at 2 AM in the America/Los_Angeles time zone (i.e. 0 2 * * * America/Los_Angeles) would not run at all because the local time shifts from 1:59:59 AM to 3:00:00 AM. That is, there is no point during that day when the local time is 2 AM.
To have multi cluster data virtual warehouse, what should be the minimum snowflake subscription ENTERPRISE BUSINESS CRITICAL STANDARD
ENTERPRISE
Your organization has a need for Column-level security by applying masking policies to columns in tables or views. Which editions will be suitable for this requirement? Standard Enterprise Business Critical VPS
Enterprise Business Critical VPS
Extended Time Travel (up to 90 days) is available in which editions Standard Enterprise Business Critical VPS
Enterprise Business Critical VPS Note: https://docs.snowflake.com/en/user-guide/intro-editions.html#security-data-protections
What was Business Critical Edition formerly known as Standard Edition Enterprise Edition Business Edition Enterprise for Sensitive Data (ESD)
Enterprise for Sensitive Data (ESD)
With respect to Snowflake UI, which of the following is true? A single session can be shared between multiple worksheets Every worksheet can have a different role, warehouse and a database Worksheets cannot have different role, warehouse and database Every worksheet has its own session
Every worksheet can have a different role, warehouse and a database Every worksheet has its own session
How will you check if a share is inbound or outbound? Execute SHOW SHARES and look at kind column Execute describe share and look at location column Execute DISPLAY SHARE and look at kind column
Execute SHOW SHARES and look at kind column
One of your users have accidentally dropped a table T1 in production. What can you do to restore the table? Execute UNDROP TABLE T1; Execute UNDELETE TABLE T1; Call Snowflake team and request them to recover the data from cloud storage Nothing can be done, whatever is lost is lost See all questions BackSkip question
Execute UNDROP TABLE T1; Note: UNDROP is a very useful feature in Snowflake. It makes it easier to retrieve your data and that is because snowflake stores data and objects in encrypted form for 24 hours by default. You have an option to purchase even longer retention period, if required. Any lost data can be instantaneously recovered with the UNDROP features. It is a single short command UNDROP TABLE <tablename> In addition to this, you can also do UNDROP SCHEMA and UNDROP DATABASE if someone messes it up even bigger. https://docs.snowflake.com/en/sql-reference/sql/undrop-table.html#undrop-table
What are the recommendations from Snowflake to effectively and efficiently use a warehouse. Experiment with different types of queries and different warehouse sizes to determine the combinations that best meet your specific query needs and workload Don't focus on warehouse size. Snowflake utilizes per-second billing, so you can run larger warehouses (Large, X-Large, 2X-Large, etc.) and simply suspend them when not in use Be very careful on what size of warehouse you are picking up. The credits charged are based on number of queries running on the warehouse and the size of the warehouse
Experiment with different types of queries and different warehouse sizes to determine the combinations that best meet your specific query needs and workload Don't focus on warehouse size. Snowflake utilizes per-second billing, so you can run larger warehouses (Large, X-Large, 2X-Large, etc.) and simply suspend them when not in use Note: Remember one more thing. The credits are not dependent on number of queries running in your warehouse, it depends on how long your warehouse is active.
A transient table can be cloned to a permanent table TRUE FALSE
FALSE
You have a small table in snowflake which has only 10,000 rows. Specifying a clustering key will further improved the queries that run on this table FALSE TRUE
FALSE Note: Specifying a clustering key is not necessary for most tables. Snowflake performs automatic tuning via the optimization engine and micro-partitioning. In many cases, data is loaded and organized into micro-partitions by date or timestamp, and is queried along the same dimension. When should you specify a clustering key for a table? First, note that clustering a small table typically doesn't improve query performance significantly. For larger data sets, you might consider specifying a clustering key for a table when: The order in which the data is loaded does not match the dimension by which it is most commonly queried (e.g. the data is loaded by date, but reports filter the data by ID). If your existing scripts or reports query the data by both date and ID (and potentially a third or fourth column), you may see some performance improvement by creating a multi-column clustering key. Query Profile indicates that a significant percentage of the total duration time for typical queries against the table is spent scanning. This applies to queries that filter on one or more specific columns. Note that reclustering rewrites existing data with a different order. The previous ordering is stored for 7 days to provide Fail-safe protection. Reclustering a table incurs compute costs that correlate to the size of the data that is reordered. https://docs.snowflake.com/en/user-guide/table-considerations.html#when-to-set-a-clustering-key
PUT command can be run from Snowflake worksheets TRUE FALSE
FALSE Note: The PUT command cannot be executed from the Worksheets page in the Snowflake web interface; instead, use the SnowSQL client to upload data files, or check the documentation for the specific Snowflake client to verify support for this command. https://docs.snowflake.com/en/sql-reference/sql/put.html#usage-notes
A COUNT(*) and COUNT(<COLUMN_NAME>) on a table will always give you the same result TRUE FALSE
FALSE Note: A COUNT(*) will return a total count of rows in the table, while COUNT(<column_name>) will return a count of rows with a non-NULL value in that particular column. COUNT(A,B) only counts the rows that have no NULL values in either the A or B column while COUNT(<ALIAS>.*) can be used to count all the rows containing no NULL columns.
When you clone a table in snowflake, the storage requirements are doubled as it needs to now store two copies of data. TRUE FALSE
FALSE Note: A massive benefit of zero-copy cloning of snowflake is that the underlying data is never duplicated. Only the metadata/pointers of the micropartitons are replicated for the new cloned table. Hence, when you clone a table actual data is never copied over to the new cloned table.
Auto clustering can be switched off at an account level FALSE TRUE
FALSE Note: Auto clustering cannot be switched off at database of account level, it will need to be done at the table level.
Resource monitors can be used to control credit usage for the Snowflake-provided warehouses, including the snowpipe warehouse TRUE FALSE
FALSE Note: Resource monitors provide control over virtual warehouse credit usage; however, you cannot use them to control credit usage for the Snowflake-provided warehouses, including the SNOWPIPE warehouse.
A consumer of shared database can add as many tables and views they want in the shared database FALSE TRUE
FALSE Note: Shares are read only
Snowflake guarantees that only one instance of a task with a defined predecessor task is running at a given time TRUE FALSE
FALSE Note: Snowflake cannot guarantee that only one instance of a task with a defined predecessor task is running at a given time.
If you clone a database, the internal snowflake stages in that database are also cloned FALSE TRUE
FALSE Note: The following rules apply to cloning stages or objects that contain stages (i.e. databases and schemas): Individual external named stages can be cloned; internal named stages cannot be cloned. When cloning a database or schema: External named stages that were present in the source when the cloning operation started are cloned. Tables are cloned, which means their internal stages are also cloned. Internal named stages are not cloned. Regardless of how a stage was cloned, the clone does not include any of the files from the source. i.e. all cloned stages are empty.
You want to delete the snowpipe file loading metadata. You decided to truncate the table to be able to do this. This will remove the metadata from the pipe TRUE FALSE
FALSE Note: Unable to Reload Modified Data, Modified Data Loaded Unintentionally Snowflake uses file loading metadata to prevent reloading the same files (and duplicating data) in a table. Snowpipe prevents loading files with the same name even if they were later modified (i.e. have a different eTag). The file loading metadata is associated with the pipe object rather than the table. As a result: Staged files with the same name as files that were already loaded are ignored, even if they have been modified, e.g. if new rows were added or errors in the file were corrected. Truncating the table using the TRUNCATE TABLE command does not delete the Snowpipe file loading metadata. However, note that pipes only maintain the load history metadata for 14 days.
A user can be defaulted to a role which user does not have access to TRUE FALSE
FALSE Note: You will be able to create a user with a default role to which the user does not have access. However, the user will nit be able to logon to snowflake if he/she does not have access to the default role. Hence a user cannot be defaulted to a role which he/she does not have access to
You have just loaded a file named student_data.csv from your snowflake stage to snowflake table. You try to reload again using the COPY INTO command. You will be able to load the file. TRUE FALSE
FALSE Note: you can overridewith There is a way, you can use FORCE=TRUE
Which COPY INTO <table> copy options are not supported by SNOWPIPE? FILES = ( 'file_name1' [ , 'file_name2', ... ] ) ON_ERROR = ABORT_STATEMENT PURGE = TRUE | FALSE FORCE = TRUE | FALSE FILE_FORMAT =( { FORMAT_NAME = '[<namespace>.]<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] } ) ]
FILES = ( 'file_name1' [ , 'file_name2', ... ] ) ON_ERROR = ABORT_STATEMENT PURGE = TRUE | FALSE FORCE = TRUE | FALSE Note: All COPY INTO <table> copy options are supported except for the following: FILES = ( 'file_name1' [ , 'file_name2', ... ] ) ON_ERROR = ABORT_STATEMENT SIZE_LIMIT = num PURGE = TRUE | FALSE (i.e. automatic purging while loading) MATCH_BY_COLUMN_NAME = CASE_SENSITIVE | CASE_INSENSITIVE | NONE FORCE = TRUE | FALSE Note that you can manually remove files from an internal (i.e. Snowflake) stage (after they've been loaded) using the REMOVE command. RETURN_FAILED_ONLY = TRUE | FALSE VALIDATION_MODE = RETURN_n_ROWS | RETURN_ERRORS | RETURN_ALL_ERRORS Please go through this link thoroughly https://docs.snowflake.com/en/sql-reference/sql/create-pipe.html#usage-notes Please note that PATTERN = 'regex_pattern' copy option is provided as a preview feature . If this is listed as one of the option, it will fall under not supported option.
Select the TABLE function which helps to convert semi-structured data to a relational representation FLATTEN CHECK_JSON PARSE_JSON
FLATTEN
Which OPTION will you set on the copy if you would like to ignore the load metadata? FORCE LOAD_UNCERTAIN_FILES IGNORE METADATA
FORCE Note: To load files whose metadata has expired, set the LOAD_UNCERTAIN_FILES copy option to true. The copy option references load metadata, if available, to avoid data duplication, but also attempts to load files with expired load metadata. Alternatively, set the FORCE option to load all files, ignoring load metadata if it exists. Note that this option reloads files, potentially duplicating data in a table. https://docs.snowflake.com/en/user-guide/data-load-considerations-load.html#workarounds
What are the two types of data consumer accounts available in snowflake FULL ACCOUNT READER ACCOUNT SHARED ACCOUNT QUERY ONLY ACCOUNT
FULL ACCOUNT READER ACCOUNT
True or False: A customer using SnowSQL / native connectors will be unable to also use the Snowflake Web interface (UI) unless access to the UI is explicitly granted by supported.
False
True or False: A table in Snowflake can only be queried using the Virtual Warehouse used to load the data.
False
True or False: SF offers tools to extract data from source systems.
False
True or False: Virtual Warehouses cannot be resized while queries are running.
False
True or False: You can point Snowflake at any S3 bucket to directly query files in that bucket as long as the files are in Parquet or ORC format
False
Integration object is a first-class database object True False
False Note: Integration object is an account level object and not a database object
To improve performance materialized view can created on top of external tables without any additional cost TRUE False
False Note: Materialized views are designed to improve query performance for workloads composed of common, repeated query patterns. However, materializing intermediate results incurs additional costs. As such, before creating any materialized views, you should consider whether the costs are offset by the savings from re-using these results frequently enough.
SF is available on premise? (TRUE OR FALSE)
False Note: Snowflake's data warehouse is a true SaaS offering. More specifically: There is no hardware (virtual or physical) for you to select, install, configure, or manage. There is no software for you to install, configure, or manage. Ongoing maintenance, management, and tuning is handled by Snowflake. Snowflake runs completely on cloud infrastructure. All components of Snowflake's service (other than an optional command line client), run in a public cloud infrastructure. Snowflake uses virtual compute instances for its compute needs and a storage service for persistent storage of data. Snowflake cannot be run on private cloud infrastructures (on-premises or hosted). Snowflake is not a packaged software offering that can be installed by a user. Snowflake manages all aspects of software installation and updates. https://docs.snowflake.com/en/user-guide/intro-key-concepts.html
Snowflake has administration settings for resource consumption to help with below activities Help control costs associated with unexpected warehouse credit usage Manage access to snowflake for specific users Manage availability of the product
Help control costs associated with unexpected warehouse credit usage Note: What is a resource monitor? To help control costs and avoid unexpected credit usage caused by running warehouses, Snowflake provides resource monitors. A virtual warehouse consumes Snowflake credits while it runs. Resource monitors can be used to impose limits on the number of credits that are consumed by: 1. User-managed virtual warehouses 2. Virtual warehouses used by cloud services
Which feature does not come with query profile? Graphical representation of the main components of the processing plan of the query Details and statistics for the overall query Hints for improving the query performance Statistics for each component of the query
Hints for improving the query performance
Which properties of a resource monitor can be modified? Increase or decrease the credit quota for the monitor If the monitor is monitoring your account, convert it to monitor individual warehouses If the monitor is monitoring individual warehouses: 1. Add or remove warehouses from the list. 2. Convert it to monitor your account Customize the schedule (frequency, start timestamp, and end timestamp) for the monitor Add or remove actions, or modify the threshold percentages for existing actions
Increase or decrease the credit quota for the monitor If the monitor is monitoring your account, convert it to monitor individual warehouses If the monitor is monitoring individual warehouses: 1. Add or remove warehouses from the list. 2. Convert it to monitor your account Customize the schedule (frequency, start timestamp, and end timestamp) for the monitor Add or remove actions, or modify the threshold percentages for existing actions
Which of the below are automatically provided by snowflake compared to other databases? Installation and Hardware Configurations Patch releases Physical Security Metadata and Collection statistics documentation
Installation and Hardware Configurations Patch releases Physical Security
Which of the below objects will contribute to your account level storage usage? Secure Views Internal Stage External stage Database and all tables inside it Functions
Internal Stage External stage Database and all tables inside it Note: SECURE VIEWS and FUNCTION definitions are stored in metadata store. They do not store any data physically. One of my students asked a valid question that external stage is a pointer to the cloud provider's storage and hence from a snowflake perspective there is no cost. This is absolutely true. But when this question is presented to you and it is not explicitly asking 'snowflake account', you should tick external stage also. If the question specifically asks about 'snowflake account', then you should not tick this option. This is kind of a trick question so please pay attention to the question text.
Which of the below are considered as best practices while loading data into snowflake? Isolate data loading workload into its own virtual warehouse Split large files into smaller files Compress the source files If format is in CSV, convert them into ORC
Isolate data loading workload into its own virtual warehouse Split large files into smaller files Compress the source files
In your organization, you use a 2XL warehouse to load data into Snowflake as part of your ETL workload. Your BI team reached out to you and wanted access to the data stored in Snowflake. You gave them access to the tables required for them to query the data. What will be your recommendation with respect to the warehouse that they will need to use? Since you already have a 2XL warehouse for loading data, you will let them use that warehouse Warehouse is not required for querying data in snowflake It is recommended to use a multi-warehouse approach for workload isolation in snowflake. Since you would not like the load workload to interfere with the read workload of the BI team, you will create a separate warehouse for them
It is recommended to use a multi-warehouse approach for workload isolation in snowflake. Since you would not like the load workload to interfere with the read workload of the BI team, you will create a separate warehouse for them Note: Snowflake's unique multi-cluster, shared data architecture makes it possible to allocate multiple independent, isolated clusters for processing while sharing the same data. In snowflake, each cluster can both read and write data complying to transactional consistency which is ensured by the cloud services layer. The compute clusters in Snowflake are called virtual warehouses. The size and resources of each virtual warehouse can be chosen independently by the user based on the characteristics and performance requirements for the workload(s) that will run on each virtual warehouse. It is, therefore, recommended to isolate workloads and have each workload use its own warehouse. It helps to track usage better and also through this means you will ensure that one workload does not interfere with the other workload. Hence you will create a separate workload for the BI TEAM
What is the benefit of client-side encryption in snowflake? It provides a secure system for managing data in cloud storage It helps to decrypt data faster at a later period It reduces the cost of encryption by the cloud provider
It provides a secure system for managing data in cloud storage
Select the ones that are true for snowflake kafka connector Kafka connector guarantees exactly-once delivery Kafka connector guarantees that rows are inserted in the order Kafka connector guarantees reprocessing of messages
Kafka connector guarantees exactly-once delivery
What does snowflake use for monitoring network traffic and use activity? Lacework Sumo logic Threat Stack
Lacework
You are a snowflake architect and you are investigating a long running query to optimize the query. You want to know if the query retrieved data from long-term centralized storage. If you are looking at the Query History table in the history area of the WebUI, where will you look for this information? Look in the Bytes Scanned Column for a green bar Look in the Bytes scanned column for the acronym LTCS Look in the Rows column for a red bar Look in the Rows column for the acronym LTCS
Look in the Bytes Scanned Column for a green bar
A stream stores data in the same shape as the source table (i.e. the same column names and ordering). What additional columns does it store? METADATA$ACTION METADATA$ISUPDATE METADATA$ROW_ID METADATA$STREAM_ID
METADATA$ACTION METADATA$ISUPDATE METADATA$ROW_ID
Snowflake supports multi-factor authentication (MFA) to provide increased login security for users connecting to Snowflake. Which statements are true about MFA security? MFA is not enabled by default. it is available to all, but user need to activate it MFA is an integrated feature powered by DUO Security Service MFA can be used to connect to SNOWFLAKE using Snowflake JDBC Driver MFA Login is designed to connect to snowflake only through the web interface
MFA is not enabled by default. it is available to all, but user need to activate it MFA is an integrated feature powered by DUO Security Service MFA can be used to connect to SNOWFLAKE using Snowflake JDBC Driver
Which access control privileges must be granted on specified resource monitors for roles other than account administrators to view and modify resource monitors as needed MONITOR CREATE MODIFY
MONITOR MODIFY
This property of resource monitor specifies whether the resource monitor is used to monitor the credit usage for your entire Account (i.e. all warehouses in the account) or a specific set of individual warehouses CREDIT QUOTA MONITOR LEVEL SCHEDULE
MONITOR LEVEL Note: Monitor Level This property specifies whether the resource monitor is used to monitor the credit usage for your entire Account (i.e. all warehouses in the account) or a specific set of individual warehouses. If this property is not set, the resource monitor doesn't monitor any credit usage. It simply remains dormant. https://docs.snowflake.com/en/user-guide/resource-monitors.html#monitor-level
Snowflake provides specific administration features and capabilities to support the following activities except? Manage databases and warehouses within snowflake account Manage roles and users within a snowflake account Monitor usage and manage resources to control costs in a snowflake account Manage 3rd party applications providing data to a snowflake account
Manage 3rd party applications providing data to a snowflake account
Snowflake provides specific administration features and capabilities to support the following activities Managing database and warehouses within a snowflake account Managing roles and users within a snowflake account Monitor usage and manage resources to control costs in a snowflake account Manage 3rd party applications providing data to snowflake
Managing database and warehouses within a snowflake account Managing roles and users within a snowflake account Monitor usage and manage resources to control costs in a snowflake account
What are the two modes in which a multi-cluster warehouse can run? Static Dynamic Maximized Auto-scale None of the above
Maximized Auto-scale Note: Maximized This mode is enabled by specifying the same value for both maximum and minimum clusters (note that the specified value must be larger than 1). In this mode, when the warehouse is started, Snowflake starts all the clusters so that maximum resources are available while the warehouse is running. This mode is effective for statically controlling the available resources (i.e. servers), particularly if you have large numbers of concurrent user sessions and/or queries and the numbers do not fluctuate significantly. Auto-scale This mode is enabled by specifying different values for maximum and minimum clusters. In this mode, Snowflake starts and stops clusters as needed to dynamically manage the load on the warehouse: As the number of concurrent user sessions and/or queries for the warehouse increases, and queries start to queue due to insufficient resources, Snowflake automatically starts additional clusters, up to the maximum number defined for the warehouse. Similarly, as the load on the warehouse decreases, Snowflake automatically shuts down clusters to reduce the number of running servers and, correspondingly, the number of credits used by the warehouse. To help control the usage of credits in Auto-scale mode, Snowflake provides a property, SCALING_POLICY, that determines the scaling policy to use when automatically starting or shutting down additional clusters. For more information, see Setting the Scaling Policy for a Multi-cluster Warehouse (in this topic). https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html#maximized-vs-auto-scale
When configuring a Warehouse using a Snowflake edition that has Elastic Data Warehousing enabled, what facets or components will you need to configure that are not needed in accounts where Elastic Data Warehousing is not enabled Minimum and Maximum Clusters Minimum and Maximum Servers Scaling Policy Auto-Suspend Auto-Resume
Minimum and Maximum Clusters Scaling Policy In Snowflake Standard edition which is not having a support of multi cluster warehouse there is no need to configure the Min & Max clusters and also the scaling policy.
Select all characteristics of snowflake's multi-cluster environment Multiple virtual warehouses in a deployment Individual warehouses scale out/in based on query activity Multi-cluster warehouses support the same properties and actions as single-cluster warehouse User must specify which cluster each query will use
Multiple virtual warehouses in a deployment Individual warehouses scale out/in based on query activity Multi-cluster warehouses support the same properties and actions as single-cluster warehouse Note: Multiple virtual warehouse in a deployment means specifying more than 1 warehouse while creating a multi-cluster warehouse. In a multi-cluster warehouse, you can specify the maximum number of server cluster(or virtual warehouse) greater than 1 and upto 10 and also a minimum number of clusters(equal to or less than the maximum(up to 10). Additionally, multi-cluster warehouses support all the same properties and actions as single-cluster warehouses, including: 1. Specifying a warehouse size. 2. Resizing a warehouse at any time. 3. Auto-suspending a running warehouse due to inactivity; note that this does not apply to individual clusters, but rather the entire warehouse. 4. Auto-resuming a suspended warehouse when new queries are submitted. In a multi-cluster warehouse, individual warehouses can scale up or down based on query load which is not possible in a single cluster warehouse. However as mentioned above, it supports the same properties and actions like the single cluster warehouse
In load metadata, Snowflake maintains detailed metadata for each table into which data is loaded. What else is stored in this metadata Name of each file from which data was loaded File size Number of columns in the file ETag for the file Number of rows parsed in the file Timestamp of the last load for the file Information about any errors encountered in the file during loading
Name of each file from which data was loaded File size ETag for the file Number of rows parsed in the file Timestamp of the last load for the file Information about any errors encountered in the file during loading Note: https://docs.snowflake.com/en/user-guide/data-load-considerations-load.html#load-metadata
PUT command in snowflake can upload files from a local folder/directory of a client machines to the below locations Named internal stage Internal stage for a specified table Internal stage for the current user External Stage
Named internal stage Internal stage for a specified table Internal stage for the current user
Which of the following best describes SF's processing engine? Leverages Apache Spark Based on Map Reduce A derivative of Presto Native SQL
Native SQL
Which of the following best describes Snowflake's processing engine EMR(Elastic Map Reduce) Spark Engine Presto Native SQL Database engine
Native SQL Database engine Note: https://docs.snowflake.com/en/user-guide/intro-key-concepts.html#key-concepts-architecture
Which of the following is not a characteristic of micro-partitions in snowflake? New partitions are created in logical properties Avoids skews between partitions Partitioning is automatically completed on the natural ingestion order of the data The natural ingestion order maintains correlations between columns which could be useful for pruning
New partitions are created in logical properties
create or replace table result_scan_table_1 like result_scan_table; Will the above query cost compute credits? No, since it is a metadata operation only Yes, it will need compute as the table structure is also getting created It is hard to say without looking at query profile
No, since it is a metadata operation only Note: Why do you think it will not cost any compute credit? The reason behind this is that the LIKE clause only creates the definition of the table(or the structure of the table) in the metadata repository. It does not require any compute credit for any type of metadata operations.
What will the below query return SELECT TOP 10 GRADES FROM STUDENT; The top 10 highest grades The 10 lowest grades Non-deterministic list of 10 grades
Non-deterministic list of 10 grades
Resource monitors support the below actions Notify & Suspend Notify & Suspend Immediately Notify Notify and automatically call the account administrator
Notify & Suspend Notify & Suspend Immediately Notify
You are sizing a snowflake warehouse, what factors would you consider? Number of users Number of concurrent queries Number of tables being queried Data size and composition
Number of concurrent queries Number of tables being queried Data size and composition
When you size a warehouse, which of the below factors are not required to be considered Number of users Number of concurrent queries Number of tables queried Data size and composition All of the above
Number of users
If you plan to regularly unload similarly-formatted data, it is recommended to use named file formats (TRUE OR FALSE)
TRUE
If you clone a permanent table(bar) into a transient table(foo) using the below command create transient table foo clone bar copy grants; What will happen to the partitions? Old partitions will not be affected,but new partitions added to the clone will follow the transient lifecycle All the partitions will be affected Only the old partitions will be affected
Old partitions will not be affected,but new partitions added to the clone will follow the transient lifecycle
The Cloning feature in Snowflake requires less storage because Only metadata is copied; no physical data is copied The cloned data is constantly updated to synchronize with original table data Data is replicated to ensure integrity of data in the original table data
Only metadata is copied; no physical data is copied There is no data copy that happens while clone is created. Its just one more object that gets created which points to the same underlying data as that of original table. Only when changes are made to clone then it starts incurring storage costs.
The following factors affect data load rates: A. Physical location of the stage B. RAM on the virtual warehouse C. Gzip compression efficiency D. Thread size
Physical location of the stage Gzip compression efficiency
A Snowflake mechanism that is used to limit the number of micro-partitions scanned by a query is called ________?
Pruning
If you recreate a storage integration after it has been linked to one or more stages, you must reestablish the association between each stage by running the ALTER STAGE command. (TRUE OR FALSE)
TRUE
Query execution is performed by which layers of Snowflake architecture Storage Layer Query Processing Layer Cloud Services Layer
Query Processing Layer Query execution is performed using Virtual Warehouse, which is also called as compute layer or Query Processing Layer.
Which layer in the snowflake architecture contains virtual warehouse Cloud services Query processing Database Storage None
Query processing
How do you remove files from a stage? REMOVE DELETE PURGE TRUNCATE
REMOVE Note: Removes files that have been staged (i.e. uploaded from a local file system or unloaded from a table) in one of the following Snowflake internal stages: Named internal stage. Stage for a specified table. Stage for the current user. Note that using the command to remove files from an external stage might work but is not officially supported. REMOVE can be abbreviated to RM. https://docs.snowflake.com/en/sql-reference/sql/remove.html#remove
Which is not a characteristic of time travel? Protection from accidental data operations Previous versions of data automatically retained Retention period selected by customers (up to 90 days for Enterprise edition) Recover data with the cost of running backups
Recover data with the cost of running backups
Which of the below activities can be done leveraging time-travel feature of snowflake? Restoring data-related objects (tables, schemas, and databases) that may have been accidentally or intentionally deleted Duplicating and backing up data from key points in the past Analyzing data usage/manipulation over specified periods of time To backup data for compliance and regulatory requirements
Restoring data-related objects (tables, schemas, and databases) that may have been accidentally or intentionally deleted Duplicating and backing up data from key points in the past Analyzing data usage/manipulation over specified periods of time
Which cache runs on a 24 hour clock? Result Cache Metadata cache Warehouse cache
Result Cache Note: WH Cache is when WH goes offline Result Cache is 24 hours Metadata cache is always changing as tables change
You want to update the value of a specific column in a table. But the updation is not based on any specific condition. You just need to ensure that 80% of the rows in the table are updated with the new value. What snowflake feature will you use SAMPLE CONCAT AVERAGE MAX
SAMPLE
If you want to clone a table, your role must have which access on the Table SELECT USAGE READ CLONE
SELECT Note: Tables - SELECT Pipes, Streams, Tasks - OWNERSHIP Other objects - USAGE
You have a Snowflake table which is defined as below CREATE OR REPLACE TABLE FRUITS(FRUIT_NUMBER NUMBER, FRUIT_DESCRIPTION VARCHAR, AVAILABILITY VARCHAR); If you would like to convert the fruit_number column to be a decimal with a certain precision and scale, which command will you run? SELECT FRUIT_NUMBER::DECIMAL(10,5) FROM FRUITS; SELECT FRUIT_NUMBER(DECIMAL(10,5)) FROM FRUITS; SELECT FRUIT_NUMBER AS DECIMAL(10,5) FROM FRUITS; SELECT FRUIT_NUMBER.DECIMAL(10,5) FROM FRUITS;
SELECT FRUIT_NUMBER::DECIMAL(10,5) FROM FRUITS;
Which of the below operations are allowed on an inbound share data? MERGE CREATE TABLE ALTER TABLE SELECT WITH GROUP BY SELECT WITH JOIN INSERT INTO
SELECT WITH GROUP BY SELECT WITH JOIN
Output of a FLATTEN query has the below columns. Select three. SEQ PATH VALUE RECORD_METADATA
SEQ PATH VALUE
Choose the continuous data loading options from below SNOWPIPE SNOWFLAKE CONNECTOR FOR KAFKA Third-party data integration tools BULK COPY
SNOWPIPE SNOWFLAKE CONNECTOR FOR KAFKA Third-party data integration tools
Snowflake supports the following performance optimization techniques SSD CACHING/DATA CACHING QUERY RESULT CACHING B-TREE INDEXES
SSD CACHING/DATA CACHING QUERY RESULT CACHING Note: https://community.snowflake.com/s/article/Caching-in-Snowflake-Data-Warehouse
All of the below are valid executionState of a snowpipe except: RUNNING STOPPED_FEATURE_DISABLED STALLED_EXECUTION_ERROR PAUSED STOPPED
STOPPED Note: Current execution state of the pipe; could be any one of the following: RUNNING (i.e. everything is normal; Snowflake may or may not be actively processing files for this pipe) STOPPED_FEATURE_DISABLED STOPPED_STAGE_DROPPED STOPPED_FILE_FORMAT_DROPPED STOPPED_MISSING_PIPE STOPPED_MISSING_TABLE STALLED_COMPILATION_ERROR STALLED_INITIALIZATION_ERROR STALLED_EXECUTION_ERROR STALLED_INTERNAL_ERROR PAUSED PAUSED_BY_SNOWFLAKE_ADMIN PAUSED_BY_ACCOUNT_ADMIN
Snowflake adds the system defined roles to each new account. Check the system defined roles below. SYSADMIN ACCOUNTADMIN SECURITYADMIN PUBLIC USERADMIN SUPERUSER
SYSADMIN ACCOUNTADMIN SECURITYADMIN PUBLIC USERADMIN
To help control the usage of credits in Auto-scale mode, Snowflake provides a property, _________________, that determines the scaling policy to use when automatically starting or shutting down additional clusters. Select the property Auto Scale Scaling Policy Max no of cluster Min no of cluster
Scaling Policy Note: To help control the credits consumed by a multi-cluster warehouse running in Auto-scale mode, Snowflake provides scaling policies, which are used to determine when to start or shut down a cluster. The scaling policy for a multi-cluster warehouse only applies if it is running in Auto-scale mode. In Maximized mode, all clusters run concurrently so there is no need to start or shut down individual clusters. https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html#setting-the-scaling-policy-for-a-multi-cluster-warehouse
An entity to which access can be granted. Unless allowed by a grant, access will be denied. in snowflake terms, what is this called? Securable object Role Privilege User
Securable object Note: Securable object: An entity to which access can be granted. Unless allowed by a grant, access will be denied. Role: An entity to which privileges can be granted. Roles are in turn assigned to users. Note that roles can also be assigned to other roles, creating a role hierarchy. Privilege: A defined level of access to an object. Multiple distinct privileges may be used to control the granularity of access granted. User: A user identity recognized by Snowflake, whether associated with a person or program. https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#access-control-framework
All security related information is stored in which layer of snowflake architecture Service Storage Compute All of the above
Service Note: Service(or Cloud service) layer is the one which stores all security related information The services included in this layer are 1. Authentication 2. Infrastructure management 3. Metadata management 4. Query parsing and optimization 5. Access control
You are speaking to a CTO of an organization and would like to define the architecture of snowflake to him in two words. What will you use? Shared Disk Shared Nothing Shared Data Shared Memory
Shared Data Note: Snowflake's architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. Similar to shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture. https://docs.snowflake.com/en/user-guide/intro-key-concepts.html
Select the statements that are true for a shared database Shared databases are read-only A clone can be created from a shared database Time travel can be done on a shared database Can edit the comments for a shared database
Shared databases are read-only Note: https://docs.snowflake.com/en/user-guide/data-share-consumers.html#general-limitations-for-shared-databases
Select the statements that are true for a data shar Shares are named Snowflake objects Share consists of privileges that grant access to the database(s) and the schema containing the objects to share Share consists of privileges that grant access to the specific objects in the database Share consists of consumer accounts with which the database and its objects are shared
Shares are named Snowflake objects Share consists of privileges that grant access to the database(s) and the schema containing the objects to share Share consists of privileges that grant access to the specific objects in the database Share consists of consumer accounts with which the database and its objects are shared Note: https://docs.snowflake.com/en/user-guide/data-sharing-intro.html#what-is-a-share
Which factors influence the unit cost of snowflake credits and data storage? Snowflake Edition The region where your Snowflake account is located Whether it is an On Demand or Capacity account Number of users of snowflake
Snowflake Edition The region where your Snowflake account is located Whether it is an On Demand or Capacity account Note: The Snowflake Edition that your organization chooses determines the unit costs for the credits and the data storage you use. Other factors that impact unit costs are the region where your Snowflake account is located and whether it is an On Demand or Capacity account: 1. On Demand: Usage-based pricing with no long-term licensing requirements. 2. Capacity: Discounted pricing based on an up-front Capacity commitment.
Mark all the true statements Snowflake Stages can be defined as either External or Internal. Snowflake Stages can be defined as either FTP or SFTP. External Stages require a cloud storage provider. FTP Stages require three cloud storage providers. Cloud storage providers must be linked to internal stage objects.
Snowflake Stages can be defined as either External or Internal. External Stages require a cloud storage provider.
With default settings, how long will a query run on snowflake Snowflake will cancel the query if it runs more than 48 hours Snowflake will cancel the query if it runs more than 24 hours Snowflake will cancel the query if the warehouse runs out of memory Snowflake will cancel the query if the warehouse runs out of memory and hard disk storage
Snowflake will cancel the query if it runs more than 48 hours
Select the limitations with materialized views(Select 3) Snowflake's "Time Travel" feature is not supported on materialized views using CURRENT_TIME or CURRENT_TIMESTAMP is not permitted A materialized view cannot include Window functions Clustering keys cannot be defined in Materialized views
Snowflake's "Time Travel" feature is not supported on materialized views using CURRENT_TIME or CURRENT_TIMESTAMP is not permitted A materialized view cannot include Window functions
What is a point in time snapshot of data which can be updated by users is called Snowflakes' cloning Time travel Fail safe
Snowflakes' cloning
Which of the following is not a snowpipe feature? Snowpipe can load data from any internal or external stage It is a server-less compute model The service provides REST end points and uses snowflake provided compute resources to load the data and retrieve history reports Snowpipe loads data after it is in stage and use executes the LOADDATA command
Snowpipe loads data after it is in stage and use executes the LOADDATA command Note: There is no LOADDATA command and data is automatically loaded by snowpipe after the files are added to the stage https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html#introduction-to-snowpipe
What scaling policies are supported by snowflake? Standard Economy Premium
Standard Economy Note: Standard (default) Prevents/minimizes queuing by favoring starting additional clusters over conserving credits. Economy Conserves credits by favoring keeping running clusters fully-loaded rather than starting additional clusters, which may result in queries being queued and taking longer to complete. https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html#setting-the-scaling-policy-for-a-multi-cluster-warehouse
What are the two types of snowflake data listings which define how data is shared and consumed via data marketplace? Standard Data Listings Personalized Data Listings Business Critical Data Listings Enterprise Data Listings
Standard Data Listings Personalized Data Listings Note: The Snowflake Data Marketplace offers two types of data listings which define how data is shared and consumed. Once published, both types of listings are displayed to consumers in the selected regions. The key difference between standard and personalized listings is the ability to access the data share. Personalized listings allow you to control which consumers can access the data. Standard Data Listings A standard listing provides instant access to a published data set. This type of listing is best for providing generic, aggregated, or non-customer-specific data. Each listing includes details about the shared data, sample queries, and information about the data provider. To access data from a standard data listing, consumers click Get, opening a prompt where they name the database for Snowflake users, permission access among Snowflake users, and agree to the provider's terms of use and Snowflake's consumer terms. Clicking Create Database immediately creates the database inside their Snowflake account in the Databases tab. Only users that have a role with the IMPORT SHARE privilege can create databases from the Snowflake Data Marketplace. Personalized Data Listings A personalized listing allows customers to request specific data sets. This can be premium data that a provider charges for or data that is specific to each consumer. Each listing includes details about the shared data, sample usage examples, and information about the data provider. To access data from a personalized listing, consumers must submit a request by clicking the Request button, and provide their contact information. Once a request is submitted, the data provider is notified. The provider then contacts the consumer. Each data provider can have different commercial terms. Once those are agreed to, then the personalized data set is created and shared with the consumer.
Which is the preferred way to store a generated identity and access management (IAM) entity for external cloud storage Access Token Stage Storage integration File Format
Storage integration
Which of the below columns are usually a good choice for clustering key. UUID column in a Customer fact table Gender with Male/Female values in a multi-terabyte table Timestamp column in a 10TB order fact table Store_Id in a 2 TB Item Store Sales fact table
Store_Id in a 2 TB Item Store Sales fact table Note: Remember that a cluster key should not have a very high cardinality nor it should have a very low cardinality. If it has high cardinality, there will be lot of partitions that will be created, with low cardinality very few partitions will be created and hence query perfromance will not be able take advantage of partition pruning. We need to take a middle ground.
Select two true statement related to streams Stream itself does not contain any table data A stream only stores the offset for the source table The hidden columns used by a stream does not consume any storage
Stream itself does not contain any table data A stream only stores the offset for the source table
Select two options which are true about variant data type in snowflake? Supports querying using JSON path notation Optimizes storage based on repeated elements A separate file format is used to store variant data Requires custom mapping for each type of record
Supports querying using JSON path notation Optimizes storage based on repeated elements
When a resource limit has hit the limit or is about to hit the limit, what can the resource monitor do Suspend the warehouse Send alert notifications Kill the query that is running
Suspend the warehouse Send alert notifications Kill the query that is running
The FLATTEN command in snowflake has two version. One version uses a join and the other version uses an object keyword. Please select two words that represent the options used with the command? OBJECT_CONSTRUCT TABLE TRY_CAST LATERAL
TABLE LATERAL
Snowflake has three types of stages USER, TABLE, NAMED. Named stage is again divided into External and Internal. Which of the below two stages are automatically created and does not need explicit configuration by the user TABLE EXTERNAL INTERNAL USER
TABLE USER
Select the table type that is automatically deleted after the session is closed and hence it has no fail-safe or time travel option post closure of the session TRANSIENT TEMPORARY PERMANENT EXTERNAL
TEMPORARY Note: temporary are valid for a session transient are same as permanent except they do not have a fail safe period
A simple tree of tasks is limited to a maximum of 1000 tasks total TRUE FALSE
TRUE
A simple tree of tasks is limited to a maximum of 1000 tasks total (including the root task) in a resumed state (TRUE OR FALSE)
TRUE
A single storage integration can support multiple external stages (TRUE OR FALSE)
TRUE
A stream object records the delta of change data capture (CDC) information for a table (such as a staging table), including inserts and other data manipulation language (DML) changes. TRUE FALSE
TRUE
A warehouse can be assigned to only a single resource monitor. TRUE FALSE
TRUE
ACCOUNTADMIN ROLE encapsulates the SYSADMIN and SECURITYADMIN system-defined roles and ACCOUNTADMIN sits on top of the role hierarchy TRUE FALSE
TRUE
According to snowflake architecture, data storage is independent of compute TRUE FALSE
TRUE
All virtual warehouses in snowflake has access to all data TRUE FALSE
TRUE
Automatic reclustering in snowflake is triggered only if/when the table would benefit from the operation (TRUE OR FALSE)
TRUE
Before any PHI data can be stored in Snowflake, a signed business associate agreement (BAA) must be in place between your agency/organization and Snowflake Inc TRUE FALSE
TRUE
COPY and INSERT operations in snowflake are non-blocking TRUE FALSE
TRUE
COPY transformations are supported only for named stages(both internal and external) and user stages TRUE FALSE
TRUE
Clustering is generally most cost-effective for tables that are queried frequently and do not change frequently (TRUE OR FALSE)
TRUE
Each time a persisted result for a query is reused, Snowflake resets the 24-hour retention period for the result, up to a maximum of 31 days from the date and time that the query was first executed. After 31 days, the result is purged and the next time the query is submitted, a new result is generated and persisted. (TRUE OR FALSE)
TRUE
Each time a warehouse is started or resized to a larger size, the warehouse is billed for 1 minute's worth of usage.After 1 minute, all subsequent billing is per-second TRUE FALSE
TRUE
Failsafe period for temporary and transient table is zero (TRUE OR FALSE)
TRUE
Federated authentication in snowflake is compliant with SAML2.0 TRUE FALSE
TRUE
Files that are already copied from the stage to the source table can be loaded again into a table cloned from the source table (TRUE OR FALSE)
TRUE
GET does not support downloading files from external stages (TRUE OR FALSE)
TRUE
If multiple instances of the kafka connector is started on the same topic or partitions, duplicate records may flow into snowflake table TRUE FALSE
TRUE
If query performance degrades over time, the table is likely no longer well-clustered and may benefit from clustering (TRUE OR FALSE)
TRUE
If you are on Snowflake Enterprise Edition (and higher), the time travel retention period can be set to any value from 0 up to 90 days for databases, schemas, and tables TRUE FALSE
TRUE
Metadata columns for staged files can only be queried by name (TRUE OR FALSE)
TRUE Note: Metadata cannot be inserted into existing table rows. Metadata columns can only be queried by name; as such, they are not included in the output of any of the following statements: SELECT * SHOW <objects> DESCRIBE <object> Queries on INFORMATION_SCHEMA views https://docs.snowflake.com/en/user-guide/querying-metadata.html#query-limitations
A table stream(also known as stream) itself does not contain any table data (TRUE OR FALSE)
TRUE Note: Note that a stream itself does not contain any table data. A stream only stores the offset for the source table and returns CDC records by leveraging the versioning history for the source table. When the first stream for a table is created, a pair of hidden columns are added to the source table and begin storing change tracking metadata. These columns consume a small amount of storage. The CDC records returned when querying a stream rely on a combination of the offset stored in the stream and the change tracking metadata stored in the table. https://docs.snowflake.com/en/user-guide/streams.html#overview-of-table-streams
You are loading data to a snowflake internal stage area using the PUT Command, the data will be encrypted on the client's machine and you will not be able to turn off the encryption. (TRUE OR FALSE)
TRUE Note: One of the biggest worries people have about moving to the cloud is security. One key piece of providing enterprise class security is the ability to encrypt the data in your data warehouse environment. With Snowflake, your data is automatically encrypted by default. https://www.snowflake.com/blog/automatic-encryption-data/
Snowflake does not support nested transactions (TRUE OR FALSE)
TRUE Note: Overlapping Transactions This section describes overlapping transactions. A stored procedure that contains a transaction can be called from within another transaction. The outer transaction can be in an outer stored procedure or can be outside any stored procedure. The inner transaction is not treated as nested; instead, the inner transaction is a separate transaction. Snowflake calls these "autonomous scoped transactions" (or simply "scoped transactions"), because each transaction executes in a conceptually independent scope. Note Terminology note: The terms "inner" and "outer" are commonly used when describing nested operations, such as nested stored procedure calls. Although Snowflake supports nested procedure calls, Snowflake does not support nested transactions; https://docs.snowflake.com/en/sql-reference/transactions.html#overlapping-transactions
Snowflake waits till all the servers are provisioned for a new virtual warehouse, before it executes query on that warehouse (TRUE OR FALSE)
TRUE Note: Snowflake does not begin executing SQL statements submitted to a warehouse until all of the servers for the warehouse are successfully provisioned, unless any of the servers fail to provision: 1. If any of the servers for the warehouse fail to provision during start-up, Snowflake attempts to repair the failed server(s). 2. During the repair process, the warehouse starts processing SQL statements once 50% or more of the requested servers are successfully provisioned.
Streams support repeatable read isolation (TRUE OR FALSE)
TRUE Note: Streams support repeatable read isolation. In repeatable read mode, multiple SQL statements within a transaction see the same set of records in a stream. This differs from the read committed mode supported for tables, in which statements see any changes made by previous statements executed within the same transaction, even though those changes are not yet committed. https://docs.snowflake.com/en/user-guide/streams.html#repeatable-read-isolation
When you load data into snowflake, snowflake reorganizes the data into its internal optimized compressed columnar format TRUE FALSE
TRUE Note: When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage. Snowflake manages all aspects of how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake.
The warehouse cache may be reset if a running warehouse is suspended and then resumes TRUE FALSE
TRUE Note: You do not need to remember this, just understand the concept. Warehouses are nothing but VM instances underneath. And the cache is nothing but SSD cache attached to the instances, If the warehouse is suspended and resumed again it may not get teh same instance.
You have many queries that needs to run at the same time on a warehouse and you need high concurrency when they execute. You will go for a multi-cluster warehouse in this case. TRUE FALSE
TRUE Note: 1. Multi Cluster warehouse can scale compute resources 2. The scaling is based on concurrent users and queries Multi-cluster warehouses enable you to scale compute resources to manage your user and query concurrency needs as they change, such as during peak and off hours.
A DML statement that selects from a stream consumes all of the change data in the stream as long as the transaction commits successfully TRUE FALSE
TRUE Note: A DML statement that selects from a stream consumes all of the change data in the stream as long as the transaction commits successfully. To ensure multiple statements access the same change records in the stream, surround them with an explicit transaction statement (BEGIN .. COMMIT). This locks the stream. DML updates to the source table in parallel transactions are tracked by the change tracking system but do not update the stream until the explicit transaction statement is committed and the existing change data is consumed. This is very important to know, if you are given a TASK which consumes from the stream and a set of query statements and then if you are asked which query statement will consume all the change data then please select the DML query statement(like INSERT query)
The COPY command is more performant than the INSERT statement? TRUE FALSE
TRUE Note: Do you remember that Snowflake stores data physically into immutable micro-partitions? So, how does insert/update work in snowflake? Since the micropartitons are immutable, any insert or update will have to copy the entire micro-partition into a new copy and then insert the new record in that partition. For update the old record will be marked for deletion and the new record will be added. so, imagine if you do insert one by one, how many micropartitons will get created. Snowflake's architecture is optimized for bulk load. Otherwise, small DMLs (and at high frequency) cause issue of triggering the engine to create a lot of small partition files (this is because each DML create a new (micro)partition file; and a lot of small partition files are bad for data processing and the database engine has to regularly compact the small partition files into larger ones for more optimal query performance). Hope this clarifies the concept.
Multiple streams can be created for the same table and consumed by different tasks TRUE FALSE
TRUE Note: Multiple tasks that consume change data from a single table stream retrieve different deltas. When a task consumes the change data in a stream using a DML statement, the stream advances the offset. The change data is no longer available for the next task to consume. Currently, we recommend that only a single task consumes the change data from a stream. Multiple streams can be created for the same table and consumed by different tasks.
A share cannot be shared TRUE FALSE
TRUE Note: read-only
Time travel and fail safe requires additional storage which will cost you storage cost TRUE FALSE
TRUE Note: The fees are calculated for each 24-hour period (i.e. 1 day) from the time the data changed. The number of days historical data is maintained is based on the table type and the Time Travel retention period for the table. Also, Snowflake minimizes the amount of storage required for historical data by maintaining only the information required to restore the individual table rows that were updated or deleted. As a result, storage usage is calculated as a percentage of the table that changed. Full copies of tables are only maintained when tables are dropped or truncated.
Which three objects did we explicitly refer to using the COPY INTO command to load data on using external stages? Table File Format View Stage
Table File Format Stage
Select two true statements for FLATTEN ? Takes a VARIANT, OBJECT, or ARRAY column and produces a lateral view Can be used to convert semi-structured data to a relational representation Cannot be used on permanent tables
Takes a VARIANT, OBJECT, or ARRAY column and produces a lateral view Can be used to convert semi-structured data to a relational representation
Create a task and a stream following the below steps. So, when the system$stream_has_data('rawstream1') condition returns false, what will happen to the task ? -- Create a landing table to store raw JSON data. -- Snowpipe could load data into this table. create or replace table raw (var variant); -- Create a stream to capture inserts to the landing table. -- A task will consume a set of columns from this stream. create or replace stream rawstream1 on table raw; -- Create a second stream to capture inserts to the landing table. -- A second task will consume another set of columns from this stream. create or replace stream rawstream2 on table raw; -- Create a table that stores the names of office visitors identified in the raw data. create or replace table names (id int, first_name string, last_name string); -- Create a table that stores the visitation dates of office visitors identified in the raw data. create or replace table visits (id int, dt date); -- Create a task that inserts new name records from the rawstream1 stream into the names table -- every minute when the stream contains records. -- Replace the 'etl_wh' warehouse with a warehouse that your role has USAGE privilege on. create or replace task raw_to_names warehouse = etl_wh schedule = '1 minute' when system$stream_has_data('rawstream1') as merge into names n using (select var:id id, var:fname fname, var:lname lname from rawstream1) r1 on n.id = to_number(r1.id) when matched then update set n.first_name = r1.fname, n.last_name = r1.lname when not matched then insert (id, first_name, last_name) values (r1.id, r1.fname, r1.lname) ; -- Create another task that merges visitation records from the rawstream1 stream into the visits table -- every minute when the stream contains records. -- Records with new IDs are inserted into the visits table; -- Records with IDs that exist in the visits table update the DT column in the table. -- Replace the 'etl_wh' warehouse with a warehouse that your role has USAGE privilege on. create or replace task raw_to_visits warehouse = etl_wh schedule = '1 minute' when system$stream_has_data('rawstream2') as merge into visits v using (select var:id id, var:visit_dt visit_dt from rawstream2) r2 on v.id = to_number(r2.id) when matched then update set v.dt = r2.visit_dt when not matched then insert (id, dt) values (r2.id, r2.visit_dt) ; -- Resume both tasks. alter task raw_to_names resume; alter task raw_to_visits resume; -- Insert a set of records into the landing table. insert into raw select parse_json(column1) from values ('{"id": "123","fname": "Jane","lname": "Smith","visit_dt": "2019-09-17"}'), ('{"id": "456","fname": "Peter","lname": "Williams","visit_dt": "2019-09-17"}'); -- Query the change data capture record in the table streams select * from rawstream1; select * from rawstream2; Task will be executed but no rows will be merged Task will return an warning message Task will be skipped
Task will be skipped
Which of these are Snowflake table types? Materialized Secure Temporary Permanent Transient External
Temporary Permanent Transient External Explanation Secure and Materialized are type of views, while rest are type of tables.
In a stream, if a row is added and then updated in the current offset, how will the delta change be represented The delta change is a new row The delta change will be a pair of delete and insert record The delta change will be a pair of insert and upate record
The delta change is a new row Note: You know its an update from METADATA$ISUPDATE
What factors influence the credits charged for a warehouse? The number of servers per cluster (determined by warehouse size) The number of clusters (if using multi-cluster warehouses) The size of the query that runs on the warehouse The length of time each server in each cluster runs
The number of servers per cluster (determined by warehouse size) The number of clusters (if using multi-cluster warehouses) The length of time each server in each cluster runs
In which scenarios would you consider to use materialized views The query results contain a small number of rows and/or columns relative to the base table Query results contain results that require significant processing Query is on an external table View's base table does not change frequently None of the above
The query results contain a small number of rows and/or columns relative to the base table Query results contain results that require significant processing Query is on an external table View's base table does not change frequently
Select all the options which are true about the data that is stored in a micro partition metadata The range of values for each of the columns in the micro-partition Mean and standard deviation of each column The number of distinct values Additional properties used for both optimization and efficient query processing
The range of values for each of the columns in the micro-partition The number of distinct values Additional properties used for both optimization and efficient query processing
You ran the below query. I have a warehouse with auto suspend set at 5 seconds. Also result cache is disabled. SELECT * FROM INVENTORY; The query profile looks like as below. Please see below 'Percentage scanned from cache' is 0% You ran the query again before 5 seconds has elapsed and the query profile looks as below. Look at the 'Percentage scanned for cache', it is 75% You ran the query again after 5 seconds. The query profile looks as below. Look at the 'Percentage scanned from cache', it is zero again. Why is this happening? The second run of the query used data cache to retrieve part of the result since it ran before the warehouse was suspended The second run of the query used query result cache The third run of the query used query result cache
The second run of the query used data cache to retrieve part of the result since it ran before the warehouse was suspended Note: Virtual warehouses are an abstraction on the compute instances of the cloud provider(in case of AWS, it is EC2 instances). Each Virtual warehouse is a cluster of these compute instances(or EC2 in case of AWS). The compute instances has local SSD attached to them. When you ran the query for the first time, the results of the query were retrieved from the remote storage(which is the object store of the cloud provider, S3 in case of AWS), part of the results also got cached in the local SSD storage of the compute instance. So, when we ran the query second time, part of the results got retrieved from the SSD cache also known as Data Cache. Ok, if that is the case why did not it retrieve from data cache the third time. The third run of the query happened after 5 seconds. The virtual warehouse had a auto suspend setting of 5 seconds. So, since there were no activity for 5 seconds, the warehouse suspended itself. When the warehouse is suspended, it loses the data cache. Why? because when it resumes it may be a completely new set of compute instances which were not used in the earlier runs. Hope this clarifies and you probably will never forget this concept now. Please remember SSD cache is also called Data cache In real work scenario, how can you leverage the understanding of this concept. Well if you a nightly ETL that runs against several tables, group the ETLs which use similar tables to run at the same time. This will help you to utilize the data cache effectively. This strategy, I named it as ETL-Colocation:)
When will you consider creating a clustering key on a table. Please select all that applies The table houses multi-terabyte data The table has variant columns The query performance on the table has degraded over time
The table houses multi-terabyte data The query performance on the table has degraded over time note: 1. Clustering will be effective only when you have enough partitions in the table and the table is more than 1 TB 2. As you insert/update rows in the table, the partitions are moved around and hence it no longer remains clustered. With Auto clustering, Snowflake automatically reclusters the table based on a proprietary algorithm
Select the ones that are true about data sharing through data shares There is only one copy of data, which lives in the data provider's account Shared data is always live, real-time and immediately available to consumers Providers can establish revocable, fine-grained access grants to shares Data Sharing is supported between accounts in the same or different Snowflake Provider and Region
There is only one copy of data, which lives in the data provider's account Shared data is always live, real-time and immediately available to consumers Providers can establish revocable, fine-grained access grants to shares
The storage architecture of snowflake has two key features, they are Time travel Replication Zero-copy cloning Query tag
Time travel Zero-copy cloning
Time travel is available for which table types? Transient Permanent External Temporary
Transient Permanent Temporary Refer Link for details - https://docs.snowflake.com/en/user-guide/data-time-travel.html A key component of Snowflake Time Travel is the data retention period. When data in a table is modified, including deletion of data or dropping an object containing data, Snowflake preserves the state of the data before the update. The data retention period specifies the number of days for which this historical data is preserved and, therefore, Time Travel operations (SELECT, CREATE ... CLONE, UNDROP) can be performed on the data. The standard retention period is 1 day (24 hours) and is automatically enabled for all Snowflake accounts: For Snowflake Standard Edition, the retention period can be set to 0 (or unset back to the default of 1 day) at the account and object level (i.e. databases, schemas, and tables). For Snowflake Enterprise Edition (and higher): For transient databases, schemas, and tables, the retention period can be set to 0 (or unset back to the default of 1 day). The same is also true for temporary tables. For permanent databases, schemas, and tables, the retention period can be set to any value from 0 up to 90 days.
True or False: When data is staged to a SF internal staging area using the PUT command, the data is encrypted on the client's machine.
True
Warehouse cache size changes with warehouse size True False
True Note: The size of the cache is determined by the number of servers in the warehouse (i.e. the larger the warehouse and, therefore, the number of servers in the warehouse), the larger the cache. https://docs.snowflake.com/en/user-guide/warehouses-considerations.html#how-does-warehouse-caching-impact-queries
True or False: Micro-partitions are immutable?
True - subsequent changes of any type to the data will be written to additional micro-partitions
You are running your query in the worksheet using a large warehouse. You want to change the warehouse. How do you do that? Choose 2 answers USE WAREHOUSE ETL_WH; SET WAREHOUSE CONTEXT=ETL_WH Update the Warehouse field in the Context Menu located above the worksheet Go to the Worksheet properties page and set the Warehouse field to ETL_WH
USE WAREHOUSE ETL_WH; Update the Warehouse field in the Context Menu located above the worksheet
In snowflake, you want to compare whether two expressions are equal. The expressions can have null values. What is the best way to compare in this case Use EQUAL_NULL Run a query with WHERE EXPR1 = EXPR2 Run a query with WHERE EXPR2=EXPR1 See all questions BackNext question
Use EQUAL_NULL Note: This is an important concept to understand in Snowflake. This has caused me pains in production. Even we do not follow this, you will have incorrect data in production. EQUAL comparison operator (=), treats NULLs as unknown values. Like most SQL languages, comparing NULL = NULL does not return TRUE. In SnowFlake, it returns NULL EQUAL_NULL function is NULL-safe, meaning it treats NULLs as known values for comparing equality. https://docs.snowflake.com/en/sql-reference/functions/equal_null.html#equal-null
You want to identify the potential performance bottlenecks and improvement opportunities of a query. What will you do? Use Query Profile Use Explain plan Call snowflake support
Use Query Profile
You need to handle a huge number of concurrent users for your snowflake data warehouse. All of them are from the same team and needs same type of access. Which is the most appropriate warehouse strategy will you recommend? Create one warehouse for each user Create one warehouse for each group of user Use multi-cluster warehouse
Use multi-cluster warehouse
How did you validate the data after unloading it using COPY INTO command? Load the data in a relational table and validate the rows Load the data in a CSV and validate the rows Use validation_mode='RETURN_ROWS'
Use validation_mode='RETURN_ROWS' Note: Execute COPY in validation mode to return the result of a query and view the data that will be unloaded from the orderstiny table if COPY is executed in normal mode: https://docs.snowflake.com/en/sql-reference/sql/copy-into-location.html#validating-data-to-be-unloaded-from-a-query
Semi structured data strings in snowflake are stored in a column with the below datatype VARIANT OBJECT VARCHAR NONE OF THE ABOVE
VARIANT
Your organization requires a snowflake instance that is a completely separate Snowflake environment, isolated from all other Snowflake accounts. Which edition will you suggest? Standard Premier Virtual Private Snowflake (VPS)
Virtual Private Snowflake (VPS)
In order to load data into snowflake, which of the below are required? Virtual Warehouse Predefined target table Staging location with data staged File format
Virtual Warehouse Predefined target table Staging location with data staged File format Note: Virtual warehouse is required to provide compute resources Target table must be pre-defined load does not have feature to auto create tables Data must be staged through the PUT command File format must be defined to indicate whether CSV, JSON, XML etc is being loaded
You ran a query in snowflake and went to query history tab. The query history shows you the below columns 1. QueryID 2. SQL TEXT 3. WAREHOUSE NAME 4. WAREHOUSE SIZE 5. SESSION ID 6. START TIME 7. END TIME Which of the above column will indicate if an compute cost was incurred to run the query? WAREHOUSE NAME CREDIT WAREHOUSE SIZE SESSION ID
WAREHOUSE SIZE Note: Anytime a query incurs compute cost, you will see the warehouse size mentioned as shown below Just fo an experiment as below 1. Run the query SHOW TABLES 2. Go to query history Do you see the ware house size? No, because SHOW TABLES is a metadata query and does not incur any compute cost
If you attempt to PUT a file but cannot because a file with the same name already exists in the stage, what can you do to load the file? Wait until the existing file's data is loaded, then retry PUT Rename the file that you want to PUT Set OVERWRITE = TRUE Set FORCE COPY = TRUE
Wait until the existing file's data is loaded, then retry PUT Rename the file that you want to PUT Set OVERWRITE = TRUE
Which cache type gets purged in a predictable way so that it ends up empty of all cached infromation? Warehouse Cache Results Cache Metadata Cache
Warehouse Cache Not: Physical storage that gets purged everytime WH is suspended
When creating a warehouse, what are the two most critical factors to consider from cost and performance persepective Warehouse size (i.e. number of servers per cluster) Manual vs automated management (for starting/resuming and suspending warehouses) Cloud provider selection
Warehouse size (i.e. number of servers per cluster) Manual vs automated management (for starting/resuming and suspending warehouses)
Resource monitor notifications can be received by account administrators through the following means Web interface Email Pager Mobile
Web interface Email
Cloud services also use compute. Does snowflake charge you for cloud services. Yes, if it exceeds 10% of the daily usage of the compute resources Yes, irrespective of how much compute resources are used Snowflake never charges for cloud services
Yes, if it exceeds 10% of the daily usage of the compute resources Note: Similar to virtual warehouse usage, Snowflake credits are used to pay for the usage of the cloud services that exceeds 10% of the daily usage of the compute resources. Usage for cloud-services is charged only if the daily consumption of cloud services exceeds 10% of the daily usage of the compute resources. The charge is calculated daily (in the UTC time zone). This ensures that the 10% adjustment is accurately applied each day, at the credit price for that day https://docs.snowflake.com/en/user-guide/credits.html#cloud-services-credit-usage
In which of the below scenarios, you will use an external table. You have data on the cloud providers's object store(AWS/GCS/AZURE Blob) but the data cannot be copied or moved to any other location due to compliance regulations You have high volume of data on the cloud providers's object store(AWS/GCS/AZURE Blob) but only a part of the data is needed in snowflake You have data on the cloud providers's object store(AWS/GCS/AZURE Blob) which needs to be updated by snowflake after applying required transformations You have XML data on cloud provider's object store
You have data on the cloud providers's object store(AWS/GCS/AZURE Blob) but the data cannot be copied or moved to any other location due to compliance regulations You have high volume of data on the cloud providers's object store(AWS/GCS/AZURE Blob) but only a part of the data is needed in snowflake Note: Please use elimination method to answer this questions 1. External tables are read-only, hence #3 is not correct 2. External tables do not support XML and hence #4 is not correct
What command will you run to pause a pipe? alter pipe <pipe name> set pipe_execution_paused = true; alter pipe <pipe name> set pipe_execution_paused = stop; alter pipe <pipe name> set pipe_execution_paused = halt;
alter pipe <pipe name> set pipe_execution_paused = true;
You have created a secure view view1. Now you want to convert it back to a regular view. Which of the below can be used for the conversion? alter view view1 set type = unsecure; alter view view1 unset secure; alter view view1 set secure; None of the above
alter view view1 unset secure; Note: https://docs.snowflake.com/en/sql-reference/sql/alter-view.html#examples
You created an warehouse(ETL_WH) which is sized at XSMALL, you want to resize the Warehouse to SMALL. How will you do that? alter warehouse etl_wh set warehouse_size=SMALL; You will need to recreate the warehouse with the new size Once created an warehouse can never be re-sized
alter warehouse etl_wh set warehouse_size=SMALL;
This snowpipe rest API Fetches a report about ingested files whose contents have been added to table loadHistoryScan and insertReport insertPipeReport insertFiles
loadHistoryScan and insertReport
How are virtual warehouse credits charged? per minute per second per-second, with a 60-second (i.e. 1-minute) minimum: per hour
per-second, with a 60-second (i.e. 1-minute) minimum:
You want to get the clustering information for a table T1 that is clustered on column C1. Which of the below queries will you run to get the details? select system$clustering_information('T1') select clustering('T1') from information_schema.clusteringinfo; Show clustering info for T1;
select system$clustering_information('T1')
Let's say that you have two JSONs as below 1. {"stuId":2000, "stuName":"Amy"} 2.{"stuId":2000,"stuCourse":"Snowflake"} How will you write a query that will check if stuId in JSON in #1 is also there in JSON in#2 with stu_demography as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; with stu_demography as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; SELECT CONTAINS('{"stuId":2000, "stuName":"Amy"}','{"stuId":2000,"stuCourse":"Snowflake"}'); with stu_demography as (select parse_json(column1) as src, src['STUID'] as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; See all questions BackSkip question
with stu_demography as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src:stuId as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; with stu_demography as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000, "stuName":"Amy"}')), stu_course as (select parse_json(column1) as src, src['stuId'] as ID from values('{"stuId":2000,"stuCourse":"Snowflake"}')) select case when stdemo.ID in(select ID from stu_course) then 'True' else 'False' end as result from stu_demography stdemo; Note: Please note that this may not be the way the question will appear in the certification exam, but why we are still learning this? 1. When you take this course, I prepare you not just for the certification but also to be prepared to work on a project 2. With this hands-on, what will you learn a. That there are two ways to query JSON, dot notation(src:stuId) and bracket notation src['stuId'] b. The JSON Value is case sensitive, if you run the 4th query with src['STUID'], it will return false because src['STUID'] will return you a NULL since there is a mismatch in the case. In the certification you may be asked question on a), b) - but do you agree if you learn this way you will never forget the concept:)
One of the government agencies has decided to use Snowflake for their datawarehouse. They require compliance with US federal privacy and security standards, such as FIPS 140-2and FedRAMP (Moderate Impact) . Which cloud providers can you choose to setup the snowflake instance? A) AWS B) AZURE C) GOOGLE D) IBM
A) AWS B) AZURE Note: https://docs.snowflake.com/en/user-guide/intro-regions.html#government-regions
What are the default roles that snowflake automatically adds for any new account? A) AccountAdmin B) SecurityAdmin C) SysAdmin D) Public E) Security Admin
A) AccountAdmin B) SecurityAdmin C) SysAdmin D) Public
What are the two ways available in snowflake before change tracking metadata is recorded for a table? A) Change tracking is enabled on the table (using ALTER TABLE ... CHANGE_TRACKING = TRUE) B) A stream is created for the table (using CREATE STREAM) C) Enable CDC on the table
A) Change tracking is enabled on the table (using ALTER TABLE ... CHANGE_TRACKING = TRUE) B) A stream is created for the table (using CREATE STREAM)
What would you check to see if a large table will benefit from explicitly defining a clustering key A) Clustering depth B) Clustering percentage C) Clustering ratio
A) Clustering depth
Which are the required parameters while creating a storage integration? A) Name B) Type C) Enabled D) Storage_Allowed_Locations E) Storage_blocked_locations
A) Name B) Type C) Enabled D) Storage_Allowed_Locations
Which factors affect your snowflake data loading rate? A) Physical location of the stage B) RAM on the virtual warehouse C) GZIP compression efficiency D) Thread size
A) Physical location of the stage C) GZIP compression efficiency Note: https://www.snowflake.com/blog/how-to-load-terabytes-into-snowflake-speeds-feeds-and-techniques/ Load rates for your own data files may differ based on a number of factors: Location of your S3 buckets - For our test, both our Snowflake deployment and S3 buckets were located in us-west-2 Number and types of columns - A larger number of columns may require more time relative to number of bytes in the files. Gzip Compression efficiency - More data read from S3 per uncompressed byte may lead to longer load times.
Snowflake supports below type of streams. Please select three A) Standard B) Update-only C) Append-only D) Insert-only
A) Standard C) Append-only D) Insert-only
Which Azure regions are supported by Snowflake in US and Canada A) west-us-2 B) east-us-2 C) canada-central D) west-us-1
A) west-us-2 B) east-us-2 C) canada-central
Which are true with respect to SMT(simple message transformation) when neither key.converter or value.converter is set All SMTs are supported Most SMTs are supported regex.router is not supported
Most SMTs are supported regex.router is not supported
The RECORD_METADATA contains which information Topic Partition Key CreateTime / LogAppendTime Value
Topic Partition Key CreateTime / LogAppendTime
SQL functionality in snowflake can be extended by USER DEFINED FUNCTIONS USING SQL USER DEFINED FUNCTIONS USING JAVASCRIPT SESSION VARIABLES NONE OF THE ABOVE
USER DEFINED FUNCTIONS USING SQL USER DEFINED FUNCTIONS USING JAVASCRIPT SESSION VARIABLES
How will you store JSON data in snowflake A) Using a column with datatype as JSON B) Using a column with datatype as VARCHAR C) Using a column with datatype as VARIANT
C) Using a column with datatype as VARIANT