D465 - Data Applications

Ace your homework & exams now with Quizwiz!

Array

A collection of values in spreadsheet cells

Advantage of storing code in R

Allows reproducibility and collaboration among analysts.

Advantage of tidyverse

Cohesive data manipulation packages in R.

Fill in the blank The SQL command _____ combines table rows with the same values into summary rows. WITH GROUP BY TABLE ORDER BY

GROUP BY

Which HAVING clause indicates to only retrieve products that have been sold more than 100 times? HAVING COUNT(order_items.product_id) > 100 HAVING COUNT(order_items.product_id) < 100 HAVING (order_items.product_id) > 100 HAVING (order_items.product_id > 100)

HAVING COUNT(order_items.product_id) > 100

Output formats for documents

HTML, PDF, Word (docx), Markdown.

Different JOIN functions in SQL

INNER, LEFT, RIGHT, FULL OUTER JOIN types.

Symbol for comments in R

Pound sign (`#`) precedes comments in R.

Programming languages and use cases

Python: web dev, data science, ML, automation.

Nested function usage

Simplifies operations, improves code readability.

R unique challenges

Steeper learning curve, limited web dev capabilities.

Aggregation

The process of collecting or gathering many separate pieces into a whole

Absolute reference

A reference within a function that is locked so that rows and columns won't change if the function is copied

Knit button in R

Compiles R Markdown into desired output formats.

COUNTIF function in spreadsheets

Counts cells meeting a specified condition.

sample() function for biased data

Creates random unbiased data samples.

dplyr filter() function

Subset rows based on specific conditions in R.

Data validation process

The process of checking and rechecking the quality of data so that it is complete, accurate, secure and consistent

Delimiter for code chunks

Triple backticks or markup to define code sections.

Smoothing line usage

Visual representation of trends in data.

Which of the following queries contain subqueries? Select all that apply. 1. SELECT call 2. FROM recordings 3. ORDER BY call.employee_id, call.start_time 1. SELECT employee_id 2. FROM employees 3. WHERE department_id IN (SELECT department_id 4. FROM departments 5. WHERE location_id = 1000) 1. SELECT product_name, 2. CASE 3. WHEN price < 10 THEN 'Low price' 4. WHEN price >= 10 AND price < 20 THEN 'Medium price' 5. ELSE 'High price' 6. END AS price_category 4. FROM products 1. SELECT price 2. FROM sales 3. WHERE price = (SELECT MAX (salary) 4. FROM sales)

1. SELECT price 2. FROM sales 3. WHERE price = (SELECT MAX (salary) 4. FROM sales) 1. SELECT employee_id 2. FROM employees 3. WHERE department_id IN (SELECT department_id 4. FROM departments 5. WHERE location_id = 1000)

A spreadsheet cell contains the coldest temperature ever recorded in Austria: -37 degrees Celsius. Which function would convert that to Fahrenheit? =CONVERT(-37, F, C) =CONVERT(-37, C, F) =CONVERT(-37, "F", "C") =CONVERT(-37, "C", "F")

=CONVERT(-37, "F", "C")

A data analyst at an engineering company calculates the number of spreadsheet rows that contain the value turbine. Which function do they use? =COUNTIF(C1:C100,"turbine") =COUNTIF(C1:C100,turbine) =COUNTIF(C1:C100,"=turbine") =COUNTIF(turbine=C1:C100)

=COUNTIF(C1:C100,"turbine")

Which function will return the number of characters in spreadsheet cell F8 in order to confirm it contains exactly 15 characters? =LEN(F8, 15) =LEN(15) =LEN(F8) =LEN(15, F8)

=LEN(F8)

A data analyst works with a spreadsheet containing product information that often has very long text strings. To check for consistency, they use a function to count the number of characters in cell P12. What is the correct syntax of the function? =LEN(P:12) =LEN(P,12) =LEN(P:P12) =LEN(P12)

=LEN(P12)

Which function will calculate the sum of the products of the corresponding items in the arrays M1:M4 and P1:P4? =SUMPRODUCT(M1:M4, P1:P4) =MULTIPLY(M1:M4, P1:P4) =PRODUCT(M1:M4, P1:P4) =ARRAY(M1:M4, P1:P4)

=SUMPRODUCT(M1:M4, P1:P4)

GROUP BY

A SQL clause that groups rows that have the same values from a table into summary rows

LIMIT

A SQL clause that specifies the maximum number of records returned in a query

OUTER JOIN

A SQL function that combines RIGHT and LEFT JOIN to return all matching records in both tables

JOIN

A SQL function that is used to combine rows from two or more tables based on a related column

COUNT DISTINCT

A SQL function that only returns the distinct values in a specified range

ROUND

A SQL function that returns a number rounded to a certain number of decimal places

INNER JOIN

A SQL function that returns records with matching values in both tables

RIGHT JOIN

A SQL function that will return all records from the right table and only the matching records from the left.

LEFT JOIN

A SQL function that will return all the records from the left table and only the matching records from the right table

Subquery

A SQL query that is nested inside a larger query

Temporary table

A database table that is created and exists temporarily on a database server

SUMPRODUCT

A function that multiplies arrays and returns the sum of those products

Calculated field

A new field within a pivot table that carries out certain calculations based on the values of other fields

Profit margin

A percentage that indicates how many cents of profit has been generated for each dollar of sale

VALUE

A spreadsheet function that converts a text string that represents a number to a numeric value

MATCH

A spreadsheet function used to locate the position of a specific lookup value

Summary table

A table used to summarize statistical information about data

Logical operators

AND (&&), OR (||), NOT (!).

Plus sign in ggplot2

Adds layers to ggplot objects for customization.

When working with a temporary table in a SQL database, at what point will the table be automatically deleted? After completing all calculations in the table After running a report from the table After ending the session in the SQL database After running the query in the SQL database

After ending the session in the SQL database

What will this query return? 1. SELECT * 2. FROM Books_table 3. LEFT JOIN Biography_table All records in the biography table and any matching rows from the books table All records in both the books table and the biography table All rows from the books table joined together with the biography table All records in the books table and any matching rows from the biography table

All records in the books table and any matching rows from the biography table

Modulo

An operator (%) that returns the remainder when one number is divided by another

Main operators in R

Arithmetic, relational, logical, assignment operators.

R and Python similarities

Both widely used in data science with extensive libraries.

Fill in the blank: A data professional uses the SQL _____ statement to return records that meet conditions by including an if/then statement in a query. CASE HAVING WHEN CONCAT

CASE

Which SQL function combines groups of text strings from multiple cells in order to create a new string? CONCAT COMBINE CONSOLIDATE CONNECT

CONCAT

Fill in the blank: The spreadsheet function _____ can be used to tally the number of cells in a range that are not empty. RANGE COUNT COUNT DISTINCT RETURN

COUNT

You write a SQL query that will count values in a specified range. Which function should you include in your query to only count each value once, even if it appears multiple times? COUNT RANGE COUNT DISTINCT COUNT COUNT VALUES

COUNT DISTINCT

COUNT vs. COUNT DISTINCT in SQL

COUNT: total rows, COUNT DISTINCT: unique values.

Fill in the blank: The spreadsheet function _____ returns the number of cells within a range that match a specified value. COUNTIF ARRAY COUNT DISTINCT VALUE

COUNTIF

Which spreadsheet tool finds an average value using values generated within a pivot table? Filter Data validation Conditional formatting Calculated field

Calculated field

A data analyst selects Format Cells and the option Text Is Exactly Baseball. This changes the color of all the cells that contain the word "Baseball." What spreadsheet tool is the analyst using? Conditional formatting Filtering Data validation CONVERT

Conditional formatting

A junior data analyst writes the following formula: =AVERAGE($C$1:$C$100). What are the purposes of the dollar signs ($)? Select all that apply. Average the values in cells C1 to C100 regardless of whether the formula is copied. Ensure rows and columns do not change. Create an absolute reference. Perform the calculation more efficiently.

Create an absolute reference. Ensure rows and columns do not change.

You prepare a project tracker spreadsheet. Next to each project is the name of the team member responsible. What spreadsheet tool will create a drop-down list with team member names to save you time when assigning the projects? Data validation Conditional formatting Pop-up menus Find

Data validation

Fill in the blank: A junior data analyst at a healthcare organization uses the spreadsheet _____ function to locate specific characters from insurance provider account numbers. FROM FIND WHERE IDENTIFY

FIND

Why might a data professional add a CREATE TABLE statement to a temporary table? Include metadata about the data in the table Automate calculations in the table Give multiple people access to the table Create a second table within the temporary table

Give multiple people access to the table

What will GROUP BY do in this query? GROUP BY apartment; SELECT apartment, AVG(price) AS apt_prices FROM rent_data Group together the apartment and rent_data tables Group only the rows in the apt_prices table Group together the rent_data by apartment prices Group the rows in the table by apartment

Group the rows in the table by apartment

A data analyst wants to retrieve only records from a database that have matching values in two different tables. Which JOIN function should they use? OUTER JOIN RIGHT JOIN INNER JOIN LEFT JOIN

INNER JOIN

JOIN commands in SQL

INNER, LEFT, RIGHT, FULL OUTER JOIN types.

Common errors in ggplot2

Incorrect aesthetic mappings, syntax misunderstanding.

When working with subqueries, which query will execute first? Rightmost Outermost Innermost Leftmost

Innermost

What SQL clause can be added to this query to ensure only the first 50 results are returned? 1. SELECT * 2. FROM Leaf_Database 3. WHERE tree_type = maple LIMIT 50 FIRST 50 RETURN 50 ONLY 50

LIMIT 50

Underscores

Lines used to underline words and connect text characters

Tibbles vs. data frames

Modernized data frames with improved features.

A data professional writes a query that uses more than one arithmetic operator. What do they add to the query to control the order of the calculations? Dollar sign ($) Parenthesis [()] Colon [:] Backslash [/]

Parenthesis [()]

What data will appear in the temporary table created through this query? 1. WITH plant_variety AS ( 2. SELECT * 3. FROM bigquery-public-data.plants.African_species 4. WHERE daily_growth_rate_percentage = 0.05 5. ) Plant varieties that grow exactly 0.05 percent per day A random subset of African plant species Plant varieties that are equal to 0.05 inches tall All plant species that exist in the public dataset

Plant varieties that grow exactly 0.05 percent per day

Locking table array in VLOOKUP

Prevents range changes for formula accuracy.

Data security

Protecting data from unauthorized access or corruption by adopting safety measures

A data professional runs a query that will return a dataset containing numbers out to five decimal places. Which SQL function will limit the records to two decimal places? LEN NUM LIMIT ROUND

ROUND

Which data-validation menu option highlights data entry errors to ensure spreadsheet formulas continue to run correctly? Reject invalid inputs Forbid entry Deny text Remove validation

Reject invalid inputs

SELECT command in SQL

Retrieves data from one or more database tables.

SELECT statement usage in SQL

Retrieving data from one or more tables.

In a SQL query, what is the purpose of the modulo (%) operator? Return the remainder of a division calculation Convert a decimal to a percent Apply an exponent to a value Find the square root of a number

Return the remainder of a division calculation

MIN function in spreadsheets

Returns the smallest value in a cell range.

Pivot table elements

Rows, columns, values, filters for data aggregation.

Fill in the blank: A data analyst uses _____ to copy data from one table into a temporary table without adding the new table to the database. TEMP COPY TO WITH SELECT INTO

SELECT INTO

VLOOKUP function in spreadsheets

Searches for values in a vertical column.

Presentation formats

Slides, Dashboards, Interactive web apps.

FROM statement in SQL

Specifies tables for data retrieval in SQL queries.

You use VLOOKUP in a spreadsheet containing weather data. While searching for rainfall levels in Chicago, you encounter an error because your spreadsheet value has a trailing space after the city name. What function should you use to eliminate this space? CUT TRIM NOSPACE VALUE

TRIM

Aliasing

Temporarily naming a table or column in a query to make it easier to read and write

Data aggregation

The process of gathering data from multiple sources and combining it into a single, summarized collection

What will this spreadsheet function return? =SUMIF(K20:K70, ">=50", L20:L70) The sum of all values in cells L20 to L70 that correspond to values in cells K20 to K70 that are greater than or equal to 50. The sum of any values in cells K20 to K70 and cells L20 to L70 that are greater than or equal to 50. The sum of all values in cells K20 to K70 for which the value in cells L20 to L70 is greater than or equal to 50. The count of the number of cells in the array K20:K70 that have a value greater than or equal to 50.

The sum of all values in cells L20 to L70 that correspond to values in cells K20 to K70 that are greater than or equal to 50.

Which of the following statements accurately describe pivot tables? Select all that apply. The calculated field in a pivot table is used to apply filters based on specific criteria. The values in a pivot table are used to calculate and count data. A pivot table is a data summarization tool. The rows of a pivot table organize and group data horizontally.

The values in a pivot table are used to calculate and count data. A pivot table is a data summarization tool. The rows of a pivot table organize and group data horizontally.

What is an example of an array in a spreadsheet? All cells with number values Cells D7, E14, and F20 The values in cells B2 through B31 All cells with values greater than 100

The values in cells B2 through B31

A data analyst at a recycling company manually recalculates the new column materials_sorter. They want to identify any rows with values that do not match those in the original column, compost_sorter. Which SQL clauses would enable them to do so? Select all that apply. WHERE materials_sorter !! compost_sorter WHERE materials_sorter >< compost_sorter WHERE materials_sorter <> compost_sorter WHERE materials_sorter != compost_sorter

WHERE materials_sorter <> compost_sorter WHERE materials_sorter != compost_sorter

Functions in ggplot2

ggplot(), geom_point(), geom_line(), aes().

Spreadsheet cell D5 contains the decimal .74. Which formula will convert it to a percentage? =D5%100 =D5,100 =D5(100) =D5*100

=D5*100

Fill in the blank: Aliasing involves _____ naming a table or column to make a query easier to read and write. permanently perpetually temporarily privately

temporarily

Fill in the blank: To copy data from one table into a _____, a data professional uses the SELECT INTO statement. temporary table new table defined function table view

temporary table

Basic aesthetic attributes in ggplot2

x-axis, y-axis, color for plot customization.

Fill in the blank: A data professional uses _____ in order to ensure spreadsheet values are static, rather than carrying over a preexisting formula or function. conditional formatting formatting paste values only data validation

paste values only

Fill in the blank: To combine rows from two or more tables based on a _____ column, data professionals use the SQL JOIN clause. unique dissimilar foreign related

related

Fill in the blank: The _____ of a pivot table organize and group the selected data horizontally. columns rows filters values

rows


Related study sets

Final Exam for Fluency Disorders

View Set

Lección 1 | Estructura 1.1 ¿Masculino o femenino?

View Set

Foundations of Nursing - Adaptive Quizzing Chapter 5

View Set

Chapter 8: Population Change (Terms and Definitions)

View Set

Chromotography and Electrophoesis

View Set