Week 3 data agregation
A subquery can't be nested in a SET command.
A SET command can't have a nested subquery because it is used with UPDATE to adjust specific columns and values in a table
JOIN
A SQL clause that is used to combine rows from two or more tables based on a related column
OUTTER JOIN
A function that combines RIGHT and LEFT JOIN to return all matching records in both tables
VALUE
A function that converts a text string that represents a number to a numerical value
INNER JOIN
A function that returns records with matching values in both tables
RIGHT JOIN
A function that will return all records from the right table and only the matching records from the left
LEFT JOIN
A function that will return all the records from the left table and only the matching records from the right table
COUNT DISTINCT
A query that only returns the distinct values in a specified range
COUNT in SQL
A query that returns the number of rows in a specified range
Subquery
A query within another query
Absolute reference
A reference that is locked so that rows and columns won't change when copied
HAVING
Allows you to add a filter to your query instead of the underlying table that can only be used with aggregate functions
You will use COUNT and COUNT DISTINCT
Anytime you want to answer questions about "how many"
Data can also be aggregated over a given time period to provide statistics such as:
Averages Minimums Maximums Sums
What is the key difference between COUNT and COUNT DISTINCT in a database query?
COUNT returns the number of rows in a specified range. COUNT DISTINCT only returns the distinct values in a specified range.
COUNT in spreadsheets
Can be used to count the total number of numerical values within a specific range in spreadsheets
Aggregation
Collecting or gathering many separate pieces into a whole
ou can use comparison operators such as >, <, or = within subqueries.
Comparison operators such as >, <, or = help you compare data in subqueries. You can also use multiple row operators including IN, ANY, or ALL.
A subquery can have more than one column specified in the SELECT clause.
For a subquery to compare multiple columns, those columns must be selected in the main query.
Troubleshooting questions
How should I prioritize these issues? In a single sentence, what's the issue I'm facing? What resources can help me solve the problem? How can I stop this problem from happening in the future?
If you enter FALSE as the last input parameter in a VLOOKUP function, VLOOKUP will search for _____.
If you enter FALSE as the last input parameter in a VLOOKUP function, VLOOKUP will search for an exact match.
In SQL, what is a subquery?
In SQL, a subquery is a query nested within another query.
In VLOOKUP, TRUE tells the function to search for exact matches, and FALSE tells the function to look for approximate matches.
In VLOOKUP, TRUE tells the function to search for approximate matches, and FALSE tells the function to look for exact matches.
In the function =VLOOKUP(K2,'Sheet 4'!A:B,2,TRUE), what does the word TRUE indicate?
In the function =VLOOKUP(K2,'Sheet 4'!A:B,2,TRUE), TRUE tells VLOOKUP to search for approximate matches.
A data analyst wants to combine rows from four tables in a database. Which SQL clause combines two or more tables based on a related column?
JOIN is a SQL clause that combines rows from two or more tables based on a related column. INNER JOIN, OUTER JOIN, LEFT JOIN, and RIGHT JOIN variations are available. If JOIN is used, an INNER JOIN is assumed.
Subqueries don't have to be enclosed within parentheses.
Parentheses are used to mark the beginning and end of a subquery.
When do you need to use VLOOKUP
Populating data in a Spreadsheet Merging data from one spreadsheet with data in another
Example of Aggregation
Puzzle pieces = data Organization = aggregation Pile of pieces = summary Putting the pieces together = gaining insights
CASE
Returns records with your conditions by allowing you to include if/then statements in your query
A subquery may be nested in a SELECT clause.
Subqueries are usually nested in the SELECT, FROM, and/or WHERE clauses. Subqueries can't be nested in SET queries.
Subqueries that return more than one row can only be used with multiple value operators.
Subqueries that return more than one row rely on multiple value operators such as the IN command.
A data analyst is working with two tables in a database. Which JOIN clause enables them to combine RIGHT and LEFT JOIN functionality to return matching records from either table?
The OUTER JOIN clause enables them to combine RIGHT and LEFT JOIN functionality to return matching records from either table.
A data analyst applies the VALUE function to a text string that represents a number, but is formatted as text. What does the VALUE function do to that text string?
The VALUE function converts the text string to a numerical value.
The parent query executes before its inner query.
The innermost query executes first. Its parent query executes last so it can use the results returned by inner queries.
Data aggregation
The process of gathering data from multiple sources in order to combine it into a single summarized collection
What is the purpose of an absolute reference within a function, such as "$C$3"?
The purpose of an absolute reference is to lock the reference to a row or column so values won't change when a function is copied.
A subquery is also called an outer query or outer select. The statement containing a subquery is called an inner query or inner select.
The statement containing a subquery is an outer query or outer select. Subqueries are nested within these statements, called inner queries or inner select.
To change a text string in spreadsheet cell F8 to a numerical value, what is the correct function?
To change the text string in spreadsheet cell F8 to a numerical value, the correct syntax is =VALUE(F8). Within the parenthesis, the VALUE syntax must include a reference to the specific cell whose value the function should convert.
A data analyst wants to retrieve only records from a database that have matching values in two different tables. Which JOIN function should they use?
To retrieve only records from a database that have matching values in two different tables, the analyst should use INNER JOIN
To search for the height of the building in Mecca, what is the correct VLOOKUP syntax?
To search for the height of the building in Mecca, the correct syntax is =VLOOKUP("Mecca", A2:D7, 3, false). "Mecca" is the reference. A2:D7 is the table array. The 3 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.
To search for the population of Nigeria, what is the correct VLOOKUP syntax? 0 / 1 point
To search for the population of Nigeria, the syntax is =VLOOKUP("Nigeria", A2:C10, 2, false). "Nigeria" is the reference. A2:C10 is the table array. The 2 indicates the position of the column from which the value should be returned. And the word false instructs the function to return an exact match. The reference should always be enclosed in quotation marks.
You are writing a SQL query to instruct a database to count values in a specified range. You only want to count each value once, even if it appears multiple times. Which function should you include in your query?
To tell a database to return distinct values in a specified range, the analyst should use COUNT DISTINCT in their query.
A data analyst wants to temporarily name a column in their query to make it easier to read and write. What technique should they use?
To temporarily name a column in a query to make it easier to read and write, the analyst should use aliasing.
VLOOKUP has certain limitations. One limitation is that it only returns the first match it finds within the specified range. Another limitation is that VLOOKUP can only search through the first column in a spreadsheet.
VLOOKUP only returns the first match it finds within a specified range and can only search in columns to the right
Aliasing
When you temporarily name a table or column in your query to make it easier to read and write