advanced SQL notes

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

how would SELECT film_id, title, rental_rate FROM film WHERE rental_rate > (SELECT AVG(rental_rate) FROM film); be written in two queries? original question: how would you find the films whose rental rate is higher than the average rental rate?

1.) SELECT AVG(rental_rate) FROM film; result: 2.98 2.) SELECT title, rental_rate FROM film WHERE rental_rate > 2.98; **basically, the nested statement (the one in the WHERE clause) would be the statement the "first" question (the question you're actually trying to solve) is based off of (the "second" question) context: in the subquery example, the first statement found the average rental rate and then had a "normal" SELECT statement with a WHERE clause condition that selected all the rental rates that were higher than the average rental rate found in the first SELECT statement (the subquery calculated the nested query FIRST): SELECT title, rental_rate FROM film WHERE rental_rate > (SELECT AVG(rental_rate) FROM film);

general order of WHERE, HAVING, GROUP BY , and ORDER BY

1.) WHERE 2.) GROUP BY 3.) HAVING 4.) ORDER BY;

what are JOIN statements used for?

JOIN statements are used for relating data from one table to data from another table. **usually used if one table has information/a record that another doesn't have and you want to obtain data in the output from both of them. for each row in table1, sql will scan table2 and check if there's any row that matches the condition specified in the INNER JOIN statement ex.) INNER JOIN B ON A.pka = B.fka; if it finds a match, it combines columns of both rows into one row and add the combined row to the returned result set. **think of the overlapping areas of two circles in a venn diagram

what is the syntax for the string concatenate operator? how do you put a space in between both columns when they are concatenated?

SELECT col1 || col2 FROM table; add a blank space in single quotes between two double bars (and then second column) ex.) SELECT first_name || ' ' || last_name FROM customer;

how would syntax look when using the GROUP BY clause?

SELECT column1, aggregate_function(column2) FROM tbl GROUP BY column1; the output would return column1, and a new column with the aggregate function **it's good practice to always put the same column in the SELECT statement in the GROUP BY statement because some sql engines require that (postgre doesn't) **context: basically, GROUP BY's output returns column1 as the main/first column in the output and then the aggregate function will be the second column that shows a number, the aggregate function that you chose, like COUNT, will show how many times column1 happened.

how would syntax looks when using the GROUP BY clause but with two columns in it?

SELECT column1, column2, aggregate_function(column2) FROM tbl GROUP BY column1, column 2; the difference between having 2 columns versus having 1 column in the GROUP BY clause is that 1 column: if there are multiple instances for column1, then the output table would just list col1 once (because it's being grouped by col1) and then the aggregate function (like COUNT) , which would list how many instances occurred for column 1 ex.) students taking courses in college: SELECT subject, COUNT(*) FROM Subject_Selection GROUP BY subject; output table: subject count art 5 math 2 english 3 2 columns: if there are two columns you would want to group by, like subject and semester, then you'd have to have the same columns in the SELECT clause in the WHERE clause (it's good practice to do so because most SQL engines require that. postgresql doesn't require the col in the SELECT clause to be the same as in the WHERE clause, but when using two columns in the WHERE clause, then postgresql needs those two columns to also be in the SELECT clause as well) ex.) students taking courses in college per semester: SELECT subject, semester, COUNT(*) FROM Subject_Selection GROUP BY subject, semester output table: subject semester count art 1 2 art 2 3 math 2 2 english 1 3 here, there may be only 2 students taking the art class in semester 1, but 3 in semester 2. the output table is grouping by both the subject AND the semester columns.

what is a subquery? what does a subquery do? how do you construct the syntax of a subquery?

a query nested inside another query allows you to use multiple SELECT statements, where a query is used within a query put the second query in (round) brackets and use it in the WHERE clause as an expression ex.) SELECT film_id, title, rental_rate FROM film WHERE rental_rate > (SELECT AVG(rental_rate) FROM film);

what type of JOIN is used by default if you just type "JOIN" without specifying whether it's an INNER JOIN or an OUTER JOIN?

an INNER JOIN

how many aggregate functions can you put in the SELECT clause?

as many as you want, just separate them with a comma: ex.) SELECT staff_id, COUNT(amount), SUM(amount) FROM payment GROUP BY staff_id; **the aggregate functions must all contain the same column, though

what is the operator for counting the amount of characters in a string? what statement is this usually placed/associated with?

char_length(column) SELECT statement ex.) SELECT first_name, char_length(first_name) FROM customer; output will return a column with the first names of the customer table and a second column with a number that would be the amount of characters in each cell of the "first_name" column ex.) Jared 5

what does the UNION operator do? how do you type the syntax? what rules must be followed to use it?

combine result sets of two or more SELECT statements into a single result set syntax: SELECT col1, col2.... FROM tab1 UNION SELECT col1, col2.... FROM tab2; 1.) both SELECT queries must return the same amount of columns (if you're pulling only 2 columns in one SELECT statement, then the other SELECT statement must be pulling only 2 columns as well) 2.) the corresponding columns in the queries must have the compatible data types (if the data type of col1 in the first SELECT query is an integer, then the data type of col1 from the second SELECT query must be an integer as well) **can use a UNION ALL operator, which shows all rows from each SELECT query (if say an employee named sally made $100 in tbl_monday and tbl_tuesday, then sally would only show up once in the result set with $100. if she made $100 in tbl_monday and $110 in tbl_tuesday, then she'd show up twice with her respective earnings)

what does the extract function do? where does the extract function generally go? how does the syntax look like?

extract parts from (specifically) dates the SELECT statement extract (units from col) where: "units" is the type of unit (like day/month/year) that you want to extract "from" is in lowercase to differentiate from the FROM statement that pulls from an actual column "col" is the column where you're extracting the timestamp from ex.) SELECT extract(day from payment_date) ***where the payment_date would be something like "2007-02-15 22:25:46:996577" and the output would result in "15", which is the day from the timestamp **extract is useful when you want to filter by something (in a column like a timestamp column)

how would you know you would need to use a JOIN?

if you wanted to return output based on another column that is only available in another table (even if both tables you are using have the same column) ex.) rental table and inventory table both have column "inventory_id" but only the inventory table has

what does the UNION operator generally do?

it removes all duplicate rows (unless the UNION ALL operator is used)

random info on the OR operator

logical operator OR unlike the AND operator, you could use the same column when using the OR operator *when using the OR operator, you must state the column for each condition of the OR statement ex.) SELECT * FROM facilities WHERE faceid = 1 OR faceid = 5; you CAN'T state the WHERE clause as: WHERE faceid = 1 OR 5;

does the column in the SELECT clause have to be the same column in the WHERE clause?

no WHERE states a condition that must be met, whether that involves another column or not, in order for the output in the SELECT statement to come up. ex.) SELECT customer_id, SUM(amount) FROM payment WHERE staff_id = 2 GROUP BY customer_id HAVING SUM(amount) > 110;

how many databases can you work with in postgresql at one time? how do you switch between databases in postgresql?

only 1 you must disconnect from the database currently active and then open the query tool in another database disconnect by: right-clicking active database select "disconnect database..."

what exactly does an INNER JOIN return as output?

only the set or records that match in both tables (only the records that appear in the column that both tables share) venn diagram: only the area shared by both circles

what does a LEFT OUTER JOIN return as output?

produces a complete set of records from table A (left circle of venn diagram) with the matching records (where available) in table B (right circle in venn diagram) if there is no match, the right side/circle will contain the null venn diagram: everything in the left circle including the overlapping area with the right circle **the output would show all rows in table A whether or not there's a matching value in table B. If there's a matching value in table A with a value in table B (both tables have a value of "bird") it will show in the output for both tables. However, all the rows in table A without a matching value in table B will show the value in table A, but it will be matched with the word "null" **the difference between a LEFT OUTER JOIN and a FULL OUTER JOIN is that it returns everything from the left table/circle and if there is no match from the right table/circle, then it will show there is a null. if there is no match from the left table, then it wouldn't show a null.

what does a FULL OUTER JOIN return as output?

produces the set of all records in both tables with matching records from both sides where available. if there is no match, the missing side will contain null (the output will show the word "null" where there is no shared records between the tables - if one table has a record the other doesn't, it will show for that record but not the other) venn diagram: everything is shaded

how would the syntax look when using a JOIN versus using a self join for the following question: retrieve the customers whose last name match the first name of another customer

self join: SELECT a.first_name, a.last_name, b.first_name, b.last_name FROM customer AS a, customer AS b WHERE a.first_name = b.last_name JOIN: SELECT a.first_name, a.last_name, b.first_name, b.last_name FROM customer AS a JOIN customer AS b ON a.first_name = b.last_name ** 1.) basically, you're deleting the comma between the aliases of the table in the FROM clause and inserting the JOIN clause between them. 2.) then you're replacing the WHERE clause with the ON clause that's it.

what does SQL assume when you type simply JOIN? what does SQL assume when you type LEFT JOIN

that it's an INNER JOIN that it's a LEFT OUTER JOIN

what is the string operator for concatenation? what is this not? where is it on the keyboard?

the double "bar" (||) the "l" (as in love) key shift + backlash (the key usually above the enter key, which is a slash leaning to the "left")

what does a FULL OUTER JOIN with WHERE return as output? how do you perform it?

the set of records unique to table A and table B syntax: perform the same syntax as a FULL OUTER JOIN but exclude the records you don't want fro BOTH SIDES (overlapping area of the venn diagram) with a WHERE clause venn diagram: everything in the left and right circle except the overlapping area ex.) SELECT * FROM tableA FULL OUTER JOIN tableB ON tableA.col1 = tableB.col1 WHERE tableA.id IS null OR tableB.id is null **WHERE tableA.id IS null OR tableB.id is null **the output will return all rows from both tables/circles that DON'T have a match (no matching values will show)

what does a LEFT OUTER JOIN with WHERE return as output? how do you type the syntax to perform it?

the set or records only from table A but not in table B. venn diagram: everything in the left circle excluding the right circle AND the overlapping area of the right circle with the left circle syntax: perform the same syntax as a LEFT OUTER JOIN but exclude the records you don't want from the right side/table B with a WHERE clause ex.) SELECT * FROM tableA LEFT OUTER JOIN tableB ON tableA.col1 = tableB.col1 WHERE tableB.id IS null **"WHERE tableB.id IS null" - is the syntax that takes out the overlapping area between both tables. **"tableB" in tableB.id will specify the table that you don't want to show in the output so all the records from table A will show but none from table B - the only thing that will show would be a "null" **this will return the output with rows from table A that DOESN'T have a match in table B

what can the LIKE operator NOT work with?

timestamps. LIKE is used with strings - timestamps are NOT STRINGS so you can't use them with the LIKE operator ex.) THIS WOULDN'T WORK! SELECT starttime FROM bookings WHERE starttime LIKE '2012-09%'; for timestamps, you need to use the BETWEEN operator

what is the general syntax of JOINs and how is it written?

1.) specify all the columns from both tables that you want to return as output, with the column that both tables share as the first (it's generally good practice to do this but not necessary) 2.) specify the main table (you choose which one it is) in the FROM clause 3.) specify the table that the main table joins to in the INNER JOIN clause. in addition, you put a join condition after the ON keyword i.e., A.pka = B.fka. **the column you choose with the ON keyword will be the column that both tables share ex.) SELECT A.pka, A.c1, B.fka, B.c2 FROM A INNER JOIN B ON A.pka = B.fka; where: A = alias for table1 B = alias for table2 pka = primary key of table1 fka = foreign key of table 2 that's the same as the primary key of table 1 c1 = column1 c2 = column 2 **when using INNER JOIN, you must specify the table of the column you are selecting in the SELECT CLAUSE: A in A.pka is the table, pka in A.pka is the column. **the part of the INNER JOIN statement after "ON" can be interchangeable - either type "A.pka =B.fka" or "B.fka = A.pka". the column "pka" or "fka" don't have to show up in the SELECT statement, but it can depending on if that's what you want to see in the output (you'd just need to specify the column in the select statement) **if a column only shows up in ONE table, then you don't need to specify the table in the SELECT statement. you could if you want to but it wouldn't make a difference. same is true for the ORDER BY statement **INNER JOIN statements will come before the WHERE statement, which means it comes before WHERE, GROUP BY, HAVING, and ORDER BY **usually, what you'd be selecting in the SELECT statement would be columns exclusive to either table but are related to each table.

how many joins are there? which is the most basic? what statement are JOINs found in?

3: inner join, outer join, and self-join inner join is the most basic JOIN statements are found in the SELECT statements

what does the HAVING clause do? what is it usually used in conjunction with? how does the syntax look?

HAVING clause: filters group rows that don't satisfy a given condition (almost like the WHERE clause but you're using it in association with GROUP BY. basically replacing WHERE with HAVING) used in conjunction with the GROUP BY clause syntax: SELECT col1, aggregate_function(column2) FROM tbl GROUP BY col1 HAVING condition;

what is the AS statement used for? what statement is it usually in conjunction with?

AS is used to rename columns/table selections with an alias. ideally used to rename aggregate columns from the SELECT statement in the output, but can also be used for non-aggregate columns (when you use an aggregate, the title of the column for something like SUM(amount) that appears in the output will be "sum numeric", which is a general title. you could rename it as something else with AS) ex.) SELECT payment_id AS my_payment_column FROM payment; or SELECT customer_id, SUM(amount) AS total_spent FROM payment GROUP BY customer_id; AS can be used anywhere if it's giving an alias to something - it can give a long table name an alias in FROM, JOIN, or a long column name in SELECT **for advanced sql users, you can just replace the AS statement with a simple space and sql would automatically give an alias to whatever comes before the space ex.) SELECT payment_id AS my_payment_column FROM payment; is the same as: (AS is replaced by the space between "payment_id" and "my_payment_column" SELECT payment_id my_payment_column FROM payment;

when using GROUP BY and ORDER BY in the same query, which must go first?

GROUP BY (and then ORDER BY goes last) **therefore, WHERE goes before ORDER BY, but WHERE also goes before GROUP BY

what does the GROUP BY clause do? what is it usually associated with?

GROUP BY divides rows returned by the select statement into groups aggregate functions: for each group, you can apply an aggregate function like: 1.) calculating the sum of items 2.) counting the number of items in the groups **GROUP BY also acts as a DISTINCT function because it doesn't count duplicates. if you don't use GROUP BY, you'd get everything including duplicates **aggregate functions are not allowed in the GROUP BY clause **when you have a aggregate, there must be a GROUP BY clause to complete the query.

what's the difference between the WHERE clause and the HAVING clause?

WHERE: sets the condition for individual rows BEFORE GROUP BY clause applies. (sets condition before calling the GROUP BY - basically sets conditions for the SELECT statement in general. ) HAVING: sets the condition for group rows created by the GROUP BY clause AFTER the GROUP BY clause applies (sets condition after calling the GROUP BY - which basically means you can create a condition for aggregates statements in the SELECT clause (that are usually associated with GROUP BY) like WHERE does for general statements.) ex.) if you had an aggregate like SUM(payments) , which means GROUP BY will be used, and you wanted the output to show the SUM(payments) > 300, you'd need to use a HAVING statement instead of WHERE. you can't create a condition for aggregates with WHERE statements. ***basically HAVING is for setting conditions for aggregate columns in SELECT statements and WHERE is for setting conditions for general columns in SELECT statements

what is a self join? when would you use a self join? what must you always do when doing a self join if you're using a self join, what could you also do that's also a valid way to solve the question and why would you preferably not use it?

a join where you join a table to itself when you want to combine rows with other rows in the same table you must use a table alias (use the AS clause) to let SQL distinguish the left table from the right table because you're using only one table with a self join - you'll be using a comma to separate the table's aliases *you're not limited to an inner join, you can do a left/right/left outer/etc join as well. ex.) SELECT a.first_name, a.last_name, b. first_name, b.last_name FROM customer AS a, customer AS b WHERE a.first_name = b.last_name a subquery, but a self join is more efficient **the INNER JOIN or any type of JOIN clause isn't stated anywhere because it's a self join, whereas you'd have to state the type of join in any other non-self-join query.

what is the difference between a normal JOIN and a self join?

a normal JOIN will use the INNER JOIN/LEFT JOIN/etc. clause while a self join doesn't need those clauses - it just continues onto the WHERE clause. a self join will always need two aliases for the same table in the FROM clause in order to work. *you can do use a subquery where the inner query (the subquery) will use the same table as the outer query, but you might as well use a self join to avoid use the JOIN clause, which would require more work for sql.

what is the syntax for the following question in two queries and in one subquery?: given: 1.) film_id is only in the inventory table 2.) rental table contains columns: rental_id, rental_date, inventory_id, customer_id, return_date, staff_id, last_update 3.) inventory table contains columns: inventory_id, film_id, store_id, last_update "what are the titles of the films where their return date is between may 29, 2005 and may 30, 2005?"

two queries: SELECT inventory.film_id FROM rental INNER JOIN inventory ON inventory.inventory_id = rental.inventory_id WHERE return_date BETWEEN '2005-05-29' AND '2005-05-30'; SELECT film_id, title FROM film WHERE film_id IN (a long list of film_ids you'd have to get in pgadmin/sql/etc.) subquery: SELECT film_id, title FROM film WHERE film_id IN (SELECT inventory.film_id FROM rental INNER JOIN inventory ON inventory.inventory_id = rental.inventory_id WHERE return_date BETWEEN '2005-05-29' AND '2005-05-30';

what operator makes all the cells in a selected column upper/lower case?

upper: makes the characters in a column all upper cased lower: makes the characters in a column all lower cased ex.) SELECT upper(first_name) FROM customer; result: JARED AMANDA LORY SELECT lower(lower_name) FROM customer; result: jared amanda lory

how can you use mathematical functions in the SELECT statement?

when two columns have the same data type, like integer, you could just add "+", "-", "/", or "*" to add, subtract, divide, or multiply them by. ex.) SELECT payment_id+customer_id FROM payment; where: payment_id = 1 and customer_id = 5, output would result in a number (6).

when would you use a self join generally?

when you want the output of a query to pull something related to something in the same table ex.) you want the list of customers whose first name are the same as the last name of customers in the same table jenny rose rose smith

site for various string operators 88

www.postgresql.org/docs/9.5/static/functions-string/html google: manager employee self join

in what order do days, months, and years go in timestamps?

year-month-day ex.) march 29, 2005" 2005-05-29

when using aliases for aggregates, can you interchange the name of the column in general syntax?

yes

can you add more columns in the SELECT statement along with an extract function?

yes, this means you can also use aggregates along with extract (separated by a column, of course)

what can you use to specify a condition in the WHERE clause when the condition is for timestamps?

you must either use the operators (>, <, >=, <=) or the BETWEEN/NOT BETWEEN keyword when using operators you MUST restate the column/timestamps for each condition in the WHERE clause ex.) do this: SELECT * FROM bookings WHERE starttime >= '2012-09-01' AND starttime < '2012-10-01'; DON'T DO THIS: SELECT * FROM bookings WHERE starttime >= '2012-09-01' AND < '2012-10-01';

if there are two schemas in a database, what must you do when using the SELECT clause to return output from a table?

you must specify the schema in the FROM clause ex.) schema.table_name SELECT * FROM cd.members or SELECT * FROM cd.bookings or SELECT * FROM cd.facilities given there are three tables named members, bookings, and facilities

when doing a self join, what must happen?

you must use an alias for the table


Ensembles d'études connexes

Chapter 2 Introduction to Java Applications Q1

View Set

Nutrition Final study guide - Chittom @ ICC

View Set

Florida statutes, rules, and regulations common to all lines

View Set

Property and Casualty insurance (General insurance)

View Set

ACIS 2504 Python Unit 1 11/3/2020 & 11/12/2020

View Set

Modern Database Management - Chapter 4

View Set

General Insurance Quiz questions

View Set

Alabama (Department Of Insurance) Life and Health Insurance

View Set