SQL Distilled

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

A1 GOTCHAS -not putting all vars in GROUP BY

-not prefacing with table name if necessary when doing a subquery join, link up the cols in the WHERE clause when subquery join, if selecting a col, the table it's from needs to be listed in the FROM statement

aggregate function returns a single value, so you can use it as part of a comparison predicate in a search condition.

...WHERE ContractPrice >= (SELECT AVG(ContractPrice) FROM Engagements) You have to place the aggregate function within a subquery, however, and then use the subquery as part of the comparison predicate.

copy from one table to another

CREATE table mytable SELECT (*) from oldtable

GROUP BY Use only when calculating aggregates by group. Placement after WHERE and before ORDER BY

EVERY SELECTED COL NAME MUST BE IN THE GROUP BY STATEMENT. If used without aggregate, it acts as a DISTINCT statement

COUNT when want to include ZERO as a valid result. Don't using HAVING. Use a WHERE referencing the outside select.

Example is count(staffid) ...where staff_type ='Professor' AND categorydescription = 'Biology'. Put the SELECT ...WHERE clause inside a SELECT and generalize to... AND categorydescription = outside.categorydescription. That will make it for each category, creating the same effect as HAVING or GROUP

GROUP BY You can only use aggregates for comparison in the HAVING clause:

GROUP BY ... HAVING SUM(cash) > 500 can use in a SELECT clause

HAVING for 'fewer than' cannot show entries with 0, need to use subquery technique in the where statement and the select

HAVING won't show 0 in count() select concat(stffirstname, ' ', stflastname), (select count(classid) from faculty_classes where faculty_classes.staffid = staff.staffid) from staff where (select count(classid) from faculty_classes f2 where staff.staffid = f2.staffid) < 3

JOIN NATURAL JOIN is structured in such a way that, columns with the same name of associate tables will appear once only.

Natural Join: Guidelines: The associated tables have one or more pairs of identically named columns. The columns must be the same data type. Don't use ON clause in a NATURAL JOIN.

CASE expression SELECT CASE case_value WHEN when_value THEN statement_list [WHEN when_value THEN statement_list] ... [ELSE statement_list] END from table ;

Or: SELECT CASE WHEN search_condition THEN statement_list [WHEN search_condition THEN statement_list] ... [ELSE statement_list] END as newcolname from table; shows how to add an alias for the col if THEN is a string, make sure to use quotes eg case when value then "HIGH"

nth highest record To select the nth highest record, you need to perform the following steps:

SELECT * FROM (SELECT * FROM table_name ORDER BY column_name ASC LIMIT N) AS tbl ORDER BY column_name DESC LIMIT 1; First you get the n highest records and sort them in ascending order. The nth highest record is the last record in the result set. Then you sort the result set in descending order and get the first one. maybe needs to be n distinct highest records

Delete only the deadline rows: DELETE `deadline` FROM `deadline` LEFT JOIN `job` .... ` ....

SELECT * FROM `deadline` LEFT JOIN `job` ON deadline.job_id = job.job_id WHERE `status` = 'szamlazva' OR `status` = 'szamlazhato' OR `status` = 'fizetve' OR `status` = 'szallitva' OR `status` = 'storno' Delete the deadline and job rows: DELETE `deadline`, `job` FROM `deadline` LEFT JOIN `job' Delete only the job rows: DELETE `job` FROM `deadline` LEFT JOIN `job` ..

SUBQUERY CORRELATED A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. For example:

SELECT * FROM t1 WHERE column1 = ANY (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);

GROUP BY you can use a GROUP BY clause and not include any aggregate functions in your SELECT clause? Sure you can! When you do this, you get the same effect as using the DISTINCT keyword

SELECT Customers.CustCityName FROM Customers GROUP BY Customers.CustCityName But DISTINCT is preferred takes out the duplicates

BOTH, shop has both hammer and thermometer

SELECT DISTINCT s.shopname FROM shops s JOIN sale s1 ON s.shopcode = s1.shopcode AND s1.product='hammer' JOIN sale s2 ON s.shopcode = s2.shopcode AND s2.product='thermometer';

median

SELECT M.MEDIAN_COL FROM MEDIAN_TABLE M WHERE (SELECT COUNT(MEDIAN_COL) FROM MEDIAN_TABLE WHERE MEDIAN_COL < M.MEDIAN_COL ) = (SELECT COUNT(MEDIAN_COL) FROM MEDIAN_TABLE WHERE MEDIAN_COL > M.MEDIAN_COL );

INSERT if it is one or more rows use INSERT INTO table (cols) VALUES. if from another table use INSERT INTO .. SELECT

SELECT from another table INSERT INTO tbl_name (a,b,c) VALUES (1,2,3), (4,5,6) if all cols don't need to repeat in INSERT but if partial do have to give name of col

TIME find overlapping date/time rows in a table

SELECT m1.meetingID, m1.meetingStart, m1.meetingEnd, m2.meetingID FROM t_meeting m1, t_meeting m2 WHERE (m2.meetingStart BETWEEN m1.meetingStart AND m1.meetingEnd OR m2.meetingEnd BETWEEN m1.meetingStart AND m1.meetingEnd) AND m2.meetingID > m1.meetingID ie they aren't the same meeting

aggregate find max per group, including all that are max

SELECT o.* FROM `Persons` o # 'o' from 'oldest person in group' LEFT JOIN `Persons` b # 'b' from 'bigger age' ON o.Group = b.Group AND o.Age < b.Age WHERE b.Age is NULL # bigger age not found

error code 1452: cannot add or update a child row: a foreign key constraint fails A simple hack can be to disable foreign key checks before performing any operation on the table. Simply query

SET FOREIGN_KEY_CHECKS=0 This will disable foreign key matching against any other tables. After you are done with the table enable it again

Alias You can only use column aliases in GROUP BY if defined in FROM or WHERE, ORDER BY if defined in SELECT, FROM or WHERE, or HAVING if defined in FROM or WHERE clauses, in WHERE If defined in FROM.

SQL is implemented as if a query was executed in the following order: FROM clause WHERE clause GROUP BY clause HAVING clause SELECT clause ORDER BY clause

DATETIME type is used when you need values that contain both date and time information. MySQL retrieves and displays DATETIME values in 'YYYY-MM-DD HH:MM:SS' format. The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.

The TIMESTAMP data type has a range of '1970-01-01 00:00:01' UTC to '2038-01-09 03:14:07' UTC. It has varying properties, depending on the MySQL version and the SQL mode the server is running in.

JOIN use WHERE or ON The ON clause defines the relationship between the tables.

The WHERE clause describes which rows you are interested in. Often you can swap which way you do it and still get the same result, however this is not always the case with a left outer join. If the ON clause fails you still get a row with columns from the left table but with nulls in the columns from the right table. If the WHERE clause fails you won't get that row at all.

JOIN subqueries and joins. In some cases, the subquery and the join might return different results. This occurs when table2 contains multiple instances of column2a.

The subquery form produces only one instance of each column2a value, but the join produces them all and its output includes duplicate rows. To suppress these duplicates, begin the join with SELECT DISTINCT rather than SELECT.

subquery Note that you should not include an alias when you write a subquery in a conditional statement.

This is because the subquery is treated as an individual value (or set of values in the IN case) rather than as a table. Conditional statement is WHERE, JOIN/ON, and CASE

HAVING Any reference to a column in a predicate within the search condition of a HAVING clause either must name a column listed in the GROUP BY clause or must be enclosed within an aggregate function.

This makes sense because any column comparisons must use something generated from the grouped rows—either a grouping value or an aggregate calculation across rows in each group. In standard SQL, a query cannot refer to non-aggregated columns in the HAVING clause that are not named in the GROUP BY clause.

NOT IN a gotcha SELECT a.* FROM a WHERE a.column NOT IN (SELECT column FROM b) This query will never return anything if there is but a single NULL in b.column. As with other predicates, both IN and NOT IN against NULL evaluate to NULL.

This should be rewritten using a NOT EXISTS: SELECT a.* FROM a WHERE NOT EXISTS ( SELECT NULL FROM b WHERE b.column = a.column )

UPDATE When building an UPDATE statement, you can put only the target table after the UPDATE key- word, so you must take the other tables and put them in a subquery that returns the column that you can link back to the target table.

UPDATE Classes SET StartTime = '14:00:00' WHERE ClassID IN (SELECT ClassID FROM Faculty_Classes INNER JOIN Staff ON Faculty_Classes.StaffID = Staff.StaffID WHERE StfFirstName = 'Kathryn' AND StfLastName = 'Patterson')

RAND()

Use the built-in RAND function in conjunction with LIMIT and ORDER BY: 1 select ename,job 2 from emp 3 order by rand() limit 5

HAVING group level filtering. Like WHERE statement but for a group rather than individual. same commands as WHERE.

WHERE filters before grouping, HAVING filters after grouping. HAVING replaces "WHERE",. only use it with GROUP BY. The GROUP BY statement comes before the HAVING statement.

UNION Both tables must have the same number of columns The columns must have the same data types in the same order as the first table

While the column names don't necessarily have to be the same, you will find that they typically are. This is because most of the instances in which you'd want to use UNION involve stitching together different parts of the same dataset.

CAST() - these are the allowed types

binary, char, datetime, date, time, decimal, unsigned, signed

AGGREGATE can't reference result of an aggregate in a WHERE clause for same query several ways to handle. no aggregate in WHERE explanation: The sum() works on a set of values specified in the argument. Here 'salary' column is passed , so it must add up all the values of this column. But inside where clause, the records are checked one by one , like first record 1 is checked for the test and so on. Thus sum(salary) will not be computed as it needs access to all the column values and then only it will return a value.

canonical solution SELECT t.firstName , t.Lastname , t.id FROM mytable t JOIN ( SELECT MAX(mx.id) AS max_id FROM mytable mx) m ON m.max_id = t.id (NOTE: select vars from first table , join with max(var) as maxvar from copy of table, on maxvar = var 2nd approach: WHERE t.id = (SELECT MAX(id) FROM ... ) 3rd approach just for max or min: SELECT t.firstName , t.Lastname, t.id FROM mytable t ORDER BY t.id DESC LIMIT 1

NULL transform NULLs into values. use coalesce

coalesce(comm, 0) returns values of comm that exist, if NULL, returns a 0.

subquery can be used as alternative to GROUP BY uses correlated subqueries when there is an aggregate

correlation is within in the subquery

SELECT * INTO <table> from <table> SELECT (col list) INTO

creates a new table from existing tables, exports data

CAST(), CONVERT() convert(expr, type), cast(expr as type)

data manipulation

DATE functions adddate( date, INTERVAL value unit) datediff(firstdate, seconddate) curdate(), cur(time), current_timestamp()

date_add('2016-03-01', INTERVAL 3 days) adddate() and date_add() are synonymous when used with INTERVAL

STRING manipulation

example: trim(both '()' from string), ltrim/rtrim(string, number of characters), length(str), position('A' in string), strpos(string, 'A'), substr(string, starting position, number of chars), concat() or ||, upper(), lower()

LIMIT {[offset,] row_count | row_count OFFSET offset}

how to find the nth, order then limit 1 then offset.. offset x, means start with the x+1

round(value, number)

number is number of decimals or if negative the number of places, e.g., -3 is rounding to nearest thousand

COALESCE To substitute the NULL value in the result set, you can use the COALESCE function as follows: SELECT customerName, city, COALESCE(state, 'N/A'), country FROM customers;

replaces NULLs with 'N/A' in this example IFNULL() is for just 2 values, COALESCE works for more

IFNULL(expr1, expr2) if expr1 is not NULL return value of expr1. If expr1 is null, return expr2

return null

join equi-join SELECT t1.*, t2.* FROM t1, t2 WHERE t1.i1 = t2.i2

same as inner join select * from t1 join t2 on t1.i1 = t2.i2

nth how to choose the nth highest record . The first has 0 bigger than all, means it is the biggest, the first, n=1 easier to do this with limits and desc

select (*) from tbl_name t1 where n-1 = (select count(primary_key_col) from tbl_name t2 where t2.col_name > t1.col_name); NOTE: n-1 of col_name are bigger than all the rest in the col_name or select * from tbl t1 where n = (select count(distinct tbl2.col) from tbl2 where tbl2.col <= t1.col) at which row are n <= to other values in the table, ie values in the other table

JOIN FULL OUTER UNION ALL with exclusion join the second join excludes except where a.price is null. This gets rid of duplicates

select * from apples a left outer join oranges as o on a.price = o.price union all select * from apples a right outer join oranges as o on a.price = o.price where a.price is null;

dataset

select * from information_schema.columns where table_name ='winemag_p2'

AGGREGATE: An aggregate may not appear in the WHERE clause unless it is in a subquery, contained in a HAVING clause, or a select list, and the column being aggregated is an outer reference. Because WHERE works on row, aggregate needs all the members of a col

select * from staff where salary > (select avg(salary) from staff); select deptid, COUNT(*) as TotalCount from staff group by deptid having count(*) > 2 Use a subselect: e.g. SELECT row FROM table WHERE id=( SELECT max(id) FROM table)

nth find the nth

select * from tbl order by var [desc] limit n-1, 1

SUBQUERY use subquery for generating a column

select a, b, (select c from xyz where..) from qtz where

COUNT or SUM with CASE COUNT counts any non-null value if using case .. then 1 else..0, use SUM() or use NULL instead of 0 if want to use COUNT()

select state, 100.0 * count(case when elevation >= 2000 then 1 else null end)/count(*) as percentage_high_elevation_airport from airports group by state;

user defined variable For SET, either = or := can be used as the assignment operator.

set @i = 2; select @i = @i + 1 ... mysql> SET @t1=1, @t2=2, @t3:=4; mysql> SELECT @t1, @t2, @t3, @t4 := @t1+@t2+@t3; For SET, either = or := can be used as the assignment operator. You can also assign a value to a user variable in statements other than SET. In this case, the assignment operator must be := and not = because the latter is treated as the comparison operator = in non-SET statements: User variables can be assigned a value from a limited set of data types: integer, decimal, floating-point, binary or nonbinary string, or NULL value.

GROUP BY must refer to columns created by the FROM and WHERE clauses. It cannot use an expression you create in your SELECT clause, but can use a col from select

this explains why can't do an aggregate in where- it is creating an expression

GROUP BY another way to do it with correlated subquery:

this will allow zeros in the count if there aren't any in faculty_classes, for example NOTE: use subquery where variable from inside subq = var from outside select - this puts it in a group select staffid, count(classid) from faculty_classes join staff on staff.staffid =faculty_classes.staffid group by staffid select staffid, (select count(classid) from faculty_classes where faculty_classes.staffid = staff.staffid) from staff

UNION UNION ALL

to select rows from different tables, union does not have duplicates but union all does

DELETE from table where ... limit ..

use to get rid of one but not both duplicates

SUBQUERY predicates

where value expression (<>, <,>,=) some,any, all(select...)


Ensembles d'études connexes

Quiz 2 (Nazi Berlin & Divided Berlin)

View Set

Autism spectrum disorder (ASD) (Sherpath)

View Set

Life-Limited to the Payment of Funeral and Burial Expense

View Set

ECO-2050 HW Assignment 12, Chapter 11

View Set

A35 CH8: QUIZ- EMERGENCY CARE, FIRST AID, & DISASTERS - MELBENZ

View Set

Chapter 5: Small Business Entry: Paths to Part-Time Entrepreneurship

View Set

Cloud & Storage Concepts Lesson 5 Quiz

View Set