SQL Part 3

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

NULL AND NOT NULL = (AND as in logical AND)

NULL

what is meant by "the WHERE clause is value to value"

where clauses are value to value - ie you call where on strings/ints/refs to columns, or any expression or function call that produces a single value - like a select query if it returns > 1 value, you get a syntax error

(Using elections) How successful (in % of votes cast) was the Conservative party in each ward?

# inner join!! SELECT Ward.name, SUM(votes) * 100 / t.totalVotes AS percent FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id INNER JOIN ( SELECT ward, SUM(Candidate.votes) as totalVotes FROM Candidate GROUP BY ward ) AS t ON t.ward = Candidate.ward WHERE Party.name = 'Conservative' GROUP BY Ward.name;

what kind of valued logic does SQL use

3 valued logic (0, 1 and NULL) ie false, true, NULL - NULL means there is a value but im not tell you about it

In a GROUP BY query, one row appears for each block. So every column must be one of the following:

A statistic - computed once per block anyway A column in the GROUP BY clause (these are automatically unique per block) A constant - these are repeated for each block, ie select(1)

what is the SQL statistical function that counts how many rows there are in a table (ie a table called Student)

COUNT SELECT COUNT(1) FROM Student;

say we want a list of students and average grades, but just want students who have an average > 50 we use this command: SELECT student, AVG(grade) AS average FROM Enrol GROUP BY student WHERE average > 50.0; but what's wrong with it what's the fix

Can't write WHERE average > 50.0; As the DB uses the where clause before the GROUP BY clause to determine the rows you're grouping on in the first place will complain the average is not a basic column of your table Solution: replace WHERE with HAVING: SELECT student, AVG(grade) AS average FROM Enrol GROUP BY student HAVING average > 50.0;

what is the syntax for the delete command what is good practice when using this command

DELETE FROM <table> [WHERE <condition>]; if you just do delete from table will delete all data WHERE row match the condition clauses - would be sensible to check the rows you want to delete with a SELECT & WHERE QUERY before deleting

why will this command fail? SELECT Student.id, Student.name, AVG(grade) FROM Enrol INNER JOIN Student ON Student.id = Enrol.student GROUP BY Student.id; how to fix it? what will this fix do?

GROUP BY, so statistics and GROUPBY rules applied: avg grade allowed Student id allowed student name not allowed as its not a grouped by / statistic / constant solution: Add to group by clause, i.e., GROUP BY Student.id, Student.name will do: this says gives me one block per id and per name - note, in this example, id implies name so it doesn't change the grouping of the blocks

What is the IN clause Where do you use it What is the main rule when using IN (or NOT IN) How do you use it in the following example: Which units are taught by lecturers in the hardware research group? Table is Lecturer {id, name, rgroup}

IN clauses takes a list of values It is used in subselects - ie when subselect produces a table with a single column and many rows, can select specific items from that list The rule is with subselect, subselect needs to produce a table with a single column (can be many rows) So for the example: - find titles of all units whose director is in the following list - the list is: id's of all lecturers who are in the hardware research group so: SELECT title FROM Unit WHERE director IN ( SELECT id FROM Lecturer INNER JOIN RGroup ON RGroup.id = Lecturer.rgroup WHERE RGroup.name = 'Hardware' );

What is the syntax for the Insert command

INSERT INTO <table> (<col>, [,<col>...]) VALUES (<value> [,<value>...]) [, (<value> [,<value>...])...] ;

insert this into a table called Person, which has a column called name: John O' Donnell

INSERT INTO Person(name) VALUE ('John O\'Donnell'); if you have a single quote in a string, do a \ before it - backslashes escape single quotes, backslashes, newlines etc in single-quoted strings

insert the following rows into the volunteer table, which has the columns id and name Fred (has id 1) Sarah (has id 2)

INSERT INTO Volunteer (id, name) VALUES (1, 'Fred'), (2, 'Sarah') ;

what are the INTERCEPT and EXCEPT set commands

INTERSECT "do both these queries and only return rows that appear identically in both of them" EXCEPT "do first query, throw out all the rows that appear in second query and just return the remainder"

if something in your column could be null, what should we always consider using

If something could be NULL, always consider the IS NULL case If doing where clause on a column where null is an allowed value, then always consider the case there may be a null there and write clause accordingly if u want to find all the rows

What is the EXISTS clause Where do you use it How does it work? What is another term for the way it is used? How would you use it for the following example: Which lecturers teach at least one unit? Table is Lecturer {id, name, rgroup}

It is used in subselects, so you can say WHERE EXISTS (...) and the list that follows is the subselect This is called a correlated subselect - idea is that within the subselect there is a reference to a thing in the outer table - ie if you're selecting from a particular table, WHERE EXists (...), there are values in the subselect that use the reference to that particular table (like Lecturer.id referencing Lecturer in the example below) What happens is DB goes through the referenced table, for each item selected from the table, as the subselect refers to values of the outer tabled, it will fire off a new subquery The solution to the example is: SELECT name FROM Lecturer WHERE EXISTS ( SELECT id FROM Unit WHERE Unit.director = Lecturer.id ); - so for each lecturer it selects id from unit where the unit director matches the rerferenced table.id - then it returns a list of units

what are the 2 orders you need to know in SQL + what happens if don't follow them Write them both out (7 steps in first, 8 in second)

Query Order - the order in which you write SQL queries - if keywords not in this order get syntax error Evaluation Order - the order in which you evaluate SQL queries Query Order: SELECT <columns> FROM <tables> <type> JOIN <joins> WHERE <conditions> GROUP BY <groups> HAVING <conditions> ORDER BY <orders>; Evaluation Order: 1. Load FROM tables -- load tables in from clause, if > 1 do a cross join 2. Process JOINs -- join into big table 3. Filter rows with WHERE -- where clause to filter, doesn't change cols but kills rows that don't match clause 4. GROUP BY and aggregate -- goes over remaining table, split into blocks for the group by, then apply statistics -- once per block (if don't have Group BY, but do have statistical function, this is the -- point where DB applies statistical function to WHOLE talbe!) 5. Apply aliases (SELECT) -- comes after to give DB freedom on how to organise things 6. Filter again with HAVING -- some DBs let you put anything you want in here, might complain if you could've done it -- with WHERE clause 7. Sort with ORDER BY -- sort out rows 8. Filter columns with SELECT -- SELECT comes at the very end

SQL implements 3 valued logic imagine we have a table where the column names are 0, 1 and NULL, and the row names are 0, 1 and NULL. Each cell represents the result of an AND operation between the row and column i.e., the top left cell represents the output if we did 0 AND 0 (ie a logical expression) fill in the values for this table

Reasons for some of the cells: Top row, right col: 0 and something is always 0 Bottom row, left col: 0 and something is always 0 Bottom row, middle col: 1 and something depends on the something, therefore cannot be evaluated by SQL so puts null there Bottom row, right col: NULL and NULL is NULL Middle row, right col: 1 and something depends on the something, therefore cannot be evaluated by SQL so puts null there

what is the requirement for set operations - like : UNION [ALL] INTERSECT EXCEPT

Requirement: all queries return the same number of columns. Column names don't matter. - Can't merge tables with different number of columns! - Tables can have different number of rows and different names for columns and data type - but the number of columns must be the same in all tables!

(Using elections) Which rank did Labour end up in the 'Whitchurch Park' ward? Your statement should produce a table with a single row and column containing the answer as a number. You can assume no ties.

SELECT 1 + COUNT(1) AS 'Labour Rank' FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id WHERE Ward.name = 'Whitchurch Park' AND Candidate.votes > ( SELECT votes FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id WHERE Ward.name = 'Whitchurch Park' AND Party.name = 'Labour' ) ORDER BY votes DESC;

(Using census) Find the average size of a ward's working population in the London (E12000007) region. --- The same as the last question but now for every region - your query should produce a table with one row per region. The intention here is not to repeat the above query 9 times.

SELECT AVG(Statistic.data) AS 'Average working population' FROM Statistic INNER JOIN Ward on Ward.code = Statistic.wardId INNER JOIN County on County.code = Ward.parent WHERE County.parent = 'E12000007' GROUP BY Ward.code; --- SELECT Region.name, AVG(Statistic.data) AS 'Average working population' FROM Statistic INNER JOIN Ward on Ward.code = Statistic.wardId INNER JOIN County on County.code = Ward.parent INNER JOIN Region on Region.code = County.parent GROUP BY Region.code;

how to count the number of distinct rows in a column - ie count the number of distinct names (where name is a column) in the table Student

SELECT COUNT(DISTINCT name) FROM Student;

SQL command to count the number of rows in the column "grade" in the table "Student" what rows will this not count?

SELECT COUNT(grade) FROM Student; rows in which grade is not null

(Using elections) List the names, parties and percentage of votes obtained for all candidates in the Southville ward. Order the candidates by percentage of votes obtained descending.

SELECT Candidate.name AS candidate, Party.name AS party, Candidate.votes AS votes, 100 * Candidate.votes/( SELECT SUM(votes) AS total_votes FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id WHERE Ward.name = 'Southville') AS percent FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id WHERE Ward.name = 'Southville' ORDER BY percent DESC;

(Using census) For all CLUs in the London (E12000007) region, produce a table with three columns called CLU, occupation and count such that: CLU is the CLU name. count is the number of people of the occupation class in question in the given CLU. occupation is the name of the occupation class. Only rows with count >= 10000 appear in the table. The table is sorted by count ascending.

SELECT County.code as CLU, Occupation.name as Occupation, SUM(Statistic.data) as count FROM Statistic INNER JOIN Occupation on Statistic.occId = Occupation.id INNER JOIN Ward ON Statistic.wardId = Ward.code INNER JOIN County ON Ward.parent = County.code WHERE County.parent = 'E12000007' GROUP BY CLU, Occupation HAVING count >= 10000 ORDER BY count ASC;

(Using census) In the Cabot ward (E05001979), produce a table listing the names of the 9 occupation classes and the number of people in each of the classes in this ward.

SELECT Occupation.name AS occupation, SUM(data) AS number_people FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId INNER JOIN Occupation ON Occupation.id = Statistic.occId WHERE Ward.code = 'E05001979' GROUP BY Occupation.name;

(Using census) Create a table with three columns occupation, women and men and one row per occupation class. The occupation column should list the occupation class names. The women and men columns in each row should list the total number of women resp. men in the row's occupation class in the whole dataset. The intention here is not to have to copy-paste a subquery 9 times.

SELECT Occupation.name as Occupation, women.sum as Women, men.sum as Men FROM Occupation INNER JOIN (SELECT occId, SUM(data) as sum FROM Statistic WHERE gender = 1 GROUP BY occId) as women ON women.occId = Occupation.id INNER JOIN (SELECT occId, SUM(data) as sum FROM Statistic WHERE gender = 0 GROUP BY occId) as men ON men.occId = Occupation.id;

(Using elections) What is the total number of votes that each party got in the elections? Your result should be a table with two columns party, votes.

SELECT Party.name AS party, SUM(votes) AS tot_votes FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id GROUP BY party;

(Using census) Produce a table that lists, for each of the 9 regions of England, the percentage of people in managerial (class 1) occupations who are women. --- The same as question above, but now with a 10th row in the table listing the value for all of England. You can use the string 'England' for the region column.

SELECT Region.name, 100 * SUM(Statistic.data) / t.Pop AS 'Percentage of people in class one' FROM Statistic INNER JOIN Ward on Ward.code = Statistic.wardId INNER JOIN County on County.code = Ward.parent INNER JOIN Region on Region.code = County.parent INNER JOIN Occupation ON Statistic.occId = Occupation.id INNER JOIN (SELECT SUM(Statistic.data) as Pop, County.parent as RegionId FROM Statistic INNER JOIN Ward on Ward.code = Statistic.wardId INNER JOIN County on County.code = Ward.parent GROUP BY County.parent ) AS t ON t.RegionId = County.parent WHERE Occupation.name = 'Managers, directors and senior officials' AND Statistic.gender = 1 GROUP BY County.parent; --- ( SELECT Region.name, 100 * SUM(Statistic.data) / t.Pop AS 'Percentage of people in class one' FROM Statistic INNER JOIN Ward on Ward.code = Statistic.wardId INNER JOIN County on County.code = Ward.parent INNER JOIN Region on Region.code = County.parent INNER JOIN Occupation ON Statistic.occId = Occupation.id INNER JOIN ( SELECT SUM(Statistic.data) as Pop, County.parent as RegionId FROM Statistic INNER JOIN Ward on Ward.code = Statistic.wardId INNER JOIN County on County.code = Ward.parent GROUP BY County.parent ) AS t ON t.RegionId = County.parent WHERE Occupation.name = 'Managers, directors and senior officials' AND Statistic.gender = 1 GROUP BY County.parent ) UNION ALL ( SELECT 'England' as 'Region', SUM(Statistic.data) / t.Pop as 'Percentage of people in class one' FROM (SELECT 'England' as e, SUM(Statistic.data) as Pop FROM Statistic) as t, Statistic INNER JOIN Occupation ON occId = Occupation.id WHERE Occupation.name = 'Managers, directors and senior officials' AND Statistic.gender = 1 );

(Using census) The same as the last question, but now produce a table with two rows, one for the smallest and one for the largest ward. There's no quicker way than repeating the last query twice, the question is how to stick the two "copies" together. Last Question here refers to "Find the working population, ward name and CLU name for the smallest ward (by working population) in the 2011 census."

SELECT SUM(Statistic.data) AS population, Ward.name, Ward.parent FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId GROUP BY Ward.name HAVING population <= ALL ( SELECT SUM(Statistic.data) AS population FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId GROUP BY Ward.name ) UNION ALL SELECT SUM(Statistic.data) AS population, Ward.name, Ward.parent FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId GROUP BY Ward.name HAVING population >= ALL ( SELECT SUM(Statistic.data) AS population FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId GROUP BY Ward.name );

(Using census) Find the working population, ward name and CLU name for the smallest ward (by working population) in the 2011 census.

SELECT SUM(Statistic.data) AS population, Ward.name, Ward.parent FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId GROUP BY Ward.name HAVING population <= ALL ( SELECT SUM(Statistic.data) AS population FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId GROUP BY Ward.name );

assume we have the tables with relationships shown in diagram on the answer side how would you find the weighted average of student with id = 1234567 the column name should be called "average"

SELECT SUM(cp * E.grade) / SUM(cp) AS average FROM Enrol E INNER JOIN Student ON E.Student = Student.id INNER JOIN Unit ON E.Unit = Unit.id WHERE Student.id = 1234567; each time you mention a statistical function, i.e., sum, it goes over the whole table and computes the value

(Using census) How many people work in sales and customer service occupations and live in the Cabot ward of Bristol (E05001979)?

SELECT SUM(data) FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId INNER JOIN County ON Ward.parent = County.code INNER JOIN Occupation ON Occupation.id = Statistic.occId WHERE Ward.name = 'Cabot' AND Occupation.name = 'Sales and customer service occupations';

(Using census) How many women work in sales and customer service occupations and live in the Cabot ward of Bristol (E05001979)?

SELECT SUM(data) FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId INNER JOIN County ON Ward.parent = County.code INNER JOIN Occupation ON Occupation.id = Statistic.occId WHERE Ward.name = 'Cabot' AND gender = 1 AND Occupation.name = 'Sales and customer service occupations';

(Using census) How many people work in caring, leisure and other service occupations (occupation class 6) in all of the City of Bristol CLU (E06000023)?

SELECT SUM(data) FROM Statistic INNER JOIN Ward ON Ward.code = Statistic.wardId WHERE Statistic.occId = 6 AND Ward.parent = 'E06000023';

(Using elections) How many votes were cast in the 'Windmill Hill' ward and what percentage of the electorate in this ward does this represent? Your statement should produce a table with one row and two columns called 'votes' and 'percentage'.

SELECT SUM(votes) AS total_votes, SUM(votes)/electorate AS percent FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id WHERE Ward.name = 'Windmill Hill';

(Using elections) How many votes were cast in all of Bristol in the 2014 elections?

SELECT SUM(votes) FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id;

given these relationships, state the command you'd use to compute the average grade for each student - producing a table with the column "student" and a column "average"

SELECT Student, AVG(grade) AS average FROM Enrol GROUP BY Student; since we used group by, this allows us to combine a statistics query (avg) with a non-statistic (student) this computes the statistic once for each block!

(Using elections) Find all wards where the Green party beat Labour and create a table with two columns ward, difference where the difference column is the number of Green votes minus the number of Labour votes. Your table should be ordered by difference, with the highest one first.

SELECT Ward.name AS ward, (Candidate.votes - l.labour_votes) AS difference FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id INNER JOIN ( SELECT Ward.id AS ward, Candidate.votes AS labour_votes FROM Candidate INNER JOIN Ward ON Candidate.ward = Ward.id INNER JOIN Party ON Candidate.party = Party.id WHERE Party.name = 'Labour' ) AS l ON l.ward = Candidate.ward WHERE Party.name = 'Green' AND Candidate.votes > l.labour_votes GROUP BY Ward.name ORDER BY difference DESC;

how would you get the table on the right from this diagram

SELECT grade, COUNT(1) FROM Enrol WHERE Unit = ... GROUP BY grade ORDER BY grade;

Find all the lectures in the same research group as Peter. Table is: Lecturer {id, name, group} Single command, no self joins

SELECT name FROM Lecturer WHERE rgroup = ( SELECT rgroup FROM Lecturer WHERE name = 'Peter' ); - used a subselect in a where clause - ensured susbselect returns a single value (rgroup)

how would you implement the ALL command for the following scenario? let's say have a list of students with names and grades (where grade is overall grade) how to get list of student names with the top grade explain your command

SELECT name FROM Student WHERE grade >= ALL ( SELECT grade FROM Student ); takes this list, matches the grade against every entry of the list, and if all of them return true then it returns true, if at least one doesn't it returns false

say we have a table of names and genders, where genders are 1 for female, 0 for male and other values for other genders write a command to select the names and genders of a table called Person, and intead of 1 returning for female, have 'F', 'M' for male and 'X' for everything else

SELECT name, CASE gender WHEN 1 THEN 'F' WHEN 0 THEN 'M' ELSE 'X' END FROM Person; this whole CASE statement just selects a column (name) case statements return values based on the IF statements

we have a table Ward {id, name, women, men} Say we want the the ward names where the ratio of women to men is > 0.5, where ratio is the proportion of women. implement a command that outputs this in a table with 2 columns: name and ratio

SELECT name, women/(women+men) AS ratio, FROM Ward ... HAVING ratio > 0.5;

what is the best way to do a rank query? ie selecting the name and grade of students that achieved the highest mark?

SELECT name, grade FROM Student WHERE grade = (SELECT MAX(grade) FROM Student);

what is the older version of the WITH command what is the object from this clause what is the little difference between this clause and the WITH clause

SUBSELECT with FROM The object of a FROM clause can be anything that produces a table one little quirk with FROM, you have to give an alias in mariaDB even if you don't refer to it (hence 'AS t' below) an example: SELECT name, title FROM ( SELECT name, title, Enrol.grade FROM Student INNER JOIN Enrol ON Enrol.student = Student.id INNER JOIN Unit ON Enrol.unit = Unit.id ) AS t WHERE grade >= 40;

what do set operations do and what are the three types of set operation what is the syntax what is the requirement when using set operations

Set operations allow you to merge results of 2 queries into 1 by sticking the rows together in some way UNION [ALL]. (most common one) INTERSECT EXCEPT syntax: --------------------------------------------------- SELECT ... [ UNION [ALL] | INTERSECT | EXCEPT ] SELECT ... --------------------------------------------------- --> when you have completed a select, then instead of semi colon at tend can give one of the 4 keywords (union, union all, intersect, except) --> then follow it up with second select queery and more if you want Requirement: all queries return the same number of columns. Column names don't matter. - Can't merge tables with different number of columns! - Tables can have different number of rows and different names for columns and data type - but the number of columns must be the same in all tables!

What quotes do we use for strings in SQL

Single quotes Like 'John' or any other bits of data, or table/column names Double quotes - DB identifiers (Column names / table names) that you can't just type out because its a DB keyword (like "group" or "table) - names that contain spaces, like "John Smith"

when does insert fail when does update fail so what command should we use in these cases

Sometimes you just want to make a row exist whether or not its already there, insert fails if a key already exists - respectively a constraint is violated, basically one of the constraints on insert is if you want to insert a thing that pre-exists it will fail update doesn't create new rows if they do exist yet so we do upsert

what is the syntax for the UPDATE command update Sarah's email address (email) to [email protected] in a table called Volunteer {note, her id = 3}

UPDATE TABLE <table> SET <col>=<value> [,<col>=<value>...] [WHERE <condition>]; UPDATE TABLE Volunteer SET email='[email protected]' WHERE id=3;

The Question is: Which lecturers teach at least one unit? Table is Lecturer {id, name, rgroup} And the solution is: SELECT name FROM Lecturer WHERE EXISTS ( SELECT id FROM Unit WHERE Unit.director = Lecturer.id ); Explain the solution

Uses EXISTS, which is a correlated subselect. What it does is for each item in the outer table (Lecturer), it fires off a subqery, and inside the EXISTS command, there is a list of values, which references the outer table. EXISTS returns true, so for each item in Lecturer table, returns true if the list is not empty - ie it returns true for a lecturer if they direct a unit

what is the syntax for the where clause

WHERE <condition> <condition> ::= <value> <op> <value> | ... <value> ::= <literal> -- 'Peter', single quotes! | <column> -- name | <subselect> -- (SELECT ...) | <function> -- f(...), like upper, concat, substring etc

from these relationships (diagram showed on answer side), implement a WITH clause which finds all the students who have passed (where passing is defined as achieving a grade >= 40) Now implement it using the FROM clause

WITH Uni AS ( SELECT name, title, Enrol.grade FROM Student INNER JOIN Enrol ON Enrol.student = Student.id INNER JOIN Unit ON Enrol.Unit = Unit.id ) SELECT name, title FROM Uni WHERE grade >= 40; SELECT name, title FROM ( SELECT name, title, Enrol.grade FROM Student INNER JOIN Enrol ON Enrol.student = Student.id INNER JOIN Unit ON Enrol.unit = Unit.id ) AS t WHERE grade >= 40;

what escapes single quotes, backslases and new lines in single-quoted strings

a backslash

implement the query that gives answers this question: What rank was the student with id = 21 within their cohort? - table and an example output table is shown on the right - talk about the idea behind it

a way of formulating this is that your rank is equal to the number of students with grades strictly higher than yours, + 1 SELECT 1 + COUNT(1) FROM ( SELECT id FROM Student S INNER JOIN Student T ON S.cohort = T.cohort AND S.grade > T.grade WHERE T.id = 21 );

what range of things can we use for the 2 things we compare in a WHERE clause

any kind of thing that produces a value - literal, column, subselect, function

what is a statistical query what is the important condition that must hold for a statistical query - why is this

any query containing a statistical function, i.e., SELECT COUNT(1) FROM Student; In a statistics query, all columns must be statistics (or constants), you cannot mix normal columns and statistic columns: statistics return a column containing one row, - cant have 2 different columns of 2 different lengths in same table - as soon as you do statistics, then all your columns have to be columns that return one row!!

why can GROUP BY allow you to combine statistical / non statistical queries

as GROUP BY is a statistical query even if not doing statistics

if we're selecting students and their average grades from a table, why will we get an error how can we fix this how does it fix it

because combining statistical query (average) with non statistic (student) you should group by student group by queries are always statistics - even if not doing statistics

how does SQL evaluate (T/F) these expressions (used for comparing two rows in a where clause for example) NULL > x NULL = NULL NULL != NULL x > 1 OR x <= 1 {where x = NULL}

cant evaluate cant evaluate cant evaluate cant evaluate - the way SQL goes about this one is it looks at first part "x > 1", cant evaluate since its NULL > 1, then looks at next NULL <=1 so cant evaluate

what will this command do? SELECT Student.id, Student.name, AVG(grade) FROM Enrol INNER JOIN Student ON Student.id = Enrol.student GROUP BY Student.id;

fail - GROUP BY, so statistics and GROUPBY rules applied: avg grade allowed Student id allowed student name not allowed as its not a grouped by / statistic / constant solution: Add to group by clause, i.e., GROUP BY Student.id, Student.name

what will this command do? SELECT student, unit, MAX(grade) FROM Enrol GROUP BY student;

fail - can't use unit, not in group by, not a stat, not a constant

what will this command do? SELECT student, MAX(grade) FROM Enrol;

fail - not allowed to use students and a statistical function together as haven't used group by

how to make a statistic query return more than one row

group by - it'll return one statistical result per group

what's short circuit evaluation

ie look at these two expressions they get short circuited, meaning one of the sides doesnt matter 0 and something is always 0 1 or something is always 1

explain the choice to use HAVING in this command SELECT name, women/(women+men) AS ratio, FROM Ward ... HAVING ratio > 0.5;

if you used WHERE would complain as ratio is an alias and aliases are dealt with after where clause so use having instead

why should we not try to use multiple queries when you can use a single query - think about where your DB is actually situated

in standard setup your DB lives on a different machine situated somewhere else o nthe network each time u make a query you get the full overhead of making a network request, most of the time of a query will be spent setting up the network, transferring data over the network, firing up DB, parsing the query etc - if you do 2 queries youre almost twice as low as with one

what is the point of a CASE statement

it returns a value! - ie for each item in your query, it gets translated into another value in a case statement, which is returned to you

What does that EXIST query return

it returns true if the list produced is not empty

what will this command do SELECT name, COUNT(1) FROM Student;

it will fail! "In a statistics query, all columns must be statistics (or constants), you cannot mix normal columns and statistic columns:" as soon as you do statistics, then all your columns have to be columns that return one row!! name is not a statistic

what is this in SQL -- INSERT INTO TABLE

its a comment saying INSERT INTO TABLE

what is the WITH clause used for what is the syntax

its another way of doing subqueries WITH <alias> AS ( ... ) SELECT ...; the AS clause (inside brackets) is a clause that creates a table, then we can SELECT from it Can just build a big table using WITH then select from it

what is the HAVING command

like a second WHERE clause you can use when the first doesn't work HAVING is a second WHERE that is evaluated after statistics, aliases (i.e., if u had student.id AS s, wouldn't be able to use WHERE s = ...) and grouping same syntax as WHERE

what are the two meanings of null in a DB - give an example of each

missing value (someone didn't prvide their name, but they do have one) NULL: not applicable (car registration, for someone who doesn't have a car)

how many rows should a statistic query return

one!!

If you ever have it where you want a column to display that is not part of GROUP BY DB column, but there is a functional dependency on what you're already grouping by - doesn't change anything, just removes error

read again

how can we compare values in a normal column with a scalar/number/string, if we have to use a statistical function to find that scalar/number/string?

since we cannot combine statistical columns with normal columns in the same table, we should get the statistical column out of a subselect! When you have a select that creates a table with exactly one row and one column, can use the value in that cell as a scalar/number/string to match our normal column value

what does GROUP BY do

takes a column or list of columns... splits table into blocks where each block is a group of rows which are the same in the group columns

what is the SQL command to find non-null items

the IS clause - used to find all the NULLs in a table <condition> ::= ... | <item> IS [NOT] NULL

what are the ANY and ALL SQL equivalents of in mathemtatics

there exists for all

what does the UNION ALL set command do whats it useful for an example is " find all students that have passed their degree (grade >= 50). If doing masters then >= 50, but if UG then >= 40" - how would you implement this, if you had a table of username, cohort and grade columns, where cohort is either MASTERS or UNDERGRADUATE what is UNION (ie not UNION ALL)

this just means stick the rows of new query onto end of rows of original query useful for writing a table that shows you 2 different things SELECT username, cohort FROM Student WHERE cohort LIKE 'M%' AND grade >= 50 UNION ALL SELECT username, cohort FROM Student WHERE cohort NOT LIKE 'M%' AND grade >= 40; if worried about repeated rows then alternative (just Union), says select rows from both tables, filter out rows that appear in both of them, then stick remaining ones together UNION ALL: "total" rows

what is the alternative to DELETE which deletes a whole table - what is the syntax - what does this effectively tell the DB

truncate table TRUNCATE TABLE <table>; its a shortcut - you will delete DB to delete whole table, tells DB not to extra checks on individual rows and just deletes all rows

what is a good way to create your own "buckets" for histograms, where buckets are like ranges between values, and return a table showing the number of occurrences in that bucket

use CASE statement, like this: SELECT CASE WHEN grade >= 70 THEN 'First' WHEN grade >= 65 THEN '2:1' WHEN grade >= 60 THEN '2:2' ELSE 'Third' END AS g ...

what is a good SQL command to use if you wanted to show the average grade of students, and in the same table, a row for for the total grade of all students (ie 2 different thingsi n same table)

use UNION ALL

how to use NULL in a condition used in a WHERE clause

we cant! WHERE only returns rows where the condition evaluates to 1. NULL is not 1.

since the fact that WHERE clause is value to value, how should we use subselects in where clauses

we should ensure that our subselect returns a single value, like in this example SELECT name FROM Lecturer WHERE rgroup = ( SELECT rgroup FROM Lecturer WHERE name = 'Peter' );

what does this show you: SELECT unit, MIN(grade), MAX(grade), AVG(grade), COUNT(1) AS N, SUM (CASE WHEN grade >= 50 THEN 1 ELSE 0 END) AS pass FROM Enrol GROUP BY Unit;

well pass column is defined as the sum of students in case when grade >= 50 its basically showing how to use case statements in statistical queries

in a CASE statement, where is the break statement

when clause of each case statement is a condition that is evaluated as a Boolean and if that condition returns true then it takes the then clause and stops processing - no break statement here

when will you get a syntax error with a where clause (fundamental reason) why is this

where clauses are value to value - ie you call where on strings/ints/refs to columns, or any expression or function call that produces a single value - like a select query if it returns > 1 value, you get a syntax error

what does this command do? SELECT name, grade FROM Student WHERE grade = MAX(grade); what is a better way of writing it?

will fail with syntax error you are mixing a statistical function MAX with a normal column function "grade = " since we cannot combine statistical columns with normal columns in the same table, we should get the statistical column out of a subselect! SELECT name, grade FROM Student WHERE grade = (SELECT MAX(grade) FROM Student); -- this subqeury gives you a table with one row -- that's fine since you're not selecting any other columns here -- also will have one column as only one column in this select

is this command valid? SELECT MIN(grade), MAX(grade), AVG(grade), COUNT(grade) FROM Student;

yes - Can do several different statistics in same query

are these commands valid? SELECT 1, COUNT(1) FROM Student; SELECT 9, COUNT(1) FROM Student;

yes - they are statistical queries, and all cols in a statistical query need to be statistics or constants

say we have a table for students, a table for units, and a table called enrol (a join table which holds a grade a student got on a unit) without specifying the command / anything, say what you'd use to compute the average grade for every student

you would use a GROUP BY - this way we can group by student in the enrol table, and compute the average in that group


Set pelajaran terkait

Physical Fitness: Unit 4: A COMBINATION OF RESISTANCE AND REPETITIONS

View Set

Arithmetic and Geometric Sequences

View Set

GS MGT 303 CH 15 Building Careers and Writing Résumés

View Set

Chapter 18 - Blood Vessels and Circulation

View Set

N123 Chp 37 Inflammatory Disorders of the Heart

View Set

Business Strategy Exam 1 questions from Connect

View Set

Economics Unit 6: Macroeconomics

View Set

NCLEX Precautions review; standard, contact, airborne, droplet

View Set

Baroon WCU Patho Week 2 check you understanding

View Set