CSC 301A
SQL
We can issue commands to the database in order to make new databases, make new tables, add data, manipulate the data, delete data, and so on. The most common language for talking to a relational database in this way is called SQL: structured query language. SQL is pronounced either "S-Q-L" or "Sequel". You must never, ever say "squeal" because that's just silly. SQL is a declarative language which relies on commands. Commands include SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, ALTER. There are more commands, but these are the main ones we will use over and over. Standard SQL has no inherent ability to do much logic (for example "if" statements); logic must be handled in separate programming languages or via add-ons to the language. In this way, SQL is probably unlike any other programming language you have ever used.
Client-Server Model
Web Browser <- Web server running PHPMyAdmin <---> Database server running MySQL RDBMS Software
True or false? The following queries are all equivalent in that they return the same rows, just using a different syntax. (1) SELECT * FROM table1 INNER JOIN table2 ON table1.foo = table2.foo; (2) SELECT * FROM table1, table2WHERE table1.foo = table2.foo; (3) SELECT * FROM table1 NATURAL JOIN table2
Yes, these are three different ways of showing the same query: ANSI-style join, and the older Theta-style join, and a natural join. The correct answer is 'True'
Suppose your co-worker, the infamous Mr. Chris P. Nugget, produces the following SQL query: SELECT foozle_type, sum(foozle_cost)FROM foozlesWHERE foozle_status='Yes'AND SUM(foozle_cost) > 10GROUP BY 1ORDER BY 2 DESC; He wants to know why he's getting an error. What do you tell him? Select one: a. The SUM expression on line 4 needs to be replaced by a HAVING and moved to follow the GROUP BY. b. He can't group by and order by different things. c. The WHERE and GROUP BY are in the wrong order. d. The sum() on line 1 and the SUM() on line 4 need to be capitalized the same way. e. He needs to group by the opposite item on the SELECT line.
You can't have aggregate functions in a WHERE clause. This should be a HAVING. The correct answer is: The SUM expression on line 4 needs to be replaced by a HAVING and moved to follow the GROUP BY.
True or false: Every time you are performing an aggregate function(count, min, max, sum, etc) you need to use a GROUP BY statement or you'll get an error like "not a single group by function". Select one: True False
You will ONLY get this error IF you are selecting another column (or more) in addition to the column you are aggregating AND you forget to use GROUP BY. The statement SELECT avg(salary) FROM table; is fine. The statement SELECT department_code, avg(salary) FROM table; is NOT correct. The correct answer is 'False'.
When joining tables using an INNER JOIN that matches on equal values for a "column in common", a good rule of thumb is that you will need ___. Select one: a. at minimum, one fewer "INNER JOIN ... ON ..." clause than you have tables being joined. b. the same number of "INNER JOIN ... ON ..." clauses as the number of tables being joined. c. the same number of "INNER JOIN...ON..." clauses as primary key columns represented in the tables being joined. d. one extra "INNER JOIN ... ON ... " clause for each column represented in any primary key in the tables being joined.
You will need one INNER JOIN ... ON ... clause for each relationship between tables. The number of relationships is one minus the total number of tables. The correct answer is: at minimum, one fewer "INNER JOIN ... ON ..." clause than you have tables being joined.
What goes into the blank below (for [--] in allmystuff) in order to make the code below work? selectQuery = "SELECT foo, bar, baz FROM table WHERE fiddle=%s"cursor.execute(selectQuery,(myFiddle))allmystuff = cursor.fetchall()for __ in allmystuff:myFoo = f[0]myBar = f[1]myBaz = f[2]
f
Database tasks
>Database designer: Designing the database structure >Database programmer: Manipulating the data inside the database. >Database administrator: Keeping the database healthy, adding users, making backups.
Other Things Relevant to Each Column
>Name: The name of the column. Avoid hyphens, spaces, and funky characters. Underscores are fine. >Length: only relevant for character columns. The "length" of an integer column is irrelevant, so don't use it. >Required/nullable: is the column allowed to be empty (null)? >Auto_increment: Upon INSERT, a NULL value in the column will allow the database to automatically insert the next value. The database software itself will keep track of the next available value. Values are not re-used once deleted. >Primary Key (PK): Every row needs a unique identifier. This is called a Primary Key. We will learn more about these later in our Design unit, but for now, just know that each table will have one or more columns that comprise a Primary Key.
Values we hold dear in Database World
>Organized - Creating databases is fundamentally an organizational task >Accurate - We value correctness >Efficient - We value speed; we want things to go faster >Safe - We want to reduce risk of error, reduce risk of harm >Persistence / Longevity - We don't throw away data, don't want to lose information. The data will still be there later.
Database software options
>RDBMS "relational database management system" >Examples of RDBMS brands/companies (MySQL Server, Oracle, PostgreSQL, etc) >NO-SQL "not only SQL"
Varchar
>Short for "variable character". Varchars are just like chars, except that if the data entered into the column doesn't use up the entire reserved block, then the storage is freed. >In other words, if you make a varchar(50) column but then only add a word like "Mary" which is 4 characters, that data value will only take up 4 characters worth of memory. (A char(50) would have still taken up the whole 50 chars worth!) >Various pronunciations of 'varchar' are acceptable. I've heard all of the following: "vahr care", "vahr char", "vahr car", "vare care", "vare char", "vare car", and more.
INTEGER or INT - whole numbers
>Sometimes INT looks like INT(11) or the like. Don't worry about that "length" specifier here. An integer is always the same size, the "length" is just for display purposes, and that is 11 by default. There are also other sizes of integers that take up more or less memory, for example you can use BIGINT to story a very long number. >TINYINT = 1 byte (8 bit) >SMALLINT = 2 bytes (16 bit) >MEDIUMINT = 3 bytes (24 bit) INT = 4 bytes (32 bit) >BIGINT = 8 bytes (64 bit) >Not all numeric data values need to be integers. Consider a ZIP code. Those are numbers, but do we need the memory of an INTEGER to store them? A CHAR(5) might be a better choice.
Options for connecting to the database
>by typing commands, via a command line >by issuing commands through point-and-click, via a GUI ('graphical based client like PhpMyAdmin') >by issuing commands via a programming language (such as python) >programmatically via an API
Database
A database is a collection of information that goes together in some logical way. For example, we could imagine a database for information about Elon University. Things we might track in this database would be people (students, professors, staff members), courses taught, buildings, finances, and so on. New databases can be created by issuing a special database command (a SQL CREATE command, to be precise) directly to the database. These commands can be typed into an interface, such as PhpMyAdmin, or issued programmatically via a language like Python. Database software, such as MySQL, usually is installed on a computer (sometimes a remote server) and you access it through some kind of graphical user interface. In this class, our database software MySQL is installed on a machine called grid8.cs.elon.edu and we access that through a PhpMyAdmin interface at the following URL: http://grid8.cs.elon.edu/PhpMyAdmin Sometimes a database is called a schema. In PhpMyAdmin you have various schemas in the left-hand pane, including one named after yourself.
Relational Database
A relational database is one that is made up of relations, or tables. There are many other types of databases, but relational databases have become the most popular over the past 40 years. They are the primary database in use today, although non-relational or so-called "NoSQL" (Not Only SQL) databases are increasing in popularity and we will definitely learn about these later in this course as well.
Table
A table is the main structure inside a relational database. Tables store data concepts that are similar to each other. For example, we could imagine inside the Elon University database, we might have a table of students and a different table of courses. I like to think of these tables as storing "nouns" - each table describes one "noun". Synonyms for "table" are: entity, relation (hence the term 'relational' database management systems, or RDBMS). The same way we create entire databases, new tables can be created using the point-and-click interface, such as PhpMyAdmin, or by using the SQL CREATE command. To remove a table we use the DROP command.
INSERT
Adds a new record (row) into a table. Example: >We have a list of 30 students, and we just got a new student in the class, so we want to INSERT a new row for this student
Data Types
Data types describe what kind of data values we have. (Is it a number or words? Is it a date or a number? Is it a lot of text or just one or two characters?) Data types are assigned on a column-by-column basis. All data entered into that column must match the data type specified. Assigning a data type to data helps us keep the data correct, reduces the risk of making data errors. Example data types include int, decimal, date, enum, char, varchar, boolean, blob, etc. Unit 1.3 covers more about the specific data types. There are more data types than this, and they differ between the different DBMSs. The name for a number type might be INT in one system and INTEGER in another system. The data types of a column in a table can be changed using the SQL ALTER command. What happens if you enter data into a database using the wrong type? If you enter the wrong type of data into a column, you will either get an error or the data will be modified/changed/shortened to match what the database is expecting. These changes can be quite destructive and unexpected, so it's best to choose the correct data type for the data you have!
Date and Time data types
Date - for data formatted as YYYY-MM-DD like 2016-12-31. All three pieces (month, day, year) must be provided. The supported range is '1000-01-01' to '9999-12-31'. Time - for data formatted as HH:mm:SS like 13:15:00. If you leave off the last 00, you are leaving off the seconds. '13:12' means '13:12:00', not '00:13:12'. 24 hour clock is used by default. Datetime - for combined date and time in the format YYYY-MM-DD HH:mm:SS, like 2016-12-31 13:15:00. TIP: You can set datetime types to automatically update themselves each time a record is touched by setting their default value to CURRENT_TIMESTAMP. This is useful for creating a "last updated" field. I use these a lot. All date and time types are inserted with quotation marks around them, just like strings. Example: INSERT INTO mytable (birthdate) VALUES('2002-12-15');
SELECT
Displays data inside one or more tables.
Column
Each row is made up of at least one column describing something about that row. For example, Elon University Students might include columns called "first name" and "last name" and "birthdate". Synonyms of column are: attribute, field. Columns can be added to a new table at the time the table is created. Or, existing columns can be changed or removed using the SQL ALTER command.
Your cubicle-dwelling nightmare of a co-worker Chris P. Nugget offers you free donut delivery for a week if you show him how to perform a JOIN that shows ALL employees and what department they work in. He is confused because he wants the query to show all employees EVEN IF they don't have a department listed. He wants a "NULL" to show up if the department is empty for that employee. Here is the query he's come up with so far: SELECT e.emp_id, e.emp_name, d.dept_name FROM employee e ______department d ON e.dept_id=d.dept_id; What goes in the blank? Select one: a. LEFT OUTER JOIN b. RIGHT OUTER JOIN c. INNER JOIN d. FULL OUTER JOIN e. NATURAL JOIN f. JOIN g. CROSS JOIN
Employees is the "big" table - the one he wants to show everything from - so it goes on the left. Then we outer join it to the department table. If there is no department listed, NULL will appear for that employee. The correct answer is: LEFT OUTER JOIN
In database terminology, ERD stands for BLANK
Entity Relationship Diagram
Enum
Enumerated types are sort of like a set, or choosing an answer from a list. You specify the values you want to be allowed, and the user is only allowed to choose one of those. To add an ENUM in PhpMyAdmin, use the following structure in a CREATE statement: ENUM('foo','bar','baz') Note that the items are surrounded by single quotes with a comma in between. In PhpMyAdmin, the interface wants you to fill it in like this: Be aware that the order of the items in the ENUM creation is the sort order that you will get in an ORDER BY clause. In other words, in the previous example, red will be ordered before black, even though that is not alphabetically correct.
True or false: `backticks` and 'single quotes' are basically the same in MySQL and they can be substituted for one another with no effect. Quiz 2
False. `Backticks` are optionally used to enclose the names of databases, tables, and columns. 'Single' and "double" quotes are used to enclose non-numeric data values such as strings and dates. You can review this in Unit 1: https://github.com/megansquire/CSC301Spr2019/blob/master/Unit1/1.3Notes.md#136-a-word-about-quotation-marks
The following tables are for CUSTOMERs who have rented cars (RENTALS). Notice that RENTALS can have customers that are not in the CUSTOMER table. (e.g. Rental #70 is in RENTALS but not in CUSTOMER). Further, note that CUSTOMERs may not have rented any cars (e.g. customer #60). Given the tables shown, which type of JOIN should be used to show all customers (CID) and all rentals (RTN), even those where the customer did not rent a car yet AND rentals that may not have a customer number? Quiz4
Here we want to show everything from CUSTOMER and everything from RENTAL, even if they don't have matching rows, so we need a FULL OUTER JOIN. The correct answer is: full outer join
Which data type is best for storing positive or negative whole numbers?
INT
True or false: Any 1NF table with a single column primary key and no other candidate keys is automatically in 2NF.
If the table is in 1NF, and it has a PK of only one column, and no other candidate keys exist, then yes, it is in 2NF, True
Your annoying co-worker Chris P. Nugget is at it again. This time he wrote the following SQL and wonders why it doesn't work to locate all customers who placed orders totaling more than $2000. 1 SELECT Customer, SUM(OrderPrice) 2 FROM Orders3 GROUP BY 14 HAVING 2 > 2000; What do you tell him? Select one: a. The problem is on line 2. b. You tell Chris to restart his computer. c. The problem is on line 3. d. The problem is on line 1. e. The problem is on line 4.
If you use the shortcuts "1" and "2" to refer to columns in SQL, you can't use them in boolean expressions. 2 is always less than 2000 so this will always return an empty set. The correct answer is: The problem is on line 4.
The attribute list for the substring() function is: substring(string, position, length) What letters will be returned from the following query: SELECT substring('alphabet',4,3);
In SQL, substring() position starts with 1, not 0. So the 4th character is the start position ('h'), and the substring takes 3 characters starting from that position. Thus, h-a-b. The correct answer is: hab
Row
Inside the tables are rows, sometimes called records. Each row or record is one instance of the noun being described by the table. For example, in our imaginary Elon University database, inside the Students table, we might have a row for "Sally Smith" and another row for "Joe Johnson". New rows are added to a table using the SQL INSERT command. To remove a row we use the DELETE command.
CREATE
Makes a new table or database. For tables, at the point of CREATE you will define all the columns (fields) your table will have. For databases, you will only name the database during the CREATE.
All about NULLs
NULL values occur when no data is entered in a cell. A column can be "NULLABLE", meaning that it is allowed to have NULL values, or "not NULLABLE" meaning that NULL values are not allowed. For example, a column for middle_name should probably be NULLABLE since not everyone has a middle name. NULL is not the same thing as an empty string (sometimes shown as two quotes right next to each other like this ''), and it is not the same as 0. Interesting: NULL can't be measured against other things using the equality operator or math operators, since it has no value.
Primary keys made up of a column (or columns) that already exist(s) in the database table - and are essential to describing some aspect of the entity - are called BLANK keys.
Natural keys are columns that would have existed in your database whether you made them a PK or not. These are not "made up" keys. For example, if you chose a combination of first and last name. Those are natural keys. A person_id such as 1234, however is a synthetic key. The correct answer is: natural
Consider a SQL table `schlump` with three columns in it: `foo`, `bar`, and `baz`. All the columns are data type varchar(30). SELECT foo, bar FROM schlump ORDER BY baz; On which line of the code snippet is the FIRST error found?
No problems here. It is possible to sort by a column that was not on the SELECT line. The correct answer is: There are no errors in this code.
Data
Now we are getting into the meat of the database - the data values are found at the intersection of a row and a column. Sometimes the data values are referred to as "cell values" since they might remind us a little of the cells in a spreadsheet, in that they are found at the intersection of a column and a row. Our goal in database design is that data values should be atomic. Atomic means the data value can not be broken down any further and still remain logical. For example, "Name" can be broken into "first name" (Sally) and "last name" (Smith), but it is probably not necessary to break "first name" into its individual letters (S-A-L-L-Y). Having atomic data helps us keep the data correct, and reduces the risk of making data errors. Remember that accuracy is an important value of database systems. Entire rows of data are added to the database using SQL's INSERT command, and individual data values can be changed using the UPDATE command.
The SQL phrase used to sort a result set is BLANK
ORDER BY
Scenario: 1. You have a table with an auto_increment Primary Key. With auto_increments, the database keeps track of the next number to be used. Typically they start with 1 for the first row, and then increment 2,3,4 and so on. 2. Into this new table, you add rows 1-99 and the auto_increments are correctly applied 1-99. 3. Then, for whatever reason, you delete row 99. 4. You then add another new row to the table. What will be the Primary Key value for the new row added in step #4 above?
Once 99 is assigned as a PK value in an auto_increment field. 99 can never be used again (unless you do some alterations to the database manually - NOT described in the above scenario!) The row number will be 100. The auto_increment value, once used in a table, can not be used again. The correct answer is: 100
A few words about quotation marks
PhpMyAdmin will automatically insert ` (backquotes or backticks, located near the '1' key on the keyboard) around column names and table names. The reason it is doing this is to help you in case you accidentally create a column or table name that has spaces or other weird characters in it. String values must be enclosed in ' or ", but do not use ' or " around column names or table names CORRECT: SELECT last_name FROM mytable WHERE first_name = 'Mary'; (uses single quotes around data value) SELECT last_name FROM mytable WHERE first_name = "Mary"; (uses double quotes around a data value) SELECT \`last_name\` FROM mytable WHERE first_name = 'Mary'; (uses backticks around table name and single quotes around data value) INCORRECT: SELECT last_name FROM mytable WHERE first_name = \`Mary\`; (uses backticks around a data value) SELECT last_name FROM 'mytable' WHERE first_name = 'Mary'; (uses quotes around a table name)
TRUNCATE
Remove all the rows from a table but leaves the table structure intact. Example: >We have a table of students in a class, but the semester is over. We want to empty out this particular group of students, but leave the columns ready for the next batch of students in the following semester.
DELETE
Removes a row. Example: >A student dropped the class so we want to remove them from the table of students in this class Question: >What if you don't want to remove an entire row, but you just want to remove one or two cell values? Then you don't want DELETE, you want UPDATE! UPDATE will retain the row, but allow you to remove (or change) individual values.
DROP
Removes a table or database, as well as all the data inside it.
Smartquotes
Smart quotes are NOT SQL-compliant and will cause you no end of headaches. Many times these are inserted after we copy/paste SQL code from somewhere on the web, or where we use a non-text editor for doing our work (e.g. Word or Google Docs) - don't do it! SELECT 'last_name' FROM mytable WHERE first_name = 'Mary'; (uses quotes around a column name)
Suppose I have a table called toys with 30 records in it. Further suppose that I have another table called boys with 6 records in it. Each table has a unique identifier column with the ID of the boy or toy. These columns are called boy_id and toy_id, respectively. If I run the following query, how many rows (records) will be returned as a result of my query? SELECT t.toy_id, b.boy_idFROM toys t, boys b; Quiz 4
Suppose I have a table called toys with 30 records in it. Further suppose that I have another table called boys with 6 records in it. Each table has a unique identifier column with the ID of the boy or toy. These columns are called boy_id and toy_id, respectively. If I run the following query, how many rows (records) will be returned as a result of my query? SELECT t.toy_id, b.boy_idFROM toys t, boys b;
Text
Text data type is used for larger strings. There are medium text and bigtext types also. Avoid blobs (binary large objects) even though they are allowed in MySQL. The database is not very good at storing binary data such as pictures. A better solution would be to let the file system store binary objects and we just store the location in the database.
Suppose you have a table called students in which you have four columns, studentID, studentName, studentType, and studentTuition.A few of the sample student records look like this:1, Jackie Hwang, FT, 1500 2, Brenda McNabb, PT, 1200 3, Martha Penelope Griswald, FT, 1800 There are 1000 records in the table. Suppose you need to write a query to calculate the average tuition paid by all FT ('studentType') students. Which of the following is correct?
The command will be: SELECT avg(studentTuition) FROM students WHERE studentType='FT'; The correct answer is: I will need an avg() aggregate function and a WHERE clause.
Imagine you are at a dinner party for work and you meet a woman whose name was either 'Sara' or 'Sarah'. You decide to look her up in the corporate database later. Complete this SQL command for finding all records which could be a match using LIKE and an approximate string: SELECT * FROM people WHERE first_name BLANK; NOTE: complete the query exactly as it would have to be completed to run without error in MySQL.
The construction is: SELECT * FROM people WHERE first_name LIKE 'Sara%'; The correct answer is: like 'Sara%'
True or false: Any table that has a foreign key column that points back to the primary key of a different table is in 3NF. Quiz 6
The correct answer is 'False'.
True or false: In general, INNER JOINs are more efficient for the database to process than Subqueries, and usually easier for the human user to understand as well, so given a situation where both queries can accomplish our goal, we should choose the INNER JOIN rather than the Subquery.
The correct answer is 'True'.
When forward-engineering an ERD into an actual relational table design and SQL CREATE statements, the primary key of the parent side of a 1-M relationship (the 1 or "one" side) will become a foreign key in the child side (the M or "many" side) of the relationship. Quiz 5
The correct answer is 'True'.
Match the technology with its description. 1. Browser based interface to a MySQL database 2. Relational Database Management System (RDBMS) 3. Language for manipulating data inside a RDBMS Quiz 1
The correct answer is: Browser-based interface to a MySQL database → PhpMyAdmin, Relational Database Management System (RDBMS) → MySQL, Language for manipulating data inside a RDBMS → SQL
In SQL, the INSERT command is used for which of the following tasks? Select one: a. INSERT is used to make a new table or database. b. INSERT is used to enter a new row/record in a table. c. INSERT is used to make a new column in an existing table. d. INSERT is used to change the data type or length of an existing column in a table. e. INSERT is used to make changes to the value in a column for an existing row.
The correct answer is: INSERT is used to enter a new row/record in a table.
For there to be a violation of 2NF in a table... Select one: a. There must be some specific nontrivial multivalue dependencies. b. There must be transitive dependencies. c. Part of a primary key determines a non-key column. d. Part of the primary key must be nullable when a non-key column is taken into account. Quiz 5b
The correct answer is: Part of a primary key determines a non-key column.
Imagine you are SELECTing information associated with a specific date from a column that has been given the DATE data type in a MySQL database table. Identify the correct SQL to accomplish this goal. Select one: a. SELECT * FROM myTable WHERE importantDate = '2018-07-11'; b. SELECT * FROM myTable WHERE importantDate = 2018-07-11; c. SELECT * FROM myTable WHERE importantDate = '11-07-2018'; d. SELECT * FROM myTable WHERE importantDate = '2018-Jul-11'; e. SELECT * FROM myTable WHERE importantDate = 'July 11, 2018' Quiz 5
The correct answer is: SELECT * FROM myTable WHERE importantDate = '2018-07-11';
Suppose I want to find records in `my_table` for which there are missing values in the `foo` column. Which of the following is the correct command to run?
The correct answer is: SELECT * FROM my_table WHERE foo IS NULL;
Imagine you have a table like the following: pet_type: Cat | Dog | Fish | Cat | Dog pet_name: Suzie|Sam|Sparkles|Sam|Murphy pet_age: 12 | 5 | 1 | 3 | 2 Which of the SQL statements below is correct for counting how many unique names are in the 'pet_name' column? (There are 4 unique names in the pet_name column; Sam is used twice.) Quiz 2
The correct answer is: SELECT count(DISTINCT pet_name) FROM pets;
Suppose I have a table called foozles with three columns in it.Which of the following commands is appropriate to generate the desired output shown below?Desired output:foozle_type Number of Foozles=========== =================Flarg 19Kneedorf 23Prongle 5 Select one: a. SELECT foozle_type, COUNT(DISTINCT *) AS "Number of Foozles"FROM foozlesGROUP BY 1; b. SELECT foozle_type, COUNT(foozle_type) AS "Number of Foozles"FROM foozlesORDER BY foozle_type; c. SELECT foozle_type, COUNT(*) AS "Number of Foozles"FROM foozlesORDER BY foozle_type ASCGROUP BY foozle_type; d. SELECT foozle_type, COUNT(*) AS "Number of Foozles"FROM foozlesGROUP BY 1ORDER BY 1;
The correct answer is: SELECT foozle_type, COUNT(*) AS "Number of Foozles"FROM foozlesGROUP BY 1ORDER BY 1;
Put the steps of a Python database connection in the correct order that they should happen. -(2) Set up a cursor to the database -(3) Set up & execute a query -(4) Fetch results and process -(5) Close the database connection -(1) Open the database connection
The correct answer is: Set up a cursor to the database → 2, Set up & execute a query → 3, Fetch the results of the query & process them → 4, Close the database connection → 5, Open the database connection → 1
Match the SQL command to the correct usage of it. 1. Used to add a new row to a table. Assumes the table is either empty, or the row you want to add is not currently in the table : INSERT 2. Used to change a blank (null) middle initial column of an existing row to 'S' : UPDATE 3. Used to add new data to a column that is blank, even if the rest of the row is filled in with data : UPDATE 4. Used to change the value in a column called first_name from 'Mary' to 'Marie' : UPDATE 5. Used to modify an existing row to a table. The row must already exist to use this command : UPDATE 6. Used to change the number of rows/records in a table from 100 to 101 : INSERT Quiz 6
The correct answer is: Used to add a new row to a table. Assumes the table is either empty, or the row you want to add is not currently in the table. → INSERT, Used to change a blank (null) middle initial column of an existing row to 'S' → UPDATE, Used to add new data to a column that is blank, even if the rest of the row is filled in with data. → UPDATE, Used to change the value in a column called first_name from 'Mary' to 'Marie' → UPDATE, Used to modify an existing row to a table. The row must already exist to use this command. → UPDATE, Used to change the number of rows/records in a table from 100 to 101. → INSERT
Match the MySQL Function to its purpose. 1. Used to generate a random number between 0 and 1 2. Used to put two or more strings together. 3. Used to convert a string to uppercase 4. Used to extract the year from a 'date' data type. 5. Used to get the position of the first instance of a particular character in a string. 6. Used to pull some arbitrary piece out of a string.
The correct answer is: Used to generate a random number between 0 and 1 → rand(), Used to put two or more strings together. → concat(), Used to convert a string to uppercase → upper(), Used to extract the year from a 'date' data type. → year(), Used to get the position of the first instance of a particular character in a string. → substring_index(), Used to pull some arbitrary piece out of a string. → substring()
Match the processing function to the correct usage. -Used with a SELECT that will return only one row -Used with a SELECT that will return more than one column and more than one row -Used with a SELECT that will return more then one row -Used with SELECT count(*)
The correct answer is: Used with a SELECT that will return only one row → fetchone(), Used with a SELECT that will return more than one column and more than one row → fetchall(), Used with a SELECT that will return more then one row → fetchall(), Used with SELECT count(*) → fetchone()
Which of the following statements is true? (Select one.) Select one: a. WHERE is used to narrow down results before the GROUP BY, but HAVING is used to narrow down a result set after the GROUP BY is done. b. WHERE and HAVING are both used to narrow down result sets, but WHERE is able to be used with aggregate functions, whereas HAVING is not. c. HAVING is to be written last on the SELECT statement. It can therefore narrow down results last. This means that HAVING can effectively narrow down results of a WHERE, GROUP BY or ORDER BY. d. HAVING is used to narrow down results before the GROUP BY, but WHERE is used to narrow down a result set after the GROUP BY.
The correct answer is: WHERE is used to narrow down results before the GROUP BY, but HAVING is used to narrow down a result set after the GROUP BY is done.
In which type of JOIN do I require the "column(s) in common" to be named identically across the tables being joined? Select one: a. natural join b. cross join c. equi-join d. inner join e. synthetic join
The correct answer is: natural join
A subquery that will work independent of any other part of the query (doesn't depend on anything else from the outer query) is called a _______ subquery. An example would be the following subquery shown in bold on line 3: SELECT first_nameFROM studentsWHERE gpa = (SELECT max(gpa) from students);
The correct answer is: non-correlated
For each item, choose its closest synonym in database terminology: table, database, column, row, cell
The correct answer is: table → entity, database → schema, column → attribute, row → record, cell → data value
After this quiz, you spitefully decide to change the value of your favorite professor's last name from 'Squire' to 'Hollingsworth' in a database. To change the value of this one field, in a column called 'last_name', you need to use the BLANK command. Quiz 3
The correct answer is: update
Which data type is best in MySQL for storing a string of mixed numbers and letters? For example any street address like "123 Main St." or "11876 East 65th St NW" Select one: a. varchar b. char c. integer d. address e. string f. boolean
The correct answer is: varchar
Suppose you have a particular column you are creating in your MySQL database table. You are 100% positive that the only data that will ever go into that column is one of two words: either 'foo' or 'bar'. There are no other possible values. Data will be inputted into that table using a web form which has a drop-down box with only those two choices in it (foo and bar) and the user will get an error if they try to submit the form without one of those two choices filled in. Which of the following are the options you should choose for this column? (You may select one or more.) Select one or more: a. ENUM('foo','bar') b. CHAR(3) c. NULLABLE d. NOT NULLABLE e. PRIMARY KEY f. INTEGER g. AUTO_INCREMENT h. VARCHAR(3)
The correct answers are: ENUM('foo','bar'), NOT NULLABLE
Which of the following are true about ERDs? (check as many as apply) Select one or more: a. ERDs are a visual way of showing everything about a database design, including its purpose and examples of precise data values we can expect to populate each table. b. ERDs can be used across cultures and language barriers to describe a database design. c. None of these are true. d. ERDs are an efficient and manufacturer-independent way to describe most of the rules that govern how the database should be designed. (Manufacturer-independent means that an ERD can be implemented in many different database software packages: MySQL, SQL Server, Oracle, etc.) Quiz 5b
The correct answers are: ERDs are an efficient and manufacturer-independent way to describe most of the rules that govern how the database should be designed. (Manufacturer-independent means that an ERD can be implemented in many different database software packages: MySQL, SQL Server, Oracle, etc.), ERDs can be used across cultures and language barriers to describe a database design.
True or false: If you want to connect two tables together with a foreign key relationship, the data types and lengths of the columns you are connecting ("the columns in common") must be exactly the same, and the names of the columns being connected must also be exactly the same.
The data types and lengths of the columns in common MUST be the same, but the column names do NOT have to be the same. The correct answer is 'False'.
"I have a query that looks like this: SELECT article FROM table1 ORder By publish_date LIMIT 20" Which of the following is the best answer to give this person (assume they are using a MySQL database)?
The database must put all the records in order first, in order to know which 20 it should return to you. The correct answer is: MySQL will first put all the qualifying records in order, then it will return the first 20 records from that complete set.
Relationships
The tables in a database are usually logically connected to each other. These logical connections are called the relationships between the tables. For example, Elon University students might be listed in their own table, and then we might have a table for dorm rooms (imagine it lists the dormitory building, the room number, how many people live in that room, and so on). We can use a relationship to connect students to dormitories. Sometimes we will see drawings/diagrams that express the relationships between tables by using boxes for the tables and lines for the relationships between them. This kind of drawing is called an ERD, or entity-relationship diagram.
Decimal - numbers with decimal points
These are numbers with decimal points. They have parentheses after them, and inside the parens are two things: precision first and scale second. Example: salary DECIMAL(5,2) In this example, 5 is the precision, or total number of digits allowed in the number. 2 is the scale, or number of digits that go on the right of the decimal point. Therefore, for this example: >999.56 - is allowed >111.11 - is allowed >1000.56 - is disallowed (too many digits total) >1000.599 - is disallowed (too many digits on the right) All number types are inserted with NO quotation marks around them. Example: INSERT INTO mytable (salary) VALUES (199.99); If you accidentally type quotation marks when inserting numeric data into a numeric column, the database will probably be okay with that (it will ignore them).
Char - character data
This data type can hold any keyboard characters, as well as emojis and various world language characters also. A char data type is always a fixed length in the computer's storage, even if you type fewer characters. For example, if you set a field to be char(10) and then you only type 5 characters into the field, the value you typed will still be allotted 10 characters worth of memory. TIP: A character column needs a collation which tells us what the allowed language is, and alphabetical order is for that language. By default the collation is usually latin1-swedish-ci. (Why? If you need to store a character set that is more complicated than latin1 characters (e.g. has emojis or multi-byte character sets), use a different collation. More about this in Unit 5.
Booleans
This is a kind of fake one; it just creates a tinyint where 0 is false and any other integer > 0 is true.
Suppose you and your co-worker Chris P. Nugget are debating whether to make a column a char or a varchar. The column is designed to hold a street address (for example, "123 Main St."). Mr. Nugget says it doesn't matter whether we declare it char(50) or varchar(50) because when all the customer data is filled in, a varchar and char column both will use up the same amount of space in the database's memory. Is this true or false?
This is definitely false. A varchar(50) will only take as much space as is needed to store the values, up to a limit of 50 bytes. A char(50) will always take 50 bytes, and will right-pad the values with spaces if needed. (see page 18)
True or false: Grouping ("group by") works with five aggregate functions only (count, sum, min, max, average), and cannot be used in SQL statements that include other functions, such as substring() or concat().
This is definitely false. There is no prohibition against using 'group by' with many types of functions, including the aggregates. In this course, we have mostly seen 'group by' used with aggregates, because that is where it really shines (and is many times required!), but it can also be used with other functions, where it will have the effect of removing duplicates (similar to DISTINCT). The correct answer is 'False'.
What normal form is this table in? Assume the Order_ID column is the Primary Key and there are no other candidate keys. (Assume that each customer may place many orders each day, even though only two orders are shown here, one per customer.)
This is in 2NF. (It has a single-column PK and there are no other candidate keys.) It is not in 3NF though, because the customer fields are all dependent on Cust_id which is a non-key column. The correct answer is: This table is in 2NF but not in 3NF
True or false: The following two SQL constructions will produce the same results. Assume both queries run perfectly, with no syntax errors.(1) SELECT foo FROM bar WHERE NOT baz < 99;(2) SELECT foo FROM bar WHERE baz >= 99;
This is true. >= and NOT... < are equivalent. Even NULL values are not returned from the result of a logical equivalence operation. The correct answer is 'True'.
True or false: If we have a 1NF table that has only two columns, and both columns comprise the composite Primary Key, then the table is automatically in 3NF.
This is true. If there are only two columns, and both are part of the PK, then it is already in 2NF, and there can be no transitive dependencies, so it is also automatically in 3NF. The correct answer is 'True'.
The following question was recently published on Stack Overflow by your hapless co-worker Chris P. Nugget (he was using his alias user1853266 at the time). True or false? The correct answer is: SELECT a.name, b.name as bnameFROM user aLEFT JOIN course b -- or LEFT OUTER JOIN course bON a.course = b.idWHERE a.name='Alice';
This is true. Remember that LEFT JOIN is basically a synonym for LEFT OUTER JOIN. Chris just messed up his WHERE clause here - he had it in the wrong place. Here is the original Stack question if you want to read it. http://stackoverflow.com/questions/22444507/use-where-in-mysql-join The correct answer is 'True'.
True or false: In ERD diagramming notation, all M:N relationships should be decomposed into a series of 1:M relationships if we plan to actually forward-engineer (build) the diagram into an actual database schema.
This is true. The database will not know how to build a M:N schema otherwise. The correct answer is 'True'.
What normal form is this table in? table: `animal_colors` animal | colors pig | pink, white cow | brown, white, black, purple horse | brown, white, black
This table has a "multi-value" attribute, which is 'colors'. The 'colors' attribute right now has multiple values for some of the rows. This means that the table is not in 1NF. The correct answer is: The table is in 0NF (not yet in 1NF).
Using only the data shown in this table, what is the highest normal form the table is in? The PK is underlined. You may assume that a given "Soup Brand" is only made by one "Parent Company". Select one: a. This table is not yet in a normal form (0NF). b. This table is in 1NF but not yet in 2NF. c. The table is in 2NF but not yet in 3NF. d. The table is in 3NF or higher.
This table is in 1NF because it has no multi-value attributes (all data is atomic), and it has a valid PK (every row is unique). It is not yet in 2NF because "Parent Company" appears to be dependent on only part of the PK (Soup Brand). (Parent Company obviously can't determine Soup Flavor since there are two different brands and companies making Tomato soup.) The correct answer is: This table is in 1NF but not yet in 2NF.
True or False: When you set up a column inside a database table, you decide what "type" the data in that column will be (integer, varchar, date, etc), and from that point on, the data you enter into that column can be of that type only. Data that is not of that type will either be rejected (an error condition) or will be converted into the specified type.
True
True or false: In Python programming with the PyMySQL database libraries, all database query parameters must be passed to the database as strings, regardless of their true data type.
True
True or false: The following syntactically correct commands are basically equivalent in MySQL, in that they will produce the same results, although they may take more or less time to run. SELECT * FROM foo WHERE bar != 'Pickle'; SELECT * FROM foo WHERE bar <> 'Pickle' SELECT * FROM foo WHERE NOT bar = 'Pickle'; SELECT * FROM foo WHERE bar NOT IN ('Pickle'); SELECT * FROM foo WHERE NOT bar IN ('Pickle');
True
True or false: In our version of MySQL, if you want to connect two tables together with an INNER JOIN, the data types and lengths of the columns you are connecting ("the columns in common") must be the same, but the columns don't have to be named the same thing.
True. The columns can be named differently or they can be named the same. However, no matter what they are named, the columns being joined must have the same data type and length. The correct answer is 'True'
Database Vocabulary
Using the correct vocabulary is important. It helps us to be Accurate and Efficient (two of our values from the prior section!) Experts in any topic use specialized vocabulary (sometimes derisively called "jargon") to convey information and have a richer conversation than non-experts can understand The following vocabulary words are particular to relational database systems.
ALTER
Change the structure of an existing table. Examples: >change the name of a column >remove a column >change the length of a column, for example last_name needs to be 50 characters instead of 20 characters >add a new column to an existing table
UPDATE
Changes the data in an existing row Example: >We entered the wrong birthdate for a student, so we need to change that in that student's row >We left out the middle name for a student, so we want to add that piece of data to that student's row
On which line is the problem with the following SQL statement:DELETE toy_costFROM toysWHERE toy_name = 'Bob the Builder'; Select one: a. Line 1 b. Line 2 c. Line 3 d. There are no problems with this query. Quiz 3
DELETE does not get a list of columns. If you want to remove just the cost, but leave the rest of the row intact, use UPDATE. The correct answer is: Line 1
What is the SQL command used to remove a table?
DROP command is used to remove a table. Note that the "DELETE" command is used to remove a particular row/record from a table.
What is a database for?
Data storage, optimized for: volume, velocity, veracity, value.
Which of the following are true statements about SQL? Select as many as apply. Select one or more: a. SQL is a general purpose language designed to get data in and out of a relational database management system (RDBMS). b. SQL is used by all modern relational database management systems (RDBMS), but each of these has their own variants of the language. Most of the variation in SQL between the different RDBMSs occurs in the commands used to manipulate the structure of the database (for example in what data types, lengths, are used). c. SQL (and specifically the SELECT command) is considered a declarative programming language. This means that it is designed to do one specific computational task ("give me the data") without specifying HOW that task is done or the logical flow of the task. Because of this, standard SQL does not have an "if" command, or any facilities for looping. d. SQL is a programming language that is conceptually related to mathematical sets, so knowledge of set theory can be helpful when thinking about tables (and the relationships between them), and retrieving the columns and rows inside a table. Quiz 1
All of these are true. The correct answers are: SQL is a general purpose language designed to get data in and out of a relational database management system (RDBMS)., SQL is used by all modern relational database management systems (RDBMS), but each of these has their own variants of the language. Most of the variation in SQL between the different RDBMSs occurs in the commands used to manipulate the structure of the database (for example in what data types, lengths, are used)., SQL (and specifically the SELECT command) is considered a declarative programming language. This means that it is designed to do one specific computational task ("give me the data") without specifying HOW that task is done or the logical flow of the task. Because of this, standard SQL does not have an "if" command, or any facilities for looping., SQL is a programming language that is conceptually related to mathematical sets, so knowledge of set theory can be helpful when thinking about tables (and the relationships between them), and retrieving the columns and rows inside a table.
String Types
All string types are inserted with quotation marks around the values. Example: INSERT INTO mytable (myname) VALUES('Sally Smith'); or INSERT INTO mytable (myname) VALUES("Sally Smith");
The term for data that can't be broken down any further into its component parts is ____.
Atomic