D191 Study Cards

¡Supera tus tareas y exámenes ahora con Quizwiz!

You need to update the OrderQty as 10 in the uc_pizzaorder table where PizzaOrderID = 1

UPDATE uc_pizzaorder SET OrderQty = 10 WHERE PizzaOrderID = 1;

What storage size is that "double precision" data type?

8

The -U flag is succeeded by the unique database value. True or false?

False

The -p flag is used to define the primary key of the database. True or False?

False

A table, test1 has already been created. Delete the rows where id is equal to 35 or 23. Structure of test1 table is given with columns name and id

DELETE FROM test1 WHERE id = 23 OR id = 35;

Deleting Rows from a Table Deleting a row from a table can be done using the DELETE statement, which looks like this:

DELETE FROM {table_name} WHERE {conditional} DELETE FROM customers WHERE email='[email protected]';

Delete all the data from the exercise_routine table but leave the structure intact.

TRUNCATE TABLE exercise_routine;

For Windows users, it is assumed that you will have secured the permissions of the file so that other users cannot access it. Once you have created the file, you can test that it works by calling psql as follows in the terminal

"psql -h my_host -p 5432 -d my_database -U my_username"

*** stopped at CROSS JOIN on page 25

**** make sure to look at that subreddit post you sent yourself via email about D191... there are specific chapters they said to read vs. skip -- also do the labs! Don't read more than you need to!

If our dataset was sufficiently large, we could encounter issues if we didn't optimize the queries first; the most common issue would simply be the time taken to execute the queries. While this doesn't sound like a significant issue, unnecessarily long processing times can cause:

- A reduction in the depth of the completed analysis. As each query takes a long time, the practicalities of project schedules may limit the number of queries, and so the depth and complexity of the analysis may be limited - The limiting of the selection of data for analysis. By artificially reducing the dataset using sub-sampling, we may be able to complete the analysis in a reasonable time but would have to sacrifice the number of observations being used. This may, in turn, lead to biases being accidentally included in the analysis - The need to use much more resources in parallel to complete the analysis in a reasonable time, thereby increasing the project cost

In order to provide scheduling functionality in PostgreSQL, you will need to use one of the external tools. Examples of these include:

- Linux crontab - Agent pgAgent - Extension pg_cron

Similarly, another potential issue with sub-optimal queries is an increase in the required system memory and compute power. This can result in either of the following two scenarios:

- Prevention of the analysis due to insufficient resources - A significant increase in the cost of the project to recruit the required resources

Here is a breakdown of the command "\copy customers FROM 'my_file.csv' CSV HEADER DELIMITER ',';"

- \copy is invoking the Postgres COPY ... FROM STDOUT command to load the data into the database - Customers specifies the name of the table that we want to append to - FROM 'my_file.csv' specifies that we are uploading records from my_file.csv -- the FROM keyword specifies that we are uploading records as opposed to the TO keyword that we used to download records - The WITH CSV HEADER parameters operate the same as before - DELIMITER ',' specifies what the delimiter is in the file. For a CSV file, this is assumed to be a comma, so we do not need this parameter. However, for readability, it might be useful to explicitly define this parameter, for no other reason than to remind yourself how the file has been formatted

For all columns in table A and table B, write an outer equi join for A and B on A.key and B.key - select * from A full outer join B on A.key = C.key - select * from A full outer join B on A.key = A.key; - select * from A full outer join B on A.key = B.key; - select * from A full outer join B on B.key = B.key;

- select * from A full outer join B on A.key = B.key;

For all columns in A and B, write a right equi join for table A and table B on A.key and B.key - select * from A right join B on A.key = D.key; - select * from A right join B on A.key = C.key; - select * from A right join B on A.key = B.key; - select * from A right join B on A.key = A.key;

- select * from A right join B on A.key = B.key;

What storage size is the "smallint" data type?

2

boolean

A data type used to represent true or false

relational database

A database that utilizes the relational model of data

WHERE clause

A piece of conditional logic that limits the amount of data returned

Which of the following events initiate a trigger? - A row is inserted into a table - A field within a row is updated - A row within a table is deleted - A table is truncated - A row within a table is retrieved - A table is selected

A row is inserted into a table A field within a row is updated A row within a table is deleted A table is truncated

Which two are true regarding subqueries? - Subqueries need not be enclosed within parenthesis - A subquery must have more than one column in the SELECT clause - A subquery is nested within another SQL query - A subquery returns conditional data necessary to complete the main query

A subquery is nested within another SQL query A subquery returns conditional data necessary to complete the main query

Which statement is true regarding triggers? - A trigger can be marked FOR EACH STATEMENT - A trigger needs to be dropped before dropping the table - Multiple triggers cannot be defined on the same event - A trigger can be created on a view

A trigger can be marked FOR EACH STATEMENT

Which keywords determine when the trigger actions will be executed? - AFTER - INSTEAD OF - FOR EACH ROW - EXCEPT - BEFORE

AFTER INSTEAD OF BEFORE

Add a new column to the CHEESES table. It should be called "description" and should be a multi-lined text field that is required.

ALTER TABLE CHEESES ADD description TEXT NOT NULL DEFAULT 'foo'; ALTER TABLE CHEESES ALTER COLUMN description DROP DEFAULT; *** I wonder why we keep adding and dropping defaults here, and if there is an alternative solution where we don't have to keep adding/removing these

You have to change the data type of "pass_key" to bytea data type (this column should not be NULL) in the Pass_keys table.

ALTER TABLE Pass_keys ALTER COLUMN pass_key TYPE BYTEA USING (pass_key::bytea), ALTER COLUMN pass_key SET NOT NULL;

All that's left to do is execute the procedure! To add funds to an account, execute your procedure by using the CALL keyword, like this:

CALL add_funds(50000, 118);

What functions are used to clean data?

CASE WHEN COALESCE NULLIF LEAST/GREATEST

A local specialty foods store wants to target a new marketing campaign based on customer spending habits. The store wants to classify its customers as follows: >$100/week, "Big Spender;" $50 to <$100 a week, "Medium Spender;" and <$50/week, "Small Spender." Which PostgreSQL function should be used by the store to classify its customers? - DISTINCT ON - NULLIF - COALESCE - CASE WHEN

CASE WHEN

Allows a query to map various values in a column to other values

CASE WHEN

________________________ is a function that allows a query to map various values in a column to other values

CASE WHEN

Create a database named CustomerDB. Fill in the blank to complete the query in order to accomplish the task. _______________________________ CustomerDB

CREATE DATABASE CustomerDB

You need to create a database named PizzaOrderingDB. Fill in the blank to complete the query in order to accomplish the task. __________________________________ PizzaOrderingDB

CREATE DATABASE PizzaOrderingDB;

Functions in SQL are contained sections of code, which provides a lot of benefits, such as efficient code reuse and simplified troubleshooting processes. We can use functions to repeat/modify statements or queries without re-entering the statement each time or searching for its use throughout longer code segments. One of the most powerful aspects of functions is also that they allow us to break code into smaller, testable chunks. As the popular computer science expression goes "If the code is not tested, it cannot be trusted" How do we defined functions in SQL?

CREATE FUNCTION some_function_name (function_arguments) RETURNS return_type AS $return_name$ DECLARE return_name return_type; BEGIN <function statements>; RETURN <some_value>; END; $return_name$ LANGUAGE PLPGSQL;

Create an index named I_EEName, which is indexing the lastname and then firstname columns of the Employees table by using the CREATE INDEX index_name ON Employees (column1, column2, ...); command in Psql Editor

CREATE INDEX I_EEName ON Employees (lastname, firstname);

In this exercise, we will create a table using the CREATE TABLE statement. The marketing team at ZoomZoom would like to create a table called countries to analyze the data of different countries. It should have four columns: an integer column, a unique name column, a founding year column, and a capital column

CREATE TABLE countries ( key INT PRIMARY KEY, name text UNIQUE, founding_year INT, capital text );

Say you wanted to create a table based on the products table that only had products from the year 2014. Let's call this table products_2014. You could then write the following query:

CREATE TABLE products_2014 AS ( SELECT * FROM products WHERE year=2014 );

Consider the following Animals table: ***Shows table with columns ID, Species, Name, Notes, Exhibit, Weight, Gender, LastWeighDate Create a view called LargeAnimals that only includes Name and Weight for all animals over 100 from the Animals table

CREATE VIEW LargeAnimals AS SELECT Name, Weight FROM Animals WHERE Weight > 100;

Create a view in Psql Editor that contains the following information: For the given customer table, use the appropriate CREATE VIEW V_Customer AS SELECT FROM Customer; command so that it concatenates the FirstName, with a blank space and the LastName fields to create an attribute CustomerName. Show all other columns in the view as they are. Hint: Use this command to combine first name and last name: CONCAT(FirstName, ' ', LastName) AS CustomerName

CREATE VIEW V_CUSTOMER AS SELECT CustomerID, CONCAT(FirstName, ' ', LastName) AS CustomerName, Street, Apt, City, State, Zip, HomePhone, MobilePhone, OtherPhone FROM Customer;

Returns every possible combination of rows from the first table and the second table

CROSS JOIN

Which statement is true about joins? - RIGHT JOIN fetches rows based on condition - CROSS JOIN creates cartesian product between two tables - LEFT JOIN fetches only rows based on the condition

CROSS JOIN creates cartesian product between two tables

Delete the rows in the MenuItems table where Popularity is less than 3 and TakeOut is equal to Y.

DELETE FROM MenuItems WHERE Popularity < 3 AND TakeOut = 'Y';

Eliminates all duplicate records and fetches only unique records

DISTINCT

Often, when looking through a dataset, you may be interested in determining the unique values in a column or group of columns. This is the primary use case of the ________________ keyword

DISTINCT

Which PostgreSQL function should be used to locate duplicate data? - NULLIF - DISTINCT - LEAST - COALESCE

DISTINCT

_____________________ allows you to ensure that only one row is returned where one or more columns are always unique in the set.

DISTINCT ON

Delete a database named CustomerDB. Fill in the blank by dragging the correct command from the bottom to complete the query.

DROP DATABASE CustomerDB;

How do you remove all existing indexes?

DROP INDEX <index name>;

Delete the MEATS table completely

DROP TABLE MEATS;

If we wanted to delete all of the data in the customers table along with the table itself, we would write:

DROP TABLE customers;

Delete the daily_diet_plan table completely.

DROP TABLE daily_diet_plan;

Deleting Tables To delete the table along with the data completely, you can just use the DROP TABLE statement with the following syntax:

DROP TABLE {table_name};

What does ETL stand for?

ETL is short for extract, transform, load

Extract, Transform, and Load (ETL)

ETL stands for Extract, Transform and Load, which is a process used to collect data from various sources, transform the data depending on business rules/needs and load the data into a destination database.

Displays the plan for a query before it is executed

EXPLAIN

The __________________ command displays the plan for the query before it is executed.

EXPLAIN ** When we use the EXPLAIN command in combination with a SQL statement, the SQL interpreter will not execute the statement, but rather return the steps that are going to be executed (a query plan) by the interpreter in order to return the desired results. There is a lot of information returned in a query plan and being able to comprehend the output is vital in tuning the performance of our database queries.

In a relational database, relations are usually implemented as tables

Each row of the table is a tuple, and the attributes are represented as columns of the table While not technically required, most tables in a relational database have a column referred to as the primary key Every column also has a data type, which describes the data for the column

What is an example of the ABC sequential aliasing convention?

FROM customers AS a INNER JOIN orders as b

What is an example of the First Letter aliasing convention?

FROM customers AS c INNER JOIN orders AS o

Tables are usually referred to in queries in the format [schema].[table]

For example, a product table in the analytics schema would be generally referred to as analytics.product

_______________________ is a clause that divides the rows of a dataset into multiple groups based on some sort of key specified in the clause. An aggregate function is then applied to all the rows within a single group to produce a single number.

GROUP BY *** the GROUP BY key and the aggregate value for the group are then displayed in the SQL output

The _______________________ clause divides the rows of a dataset into multiple groups based on some sort of key. An ______________ function is then applied to all rows within a single group to produce a single number

GROUP BY aggregate

The B-tree index is a type of binary search tree and is characterized by the fact that it has a self-balancing structure, maintaining its own data structure for efficient searching. A generic B-tree is one where each node in the tree has no more than two elements (thus providing balance) and that the first node has two children. These traits are common among B-trees, where each node is limited to n components, thus forcing the split into child nodes. The branches of the trees terminate at leaf nodes, which, by definition, have no children

Girl you know what a binary search tree is lmao

What are the three most common transformations in ETL processes?

I don't even know!! lol

Which operator is used in a WHERE clause when the subquery returns more than one row of data?

IN ALL ANY

Returns only the records that have matching values in the joined field of two tables

INNER JOIN

**** Please look back at Figure 3.3 on page 25! Really helpful venn diagram figure that shows how these things look

INNER JOIN -- selects records that have matching values in both tables LEFT JOIN -- returns all rows in the left table (A) and all the matching rows found in the right table (B) RIGHT JOIN -- returns all rows in the right table (B) and all the matching rows found in the left table (A) FULL JOIN -- returns all the rows from both tables, whether they are matched or not. Essentially full join combines the functions left join and right join. For matching rows, the two are joined into a group

Which three events can activate a trigger?

INSERT UPDATE DELETE

Consider the following table structure of Albums: *** Shows table with columns Artist, Name, Cost, Purchased Insert an album by Artist 3 you purchased on August 10, 2019 named "Reality All Stars" that costs $17.50

INSERT INTO Albums (Artist, Name, Cost, Purchased) VALUES (3, 'Reality All Stars', 17.50, '2019-08-10');

Consider the following garage table: *** shows table with columns named make, model, and year The for_sale table structure is the same as the garage table. You have to copy all of the "Alfa Romeo" cars from the garage table to the for_sale table

INSERT INTO for_sale(make, model, year) (SELECT * FROM garage WHERE (make = 'Alfa Romeo'));

Consider the following garage table structure: *** shows table with columns named make, model, and year Enter a new car in the garage table, a 1971 Chevrolet Camaro

INSERT INTO garage (make, model, year) VALUES ('Chevrolet', 'Camaro', 1971);

Consider the table named widgets. *** Shows table with columns Field, Type, NULL, and Key Insert an order for "Mary Wills" for 3 widgets at a total of $33.33

INSERT INTO widgets VALUES (3, 'Mary Wills', 33.33);

pg_cron can run several jobs in parallel, but only one instance of a program can be running at a time

If a second run should be started before the first one finishes, then it is queued and will be started as soon as the first run completes

If we wanted to return product_id column followed by the model column of the products table, we would write the following query: SELECT product_id, model FROM products;

If we wanted to return the model column first and the product_id column second, we would write this: SELECT model, product_id FROM products;

What are four characteristics of dirty data? - Inaccurate - Inconsistent - Duplicate - Standardized - Integrity -Incomplete - Ordered

Inaccurate Inconsistent Duplicate Incomplete

What are the benefits of using stored procedures?

Increased execution speed Safe execution via transactions Modularity

**** Go back and review page 47 too right before the test...

It's basically just going over jupyter notebooks and why pairing SQL with Python for database management is a good idea

The ___________ clause is a SQL clause that allows a user to join one or more tables together based on distinct conditions

JOIN

What are Outer Joins?

Left Join - returns all records from the left table along with any matching records from the right table Right Join - returns all records from the right table along with any matching records from the left table

Left vs. Right Joins

Left outer joins are far more prevalent in practice Can be easier to read and interpret

What are the downsides to SQL databases, which are as follows?

Lower specificity Limited scalability Object-relation mismatch impedance

Returns the maximum value in columnX

MAX(columnX)

Returns the minimum value in columnX. For text columns, it returns the value that would appear first alphabetically.

MIN(columnX)

Which of the following are aggregate functions used in SQL? - TRIM(string) - MIN(columnX) - TRANSLATE(string, characters, translations) - COUNT(columnX) - REGR_SLOPE(columnX, columnY)

MIN(columnX) COUNT(columnX) REGR_SLOPE(columnX, columnY)

What is Set Theory?

Mathematical concept A set is a collection of elements - unduplicated - unordered Object to set membership *** Results of a query can be considered a set. The only rules are, SQL sets must have the same number of columns and SQL set columns have to be the same data type

The ________________ constraint guarantees that no value in a column can be null

NOT NULL

Column constraints are keywords that give special properties to a column. Some major column constraints are:

NOT NULL: This constraint guarantees that not value in a column can be null UNIQUE: This constraint guarantees that every single row for a column has a unique value and no value is repeated PRIMARY KEY: This is a special constraint that is unique for each row and helps to find the row quicker. Only one column in a table can be a primary key

___________________ is, in a sense, the opposite of COALSCE.

NULLIF

What are the major SQL data types?

Numeric - numeric data types are data types that represent numbers Character - character data types store text information Boolean - boolean data types are used to represent True or False Datetime - the datetime data type is used to store time-based information such as dates and times

In addition to everything mentioned so far, it is also a good idea to set up a .pgpass file. A .pgpass file specifies the parameters that you use to connect to your database, including your password. All of the programmatic methods of accessing the database discussed in this lession (using psql, R, and Python) will allow you to skip the password parameter if your .pgpass file contains the password for the matching hostname, database, and username

On UNIX-based systems and macOS X, you can create the .pgpass file in your home directory. On Windows, you can create the file in %APPDATA%\postgresql\pgpass.conf. The file should contain one line for every database connection that you want to store, and it should follow this format (customized for your database parameters) hostname:port:database:username:password

What are three examples of dirty data? - Current information - Unique records - Outdated information - Duplicated records - Misspelled words - Range constrained fields

Outdated information Duplicated records Misspelled words

The ______________________________ constraint is unique for each row and helps to find the row quicker

PRIMARY KEY

Returns the intercept of linear regression for columnX as the response variable, and columnY as the predictor variable

REGR_INTERCEPT(columnX, columnY)

Returns the slope of linear regression for columnX as the response variable, and dcolumnY as the predictor variable

REGR_SLOPE(columnX, columnY)

What is a Full Join?

Returns all rows from two or more tables regardless of whether the join condition is met If no match, the missing side will contain null

Inner Join

Returns all rows from two or more tables that meet the join condition Joined fields must exist in both tables Only those rows that exist in Table A and Table B will be selected. SELECT customers.*, orders.* FROM customers INNER JOIN orders ON customers.customer_id = orders.customer_id; *** ON specifies the primary key and foreign key relationship here

What does UNION do?

Returns results from multiple queries as one result set UNION - returns a distinct list of rows UNION ALL - returns all rows from both tables (including nulls)

Retrieves zero or more rows from one or more database tables or database views

SELECT

The most common operation in a database is reading data from a database. This is almost exclusively done through the use of the _________________ keyword

SELECT

A right outer join is very similar to a left join, except the table on the "right" (the second listed table) will now have every row show up, and the "left" table will have NULLs if the join condition is not met. To illustrate, let's "flip" the last query by right-joining the emails table to the customers table with the following query:

SELECT * FROM emails e RIGHT JOIN customers c ON e.customer_id = c.customer_id ORDER BY c.customer_id LIMIT 1000;

Here's an example of using the equal's condition in the join predicate:

SELECT * FROM salespeople INNER JOIN dealerships ON salespeople.dealership_id = dealerships.dealership_id ORDER BY 1;

Consider the following garage table: *** Shows table with three columns called make, model, and year Select all the "Ferrari" cars from the garage table.

SELECT * FROM garage WHERE make='Ferrari';

Example of COUNT being used to return the number of rows without a NULL value in the column:

SELECT COUNT(customer_id) FROM customers;

Let's say if you wanted to get a unique list of salespeople where each salesperson has a unique first name. In the case that two salespeople have the same first name, we will return the one that started earlier. This query would look like this:

SELECT DISTINCT ON (first_name) * FROM salespeople ORDER BY first_name, hire_date;

For example, if you wanted to know all the unique model years in the products table, you could use the following query:

SELECT DISTINCT year FROM products ORDER BY 1;

Consider the following Animals table with columns ID, Species, Name, Notes, Exhibit, Weight, Gender, LastWeighDate List all the names and weights of animals in species 2, but list them by weights in reverse numerical order

SELECT Name, Weight FROM Animals WHERE Species = 2 ORDER BY Weight DESC;

Consider the following Animals table: *** Shows table with columns ID, Species, Name, Notes, Exhibit, Weight, Gender, and LastWeighDate Select just the Name, Weight, and Gender from the Animals table

SELECT Name, Weight, Gender FROM Animals;

Consider the following Places table with columns Continent, Country, City, Region List all the regions that have more than two cities in them

SELECT Region FROM Places GROUP BY Region HAVING COUNT(City)>2;

Consider the following Buildings table: *** Shows table with columns id, building_name, address, monthly_rent Select the building_name and monthly_rent from the Buildings table

SELECT building_name, monthly_rent FROM Buildings;

Here's an example of using the CASE WHEN function to get regional lists

SELECT c.customer_id, CASE WHEN c.state in ('MA', 'NH', 'VT', 'ME', 'CT', 'RI') THEN 'New England' WHEN c.state in ('GA', 'FL', 'MS', 'AL', 'LA', 'KY', 'VA', 'NC', 'SC', 'TN', 'VI', 'WV', 'AR') THEN 'Southeast' ELSE 'Other' END as region FROM customers c ORDER BY 1

Example of Full Outer Join

SELECT c.first_name, c.last_name, o.order_date, o.order_amount FROM customers AS c FULL OUTER JOIN orders AS o ON c.customer_id = o.customer_id;

Consider the following Services table: *** Shows table with three columns called id, company, and monthly_fee Select the company and monthly_fee from the Services table.

SELECT company, monthly_fee FROM Services;

Example INTERSECT

SELECT curr.* FROM customers_current AS curr INTERSECT SELECT old.* FROM customers_archive old ORDER BY last_name; *** INTERSECT operator returns the results of the two sets when the result appears in both sets (so it is essentially an identical record in both tables)

Inner Joins are the most common types of Joins. Here are some other ways of doing an inner join:

SELECT example_column, example_column2 FROM customers JOIN orders ON customers.customer_id = orders.customer_id; SELECT example_column, example_column2 FROM customers orders WHERE customers.customer_id = orders.customer_id;

Consider the following garage table with columns make, model, and year List all the makes that have more than 3 models in them

SELECT make from garage GROUP BY make HAVING count(DISTINCT(model))>3;

Consider the following garage table with columns make, model, and year Write a query to retrieve the make and do a count of all models of each make with model year after 1950

SELECT make, COUNT(model) FROM garage WHERE year > 1950 GROUP BY make;

Let's say you want to see all of the products listed by the date when they were first produced, from earliest to latest. The method for doing this in SQL would be as follows:

SELECT model FROM products ORDER BY production_start_date;

Conversely, you can use NOT IN to return all values that are not in a list of values. For instance, if we wanted all products that were not produced in the years 2014, 2016, and 2019, we could write:

SELECT model FROM products WHERE year NOT IN (2014, 2016, 2019);

Let's say we wanted to return models that not only were built in 2014, but also have a manufacturer's suggested retail price (MSRP) of less than $1,000. We can write:

SELECT model FROM products WHERE year=2014 AND msrp<=1000;

Now, let's say that we wanted to return models that were released in the year 2014 or had a product type of automobile. We would then write the following query:

SELECT model FROM products WHERE year=2014 OR product_type='automobile';

Write a query in the Psql Editor that joins (Inner Join) the three tables given below and meets the following requirements: - Retrieve PizzaOrderID, LastName, PizzaName, OrderQty, Pizzaprice from the following tables - Use the table name alias as o for uc_pizzaorder, p for uc_pizza, and c for uc_customer tables

SELECT o.PizzaOrderID, c.LastName, p.PizzaName, o.OrderQty, p.PizzaPrice FROM uc_pizzaorder As o INNER JOIN uc_pizza AS p ON p.PizzaID = o.PizzaID INNER JOIN uc_customer AS c ON c.CustomerID = o.CustomerID; **** study this one using whiteboard and memorize

Alternatively, you can also put the AS keyword between the table name and the alias to make the alias more explicit

SELECT s.* FROM salespeople AS s INNER JOIN dealerships AS d ON d.dealership_id = s.dealership_id WHERE d.state = 'CA' ORDER BY 1;

_____________ is the language utilized by users of an RDBMS to access and interact with a relational database

SQL

According to Forbes, it is estimated that, almost 80% of the time spent by analytics professionals involves preparing data for use in analysis and building models with unclean data which harms analysis by leading to poor conclusions

SQL can help in this tedious but important task, by providing ways to build datasets which are clean, in an efficient manner.

Returns the sample standard deviation of all values in columnX

STDDEV(columnX)

Returns the sum of all values in columnX

SUM(columnX)

LIMIT clause

Sets an upper limit on the number of tuples returned by SQL

FROM clause

Specifies the database tables or views involved in the SELECT statement

What are stored procedures?

Stored procedures are functions that are given the added capability of transactions. This is extremely powerful as it allows you to define pre-compiled workflows that can be safely committed and rolled back. Not to mention, stored procedures are fast because they consist of pre-compiled SQL that the database engine can immediately execute.

SQL functions are subprograms that can be called repeatedly throughout SQL database applications for processing or manipulating data - FALSE - TRUE

TRUE

Alternatively, if you want to delete all the data in a query without deleting the table, you could use the TRUNCATE keyword as follows:

TRUNCATE TABLE customers;

Delete all the data from the funds table but leave the structure intact.

TRUNCATE TABLE funds;

ETL Process Step 1 -- Extraction

The extraction step of an ETL process involves connecting to the source systems, and both selecting and collecting the necessary data needed for analytical processing within the data warehouse or data mart. Usually data is consolidated from numerous, disparate source systems that may store the data in a different format. Thus, the extraction process must convert the data into a format suitable for transformation processing. The complexity of the extraction process may vary and it depends on the type and amount of source data.

What is ETL?

The extraction, transformation, and loading process is a batch or scheduled data integration process that includes extracting data from their operational or external data sources, transforming the data into an appropriate format, and loading the data into a warehouse repository.

Model the entities that result from normalizing the model from 1st normal form to 2nd normal form to fulfill all the requirements for 1st normal form and also these requirements: - Redundant data appearing in multiple rows must be moved to separate tables - Resulting tables must be related through foreign keys *** NOTE: To simplify the complexity of the task, a customer is restricted to make the order of single pizza at a time and the records in the order table are transient. The "Customer Pizza Order" table appears as follows: Pizza Order ID Last Name First Name Street Apartment City State Zip Code Home Phone Mobile Phone Other Phone Qty of Pizzas Special Handling Notes Pizza ID Pizza Name Pizza Description Pizza Size Pizza Price Pizza Photo URL PizzaOrderTimestamp

The following shows the 2NF version of the previously 1NF table (note... the singular table has been split up into two tables called "Pizza" and "Customer Pizza Order" Table called "Pizza": Pizza ID Pizza Name Pizza Description Pizza Size Pizza Price Pizza Photo URL Table called "Customers Pizza Order": Pizza Order ID Last Name First Name Street Apartment City State Zip Code Home Phone Mobile Phone Other Phone Qty of Pizzas Special Notes Pizza ID PizzaOrderTimestamp

Installation of pg_cron

The installation of this extension only requires the following command: sudo apt-get -y install postgresql-10-cron

Which statement accurately describes how jobs are scheduled in PostgreSQL?

The pgAgent is a job scheduling agent available for PostgreSQL

Index scans are one method of improving the performance of our database queries. Index scans differ from sequential scans in that a pre-processing step is executed before the search of database records can occur.

The simplest way to think of an index scan is just like the index of a text or reference course

ETL Process Step 2 -- Transformation

The transformation step of an ETL process involves execution of a series of rules or functions to the extracted data to convert it to standard format. It includes validation of records and their rejection if they are not acceptable. The amount of manipulation needed for transformation process depends on the data. Good data sources will require a little transformation, whereas others may require one or more transofmration techniques to meet the business and technical requirements of the target database or the data warehouse. The most common processes used for transformation are conversion, clearing the duplicates, standardizing, filtering, sorting, translating, and looking up or verifying if the data sources are inconsistent.

What are the Set Operations?

These operations combine the results of multiple queries into a single result set - UNION - INTERSECT - EXCEPT

Of the following statements, which is true regarding the SQLAlchemy Engine object? - It contains information about the type of connection and a script engine - It automatically commits changes to the database due to CREATE TABLE, UPDATE, INSERT, or other modification statements - It provides a nice interface that works well with other packages (such as pandas). - It maximizes the time of the connection and reduces the load on the database

These two are true: - It automatically commits changes to the database due to CREATE TABLE, UPDATE, INSERT, or other modification statements - It provides a nice interface that works well with other packages (such as pandas).

Creation of pg_cron extension

This extension will create the meta-data and the procedures to manage it, so the following command should be executed in psql: postgres=#CREATE EXTENSION pg_cron; CREATE EXTENSION

The datetime data type is used to store time-based information such as dates and times. The following are some of the datetime data types:

Timestamp without timezone -- 8 bytes, both date and time (no time zone) Timestamp with timezone -- 8 bytes, both date and time, with time zone date -- 4 bytes, date (no time of day) time without timezone -- 8 bytes, time of day (no date) interval -- 16 bytes, time interval

Which event you could possibly write a trigger for, for auditing purposes? - To audit system tables - To audit drop on specific table - To audit select on specific table - To audit update on specific table

To audit update on specific table

A result of LEFT OUTER join can be seen as INNER JOIN + unmatched rows of left table while the result of the RIGHT OUTER join is equal to INNER JOIN + unmatched rows from the right side table - True - False

True

The -d flag is used to specify the database name. True or False?

True

The string that comes after the -h flag should be the hostname of the database. True or false?

True

Booleans are a data type used to represent True or False. What are the boolean data types?

True -- t, true, y, yes, on, 1 False -- f, false, n, no, off, 0 *** While all of these values are accepted, the values True and False are considered compliant with best practice. Booleans columns can also have NULL values

What is the most common Set operator?

UNION

To illustrate the use of the update statement, let's say that for the rest of the year, the company has decided to sell all scooter models before 2018 for $299.99. We could change the data in the products table using the following query:

UPDATE products SET base_msrp = 299.99, WHERE product_type = 'scooter' AND year<2018;

Change all menu items DateAdded dates to 2019-08-10 in the MenuItems table.

Update MenuItems SET DateAdded = '2019-08-10';

Returns the sample variance of all values in columnX

VAR(columnX)

The _____________ clause is a piece of conditional logic that limits the amount of data returned.

WHERE All of the rows returned in a SELECT statement with a WHERE clause in it meet the conditions of the WHERE clause. The WHERE clause can usually be found after the FROM clause of a SELECT statement

Common table expressions establish temporary tables by using the WITH clause. For example SELECT * FROM salespeople INNER JOIN ( SELECT * FROM dealerships WHERE dealerships.state = 'CA' ) d ON d.dealership_id = salespeople.dealership_id ORDER By 1 could also be written as

WITH d as ( SELECT * FROM dealerships WHERE dealerships.state = 'CA' ) SELECT * FROM salespeople INNER JOIN d ON d.dealership_id = salespeople.dealership.id ORDER BY 1;

Using R allows us to take an approach to connect to our database directly in our analytics code. But why use R?

While we have managed to perform aggregate-level descriptive statistics on our data using pure SQL, R allows us to perform other statistical analysis, including machine learning, regression analysis, and significance testing. R also allows us to create data visualizations that make trends clear and easier to interpret. R has arguably more statistical functionality than just about any other analytics software available

Use the \d [tablename] command to display the following table structures: *** Shows three tables named Customer, Donut, and CustomerDonutOrder

\d Customer; \d Donut; \d CustomerDonutOrder;

Use the \d command to display the list of tables and \d [tablename] command to display the table structure: ***shows two tables named Customer and Pizza

\d Customer; \d Pizza;

The ______________ command in PostgreSQL is used to get a list of functions available in memory, while the _________________ command in PostgreSQL is used to review the function definition for already-defined functions

\df \sf

An ________________ is a temporary label in a query

alias

Character data types store text information. What are the character data types?

character varying(n), varchar(n) -- variable length with limit character(n), char(n) -- fixed length, blank padded text -- variable unlimited length ** Under the hood, all of the character data types use the same underlying data structure in PostgreSQL and many other SQL databases, and most modern developers do not use char(n)

For Unix and Mac users, you will need to change permissions on the file using the following command on the command line (in the Terminal):

chmod 0600 ~/.pgpass

What does CRUD stand for?

create, read, update, and delete

What does the term ETL stand for?

extraction, transormation, and loading aka: extract, transform, load

ALL

keyword includes duplicate rows

Joins where at least one table will be represented in every row after the join operation are known as ______________ joins

outer

Causes the interpreter to end the query specified by the process ID (pid)

pg_cancel_backend

Which statement is true about scheduling jobs in Postgres? - pgadmin scheduler runs in Mac operation systems - Cron runs in Windows operating systems - pg_cron scheduler runs inside a Postgres database - Task scheduler runs in Linux operating systems

pg_cron scheduler runs inside a Postgres database

The software used to manage relational databases on a computer is referred to as a _____________________________________________________________________

relational database management system (RDBMS)

DISTINCT

removes all duplicates

The process that the SQL server uses to search through a database is known as ________________

scanning

The hash index only recently gained stability as a feature within PostgreSQL, with previous versions issuing warnings that the feature is unsafe and reporting that the method is typically not as performant as B-tree indixes. At the time of writing, the has index feature is relatively limited in the comparative statements it can run, with equality (=) being the only one available. So, given that the feature is only just stable and somewhat limited in option for use, why would anyone use it? Well, hash indices are able to describe large datasets (in the order of tens of thousands of rows or more) using very little data, allowing more of the data to be kept in memory and reducing search times for some queries. This is particularly important for databases that are at least several gigabytes in size. The following example shows how to create a hash index:

sqlda=# CREATE INDEX ix_gender ON customers USING HASH(gender);

The types of tasks that the pg_cron scheduler could be any of the following ones: - stored procedures - SQL statements - PostgreSQL commands (as VACUUM, or VACUUM ANALYZE)

yep, those are some examples haha cool B) ;) ;) ;) wink wink, poopsie plops

Returns the average of all values in columnX

AVG(columnX)

The ________________ command efficiently transfers data from a database to a file, or from a file to a database. It retrieves data from the database and dumps it in the chosen file format. It can be called remotely using the _____________ command-line program.

COPY psql

COPY moves data between PostgreSQL tables and standard file-system files. COPY TO copies the contents of a table to a file COPY FROM copies data from a file to a table (appending data to whatever is in the table already)

COPY TO can also copy the results of a SELECT query

Calculates the Pearson correlation between columnX and columnY in the data

CORR(columnX, columnY)

Counts the number of rows in the output table

COUNT(*)

We assemble data using ____________ and _______________

JOINs and UNIONs

There are several options to configure COPY and \copy commands:

- FORMAT format_name can be used to specify the format. The options for format_name are csv, text, or binary. Alternatively, you can simply specify CSV or BINARY without the FORMAT keyword, or not specify the format at all and let the output default to a text file format - DELIMITER 'delimiter_character' can be used to specify how null values should be represented (for example, ' ' if blanks represent null values, or 'NULL' if that's how missing values should be represented in the data). - HEADER specifies that the header should be output - QUOTE 'quote_character' can be used to specify how fields with special characters (for example, a comma in a text value within a CSV file) can be wrapped in quotes so that they are ignored by COPY - ESCAPE 'escape_character' specifies the character that can be used to escape the following character - ENCODING 'encoding_name' allows specification of the encoding, which is particularly useful when you are dealing with foreign languages that contain special characters or user input

Extracting data using the SELECT command directly from the database executes a sequential scan, where the database traverses through each record in the database and compares each record to the criteria in the sequential scan, returning those records that match the criteria. This is essentially a brute-force scan and, thus, can always be called upon to execute a search. The SELECT command retrieves zero or more rows from one or more database tables or database views. In many situations, a sequential scan is also often the most efficient method that will be automatically selected by the SQL server. This is particularly the case if any of the following is true:

- The table is quite small. For instance, it may not contain a large number of records - The field used in searching contains a high number of duplicates - The planner determines that the sequential scan would be equally efficient or more efficient for the given criteria than any other scan

The COPY and \copy commands are far more efficient at uploading data than an INSERT statement. There are a few reasons for this:

- When using COPY, there is only one commit, which occurs after all the rows have been inserted - There is less communication between the database and the client, so there is less network latency - Postgres includes optimizations for COPY that would not be available through INSERT

What does this query do? SELECT first_name FROM customers WHERE state='AZ' ORDER BY first_name

1. We start with the customers table. 2. The customers table is filtered to where the state column equals 'AZ'. 3. We capture the first_name column from the filtered table. 4. The first_name column is then ordered in alphabetical order.

What storage size is the "integer" data type?

4

What storage size is the "real" data type?

4

We are often interested in multiple conditions being met at once. For this, we put multiple statements together using the ___________ or __________ clause.

AND, OR

Another useful technique is to replace NULL values with a standard value. This can be accomplished easily by means of the ____________________ function.

COALESCE

You need to execute a query to change the "pass_key" column of the blob data type to include the comment "id of employees" in the Pass_keys table

ALTER TABLE Pass_keys ALTER COLUMN pass_key TYPE BYTEA USING pass_key::bytea; COMMENT on column Pass_keys.pass_key is 'id of employees';

You have to add the "created" column in the Pass_keys table that automatically holds the timestamp of when the row was created.

ALTER TABLE Pass_keys add created timestamp default current_timestamp;

Add the "date_added" column in the Places table that automatically holds the timestamp of when the row was created.

ALTER TABLE Places ADD date_added timestamp default current_timestamp;

Change the data of type "Description" to a variable length starting between 1 and 50 characters in length in the Restaurants table. This column should not be NULL.

ALTER TABLE Restaurants ALTER COLUMN Description TYPE CHARACTER VARYING(50), ALTER COLUMN Description SET NOT NULL;

Add a new column to the Things table. It should be called "color_int" and be an integer

ALTER TABLE Things ADD color_int INT;

Delete the "color" column from the Things table because this column is no longer needed.

ALTER TABLE Things DROP color;

Delete the "description" column from the UC_CHEESES table because this column is no longer needed

ALTER TABLE UC_CHEESES DROP description;

Run two DDL commands one after the other to perform the following tasks: 1. Delete the "P_budget" column of the UC_project table 2. Create a "P_cost" column with the data type INT that is required

ALTER TABLE UC_project DROP P_budget; ALTER TABLE UC_project ADD COLUMN P_cost INT NOT NULL DEFAULT 0; ALTER TABLE UC_project ALTER COLUMN P_cost DROP DEFAULT; ** lol this is a weird one. Make sure to memorize this

Add a new column called "day_of_the_week" to the exercise_routine table. The column type should be a char to store 'M', 'T', or 'F'

ALTER TABLE exercise_routine ADD day_of_the_week char;

Run two DDL commands one after the other. The first one should delete the "exercise_name" column of the exercise_routine table. The second DDL command should create the "exercise_name" column again, but this time with the correct VARCHAR(45) data type.

ALTER TABLE exercise_routine DROP exercise_name;ALTER TABLE exercise_routine ADD exercise_name VARCHAR(45);

Let's say, for example, we wanted to add a new column to the products table that we will use to store the products' weight in kilograms called weight. We could do this by using the following query:

ALTER TABLE products ADD COLUMN weight INT;

Let's imagine that you decide to delete the weight column you just created. You could get rid of the column using the query:

ALTER TABLE products DROP COLUMN weight;

Add a new column called "P_budget" to the project table. The column type should be a numeric with a total of 8 digits to store the budget amount in US dollars with the appropriate placeholders for cents.

ALTER TABLE project ADD P_budget NUMERIC(8,2);

To add new columns to an existing table, you can use the ADD statement. It changes the definition of a base table or the schema elements

ALTER TABLE {table_name} ADD COLUMN {column_name} {data_type};

If you want to remove a column from a table, you can use the DROP statement. It deletes the elements of the schema, such as tables, domains, types or constraints

ALTER TABLE {table_name} DROP COLUMN {column_name};

Collects statistics about tables and indices and stores them in internal tables

ANALYZE

The _____________________ command collects statistics about tables and indices and stores them in internal tables

ANALYZE

Which PostegreSQL function should be used to locate missing data? - DISTINCT - LEAST - COALESCE - NULLIF

COALESCE

Change the popularity column of INT data type to include the comment "Popularity based on 2019 sales" in the MenuItems table.

COMMENT ON COLUMN MenuItems.popularity IS 'Popularity based on 2019 sales';

Add a comment to the Rating table that says 'Ratings of all movies released in 2019'

COMMENT ON TABLE Rating IS 'Ratings of all movies released in 2019';

Allows users to list any number of columns and scalar values

COALESCE

We have also learned how SQL has a wide variety of functions and keywords that allow users to map new data, fill in missing data, and remove duplicated data. Keywords such as ______________, _________________, ______________, and _________________ allow us to make changes to data quickly and easily

CASE WHEN COALESCE NULLIF DISTINCT

Add a comment to the exercise_routine table that says 'Exercise routine for January 1 to January 20'. Note: To check to see if the comment has been added to the table, run the following code in the Psql Editor: \d+ [tablename]

COMMENT ON TABLE exercise_routine IS 'Exercise routine for January 1 to January 20';

The ___________________ statement retrieves data from your database and dumps it in the file format of your choosing.

COPY

The _________________ function will return the number of rows without a NULL value in the column.

COUNT

Counts the number of rows in columnX that have a non-NULL value

COUNT(columnX)

To create an index for a set of data, we can use the following syntax:

CREATE INDEX <index name> ON <table name>(table column); We can also add additional conditions and constraints to make the index more selective: CREATE INDEX <index name> ON <table name>(table column) WHERE [condition]; We can also specify the type of index: CREATE INDEX <index name> ON <table name> USING TYPE(table column) For example: CREATE INDEX ix_customers ON customers USING BTREE(customer_id);

Create an index named i_PizzaName, which is indexing the PizzaName column of the Pizza table by using the CREATE INDEX index_name ON Pizza (column1, column2, ...); command in Psql Editor

CREATE INDEX i_PizzaName ON Pizza (PizzaName);

Consider the following tables *** Shows table "Artists" and "Albums". Create a procedure named count_rating to return the count of Albums based on user input of the Rating level of Artists

CREATE OR REPLACE FUNCTION count_rating(var1 integer) RETURNS varchar(64) AS $result$ declare result varchar(64); BEGIN result = count(*) from Artists a, Albums b WHERE a.Rating = var1 and a.ID = b.Artist; RETURN (result); END; $result$ LANGUAGE plpgsql;

Okay great! Now let's create a stored parameter that will consist of both an UPDATE to the account table to update the total for an account and an INSERT into the transaction_history table

CREATE OR REPLACE PROCEDURE add_funds(IN in_amount INT, IN in_account_id INT) 2 BEGIN 3 UPDATE account 4 SET total = total + in_amount 5 WHERE account_id = in_account_id; 6 7 INSERT INTO transaction_history(is_credit, amount, account_id) 8 VALUES(TRUE, in_amount, in_account_id); 9 10 COMMIT; 11 12 EXCEPTION WHEN OTHERS THEN 13 ROLLBACK; 14 END; 15$ LANGUAGE plpgsql; ****The COMMIT keyword here is used to wrap the insertion and update statements within a transaction. Finally, on any error, ensure that you rollback the transaction! This is a key part of the procedure and ensures that if an update to the account table fails, you don't keep track of the transaction in the transaction_history table.

Stored procedures consume parameters just like functions. These parameters can be specified as either IN or INOUT parameters. To capture output values from the result of calling a stored procedure, you must define your stored procedure to consume an INOUT parameter. Let's create our first stored procedure!

CREATE OR REPLACE PROCEDURE insert_funds(IN amount INT, IN account_id INT) BEGIN -- ... END; $ LANGUAGE plpgsql; *** This is an example of the bare bones of a procedure ... so we have the function name, the parameters, BEGIN, END, and the language definition.

Create a table named Cipher_keys to store encryption keys for the data center model. The table should have the following data: - Automatically generated integer primary key - The name of the key up to a length of 32 (required, name) - Multi-line text field for the key (cipher_key)

CREATE TABLE Cipher_keys ( id BIGSERIAL PRIMARY KEY, name VARCHAR(32) NOT NULL, cipher_key text);

Create a table named customerpizzaorder represented in the following Physical Schema by using the CREATE TABLE command. PizzaOrderID must have BIGSERIAL as the default value. The customerpizzaorder table looks like this: PK -> PizzaOrderID INT(11) CustomerID INT(11) PizzaOrderTimestamp TIMESTAMP SpecialNotes VARCHAR(500)

CREATE TABLE customerpizzaorder (PizzaOrderID BIGSERIAL NOT NULL PRIMARY KEY,CustomerID INT,PizzaOrderTimestamp TIMESTAMP DEFAULT NOW(),SpecialNotes VARCHAR(500),FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID));

Create a table of sales for a diamond retail company called diamond_sales. The columns are: - buyer_name (as VARCHAR(100)) - sales_price (as NUMERIC(8,2)) - cut_quality (1 = fair, 2 = good, 3 = very good, 4 = premium, 5 = ideal) - purchase_date (as DATE)

CREATE TABLE diamond_sales ( buyer_name VARCHAR(100), sales_price NUMERIC(8,2), cut_quality INT, purchase_date DATE);

Create a table of foods for a diet called diet_plan. The columns are: - food_name (which should be a VARCHAR(20) - calories (as INT) - food_group (1 = dairy group, 2 = meat group, 3 = fruits and vegetables, 4 = bread group, 5 = misc, which is in INT)

CREATE TABLE diet_plan ( food_name VARCHAR(20), calories INT, food_group INT); ** pretty tricky question! They tried to trip you up on this one

Create a table named log_ins to store user login information for a website. The table should have the following data: - An automatically generated integer primary key (id) - A multi-line text field for the IP address of the user (required, ip_address) - A timestamp of when the user logged in (log_time)

CREATE TABLE log_ins ( id BIGSERIAL PRIMARY KEY, ip_address TEXT NOT NULL, log_time TIMESTAMP);

Suppose we want to create a table called state_populations, and it has columns with states' initials and populations. The query would look like this:

CREATE TABLE state_populations (^state VARCHAR(2) PRIMARY KEY, population NUMERIC );

Create a table named uc_donutcustomers represented in the following Physical Schema by using the CREATE TABLE command: uc_donutcustomers table looks like this: - CustomerID INT(11) - CustomerAddress TEXT - CustomerPhone VARCHAR(30) Set the CustomerAddress column to required and the CustomerPhone default value to 555-555-5555

CREATE TABLE uc_donutcustomers ( CustomerID BIGSERIAL NOT NULL PRIMARY KEY, CustomerAddress TEXT NOT NULL, CustomerPhone VARCHAR(30) DEFAULT '555-555-5555');

To create a new blank table, we use the CREATE TABLE statement. This statement takes the following structure:

CREATE TABLE {table_name} ( {column_name_1} {data_type_1} {column_constraint_1}, {column_name_2} {data_type_2} {column_constraint_2}, {column_name_3} {data_type_3} {column_constraint_3}, ... {column_name_last} {data_type_last} {column_constraint_last}, );

We know how to create a table. However, say that you wanted to create a table using data from an existing table. This can be done using a modification of the CREATE TABLE statement:

CREATE TABLE {table_name} AS ( {select_query} );

Consider the Employees_uc table: *** Shows table with columns first_name, last_name, address, salary, birth_month, birth_day, birth_year Create a view called BirthdaySquad that only includes first_name, last_name, birth_month, and birth_day from the Employees_uc table.

CREATE VIEW BirthdaySquad AS SELECT first_name, last_name, birth_month, birth_day FROM Employees_uc;

Consider the following tables: ***Shows tables Employees_uc, Buildings, and Services Create a BurnRate view that will achieve the following tasks and combine them using the UNION operator. - Select each employees first_name and last_name and concatenate them together to form the first column. Then, select the salary column as the second column and divide it by 12 to get the monthly salary for each employee and call this second column as monthly_cost - Select building_name, monthly_rent from the Buildings table - SELECT company, monthly_fee from the Services table

CREATE VIEW BurnRate AS SELECT CONCAT(first_name, ' ', last_name), (salary/12) AS monthly_cost FROM Employees_uc UNION SELECT building_name, monthly_rent FROM Buildings UNION SELECT company, monthly_fee FROM Services;

For the given Employees_uc table, use the appropriate CREATE VIEW V_Commission AS SELECT FROM Employees_uc; command so that it concatenates the first_name and last_name with a blank space to create an attriute customer_name. Then it computes 10% of the salary to create an attribute commission. Finally, it concatenates the birth_year, birth_month, and birthday with a hyphen (-) to create an attribute birthday. Show all other columns in the view as they are. Note: In the code on the left, the 10% is applied by multiplying by .10. However, other alternative work in SQL such as multiplying by 0.1 or dividing by 10. Due to limitations in this system, multiplying by .10 is the only code allowed to be marked correct.

CREATE VIEW V_Commission AS SELECT id, CONCAT(first_name, ' ', last_name) AS customer_name, address, (salary * .10) AS commission, CONCAT(birth_year, '-', birth_month, '-', birth_day) AS birthday FROM Employees_uc;

Consider the garage table with columns named make, model, and year Delete all the Ferraris from the garage table

DELETE FROM garage WHERE make = 'Ferrari';

What would be the query to remove the products from the table whose Price is less than 2.00?

DELETE FROM Product WHERE Price < 2.00;

If we wanted to delete all of the data in the customers table without deleting the table, we could write the following query:

DELETE FROM customers;

Consider the following garage table with columns make, model, and year Delete all cars with the model year before 1960 from the garage table

DELETE FROM garage WHERE year < 1960;

What is an example of the Shortcut aliasing convention?

FROM customers AS cust INNER JOIN orders ord

The _____________________ clause enables users to specify conditions that filter which group results appear in the results.

HAVING *** This clause is similar to the WHERE clause, except it is specifically designed for GROUP BY queries

Consider the following table structures: *** Shows table titled "ColorLookup" with columns id and color and another table titled "Things" with columns object and color Take distinct colors from the Things table and insert them into the ColorLookup table

INSERT INTO ColorLookup (color) SELECT DISTINCT color FROM Things;

Consider the following uc_pizza table: *** Shows table with columns PizzaID, PizzaName, PizzaDescription, PizzaPrice, PizzaPhotoID The Pizza table structure is the same as the uc_pizza table. Copy the pizza with PizzaID = 1 from the uc_pizza table to the Pizza table

INSERT INTO Pizza(PizzaID, PizzaName, PizzaDescription, PizzaPrice, PizzaPhotoID) (SELECT * FROM uc_pizza WHERE PizzaID = 1);

Consider the following Restaurants table structure *** Shows table with columns Field, Type, Null, Key, Default, and Extra Enter a new type 3 restaurant in the Restaurants table with the name Sarah's Salad Shack and the description, Fresh salads made to order

INSERT INTO Restaurants (Name, Description, RestaurantType) VALUES (E'Sarah\'s Salad Shack', 'Fresh salads made to order', 3);

You can add new data in a table using several methods in SQL. One method is to simply insert values straight into a table using the INSERT INTO...VALUES statement. As an example, let's say you wanted to insert a new scooter into the products table. This could be done with the following query:

INSERT INTO products (product_id, model, year, product_type, base_msrp, production_start_date, production_end_date) VALUES (13, "Nimbus 5000", 2019, 'scooter', 500.00, '2019-03-03', '2020-03-03');

Take the example of the products_2014 table we discussed earlier. Imagine that instead of creating it with a SELECT query, we created it as a blank table with the same structure as the products table. If we wanted to insert the same data as we did earlier, we could use the following query:

INSERT INTO products_2014 (product_id, model, year, product_type, base_msrp, production_start_date, production_end_date) SELECT * FROM products WHERE year=2014;

Imagine there are two tables called test1 and test2. Take the names and ids of names beginning with the letter m through z from test1 table and insert them into the test2 table.

INSERT INTO test2 (name,id) SELECT name,id FROM test1 WHERE name > 'm';

**** GO back to page 46 and review the information on installing R

It also shows you how to successfully export data from our database into R

Since the release of Oracle Database in 1979, SQL has become an industry standard for data in nearly all computer applications - and for good reason. SQL databases provide a ton of advantages that make it the de facto choice for many applications:

It is intuitive efficient declarative robust

What is pg_cron?

It is a cron-based job scheduler for PostgreSQL that runs inside the database as an extension (similar to the DBMS_SCHEDULER in Oracle) and allows the execution of database tasks directly from the database, due to a background worker

Many versions of modern SQL also support data structures such as ____________________________ and arrays.

JavaScript Object Notation (JSON)

LEFT OUTER JOIN is the same as

LEFT JOIN

Returns all the records from the first table and only matching records from the second table

LEFT JOIN

Left outer joins are where the left table (that is, the table mentioned first in a join clause) will have every row returned. If a row from the other table is not found, a row of NULL is returned.

LEFT OUTER JOIN is the same thing as LEFT JOIN, so might as well just use LEFT JOIN tbh

Which PostgreSQL function should be used to remove invalid column entries from a query? - DISTINCT ON - NULLIF - COALESCE - CASE WHEN

NULLIF

If you do a right join on table1 and table2, what happens?

Only records from table1 that match records from table2 will be included, and all of the records from table2 will be included. As a result, all records in table2 that don't have a relationship with table1 records will have null columns. SO SIMPLE! Don't get it confused lol it's really straightforward and exactly what it sounds like So in a right join, all the records in the right table that don't have a match will come over and just have null columns.

If you do a left join on table1 and table2, what happens

Only records from table2 that match records from table1 will be included (essentially those that have a foreign key matching a primary key in table 1)... and all the records from table1 will be brought over...even if they have no match with table2. As a result, all records in table1 that don't have a relationship with a record in table2 will have null columns. SO SIMPLE! Surprised I got this confused earlier lol So the left table has the nulls in a left join if there are records that don't match.

Generally speaking, a query can be broken down into five parts:

Operation - The first part of a query describes what is going to be done. In this case, this is the word SELECT, followed by the names of columns combined with functions. Data - The next part of the query is the data, which is FROM keyword followed by one or more tables connected together with reserved keywords indicating what data should be scanned for filtering, selection, and calculating Conditional - A part of the query that filters the data to only rows that meet a condition usually indicated with WHERE Grouping - A special clause that takes the rows of data source, assembles them together using a key created by a GROUP BY clause, and then calculates a value using the values from all rows with the same value. Post-processing - A part of the query that takes the results of the data and formats them by sorting and limiting the data, often using keywords such as ORDER BY and LIMIT

Returns all the records from the second table and only matching records from the first table

RIGHT JOIN

What does the set theory operation INTERSECT do?

Returns results from multiple queries as one result set only if the record appears in the result set from both queries. In other words, if the two tables have literally the exact same record, the record is included in the result INTERSECT - returns a distinct list of rows INTERSECT ALL - does not remove duplicate rows (includes all duplicates and nulls)

What does the EXCEPT set operator do?

Returns results from two queries that appear in only the left query Useful when trying to find rows that are in one set but not the other EXCEPT - returns a distinct list of rows EXCEPT ALL - does not remove duplicate rows (includes all duplicates and nulls)

LEFT OUTER join includes unmatched rows from the left table while RIGHT OUTER join includes unmatched rows from the ____________________ - Left table - Inner table - Outer table - Right table

Right table

Which of the following are advantages of SQL databases? Robust Reducible Intuitive Imperative Efficient

Robust Intuitive Efficient

For set operators like UNION, INTERSECT, and EXCEPT, the tables need to be literally the same except for the name of the primary key lol -- just keep that in mind Here's an example of a UNION

SELECT 'Active' AS status, curr.* FROM customers_current AS curr UNION SELECT 'Inactive' AS status, old.* FROM customers_archive AS old ORDER BY last_name; *** so essentially what we're doing here is combining the two tables and adding an additional column first called 'status' that will have a row value as "Active" if the record is from the first table and "Inactive" if the record is from the second table

We can now query this "super-dataset" the same way we would query one large table using two clauses. For example, going back to our multi-query issue to determine which sales query works in California, we can now address it with one easy query:

SELECT * FROM salespeople INNER JOIN dealerships ON salespeople.dealership_id = dealership.dealership_id WHERE dealerships.state = 'CA' ORDER BY 1

To show how left outer joins work, let's examine two tables: the customers tables and the emails table

SELECT * FROM customers c LEFT OUTER JOIN emails e ON e.customer_id=c.customer_id ORDER BY c.customer_id LIMIT 1000;

The FULL OUTER JOIN will return all rows from the left and right tables, regardless of whether the join predicate is matched. For rows where the join predicate is met, the two rows are combined in a group. For rows where they are not met, the row has NULL filled in. Don't forget that FULL OUTER JOIN is the same as FULL JOIN. The full outer join has the following syntax:

SELECT * FROM email e FULL JOIN customers c ON e.customer_id=c.customer_id;

Whatever the reason, we are often interested in finding rows where data is not filled in for a certain value. In SQL, blank values are often represented by the NULL value. For instance, in the products table, the production_end_date column having a NULL value indicates that the product is still being made. In this case, if we want to list all products that are still being made, we can use the following query:

SELECT * FROM products WHERE production_end_date IS NULL

If we are only interested in products that are not being produced, we can use the IS NOT NULL clause, as in the following query:

SELECT * FROM products WHERE production_end_date IS NOT NULL

Consider the following Artists table: *** Shows table with columns ID, Name, and Rating Select all rows with a 2 rating from the Artists table.

SELECT * FROM Artists WHERE Rating='2';

When using more than one AND/OR condition, use parentheses to separate and position pieces of logic together. This will make sure that your query works as expected and that it is as readable as possible.

SELECT * FROM products WHERE (year>2014 AND year<2016) OR product_type='scooter';

The most basic SELECT query follows the pattern SELECT...FROM <table_name>;

SELECT * FROM products;

The general example format of a CASE WHEN statement is:

SELECT *, CASE WHEN postal_code='33111' THEN 'Elite Customer' CASE WHEN postal_code='33124' THEN 'Premium Customer' ELSE 'STANDARD CUSTOMER' END AS customer_type FROM customers; Essentially this works when we want to return all rows for customers from the customers table. Additionally, we want to add a column that labels a user as being an Elite Customer if they live in postal code 33111, or as a Premium Customer if they live in 33124. Otherwise, it will mark the customer as a Standard Customer. This column will be called customer_type. That's what we just did with the CASE STATEMENT above :D

Consider the Employees_uc table with columns id, first_name, last_name, address, salary, birth_month, birth_day, birth_year Select the birth_year, birth_month, and birth_day and concatenate them with a hyphen(-) to form a column named birthday. Then select the salary column and multiply it by 10% to get the maximum commission for each employee. Call this column max_commission. Note: In the code to the left, the 10% is applied by multiplying by .10, but alternatives could also work such as multiplying by 0.1 or dividing by 10. The answer in this lab only accepts multiplying by .10 due to a limitation in the system

SELECT CONCAT(birth_year, '-', birth_month, '-', birth_day) AS birthday, (salary*.10) AS max_commission FROM Employees_uc;

Consider the following Employees_uc table with columns first_name, last_name, address, salary, birth_month, birth_year Select each employees first_name and last_name and concatenate them together to form the first column. Then, select salary column and divide it by 12 to get the monthly salary for each employee. Finally call that second column as monthly_cost so that we can use it later

SELECT CONCAT(first_name, ' ', last_name), (salary/12) AS monthly_cost FROM Employees_uc;

Aggregate functions can also be used with the WHERE clause in order to calculate aggregate values for specific subsets of data. For example, if you wanted to know how many customers ZoomZoom had in California, you could use the following query:

SELECT COUNT(*) FROM customers WHERE state='CA';

To get a count of the number of rows in a situation where you want to include those with NULL values for that column as well, you could alternatively use the COUNT function with an asterisk, (*), to get the total count of rows:

SELECT COUNT(*) FROM customers;

You can also do arithmetic with aggregate functions. In the following query, you can divide the count of rows in the customers table by two like so:

SELECT COUNT(*)/2 FROM customers;

Let's say, however, that you are interested in the number of unique countries in the customer list. This answer could be queried using COUNT (DISTINCT expression)

SELECT COUNT(DISTINCT country) FROM customers;

Consider the Places table with columns Continent, Country, City, Region Write a query to retrieve countries and count how many cities are in each country using a GROUP BY clause

SELECT Country, COUNT(City) FROM Places GROUP BY Country;

The general syntax of a DISTINCT ON query is:

SELECT DISTINCT ON (distinct_column) column_1, column_2, ... column_n FROM table ORDER BY order_column; *** Here, distinct_column is the column or columns you want to be distinct in your query, column_1 through column_n are the columns you want in the query, and order_column allows you to determine the first row that will be returned for a DISTINCT ON query if multiple columns have the same value for distinct_column. For order_column, the first column mentioned should be distinct_column. If an ORDER BY clause is not specified, the first row will be decided randomly.

Consider the Things table, which has columns object and color Select (distinctly) all of the colors from the Things table

SELECT DISTINCT color FROM Things;

Consider the following garage table with columns make, model, and year Select (distinctly) all of the makes from the garage table

SELECT DISTINCT make FROM garage;

You can also use it with multiple columns to get all distinct column combinations present. For example, to find all distinct years and what product types were released for those model years, you can simply use the following:

SELECT DISTINCT year, product_type FROM products ORDER BY 1, 2;

Consider the following Animals table: *** Shows table with columns ID, Species, Name, Notes, Exhibit, Weight, Gender, and LastWeighDate Select the name of all animals that are of Species 2.

SELECT Name FROM Animals WHERE Species='2';

Consider the following Places table with columns Continent, Country, City, Region Write a query to retrieve region and do a count of all the cities (city) in each region, but only for the United States

SELECT Region, COUNT(City) FROM Places WHERE Country = 'United States' GROUP BY Region;

Consider the following Places table with columns Continent, Country, City, and Region List all the regions and cities in Canada, but list them by region in reverse alphabetical order

SELECT Region, City FROM Places WHERE Country = 'Canada' ORDER BY Region DESC;

Consider the Places table with columns Continent, Country, City, Region List the region and city in the United States, but make the region list descending and the cities within each region ascending

SELECT Region, City FROM Places WHERE Country = 'United States' ORDER BY Region DESC, City ASC;

Using aggregates, identifying the amount of missing data can tell you not only which columns have missing data, but also whether columns are even usable because there is so much of teh data missing. Depending on the extent of missing data, you will have to determine whether it makes the most sense to delete rows of missing data, fill in missing values, or to just delete columns as they do not have enough data to make definitive conclusions. The easiest way to determine whether a column is missing values is to use a modified CASE WHEN statement with the SUM and COUNT functions to determine what percentage of data is missing. Generally speaking, the query looks as follows:

SELECT SUM(CASE WHEN state IS NULL OR state IN (' ') THEN 1 ELSE 0 END)::FLOAT/COUNT(*) AS missing_state FROM customers;

You can also use the aggregate functions with each other in mathematical ways. In the following query, instead of using the AVG function to calculate the average MSRP of products at XoomXoom, you could "build" the average function using SUM and COUNT as follows:

SELECT SUM(base_msrp)::FLOAT/COUNT(*) AS avg_base_msrp FROM products

Here is an example of an alias -- it is essentially like adding a variable to cut down on time

SELECT c.first_name, c.last_name, o.order_date, o.order_amount FROM customers AS c INNER JOIN orders AS o ON c.customer_id = o.customer_id WHERE c.last_name = 'Dodd';

Example of a left join

SELECT c.first_name, c.last_name, o.order_date, o.order_amount FROM customers AS c LEFT OUTER JOIN orders AS o ON c.customer_id = o.customer_id;

Example of a Right Join

SELECT c.first_name, c.last_name, o.order_date, o.order_amount FROM customers AS c RIGHT OUTER JOIN orders AS o ON c.customer_id = o.customer_id;

Example EXCEPT operator in use:

SELECT curr.* FROM customers_current AS curr EXCEPT SELECT old.* FROM customers_archive AS old ORDER BY last_name; *** EXCEPT operator returns the results of the two sets when the result appears only in the left set

NULLIF is a two-value function and will return NULL if the first value equals the second value. As an example, imagine that the marketing department has created a new direct mail piece to send to the customer. One of the quirks of this new piece of advertising is that it cannot accept people who have titles longer than three letters. In our database, the only known title longer than three characters in 'Honorable'. Therefore, they would like you to create a mailing list that is just all the rows with valid street addresses and to blot out all titles with NULL that are spelled as 'Honorable'. This could be done with the following query:

SELECT customer_id, NULLIF(title, 'Honorable') as title, first_name, last_name, suffix, email, gender, ip_address, phone, street_address, city, state, postal_code, latitude, longitude, date_added FROM customers c ORDER BY 1 This will blot out all mentions of 'Honorable' from the title column

COALESCE allows you to list any number of columns and scalar values, and, if the first value in the list is NULL, it will try to fill it in with the second value. The COALESCE function will keep continuing down the list of values until it hits a non-NULL value. If all values in the COALESCE function are NULL, then the function returns NULL. Here's an example:

SELECT first_name, last_name, COALESCE(phone, 'NO PHONE') as phone FROM customers ORDER BY 1;

Consider the following garage table with columns make, model, and year Write a query to retrieve makes and count how many models are in each using a GROUP BY clause

SELECT make, COUNT(model) FROM garage GROUP BY make;

Consider the following garage table with columns make, model, and year List the make and model of all Chevrolets and Jaguars, but arrange the make list to be descending and the model of each make to be ascending

SELECT make, model FROM garage WHERE make = 'Chevrolet' OR make = 'Jaguar' ORDER BY make DESC, model ASC;

However, because product_id is the first column in the table, you could instead write:

SELECT model FROM products ORDER BY 1;

Also instead of writing the name of the column you want to order by, you can instead refer to what number column it is in the natural order of the table. For instance, say you wanted to return all the models in the products table ordered by product ID. You could write:

SELECT model FROM products ORDER BY product_id;

If you would like to extract data in greatest-to-least order, you can use the DESC keyword. If we wanted to fetch manufactured models ordered from newest to oldest, we would write:

SELECT model FROM products ORDER BY production-start_date DESC;

Most tables in SQL databases tend to be quite large, and therefore returning every single row is unnecessary. Sometimes, you may want only the first few rows. For this scenario, the LIMIT keyword comes in handy. Let's imagine that you wanted to only get the first five products that were produced by the company. You could get this by using the following query:

SELECT model FROM products ORDER BY production_start_date LIMIT 5;

You can make the ascending order explicit by using the ASC keyword. For our last query, this could be achieved by writing:

SELECT model FROM products ORDER BY production_start_date ASC;

Let's say you were interested in returning all models of a car with the year 2014, 2016, or 2019. You could write a query such as this: SELECT model FROM products WHERE year = 2014 OR year = 2016 OR year = 2019; However, as you can see, this is long and tedious to write. Using IN, you can instead write:

SELECT model FROM products WHERE year IN (2014, 2016, 2019);

Write a query in the Psql Editor that joins (Inner Join) the three tables given below and meets the following requirements: - Retrieve DonutOrderID, LastName, DonutName, OrderQty, DonutPrice from the following tables - Use the table name alias as o for uc_donutorder, d for uc_donut, and c for uc_customer tables

SELECT o.DonutOrderID, c.LastName, d.DonutName, o.OrderQty, d.DonutPrice FROM uc_donutorder As o INNER JOIN uc_donut AS d ON d.DonutID = o.DonutID INNER JOIN uc_customer AS c ON c.CustomerID = o.CustomerID; **** study this one too lol

Two functions that come in handy for data preparation are the LEAST and GREATEST functions. Each function takes any number of values and returns the least or the greatest values respectively. A simple use of this variable would be to replace the value if it's too high or low. For example, the sales team may want to create a sales list where every scooter is $600 or less than that. We can create this using the following query:

SELECT product_id, model, year, product_type, LEAST(600.00, base_msrp) as base_msrp, production_start_date, production_end_date, FROM products, WHERE product_type='scooter' ORDER BY 1;

Another useful data transformation is to change the data type of a column within a query. This is usually done to use a function only available to one data type, such as text, while working with a column that is a different data type, such as a numeric. To change the data type of a column, you simply need to use the column::datatype format, where column is the column name, and datatype is the data type you want to change the column to. For example, to change the year in the products table to a text column in a query, use the following query:

SELECT product_id, model, year::TEXT, product_type, base_msrp, production_start_date, production_end_date, FROM products; *** this will convert the year column to text

You can also alia table names so that you do not have to type out the entire name of the table every time. Simply write the name of the alias after the first mention of the table after the join clause, and you can save a decent amount of typing. For instance, for the last preceding query, if we wanted to alias salespeople with s and dealerships with d, you could write the following statement:

SELECT s.* FROM salespeople s INNER JOIN dealerships d ON d.dealership_id = s.dealership_id WHERE d.state = 'CA' ORDER BY 1;

You could also use a column number to perform a GROUP BY operation

SELECT state, COUNT(*) FROM customers GROUP BY 1

Often though, you may be interested in ordering the aggregates themselves. The aggregates can be ordered using ORDER BY as follows:

SELECT state, COUNT(*) FROM customers GROUP BY state ORDER BY COUNT(*) which would give the output VT 16 WY 23 ME 25 RI 47 NH 77

Let's count the number of customers in each US state using a GROUP BY query. Using GROUP BY, a SQL user could count the number of customers in each state by querying:

SELECT state, COUNT(*) FROM customers GROUP BY state

While GROUP BY with one column is powerful, you can go even further and GROUP BY multiple columns. Let's say you wanted to get a count of not just the number of customers ZoomZoom had in each state, but also of how many male and female customers it had in each state. Multiple GROUP BY columns can query the answer as follows:

SELECT state, gender, COUNT(*) FROM customers GROUP BY state, gender ORDER BY state, gender

The general structure of a GROUP BY operation with a HAVING statement is:

SELECT {KEY}, {AGGFUNC(column1)} FROM {table1} GROUP BY {key} HAVING {OTHER_AGGFUNC(column_2)_CONDITION} An example would be SELECT state, COUNT(*) FROM customers GROUP BY state HAVING COUNT(*)>=1,000 ORDER BY state

Inner joins are typically written in the following form:

SELECT {columns} FROM {table1} INNER JOIN {table2} ON {table1}.{common_key_1}={table2}.{common_key_2}

UNION filters all duplicate rows and UNION ALL returns all rows

That means that UNION ALL includes nulls and duplicates

What must be completed during the extract step in the ETL process?

The first step, extraction, is to collect or grab data from from its source(s). The second step, transformation, is to convert, reformat, cleanse data into format that can be used be the target database.

What are the steps of the ETL process?

The first step, extraction, is to collect or grab data from its source(s). The second step, transformation, is to convert, reformat, cleanse data into a format that can be used by the target database. Finally, the last step, loading, is to import the transformed data into a target database, data warehouse, or data mart.

Model the entities that result from normalizing the model form 2nd normal form to 3rd normal form to fulfill all the requirements for the 1st and 2nd normal forms. It should also eliminate fields that do not depend on the primary key or any other field known as functional dependencies. NOTE: To simplify the complexity of the task, a customer is restricted to make the order of single pizza at a time The two tables in 2NF form are as follows: The table called "Pizza": Pizza ID Pizza Name Pizza Description Pizza Size Pizza Price Pizza Photo URL Table called "Customers Pizza Order": Pizza Order ID Last Name First Name Street Apartment City State Zip Code Home Phone Mobile Phone Other Phone Qty of Pizzas Special Notes Pizza ID PizzaOrderTimestamp

The following shows the 3NF version of the previously 2NF tables (note... the two tables have now been converted into three tables called "Pizza", "Customer", and "Customer Pizza Order")

Model the entity that results from normalizing the model from unnormalized form to 1NF to fulfill these requirements: - Each table has a primary key to uniquely identify a record All values in each column are atomic, and there are no repeating groups The "Customer Pizza Order" table currently appears as follows: Last Name First Name Street Apartment City State Zip Code Home Phone, Mobile Phone, Other Phone Qty of Pizzas Special Notes Pizza1 Name Pizza1 Description Pizza1 Size Pizza1 Photo URL Pizza2 Name Pizza2 Description Pizza2 Size Pizza2 Photo URL Pizza3 Name Pizza3 Description Pizza3 Size Pizza3 Photo URL Pizza4 Name Pizza4 Description Pizza4 Size Pizza4 Photo URL Pizza5 Name Pizza5 Description Pizza5 Size Pizza5 Photo URL PizzaOrderTimestamp

The following table is the 1NF version of the previously unnormalized table: Pizza Order ID Last Name First Name Street Apartment City State Zip Code Home Phone Mobile Phone Other Phone Qty of Pizzas Special Handling Notes Pizza ID Pizza Name Pizza Description Pizza Size Pizza Photo URL Pizza Order Timestamp

ETL Process Step 3 -- Loading

The load is the last step of the ETL process. It involves importing extracted and transformed data into a target database or data warehouse. Some load processes physically insert each record as a new row into the table of the target warehouse utilizing a SQL insert statement. Whereas other load processes include massive bulk insert of data utilizing a bulk load routine. The SQL insert is a sower routine for imports of data, but does allow for integrity checking with every record. The bulk load routine may be faster for loads of large amounts of data, but does not allow for integrity check upon load of each individual record.

SQL provides us with many tools for mixing and cleaning data. We have learned how joins allow users to combine multiple tables, while _______________ and subqueries allow us to combine multiple queries.

UNION

The __________________ constraint guarantees that every single row for a column has a unique value and that no value is repeated

UNIQUE

Change all notes to be 'large animal' and LastWeighDate to be 2019-08-10 in the Animals table where the Weight is more than 100 and Gender is M.

UPDATE Animals SET Notes = 'large animal', LastWeighDate = '2019-08-10' WHERE Weight > 100 AND Gender = 'M';

Update TakeOut as Y in the MenuItems table where the Cost is more than 9.95

UPDATE MenuItems SET TakeOut = 'Y' WHERE Cost > 9.95;

Let's say, for instance, we have the wrong email on file for the customer with teh customer ID equal to 3. To fix that, we can use the following query:

UPDATE customers SET email = NULL WHERE customer_id=3;

Change the price for all the Camaro model to $40,000 in the for_sale table.

UPDATE for_sale SET price = 40000 WHERE model = 'Camaro';

Change the price of all the cars to a more respectable $50,000 in the for_sale table.

UPDATE for_sale SET price = 50000;

Deleting values from a row The easiest way to accomplish this task is to use the UPDATE structure we already discussed and set the column value to NULL like so:

UPDATE {table_name} SET {column_1} = NULL, {column_2} = NULL, ... {column_last} = NULL WHERE {conditional}

Sometimes, we may need to update the values of the data present in a table. To do this, you can use the UPDATE statement. To illustrate the use of the update statement, let's say that for the rest of the year, the company has decided to sell all scooter models before 2018 for $299.99. We could change the data in the products table using the following query:

UPDATE {table_name} SET {column_1} = {column_value_1}, {column_2} = {column_value_2}, ... {column_last} = {{column_value_last}} WHERE {conditional};

Relational databases are a mature and ubiquitous technology that is used to store and query data. Relational databases store data in the form of relations, also known as tables, which allow for an excellent combination of performance, efficiency, and ease of use. SQL is the language used to access relational databases. SQL is a declarative language that allows users to focus on what to create, as opposed to how to create it. SQL supports many different data types, including numeric data, text data, and even data structures

When querying data, SQL allows a user to pick which fields to pull, as well as how to filter the data. This data can also be ordered, and SQL allows for as much or as little data as we need to be pulled. Creating, updating, and deleting data is also fairly simple and can be quite surgical.

If an order sequence is not explicitly mentioned, the rows will be returned in __________________ order. This means the rows will be ordered from the smallest value to the highest value of the chosen column or column.

ascending

COPY naming a file or command is only allowed to database superusers or users who are granted one of the default roles:

pg_read_server_files pg_write_server_files or pg_execute_server_program ***this is because these privileges allow reading or writing any file or running a program that the server has privileges to access

Allows the SQL interpreter to essentially do nothing for the next defined period of time

pg_sleep

Forces the process to end without cleaning up any resources being used by the query

pg_terminate_background

ETL enables _________________________________ from source to target data repository.

physical movement of data

The one advantage of common table expressions is that they are ____________________________.

recursive ** Recursive common table expressions can reference themselves. Because of this feature, we can use them to solve problems that other queries cannot.

For all columns in A and B, write an inner equi join for table A and table B on A.key and B.key - select * from A inner join B on A.key = B.key; - select * from A inner join B on A.key = C.key; - select * from A inner join B on A.key = D.key; - select * from A inner join B on A.key = A.key;

select * from A inner join B on A.key = B.key;

For all columns in A and B, write a left equi join for table A and table B on A.key and B.key - select * from A left join B on A.key = B.key; - select * from A left join B on A.key = C.key; - select * from A left join B on A.key = A.key; - select * from A left join B on A.key = D.key;

select * from A left join B on A.key = B.key;

There are a number of different types of scans that can be used to retrieve information. The easiest to understand and the most reliable scan available within an SQL database is the ____________________ scan

sequential *** If all other scans fail, you can always fall back to the reliable sequential scan to get the information you need out of a database

Numeric data types represent numbers. What are some of the major numeric data types?

smallint - 2 bytes, small-range integer, -32768 to +32767 range integer - 4 bytes, typical choice for integer, -2147483648 to +2147483647 range bigint - 8 bytes, large-range integer, -9223372036854775808 to +9223372036854775807 range decimal - variable, user-specified precision, exact, range is Up to 131072 digits before the decimal point and up to 16383 digits after the decimal point numeric - variable, user-specified precision, exact, range is Up to 131072 digits before the decimal point and up to 16383 digits after the decimal point real - 4 bytes, variable precision, inexact, 6 decimal digits precision range double precision - 8 bytes, variable precision, inexact, 15 decimal digits smallserial - 2 bytes, small autoincrementing integer, 1 to 32767 range serial - 4 bytes, autoincrementing integer, 1 to 2147483647 range bigserial - 8 bytes, large autoincrementing integer, 1 to 9223372036854775807 range

A relational database is a database that utilizes the relational model of data. The relational model organizes data as relations or sets of _______________. This model allows __________________ operations to be performed between relations.

tuples logical


Conjuntos de estudio relacionados

Managing Organizational Structure and Culture CH 10

View Set

Peds exam 1: Death and Dying / practice exam questions

View Set

hematologie 1.ot.základní vlastnosti krve

View Set

ECO/372 - Output, Income, and Economic Growth

View Set

iPERMS Web-Based Training - Scan Operator

View Set

Finite population correction factor

View Set

Chapter 22: Revolution in the Transatlantic World

View Set

Medieval Europe: Unit Test Review

View Set

Management Test 2 - ch 7,8,9 - In class questions

View Set

Chap 5 Underwriting and Policy Issue

View Set