database management final exam
SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY
The six clauses of the SELECT statement must be coded in the following order
set operations
UNION, INTERSECT, MINUS
trigger
a named set of SQl statements that are considered (triggered) when a data modification occurs on if certain data definitions are encountered; if a condition stated within a trigger is met, then a prescribed action is taken
join
a relational operation that causes two tables with a common domain to be combined into a single table/view
business intelligence
a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information
star schema
a simple database design in which dimensional data are separated from fact or event data; a dimensional model
scalar aggregate
a single value returned from a SQL query that includes an aggregate function
data warehouse
a subject-oriented, integrated, time-variant, non-updateable collection of data used in support of management decisions
informational system
a system designed to support decision making based on historical point-in-time and prediction data from complex queries or data-mining applications
HAVING
combine filtering criteria (WHERE) with grouping criteria (GROUP BY); provides conditions for group selection
union
combines all rows from the results of two queries
save
commit
virtual table
constructed automatically as needed by a database management system
base table
contains inserted raw data, corresponds to the relations that are identified in the database's concept schema
materialized view
copies/replicas of data, based on SQL queries created in the same manner as dynamic views (exists as a table so must be synchornized with associated base tables)
data manipulation language
core commands of SQL; insert, update, query data
data definition language
create, alter, and drop tables, views, and indexes
derived data
data that have been selected, formatted, and aggregated for end-user decision support applications
user-defined data type
data type that a user can define by making it a subclass of a standard type or creating a type that behaves as an object; may also have defined functions and methods
schema
defines the logical structure of related objects belonging to a single user or group
descriptive analytics
describes the past status of the domain of interest using a variety of tools through techniques such as reporting, data visualization, dashboards, and scoreboards
veracity
dimension of data's quality
number(p,s)
ex. '123.4', '456.9', Number(4,1)
date
ex. '9/24/2019' '12-25-2019'
char(x)
ex. 'NY' 'IA' 'CA", char(2)
boolean
ex. 'TRUE' 'FALSE' 'UNKNOWN'
varchar2(x)
ex. 'abc' 'John Smith', varchar2(20)
number
ex. 1, 2, 3, 234.45
variety
expansive types of data that are collected, stored, and analyzed
embedded SQL
hard-coded SQL statements included in a program written in another language, such as C or Java
subquery
include a query in SELECT, FROM, WHERE, or HAVING clauses of another query
field controls
indicates when a column is required, unique, has a default value, or has a set list of acceptable values
self join
join records in the same table based on matching values; used to build queries from unary relationships, must create at least one temporary copy of the table
concatenation function
joins fields of text in the returned query ex. returning salesperson's first and last name as full name in one field
data mining
knowledge discovery using a sophisticated blend of techniques from traditional statistics, artificial intelligence, and computer graphics
grain
level of detail in a fact table
WHERE
list of logical expressions that filter results matching the stated criteria; each is a test returning a TRUE/FALSE result
multidimensional online analytical processing (MOLAP)
loads data into an intermediate structure, usually a three-or-higher dimensional array
relational database management system
manages data as a collection of tables in which all data relationships are represented by common values in related tables
vector aggregate
multiple values returned from an SQL query that includes an aggregate function
rows subquery
return a set of values from a single column for multiple rows; include in WHERE clause with set operator or comparison operators
right outer join
return all rows on the right-hand side of the table; displays all records from last table listed in join regardless of whether there is a matching value in first (or previous) table
scalar subquery
returns a single value for a single row; include in WHERE clause with equality or comparison operator to use result as filtering condition; always include in parentheses ()
full outer join
returns all rows from every table listed in the join
volume, variety, velocity, veracity, value
the 5 v's
volume
the amounts of data collected and processed (hundreds of terabytes or more)
prescriptive analytics
uses of results of predictive analytics together with optimization and simulation tools to recommend actions that will lead to a desired outcome
aggregate functions
Calculations such as Min, Max, Avg, and Sum that are performed on a group of records. ex. finding the average price of all products
online transaction processing (OLTP)
Capturing of transaction and event information using technology to process, store, and update
SELECT COUNT(CustomerID) AS TexasCustomers FROM CUSTOMER WHERE State = 'TX'';
Command: How many customers are from Texas?
SELECT SalespersonFirst || ' || SalespersonLast AS FullName FROM SALESPERSON;
Command: Return salesperson full name as one field
SELECT OrderID, SYSDATE-OrderDate AS DaysSinceOrder FROM ORDER;
Command: Return the number of days since an order was placed
SELECT AVG(ProductPrice) AS AveragePrice FROM PRODUCT;
Command: find the average price of all products
SELECT Customer_T.CustomerID, CustomerName, OrderID FROM Customer_T NATURAL JOIN Order_T ON Customer_T.CustomerID = Order_T.CustomerID
Command: for each customer who has placed an order, what is the customer ID,name, and order number (natural join)
SELECT * FROM ORDER, CUSTOMER WHERE ORDER.CustomerID = CUSTOMER.CustomerID AND ORDER.OrderID = 1008;
Command: retrieve the customer and order information for Order 1008
SELECT ProductID, ProductDescription, ProductStandardPrice FROM PRODUCT WHERE ProductStandardPrice = (SELECT MAX(ProductStandardPrice) FROM PRODUCT);
Command: return the information (ID, Description, Price) for the most expensive product (without using FETCH)
SELECT CustomerName, CustomerAddress, CustomerCity, CustomerState, FROM CUSTOMER WHERE CUSTOMER.CustomerID = (SELECT CustomerID FROM ORDERS WHERE OrderID = 1008);
Command: return the name and address of the customer who placed Order 1008
SELECT CustomerName, CustomerAddress, CustomerCity, CustomerState FROM CUSTOMER WHERE CustomerState ='CA' OR CustomerState ='NY';
Command: return the name, address, city, and state of all customers who live in either California or New York
SELECT ProductDescription, ProductPrice FROM PRODUCT WHERE ProductPrice < 400 AND (ProductFinish='Cherry' OR ProductFinish='Walnut');
Command: return the product description and price of items less than $400 with a finish of either Cherry or Walnut
SELECT Customer_T.CustomerID, Order_T.CustomerID, CustomerName, OrderID FROM Customer_T, Order_T WHERE Customer_T.CustomerID = Order_T.CustomerID ORDER BY OrderID
Command: what are the customer IDs and names of all customers, along with the order IDs for all the orders they've placed? (equi-join)
SELECT ProductID, ProductDescription, ProductStandardPrice FROM PRODUCT WHERE ProductStandardPrice > (SELECT AVG(ProductStandardPrice) FROM PRODUCT);
Command: write a subquery to find the ID, description, and price for all Pine Valley Furniture Company products that are more expensive than the average
GROUP BY
Divides data into subsets; categorizes rows into groups based on the values of the specified field; may apply aggregate functions
data
Facts and statistics collected together for reference or analysis
relational online analytical processing (ROLAP)
OLAP tools that view the database as a traditional relational database in either a star schema or other normalized or denormalized set of tables
cross join
two tables are joined without specifying common columns, and all combinations of rows are generated (also called cartesian join); creates a large table without useful or meaningful results
dynamic views
virtual table created dynamical upon request by a user; not temporary, bur rather results (differs from materialized view, which is stored on a disk and refreshed in intervals)
value
what the data should ultimately add to the organization
comparison operators
>, =
SELECT StudentID, FirstName, LastName FROM STUDENT WHERE LastName LIKE 'Mc%';
List the student ID and name for every student whose last name begins with Mc
database management system
Software that allows you to use a computer to create a database
enterprise data warehouse
a centralized, integrated data warehouse that is the control point and single source of all data made available to end users for decision support
procedure
a collection of procedural and SQL statements that are assigned a unique name within the schema and stored in the database
logical data mart
a data mart created by a relational view of a data warehouse
dependent data mart
a data mart filled exclusively from an enterprise data warehouse and its reconciled data
independent data mart
a data mart filled with data extracted from the operational environment, without the benefit of a warehouse
data mart
a data warehousing that is limited in scope, whose data are obtained by selecting and summarizing data from a data warehouse or from separate extract, transform, and load processes from source data systems
business performance management system
allows managers to measure, monitor, and manage key activities and processes to achieve organizational goals
real-time date warehouse
an enterprise data warehouse that accepts near-real-time feeds of transactional data from the systems of record, analyzes warehouse systems of record analyzes warehouse data, and in near-real-time relays business rules to the data warehouse and systems of record so that immediate actions can be taken in response to business events
operational data store
an integrated, subject-oriented, continuously updateable current-valued, enterprise-wide, detailed database designed to serve operational users as they do decision support processing
predictive analytics
applies statistical and computational methods and models to data regarding past and current events to predict what might happen in the future
procedures
automated processes
catalog
collection of all schema in the database (description)
conformed dimension
one or more dimension tables associated with two or more fact tables for which the dimension tables have the same business emaning and primary key with each fact table
JOIN
operation bringing data from two tables into a single result table; uses common fields (FK/PK) in related tables
table subquery
passes multiple columns to the outer query (subquey yields a derived table that is joined to the data in the outer query); must be nested in the FROM clause
hardware
physical devices/services
correlated subquery
processing the inner query depends on data from the outer query
access language
programming language for creating, querying, modifying, and deleting ddata
software
programs for creating and managing data access
catalog
set of schemas that, when put together, constitute a description of database
data control language
set up access and controls for database administration; grand/revoke privileges and commit data
FETCH
shows only the top-k results; specific to Oracle
mathematical
some __________ expresions (+,-) can be applied to date fields ex. returning the number of days since an order was placed
text/character
some aggregate functions (MIN,MAX,COUNT) may be applied to ____________________________ fields
ORDER BY
sort results in ascending or descending order
dynamic SQL
specific SQL code generated on the fly while an application processes
velocity
speed at which the data arrives
SQL
structured query language
operational system
system that is used to run a business in rela time, based on current data (aka system of record)
analytics
systematic analysis and interpretation of data -- typically using mathematical, statistical, and computational tools -- to improve our understanding of real world domain
equi-join
the joining condition is based on equality between values in the common columns (redundancy in results appears)
speed, availability, access
the three ways we measure big data and advanced analytics
union compatibility
the two sub-queries must contain the same schema (name and data type)
online analytical processing (OLAP)
the use of a set of graphical tools that provides users with a multi-dimensional view of their data and allow them to analyze the data using simple windowing techniques
equi-join
two tables are joined based on equality between values in common columns
natural join
two tables are joined based on equality between values in the common columns
inner join
two tables are joined based on equality between values in the common columns; records that do not have matching values are excluded from results (equipment-join and natural join are both types of this)
outer join
two tables are joined based on equality between values in the common columns; records that do not have matching values in common columns also included in results (with NULL values)
cartesian join
number of rows is equal to number of rows in each table multiplied together ex. 15 orders x 10 customers = 150 rows
set operators
IN, NOT IN, ANY, ALL
transient data
data in which changes to existing records are written over previous records, thus, destroying the previous data content
set
data structure that stores one or more unordered, unique values; can be joined to each other based on shared values
preiodic data
data that are never physically altered or deleted once they have been added to the store
big data
data that exists in very big volumes and many different varieties (data types) and that need to be processed at a very hig velocity (speed)
hardware, software, data, procedures, access language
database management system components
reconciled data
detailed, current data intended to be the single, authoritative source for all decision support applications
persistent stored modules
extensions defined originally in SQL:1999 that include the capability to create and drop modules of code stored in the database schema across user sessions
left outer join
returns all rows on the left-hand side of the join; displays all records from first table listed in join regardles of whether there is a matching value in second (or subsequent) table
outer join
rows that do not have matching values in common columns are nevertheless included in the result table
natural join
same as an equi-join except that one of the duplicate columsn is eliminated in the result table