MIS Final

Ace your homework & exams now with Quizwiz!

Data Management Issues Today

*Massive and increasing amounts of data.* - Data collection is easy. Data arrive faster than they are deleted - Gather data without knowing how they might be useful - Hire business consultants, collect data that pertain to operational questions *Data dispersal.* - Coordinate data management between departments within your organization - Trade off between centralized data storage and federated system - Alternative option: create "federated" data sources that dynamically combine information from constituent systems. *Data fusion.* - Sharing data with outside organizations - Suppliers, retailers, etc. *Security.* - don't want hackers to have access to private information, e.g. credit cards, SSN

Limitations of One Table

*update anomaly* : if a customer calls in reference to loanA and gives you a new address, loanB might still have the old address. *insertion anomaly* : you want to keep a potential customer who doesn't have a loan with you yet, but there is no natural place in the Loan table to store this information. *deletion anomaly* : a customer pays off his last outstanding loan and is consequently deleted from the loan table. As a result, you lose all record of the customer's name and address, and you are now unable to contact them in future.

MS Access

- *Table*: stores data - *Form*: a user interface to view and manipulate data stored in tables - *Report*: a static, non-interactive page that shows data in an organized style - *Query*: generate temporary table based on some conditions (sort, sum, aggregate, etc.)

Point of Sale (POS)

A data collection system that electronically receives and stores bar code information derived from sales transactions.

INNER JOIN Operator

An inner join consists of all combinations of rows selected from two tables that meet some matching condition, formally called a join predicate. table1 INNER JOIN table2 ON condition the condition is usually applied on foreign key-primary key pairs Example: show the customer and order information from both tables SELECT customers.custid, fname, lname, orderid, orderdate FROM customers INNER JOIN orders ON customers.custid = orders.custid; you are specifying to only show combinations where the foreign key pair is correct! same customer id for all rows that are returned. Note: When you use an inner join, the query result will be as many rows as there are instances of your foreign key in the second table!

DELETE FROM

DELETE FROM department; will remove ALL tuples/rows from the table DELETE FROM department WHERE LName = "Smith"; will only remove tuples that match the criteria. DROP TABLE department will remove the whole entire table and its rows from your database! there is no way to reverse this action.

CREATE TABLE

Defines an SQL relation/table. CREATE TABLE tablename ( attribute datatype(#characters), att2 type, att3 type(2), PRIMARY KEY (att2) );

Crow's Foot Notation

Each relationship between entities in the ER Diagram shows how many instances of one entity are associated with how many instances of another entity. *cardinality constraints*: limitations on the relationships between two entities, which, once the entities have been made into tables, also ends up meaning the relationships between the values in the columns that link the two tables together

WHERE: Inequality Predicates

Equal to = Not equal to <> Less than < Less than or equal to <= Greater than > Greater than or equal to >= Example: find students born after 1983. SELECT Fname, Lname, DOB FROM students WHERE dob >= #1/1/1983# ORDER BY lname ASC, fname ASC; Example: sort all students with gpa over 3.5 by highest gpa. SELECT fname, lname, major FROM students WHERE gradepoints/credits >= 3.5 ORDER BY gradepoints/credits DESC; Notice! you can't use alias in where or order by clauses!

Sequence of SQL Operations

FROM (join tables, create table alias) *note* this is why you can use a table alias in the from clause everywhere else in your sql! from alias is made first. WHERE (use criteria to remove rows) SELECT: AGGREGATION (make calculated fields and aggregate fields) HAVING (use criteria to remove aggregated rows) ORDER BY (sort all rows) Drop Columns (select, also where select alias of columns comes in) DISTINCT (removes all rows that are duplicates)

Multiple Table Queries

Field: LOAN.* and CUSTOMER.* to have all rows from loan and customer concatenated together in one table. Adding Criteria: after creating a query with both tables, add required fields and then put criteria under relevant attributes, it will sort for you based on the criteria (ex: only show amount >100)

WHERE: IN Operator

IN operator will check the existence of a value against a list of values provided or constructed in SQL. It usually takes the place of multiple OR operators. The IN operator can be used with text, number and date data, making it suitable for a wide range of applications. Example: List all customers from NY, PA, or CT SELECT lname, fname, city, state FROM customers WHERE state IN ("ny","pa","ct"); *Note*: you can also ask for NOT IN to show everything excluding the list of attribute-values.

GROUP BY Clause

More often than not, we do not want to aggregate information over the entire dataset but over multiple groups of records. Suppose that in the previous query, we do not want a single grand total but the total revenue for each day. To perform this kind of processing, we use another standard SQL feature, the GROUP BY clause. The general syntax of the GROUP BY clause is as follows: SELECT field(s), aggregate function (field of calculated field) FROM table WHERE criteria (optional) GROUP BY field(s) ORDER BY field(s) (optional) Note: if you have attributes in your select clause that are not being put in some aggregation function, they *must* also be put in the group by clause, otherwise your query will return an error! Example: calculate total product units ordered by each customer SELECT lastname, sum(quantity) AS totalunits FROM customers GROUP BY lastname;

Good Database Design Principles

No Redundancy Every field is stored in only one table except in case of foreign keys Replication of foreign keys is okay to join two tables No "Bad" Dependencies Remove partial and transitive dependencies. Need determinant to be a whole primary key or candidate key. Use normalization when necessary.

SQL Queries on Multiple Tables

Note: inner join is only useful for two tables. For 3+ tables, you *must* use a cartesian join with the foreign key statements in the where clause!! Example: you want to see all products who sold more than 10 at once, or were sold on april 20th. SELECT quantity, productname, orderdate FROM orders, orderdetail, product WHERE orders.orderid = orderdetail.orderid AND product.productid = orderdetail.productid AND (quantity >= 10 OR orderdate=#4/20/2013#); *note* you have to have the brackets for all other criteria besides your foreign-key statements so that access is not confused! Another Example: see all products purchased after april 15 and worth more than $500. SELECT p.productid, quantity, productname, orderdate, quantity x unitprice AS lineprice FROM orders, orderdetail, product AS p WHERE orders.orderid = orderdetail.orderid AND product.productid = orderdetail.productid AND (quantity*p.unitprice >= 500 AND orderdate > #4/20/2013#);

Integrity Constraints in SQL

PRIMARY KEY (𝐴_(𝑗_1 ), 𝐴_(𝑗_2 ), ...,𝐴_(𝑗_𝑚 )): The primary key specification says that attributes 𝐴_(𝑗_1 ), 𝐴_(𝑗_2 ), ...,𝐴_(𝑗_𝑚 ) form the primary key for the relation. The primary key attributes are required to be nonnull and unique; that is, no tuple can have a null value for a primary key attribute, and no two tuples in the relation can be equal on all the primary key attributes. FOREIGN KEY (𝐴_(𝑘_1 ), 𝐴_(𝑘_2 ), ...,𝐴_(𝑘_𝑛 )) REFERENCES 𝑠: The foreign key specification says that the values of attributes 𝐴_(𝑘_1 ), 𝐴_(𝑘_2 ), ...,𝐴_(𝑘_𝑛 ) for any tuple in the relation must correspond to values of the primary key attributes of some tuple in relation 𝑠. NOT NULL: The not null constraint on an attribute specifies that the null values is not allowed for that attribute.

Expert System (ES)

Rely on artificial intelligence techniques to support knowledge-intensive decision-making processes Preserves the knowledge of retiring experts in the form of a knowledge base consisting of facts and relationships among the facts Can be distributed and used repeatedly

SELECT: Special Functions

SELECT *** FROM table will give you all the attributes from the table you asked for SELECT "string with information" FROM table is called a *literal*, which will give you the result of a table with one column and as many rows as the number of tuples in the table you requested. Each row will just be the string literal you asked for in the select clause. SELECT ("Dear" + FName + ", your GPA is" + CStr(GradePoints/Credits) + "!") From Students you can use *concatenation* in your literal request so that each row gives you the attributes you wanted in the form of a string sentence "Dear Mary, your GPA is 3.2!"

COUNT()

SELECT Count(OrderID) FROM ORDERS WHERE OrderDate BETWEEN #4/15/2013# AND #4/22/2013#; The count() function, used alone, will calculate the number of records in a dataset as soon as it is used on a field that contains no null values. It is usually applied on the primary key of the table since it is the one field that is certain not to contain nulls.

DISTINCT Predicate

SQL allows duplicates in relations as well as in query results. It is used to eliminate duplicate rows from the results of a SELECT statement. It is used right after the SELECT statement. The column on which we use the DISTINCT predicate will return only its unique values and if we apply it on a combination of columns - lastname and firstname, for example - the database will return the unique combinations of those values. Example: SELECT DISTINCT state FROM students; result will be each state in a list that has appeared at least one time in the table. Won't give duplicate answers for every row. SELECT DISTINCT city, state FROM students; will return every unique combination of city and state Manhattan, NY; Long Island, NY; Trenton, NJ; and so on

ORDER BY: Calculated Attributes

Show all students' name, id, total grade points, total credits earned and GPA. Knowing that GPA = total grade points / total credits. Sort all results by GPA in descending order SELECT LastName, FirstName, ID, GradePoints, Credits, GradePoints/Credits as GPA FROM Students ORDER BY GradePoints/Credits Note: in all clauses except SELECT, you cannot use the rename/alias to do anything! no sorting, no criteria, not allowed

Using WHERE and GROUP BY together

Some times, we want to exclude certain records from our aggregate calculations. We can use the WHERE clause to exclude records we are not interested in working with. Using WHERE, we exclude records before the GROUP BY clause takes effect. In this example we look for order totals by customer, but we want to exclude all products which are cheaper than $10 when we calculate the total amount. SELECT lastname, orderdate, sum(unitprice*quantity) AS OrderTotal FROM customers WHERE UnitPrice >= 10 GROUP BY lastname, orderdate;

ORDER BY Clause

Sort all rows in the query result according to the order of the designated attribute Sort all students' information according to their last name. SELECT * FROM students ORDER BY LName DESC We may specify desc for descending order or asc for ascending order, for each attribute; ascending order is the default. Can sort on multiple attributes, the left one is what is sorted first, then every next to the right is calculated. Example: order by LastName asc, FirstName asc

HAVING Clause

Suppose we are interested in performing the same query, but we want to see only customers who have spent a total of at least $100.00. Here, we need to instead apply a criterion to the result of the aggregation function, which can only be known *after the grouping and aggregation steps*. To apply criteria after grouping and aggregation, SQL provides an additional clause called HAVING. Example: SELECT lname, orderdate, sum(unitpricexquantity) as ordertotal FROM customers WHERE unitprice >= 10 GROUP BY lname, orderdate HAVING sum(unitpricexquantity) >= 100;

Decision Support System (DNS)

System designed to help analyze and make specific kinds of decisions. Help find the optimal course of action and answer "What if?" questions. Rely on models and formulas to produce concise tables or a single number that determines a decision

Data Types in SQL

Text - 255 characters maximum for text, strings Memo - stores up to 65,536 characters of text/string. cannot be sorted! only searched. Byte - allows whole numbers from 0 to 255, takes up 1 byte Integer - allows whole numbers from -32,768 to +32,767, takes 2 bytes Long - allows whole numbers from -+2,147,483,648 and takes 4 bytes Single - single precision floating point, handles most decimals, 4 bytes Double - double precision floating point, most decimals, 8 bytes Currency - holds up the 15 digits of whole dollars plus 4 decimal places, 8 bytes AutoNumber - starting from 1, i++, 4 bytes Date/Time - 8 bytes Yes/No - boolean, no null values allowed, only 0 and 1. 1 bit! Ole Object - stores 1GB of pictures, audio, video, binary large objects (BLOBs) Hyperlink - link to web Lookup Wizard - lets you type a list of options, which then can be chosen from a drop down list, takes up 4 bytes

WHERE: BETWEEN ... AND Operator

The BETWEEN ... AND operator is designed to retrieve subsets from a data set. It works primarily with number and date data. It can be used to retrieve records that exist between two values acting as boundaries. BETWEEN is inclusive which means that the values acting as boundaries will be included in the recordset. Example: Show all orders placed b/w 4-15 and 4-22 in 2013 SELECT orderid, orderdate FROM orders WHERE orderdate BETWEEN #4/15/2013# AND #4/22/2013#; ORDER BY orderdate DESC;

Enterprise Resources System (ERP)

The integrated management of core business processes. Goal is to use a MIS for all operations within the enterprise. Sales, Procurement (SRM), Production (PLM), Distribution (SCM), Accounting, HR, Corporate Governance, Customer Service (CRM), etc.

Normalization

The process of applying rules to a database design to ensure that information is divided into the appropriate tables. Removes partial and transitive anomalies. *First Normal Form*: all table attributes are atomic, one value of each attribute for every instance of the entity (you can't have multiple values in one cell) Not atomic if you have nested relations/tables and subtables, repeating groups/sections, or list-value attributes Ex of not 1NF: CLIENT(ClientID, ClientName, VetID, VetName, PET(PetID, PetName, PetType)) You can fix this and make it 1NF if you separate CLIENT and PET into two separate, un-nested tables. Use ClientID as foreign key to CLIENT in your pet table, and use composite primary key ClientID + PetID since petID is only unique when paired with an owned *Second Normal Form*: must be in 1NF and also have no partial dependencies Partial dependency occurs when you have a composite primary key and a non-key attribute depends on part of the primary key, but not the whole thing! Look back at original example of Member, Activity, Fee. It was in 1NF, can be turned into 2NF by splitting into two separate tables, using Activity as a foreign key. If you have a transitive dependency, you can still be in 2NF *Third Normal Form*: you are in 1NF, 2NF, and you have no transitive dependencies. Transitive Dependency happens when a non-key attribute depends on another non-key attribute, which could not be considered a candidate for primary key in any way. You must split the tables again to remove transitive dependencies! CLIENT(ClientID, Name, VetID) Vetid foreign key to VET VET(VetID, VetName) PET(PetId, ClientID, PetName, PetType) ClientID foreign key to CLIENT Special Note about 3NF: Dependencies on candidate keys do not invalidate 3NF because these are not considered transitive dependencies. Ex: SSN -> birthday, SSN -> Gender, and License <-> SSN. This is okay because SSN and License are both candidate keys.

SELECT: Calculated Fields

The select clause can contain arithmetic expressions involving the operation, +, -, *, and /, and operating on constants or attributes of tuples. Example: Show all students' name, id, total credits earned and total tuition paid for those credits. It is know that each credit costs $800. SQL: SELECT LastName, FirstName, ID, Credits, *Creditsx800* FROM Students;

Data Definition Language (DDL)

The set of relations (i.e., entities and relationships) in a database must be specified to the system by means of a data-definition language (DDL). The SQL data-definition language (DDL) allows the specification of information about relations, including: The schema for each relation. The domain of values associated with each attribute. Integrity constraints And also other information such as The set of indices to be maintained for each relations. Security and authorization information for each relation. The physical storage structure of each relation on disk. DDL is used to create/modify tables.

WHERE: Logical Operators

There are 3 points you need to remember about them: 1. Their task is to produce a subset of records from the total records in a table or query. 2. They have no effect on the columns we retrieve from that table or query. 3. The OR and AND operators can be used individually or in combination with varying results. Like arithmetic expressions, the parts inside brackets will be firstly combined, otherwise all criteria will be combined from left to right. PEMDAS. *OR*: The OR operator is inclusive so that records will be returned for every OR condition satisfied in our query. *AND*: The AND operator on the other hand is exclusive which means it will produce results only when all of the conditions are met. *NOT* The NOT operator will find all rows that match your criteria, and then remove them (essentially return the opposite) ---- Produce a report of customers who live in either hamilton or NYC in new york SELECT * FROM customers WHERE state = "NY" AND (city = "hamilton" OR city = "NYC"); Show all customers from NYC as well as from new jersey SELECT * FROM customers WHERE state = "NJ" OR city = "NYC";

Combo Box

allows you to choose from a drop-down list for a specific attribute 1. open design view, select a field 2. in field properties, click "lookup" 3. click on "text box" and select "combo box" 4. click "row source type", select "value list" - type "optionA";"optionB";"optionC" 5. save and open datasheet view

Foreign Key

an attribute or a collection of attributes whose value are intended to match the primary key of some related record (usually in a foreign table) belongs to the entity with the M cardinality constraint, "points to" the entity with the 1 cardinality constraint

Management Information System (MIS)

an information system used for *decision-making*, and for the coordination, control, analysis, and visualization of information in an organization; especially in a company. (search engines, GPS are examples of consumer information systems, NOT THE SAME)

Computer-Based Information Systems (CBIS)

an organized integration of hardware and software technologies and human elements designed to produce timely, integrated, accurate and useful information for decision making purposes - *Hardware*: devices like the monitor, processor, keyboard to interface with the computer - *Software*: the programs that allow the hardware to process the input data - *Networks*: are a connecting system that allows diverse computers to distribute resources - *Procedures*: Rules for how humans use information systems.

INSERT INTO

command to load data into an existing relation INSERT INTO tablename VALUES(primarykeyvalue, "string", "732-777-8888", True, 112); make sure you type your attribute values in the same order as the attributes are listed in your relation design!

Information Systems

consist of the ways that organizations store, move, organize, and manipulate/process their information

Functional Dependency

every value of the dependent attribute correlates to a SINGLE value of the primary/function attribute. Ex: MemberID has one value of Activity (M001 to Swimming) *Transitive Dependency*: functional dependent whose determinant is not the primary key, part of the primary key, or even a candidate key. Tends to be a bad choice! You want to be dependent on something that is unique to every instance of the entity. *Partial Dependency*: a functional dependency whose determinant is part of the primary key, but not all of it. Ex: MemberID and Activity ----> Fee, fee is partially dependent on MemberID. Causes Anomalies: 1. Deletion Anomaly: if member M001 stops taking skiing, you lose all information about the related fee for skiing 2. Insertion Anomaly: if you add a new activity, you cannot have an empty MemberID to store just the fee. If no one is taking the new activity, you can't add a new row! 3. Update Anomaly: if you change the price of golf, you have to change the fee value for ALL rows which are related to golf! Makes for typing errors or someone could forget to change all the row values. NOTE: transitive dependency also leads to the same anomalies

Form Design

more user-friendly GUI for database record entry design of this does not change design of the underlying database changing entries here will result in the same change being made in the underlying database Note: you can add calculated fields that are not stored in the database itself (Ex: GPA =[GradePoints]/[Credits])

Join

operator which combines tables based on foreign-primary key relationships *Inner Join*: An inner join requires each row in the two joined tables to have some matching column values (usually are the linked primary key and foreign key columns), and is a commonly used join operation in applications. (inner venn diagram overlap) *Left Join*: A left (outer) join returns all the values from an inner join plus all values in the left table that do not match to the right table, including rows with NULL (empty) values in the link column. *Right Join*: A right (outer) join returns all the values from an inner join plus all values in the right table that do not match to the left table, including rows with NULL (empty) values in the link column.

Information

refers to data that are structured and organized to be useful in making a decision or performing some task. "the resolution of uncertainty" Tells you what an entity is, the nature of that entity, and its properties

Entity Relationship Diagram (ERD)

A diagram that depicts an entity relationship model's entities, attributes, and relations *Relational Schema*: database map of entities, contains less information entity - rectangle attribute - oval relationship - diamond

Query Design

extracts information from the database and processes it, then stores in a temporary table ("ephemeral table"/Views/Dynasets) sorting, filtering, aggregation, calculation, etc Sort: ascending, descending (will sort by the leftmost field first, then by subsequent on the right. if a table appears on the right but needs to be first sort, make a hidden copy at very left field!) Calculated Field: in an empty field, enter "FieldName:=[field1]/[field2]" Criteria: filters records based on math (=10, >40, <=7) or based on data ("business" and "economics") *And* for same row, *Or* for different row

Transaction Processing System (TPS)

gather data about everyday business events in "real time" as they occur Fast, accurate, stable Use technology like barcodes and scanners

Structured Query Language (SQL)

language used to write and execute *queries* (data-fetch requests) for your relational database system "the language to manipulate data" IBM developed the original version of SQL, originally called Sequel, as part of the System R project in the early 1970s. The Sequel language has evolved since then, and its name has changed to SQL (Structured Query Language). Many products now support the SQL language. SQL has clearly established itself as the standard relational database language.

Report Design

mainly used for printing data from database, non-interactive. information can be grouped and sorted.

1-to-Many Relationship

one table instance is linked to multiple instances of a foreign table ex: one customer takes out several loans. every loan belongs to one customer. [customer] 1 -<>- M [loan]

Subtype Entity

special relationships used to designate a subgroup within an entity that has special attributes that do not apply to the entity as a whole ex: honor student subtype of student, must have foreign key to student and foreign key to faculty (thesis advisor) ER Diagram shows this as a double rectangle, connected by [student]1 --- 0/1[honor] The primary key of your subtype entity should match the primary key of the parent entity! And it is considered a foreign key to the parent. Don't need to repeat the same attributes, implied that they keep the parent's ones as well as whatever unique ones you add.

Table Design

two-dimensional representation of an entity in a database, aka "relation" rows: tuples/records, correspond to one *instance* of the entity. every row is unique. row order doesn't matter. columns: attributes/fields, record various *properties* of the entity. has a unique field name. order is unimportant. all entries in a column has the same datatype. note: every datacell contains atomic data (no lists, subtables, arrays. one data per one cell)

Validation Rules

used to limit entries to a text box or combo box 1. open design view, select field 2. click "field properties", click "general" 3. under "validation rule", type ="optionA" or "optionB" or "optionC" 4. under "validation text", type a message to the user when an error is found 5. save and open datasheet view

AVG()

we want to know the average difference (over all products) between the units in stock and the units on order SELECT Avg(UnitsinStock - UnitsonOrder) FROM PRODUCT;

SUM()

we want to know the total number of items ordered over the entire history covered by the database. SELECT Sum(Quantity) FROM ORDERDETAIL;

UPDATE

when you add a new column/attribute, you will have a null value for all rows. To add data, use update UPDATE department SET Room = "W102" WHERE FName = "John" AND LName = "Smith";

Multi-Path Relationships

a design having multiple distinct ways to relate records in different tables. These multiple paths are justified because they are semantically different - they mean different things When creating queries, you should only use the joins relevant to the question being asked (don't want to have customer.zip and branch.zip return the same zipcode and it be wrong). you can also add the multi-path table twice so it can have a separate column value for both paths!

Database Management System (DBMS)

a software system that enables users to define, create, maintain and control access to the database "the software that supports database and SQL"

Analytical Processing

a technique that examines stored information and looks for patterns in the data that are not yet known or suspected. computationally intensive, the data being used may not be up to date if records change faster than the program can process. simple: sort/group/sum/calculate complex: data mining, optimization, decision analysis databases cannot do complex tasks, only simple ones

Transactional Processing

a way of changing a database that keeps its records up to date at all times keeps track of day-to-day events add/delete/modify tables can slow down other processing power

ALTER TABLE

allows you to add or remove an attribute from an already existing table ALTER TABLE department ADD Room text(15); ALTER TABLE department DROP Room;

Relational Database

database which breaks datasets into individual pieces or subsets (called tables). each subset has a theme that logically binds the data records and the subset together when retrieving information, you do not need to access all subsets, only the ones with relevant data records (efficient processing)

MIS Advantages

- can perform numerical computations fast - can communicate quickly and accurately - large storage capacity - task automation and portability - data security Cons: - small errors have big impact - data processing unintuitive to users - malfunctions may go unnoticed

Excel Disadvantages

- can't handle big data - reading/writing file will be slow when file is large - can't be synchronized online across many users - hard to develop programs for

Aggregation

"Σ Totals" button leads to Aggregation function. Choice to select a field and run "group by (default), sum, count, etc." If two records are identical in every "group by" column, they are in the same group. If not, they belong to 2 different groups. Each group becomes one row in the query output. For each group, Access displays the common values of the "group by" columns. For the remaining columns, Access combines the values as specified in the "Totals" row. Note: to use a prior aggregation in a new aggregate expression, use NAME: [table.field1]-SUM([table.field2])

Parts of SQL

*Data-definition language (DDL)*: The SQL DDL provides commands for defining relation schemas, deleting relations, and modifying relation schemas. *Data-manipulation language (DML)*: The SQL DML provides the ability to query information from the database and to insert tuples into, delete tuples from, and modify tuples in the database. *Integrity*: The SQL DDL includes commands for specifying integrity constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed. *View definition*: The SQL includes commands for defining views. *Transaction control*: SQL includes commands for specifying the beginning and ending of transactions. *Embedded SQL and dynamic SQL*: Embedded and dynamic SQL define how SQL statements can be embedded within general-purpose programming languages, such as C, C++, and Java. *Authorization*: The SQL DDL includes commands for specifying access rights to relations and views.

Data Manipulation Language (DML)

A typical SQL query has the form: SELECT sum(att), att2, att3 FROM table, t2, t3 WHERE primary-foreign key statement AND (predicate/criteria) GROUP BY att2, att3 HAVING sum(att) <> 0 ORDER BY att2 asc, sum(att) desc; The result of an SQL query is a relation with the sorted data.

SQL Aggregation Functions

Aggregate functions like sum(), avg(), count(), count(*), max(), min() have the main goal of summarizing data among several rows into a single piece of data.

Primary Key

An attribute or collection of attributes whose values uniquely identify each row in a table/relation should be minimal: it should not contain unnecessary attributes *composite key*: "concatenated" key. a multi-attribute key (the combination of both attributes generates a unique identifier for the row/instance)

Cartesian Joins

In addition to INNER JOIN expressions, the FROM clause of a SELECT statement may contain a list of table names separated by commas. For example, it is possible to write: SELECT customers.custid, fname, lname, orders.custid, orderid, orderdate FROM customers, orders; However, this query does not behave the same way as the first query in the previous section. The construction "CUSTOMERS, ORDERS" in the FROM clause instructs SQL to form the table obtained by combining *every possible row* of the CUSTOMER table with every possible row of the ORDERS table, without checking whether the foreign key values linking the two tables match. To turn the cartesian join into an inner join, make a primary-foreign key statement in the WHERE clause! SELECT ... FROM customers, orders WHERE customers.custid = orders.custid; *Note*: If you have 3 rows of left table, 4 rows of right table, cartesian join will give you 3*4 = 12 rows of information. Some of these rows are meaningful and others are not. The ones where Left.PrimaryKey and Right.ForeignKey match means that the row is meaningful.

Synthetic Key

Sometimes we may invent a new attribute to serve as a primary key if no suitable primary key is available, or to avoid composite keys In Access, "Autonumber" fields can serve this purpose

SELECT: Rename Operation

The SQL allows renaming attributes and tables using the as clause: old-name *AS* new-name Example: SELECT LastName, FirstName, ID, Credits, *Creditsx800 AS TotalTuition* FROM Students;

TOP Predicate

The TOP predicate used alone is not useful since it does not return records in any special order. It simply returns the first n or n% or the records as they are stored in the table. Must combine with order by, group by, and where clauses. Example: Show the top 5 students who have earned the most credits so far! SELECT TOP FirstName, LastName, Credits FROM students ORDER BY credits *DESC*; Note, the TOP predicate will find the value of the nth row and then include all rows that "tie" or have the same value as the nth row! So you may get the top 6 or 7 if many people have identical credit amount. SELECT TOP 5 PERCENT Fname, Lname, Credits FROM students ORDER BY credits DESC; Top will take the total number of rows in the table, multiply by 5%, and then round UP if there is a decimal (if you round down 0.5 you get no answer! avoid by rounding up). Again, if the last row has a "tie" it will include matching rows.

WHERE Clause

The WHERE clause specifies criteria's that each row in the result must satisfy The WHERE clause comes before the ORDER BY clause Find all students who majors in business SELECT FirstName, LastName, Major FROM Student WHERE Major = 'Business'; Fun Example: show all the field names from the orders table SELECT * FROM orders WHERE 0=1; since the criteria 0=1 is never true, you will get a relation with all the fields from the original orders table, but showing no rows!

Outline Notation

Underline all parts of primary key only (concatenated key) Foreign key statement underneath TABLE1(_table1id_, name, order, ...) TABLE2(_table2id_, ..., table1id) table1id foreign key to TABLE1

Many-to-Many Relationships

Use an intermediate table between two entities with this relationship. intermediate table represents two one-to-many relationships. INTERMEDIATE(_primary1_, _primary2_, ... related attributes) primary1 foreign key to ENTITY1 primary2 foreign key to ENTITY2

OUTER JOIN: Left and Right

Use outer joins for aggregation!! the LEFT JOIN syntax specifies that the join result always contain at least one representative of each record in the left table, even if there is no matching record in the right‐hand table to satisfy the ON condition. When there is no matching query in the right‐hand table, SQL supplies blank values for its attributes. RIGHT JOIN syntax specifies that the join result always contain at least one representative of each record in the right table, and gives a null value where there is no matching instance in the left table Example: SELECT fname, lname, nickname FROM coach LEFT JOIN team ON coach.teamid = team.teamid you will have a result of fname, lname, nickname with all your coaches listed, and only as many nicknames as there are teams with coachid assigned! for coaches with no teamid assigned, you get null value in team.nickname.

SQL Aggregation on Join Tables

We want to display the total revenue for every day for every customer. SELECT customer.custid, orderdate, sum(unitpricexquantity) as revenue FROM customer, orders, orderdetail, products WHERE customer.custid = orders.custid AND orders.orderid = orderdetail.orderid AND products.productid = orderdetail.productid AND (orderdate BETWEEN #4/22/2013# AND #4/29/2013#) GROUP BY customer.custid, orderdate HAVING sum(unitpricexquantity) >= 100 ORDER BY sum(unitpricexquantity) DESC;

Referential Integrity

When enforcing referential integrity we tell Access that the values displayed in the foreign key for a given table must be present in some records of the table to which that foreign key is pointing to. *Cascading Updates*: propagating a change from the parent's primary key to the children's foreign key. *Cascading Deletes*: you can delete a record on the one side of the relationship and all the related records on the many side will also be deleted


Related study sets

Chapter 23 The Rock Record------Relative Dating

View Set

Data Analytics: Chapter 8: Sampling Distributions and Estimation

View Set

Anatomy: Chapter 6 Check Point Quiz

View Set

Real Estate Agency in Practice study set

View Set

Nationalism and Industrialism Topic Test

View Set

5.01 Quiz: Solving Perfect Square Equations

View Set

Music of the people: The Beatles

View Set