Advanced Database Systems

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What are the aims of DBMS?

-Provide users with an abstract view of data -Hide certain details of data from different users -No details about how data is stored

When are pairs of operations not in conflict?

-When two transactions only read some data item. -When two transactions read or write completely different data items

What are the main advantages of the Enhanced ER model?

-avoid describing similar concepts more than once -have relations that include a subclass but not the superclass -more semantic information to the design of a form

What is a recoverable schedule?

-once a transaction T is committed -it should never be necessary to roll back.

Semi-structured data

-self describing data -schema information is mixed with data values

Unstructured data

-very limited indication of the type of data

Several actions are necessary to create a database. Place these in the correct order: 1. Create the data dictionary views 2. Create the parameter file 3. Create the password file 4. Issue the CREATE DATABASE command 5. Issue the STARTUP command

2. Create the parameter file 3. Create the password file 5. Issue the STARTUP command 4. Issue the CREATE DATABASE command 1. Create the data dictionary views

Page 211 Question 1

A = Segment B = Extent C = Oracle Block D = Datafile

Database managment system (DBMS)

A central set of common functions for managing a database

Field

A character or group of characters that have a specific meaning, used to define and store data

What is a database?

A collection of files storing related data.

File

A collection of related records

Primary Key

A column in a relational database whose values must be unique for each row

Foreign Key

A column or field that is a primary key in another table

Surrogate Key

A column you can create to be the records primary key identifier

Row

A data set representing a single item

Multidimensional data model (OLAP)

A database is a set of facts (points) in a multidimensional space. Consists of fact tables and dimension tables

Relationship

A logical connection between different tables

Record

A logically connected set of one or more fields that describes a person, place, or thing

Database administrator (DBA)

A person charged with maintaining, administering, and maintaining the database

Database

A place to store and manage application data

Data Dictionary veiws

A program that allows the user to view objects by reference

Transaction log

A record of all related changes in the computers memory

What is a non serial schedule?

A schedule where the operations of some transactions are interleaved.

What is a serial schedule?

A schedule where the operations of two transactions are not interleaved.

What is a schedule?

A sequence of operations from a set of n transactions T1,T2,...Tn such that: the order of the operations in each transaction Ti is preserved in the schedule.

Foreign key

A set of attributes within a relation R1 that matches the primary key of a relation R2.

Table

A set of rows sharing the same attributes

What files are generated when you choose the option to Generate Database Creation Scripts in the Database Configuration Assistant?

A shell script SQL scripts A parameter file A password file

What is a cascading rollback?

A single transaction failure leading to a series of transaction rollbacks

Triggers

A statement that is executed automatically by the system as a side effect of a modification to the database.

Normalization

A step-by-step process to eliminate data redundancy 1NF, 2NF, 3NF

Field

A table consists of several records, each record can be broken into several smaller entities

Raw facts

A telephone number, a birth date, a customer name, and a year to date (YTD) sales value

What is the uncommitted dependency (dirty data) problem?

A transaction is allowed to see the intermediate results of another transaction before it has committed

Roll back

A transaction log reverses back to a state before any change occurred

What is Consistency?

A transaction must transform the database frm one consistent state to another consistent state.

What is the inconsistent analysis problem?

A transaction reads some values while they are being updated by another transaction.

Composite Key

A unique key created by combining two or more columns. Usually containing feilds that are primary keys in other tables

Oracle Net

A utility that enables network communcation between client and the server used by oracle servers

What are the 4 properties of a transaction?

ACID: Atomicity Consistency Isolation Durability

Which of these actions will not be recorded in the alert log? A. ALTER DATABASE commands B. ALTER SESSION commands C. ALTER SYSTEM commands D. Archiving an online redo log file E. Creating a tablespace F. Creating a user

ALTER SESSION commands Creating a user

Changing Column's Data Type §

ALTER can be used to change data type § Some RDBMSs do not permit changes to data types unless column is empty § Syntax - § ALTER TABLE tablename MODIFY (columnname(datatype)) ;

Null

Absence of any data value that could represent: An unknown attribute value A known, but missing, attribute value A inapplicable condition

At what point can you not choose or change the database character set?

After database creation, using DBCA to install options.

Operations on Multidimensional Data Model

Aggregation (roll-up), navigation (drill-down), selection (slide), calculation, ranking, etc

Objects

All database objects

ALL operator

Allows comparison of a single value with a list of values returned by the first subquery § Uses a comparison operator other than equals

What is a transaction?

An action carried out by the user/ program that reads or updates the database.

How do we guarantee serializability with locks?

An additional protocol controls the positioning of locks. Two phase locking (2PL): -for every single transaction, all locking operations occure before unlocking operations

Determinant

Any attribute whose value determines other values within a row

Relationships

Association between entities that always operate in both directions

Primary key (PK)

Attribute or combination of attributes that uniquely identifies any given row

Foreign Key

Attribute or combination of attributes in one table whose values must either match the primary key in another table or be null

Super Key

Attribute or combination of attributes that uniquely identifies each row in a table

Levels of Database Backups

Backups are provided with high security

BLOB

Binary Large Object stores up to 4 GB of data

Command §

CREATE SCHEMA AUTHORIZATION {creator}; § Seldom used directly

Syntax to create table §

CREATE TABLE tablename();

Q.43 In the figure below, Customer_ID in the CUSTOMER Table is which type of key?

Candidate

Primary Key

Candidate key selected to uniquely identify all other attribute values in any given row; cannot contain null entries

Alternate key

Candidate keys not selected

Atomic attribute

Cannot be further subdivided

Business intelligence

Captures and processes business data to generate information that support decision making

What is a problem of timestamp protocol? What is the solution?

Cascading rollback. Solution: -A transaction is structured such that all its writes are all performed at the end of its processing -All writes of a transaction form an atomic action

Database management system (DBMS)

Collection of programs, Manages the database structure, Controls access to data stored in the database

NoSQL Disadvantages

Complex programming is required, There is no relationship support, There is no transaction integrity support, In terms of data consistency, it provides an eventually consistent model

Main points of conservative method and optimistic method?

Conservative: delay the transactions when they are in conflict Optimistic: assume that transactions are rarely in conflict. Check for conflicts just before the transaction commits

Keys

Consist of one or more attributes that determine other attributes

Disjoint subtypes

Contain a unique subset of the supertype entity set, Known as nonoverlapping subtypes, Implementation is based on the value of the subtype discriminator attribute in the supertype

Processing mismatch §

Conventional programming languages process one data element at a time § Newer programming environments manipulate data sets in a cohesive manner §

What operation cannot be applied to a tablespace after creation?

Convert from manual segment space management to automatic segment space management

Conviction (Association Rules)

Conviction (A => B) = Pr(A)*Pr(not B) / Pr(A ∧ not B)

Factors Affecting Software Purchasing Decision

Cost DBMS features and tools Underlying model Portability DBMS hardware requirements

Which of these operations can be accomplished with the DBCA? A. Create a database B. Remove a database C. Upgrade a database D. Add database options E. Remove database options

Create a database, Remove a database, Add database options

Developing an ER Diagram

Create a detailed narrative of the organization's description of operations

Database Developer

Create and maintain database based applications, programing, database fundamentals, SQL

Cloud database

Created and maintained using cloud data services that provide defined performance measures for the database

Oracle application server

Creates a World Wide Website that allows users to access oracle databases and create dynamic web pages

Virtualization

Creates logical representations of computing resources independent of underlying physical computing resources

Which view will list all tables in the database?

DBA_TABLES

Which views could you query to find out about the temporary tablespaces and the files that make them up?

DBA_TABLESPACES DBA_TEMP_FILES V$TABLESPACE V$TEMPFILE

What instance parameter cannot be changed after database creation?

DB_BLOCK_SIZE

Which parameter controls the location of background process trace files?

DIAGNOSTIC_DEST

Listing Unique Values §

DISTINCT clause: Produces list of values that are unique § Syntax - SELECT DISTINCT columnlist FROM tablelist ; § Access places nulls at the top of the list § Oracle places it at the bottom § Placement of nulls does not affect list contents

Which of these commands can be executed against a table in a read-only tablespace? A. DELETE B. DROP C. INSERT D. TRUNCATE E. UPDATE

DROP

DML

Data Manipulation Language: insertion of new data modification of stored data retrieval of data deletion of data

Data dictionary management

Data dictionary

________ is a component of the relational data model included to specify business rules to maintain the integrity of data when they are manipulated.

Data integrity

Distributed database

Data is distributed across different sites

Centralized database

Data is located at a single site

First Normal Form (1NF)

Data is structured to have a primary key and no repeating groups

Data type mismatch §

Data types provided by SQL might not match data types used in different host languages

Time-Variant Data

Data whose values change over time and for which a history of the data changes must be retained

USERS

Database Users

VEIWS

Database Views

Performance Factors of an Information System

Database design and implementation Application design and implementation Administrative procedures

Database Design Challenges: Conflicting Goals

Database design must conform to design standards

Audit trial

Database of the log in and log out and also the tables na ina-access

TABLES

Database tables

ERD depicts the

Database's main components, Entities, Attributes, Relationships,

Which files must be synchronized for a database to open?

Datafiles, online redo log files, and the controlfile

SQL Functions

Date and time functions Numeric functions String functions Conversion functions

Structured Query Language (SQL)

De facto query language and data access standard supported by the majority of DBMS vendors

DROP TRIGGER trigger_name command §

Deletes a trigger without deleting the table §

DROP TABLE

Deletes table from database § Syntax - DROP TABLE tablename ;

Truncate table tablename

Deleting the data in the table

Attributes

Describe the properties of an object

Relationship

Describes an association among entities

Cloud Computing Data Architect

Design and implement the infrastructure for next generation cloud database systems, internet technologies, cloud storage technologies, data security, performance tuning, large databases, etc.

Database Architect

Design and implementation of database environments, DBMS fundamentals, data modeling, SQL, hardware knowledge, etc.

Database Designer

Design and maintain databases, system design, database design, SQL

Foreign key

Do not need to have unique values in the referencing relation

Optional attribute

Does not require a value, can be left empty

Relational model aimed at logical level

Does not require physical

Index

Each index is associated with only one table

Entity Integrity Purpose

Each row will have a unique identity, and foreign key values can properly reference primary key values

Inheritance

Enables an entity subtype to inherit attributes and relationships of the supertype, All entity subtypes inherit their primary key attribute from their supertype, At the implementation level, supertype and its subtype(s) maintain a 1:1 relationship, Entity subtypes inherit all relationships in which supertype entity participates, Lower-level subtypes inherit all attributes and relationships from its upper-level supertypes

Backup and recovery management

Enables recovery of the database after a failure

Schema data definition language (DDL)

Enables the database administrator to define the schema components

The External Model

End users' view of the data environment, ER diagrams are used to represent the external views

Performance tuning

Ensures efficient performance of the database in terms of storage and access speed

SQL Constraints •

Ensures that column does not accept nulls NOT NULL • Ensures that all values in column are unique UNIQUE • Assigns value to attribute when a new row is added to table DEFAULT • Validates data when attribute value is entered CHECK

Full functional dependence

Entire collection of attributes in the determinant is necessary for the relationship

Participants

Entities that participate in a relationship

Segments

Equivalent of a file system's record type

SQL engine

Executes all queries

Cardinality

Expresses the minimum and maximum number of entity occurrences associated with one occurrence of related entity

HAVING Clause §

Extension of GROUP BY feature § Applied to output of GROUP BY operation § Used in conjunction with GROUP BY clause in second SQL command set § Similar to WHERE clause in SELECT statement

Theta join

Extension of natural join, denoted by adding a theta subscript after the JOIN symbol

Logical View of Data

Facilitated by the creation of data relationships based on a logical construct called a relation

Proper naming

Facilitates communication between parties, Promotes self-documentation

Well designed database

Facilitates data management, Generates accurate and valuable information

Structural independence

File structure is changed without affecting the application's ability to access the data

Big Data Aims to

Find new and better ways to manage large amounts of web and sensor-generated data, Provide high performance and scalability at a reasonable cost

Normalization Normal forms

First normal form (1NF) Second normal form (2NF) Third normal form (3NF)

CHAR

Fixed-length character data coumn_name CHAR(maximum_size(0-2000)

Database Design

Focuses on the design of the database structure that will be used to store and manage end user data, Well designed database, Poorly designed database causes difficult to trace errors

ALTER TABLE command §

Followed by a keyword that produces the specific change one wants to make § Options include ADD, MODIFY, and DROP §

Design Case 1: Implementing 1:1 Relationships

Foreign keys work with primary keys to properly implement relationships in relational model

Grouping Data §

Frequency distributions created by GROUP BY clause within SELECT statement § Syntax - SELECT columnlist FROM tablelist [WHERE conditionlist ] [GROUP BY columnlist ] [HAVING conditionlist ] [ORDER BY columnlist [ASC | DESC]];

Partial dependency

Functional dependence in which the determinant is only part of the primary key Assumption - One candidate key Straight forward Easy to identify

SQL Functions §

Functions always use a numerical, date, or string value § Value may be part of a command or may be an attribute located in a table § Function may appear anywhere in an SQL statement where a value or an attribute can be used

Data Management

Generation, storage, and retrieval of data

The Entity Relationship Model

Graphical representation of entities and their relationships in a database structure

Repeating group

Group of multiple entries of same type can exist for any single key attribute occurrence, Existence proves the presence of data redundancies

HAVING subqueries

HAVING clause restricts the output of a GROUP BY query by applying conditional criteria to the grouped rows

HTML vs XML

HTML describes the presentation of data XML describes the content of data

Outer join

Have common values in common columns or have no matching values

Edgar frank ted codd

He is an english computer scientist and worked for IBM, who invented the relational model for database management

Database Consultant

Help companies leverage database technologies to improve business processes and achieve specific goals, database fundamentals, data modeling, database design, SQL, DBMS, hardware vendor specific technologies, etc.

Syntax §

INSERT INTO tablename SELECT columnlist FROM tablename §

Developing an ER Diagram

Identify business rules based on the descriptions

Developing an ER Diagram

Identify main entities and relationships from the business rules

Developing an ER Diagram

Identify the attributes and primary keys that adequately describe entities

Fully functional dependence (composite key)

If attribute B is functionally dependent on a composite key A but not on any Subset of that composite key, the attribute B is fully functionally dependent on A.

When is a schedule cascadeless?

If cascade rollbacks cannot occur

How can a tablespace be made larger?

If it is a SMALLFILE tablespace, add files Resize the existing file(s)

When is a non-serial schedule serializable?

If it produces a database state that can be produced by some serial execution of the same transactions

What conditions are needed for a recoverable schedule?

If no transaction T commits in schedule S unless: -first all transactions T' commit, from which T reads.

Order of rows and columns

Immaterial to the dbms

Oracle Sequences §

Independent object in the database § Have a name and can be used anywhere a value expected § Not tied to a table or column § Generate a numeric value that can be assigned to any column in any table § Table attribute with an assigned value can be edited and modified § Can be created and deleted any time

Unique index

Index key can have only one pointer value associated with it

Index key

Index's reference point that leads to data location identified by the key

Single Data Value

Intersection of a row and column

Primary key

Is a candidate key that is most appropriate to become main key of the table

Foreign key

Is a field in a relational table that matches the primary key column of another table

Data integrity

Is a fundamental component of information security

Composite index: §

Is based on two or more attributes § Prevents data duplication §

Conversion to Third Normal Form Table is in 3NF when it:

Is in 2NF Contains no transitive dependencies

Domain

It can be considered a constraint on a value of the attribute

Unstructured data

It exists in their original state

One to one

It exists when one row in a table may be linked with one row in another table and vice versa

One to many

It exists when one row in table A may be linked with many rows in table B but one row in table B is linked to only one row in table A

Relational database

It is a collection of data items organized as a set of formally described tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables

Database

It is a collection of logically related data

Schema

It is the overall design of a database

Structured query language

Meaning of sql

Stored function

Named group of procedural and SQL statements that returns a value § As indicated by a RETURN statement in its program code §

Unnormalized

No primary keys and redundant data

Partial completeness

Not every supertype occurrence is a member of a subtype

Arithmetic operators §

Perform: § Operations within parentheses § Power operations § Multiplications and divisions § Additions and subtractions

In which data model would a code table appear?

Physical

Data types

Primitve data types

UNION ALL §

Produces a relation that retains duplicate rows § Can be used to unite more than two queries §

Data quality

Promoting accuracy, validity, and timeliness of data

Public Key

Publicly published key used to encrypt data, but cannot be used to decrypt data.

Subquery

Query embedded/nested inside another query

Database access languages and application programming interfaces

Query language, Structured query language (SQL)

Existence independence

Referred to as a strong entity or regular entity

Data integrity

Refers to the accuracy and consistency of data stored in a database

Recursive relationship

Relationship exists between occurrences of the same entity set

The Internal Model

Representing database as seen by the DBMS mapping conceptual model to the DBMS, Is software dependent and hardware independent

The Conceptual Model

Represents a global view of the entire database by the entire organization, Has a macro-level view of data environment, Is software and hardware independent

Extensible Markup Language (XML)

Represents data elements in textual format

Client database processes

Requests for data across the network

Design Case 2: Maintaining History of Time-Variant Data

Requires creating a new entity in a 1:M relationship with the original entity, New entity contains the new value, date of the change, and other pertinent attribute

Hierarchical Model Disadvantages

Requires knowledge of physical data storage characteristics, Navigational system requires knowledge of hierarchical path, Changes in structure require changes in all application programs, Implementation limitations, No data definition, Lack of standards

Class hierarchy

Resembles an upside-down tree in which each class has only one parent

Tuple

Rows

Specific integrity rule

Rules that apply to a particular relation

General integrity rule

Rules that apply to all relations or database

Contstraints

Rules that restrict the data values the can be entered into a column in a database table

Desktop database

Runs on PC

Attribute List Subqueries §

SELECT statement uses attribute list to indicate what columns to project in the resulting set § Column alias cannot be used in attribute list computation if alias is defined in the same attribute list

Comparison Operators: Computed Columns and Column Aliases §

SQL accepts any valid expressions/formulas in the computed columns § Computed column, an alias, and date arithmetic can be used in a single query

Relational Set Operators §

SQL data manipulation commands are set-oriented § UNION, INTERSECT, and Except (MINUS) work properly when relations are union - compatible

Homonym

Same name is used to label different attributes

SEQUENCES

Sequences used to generate surrogate key values automatically

Triggering event

Statement that causes the trigger to execute

Conf(LHS => RHS) (in terms of supports)

Supp(LHS ∪ RHS) / Supp(LHS)

Extended relational data model (ERDM)

Supports OO features and complex data representation

Workgroup databases

Supports a small number of users or a specific department

Database Design

Supports company's operations and objectives Checks the ultimate final product from all perspectives Pointers for examining completion procedures Data component is an element of whole system System analysts/programmers design procedures to convert data into information Database design is an iterative process

COMMIT: Command to save changes •

Syntax - COMMIT [WORK]; • Ensures database update integrity

DELETE: Command to delete •

Syntax - DELETE FROM tablename • [WHERE conditionlist ];

Data Manipulation Commands INSERT: Command to insert data into table •

Syntax - INSERT INTO tablename VALUES(); • Used to add table rows with NULL and NOT NULL attributes

ROLLBACK: Command to restore the database •

Syntax - ROLLBACK; • Undoes the changes since last COMMIT command

SELECT: Command to list the contents •

Syntax - SELECT columnlist FROM tablename ; • Wildcard character (*): Substitute for other characters/command

UPDATE: Command to modify data •

Syntax - UPDATE tablename SET columnname = expression [, columnname = expression ] [WHERE conditionlist ];

Which protocols can Oracle Net 12c use?

TCP, SDP, TCP with secure sockets, Named Pipes

Which of these are types of segment? A. Sequence B. Stored procedure C. Table D. Table partition E. View

Table Table partition

Attribute or combination of attributes

Table must have to uniquely identify each row

Difference

Tables must be union-compatible to yield valid results

Intersect

Tables must be union-compatible to yield valid results

Base tables

Tables on which the view is based

Union-compatible

Tables share the same number of columns, and their corresponding columns share compatible domains

Logical design

Task of creating a conceptual data model

How does multiversion timestamp work?

Tell me!

Connectivity

Term used to label the relationship types

Q.15 The figure below is an example of mapping which type of relationship?

Ternary

What tools can be used to manage templates?

The Database Configuration Assistant

What happens when a user issues a COMMIT?

The LGWR flushes the log buffer to the online redo log

Which of these tools can configure a listener.ora file? A. The Database Configuration Assistant B. Database Express C. The lsnrctl utility D. The Net Configuration Assistant E. The Net Manager

The Net Configuration Assistant The Net Manager

Data independence

The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level

Logical data independence

The ability to modify the conceptual scheme without causing application programs to be written

Functional dependence

The attribute B is fully functionally dependent on the attribute A if each value of A determines one and only one value of B

Primary key

The candidate key selected to identify rows uniquely with the table

Schema objects (database objects)

The data within a user schema

What memory structures are a required part of the SGA?

The database buffer cache The log buffer The shared pool

You shut down your instance with SHUTDOWN IMMEDIATE. What will happen on the next startup?

The database will open without recovery

Which SGA memory structure(s) cannot be resized automatically after instance startup?

The log buffer

Which SGA memory structure(s) cannot be resized dynamically after instance startup?

The log buffer

Structured Query Language

The standard query language for relational databases is Structured Query Language

If a statement is suspended because of a space error, what will happen when the problem is fixed?

The statement will continue executing from the point it had reached immediately after the problem is fixed

Attribute

The table columns

You issue the command SHUTDOWN, and it seems to hang. What could be the reason?

There are other sessions logged on

Surrogate Keys Primary key used to simplify the identification of entity instances are useful when:

There is no natural key, Selected candidate key has embedded semantic contents or is too long

Which statement is correct regarding the online redo log?

There must be at least two log file groups, with at least one member each

Ternary relationship

Three entities are associated

How to handle deadlocks?

Timeouts: -a transaction that requests a lock for only a system-defined period (maximum) of time. Deadlock detection: -construct a wait-for-graph -nodes for each transaction -a cycle suggests there is a deadlock

Data transformation and presentation

Transforms entered data to conform to required data structures

One to one One to many Many to one Many to many

Types of relationship between entities

Advanced Data Updates §

UPDATE command updates only data in existing rows § If a relationship is established between entries and existing columns, the relationship can assign values to appropriate slots § Arithmetic operators are useful in data updates § In Oracle, ROLLBACK command undoes changes made by last two UPDATE statements

Select (Restrict)

Unary operator that yields a horizontal subset of a table

Disadvantages of Storing Derived Attributes Not Stored

Uses CPU processing cycles, Increases data access time, Adds coding complexity to queries

Entity relationship diagram (ERD)

Uses graphic representations to model database components

WHERE Subqueries

Uses inner SELECT subquery on the right side of a WHERE comparison expression § Value generated by the subquery must be of a comparable data type § If the query returns more than a single value, the DBMS will generate an error § Can be used in combination with joins

Divide

Uses one 2-column table as the dividend and one single-column table as the divisor

EER diagram (EERD)

Uses the EER model

Validates logical model

Using normalization Integrity constraints Against user requirements

Relvar

Variable that holds a relation

Left outer join

Yields all of the rows in the first table, including those that do not have a matching value in the second table

Right outer join

Yields all of the rows in the second table, including those that do not have matching values in the first table

Product

Yields all possible pairs of rows from two tables

Difference

Yields all rows in one table that are not found in the other table

Intersect

Yields only the rows that appear in both tables

The LOG_BUFFER parameter is a static parameter. How can you change it?

You can change it within the instance, but it will return to the static value at the next startup

Consider this line from a listener.ora file: L1=(description=(address=(protocol=tcp)(host=serv1)(port=1521))) What will happen if you issue this connect string: connect scott/tiger@L1

You can't tell - it depends on how the client side is configured

You issue the URL https://127.0.0.1:5500/em and receive an error. What could be the problem?

You have not started the database listener Database Express is running on a different port You are not logged on to the database server node You have not started the database

Client

a program that rquests and uses a servers resources

User Schema

are part of the users account where they create table in there area known as users schema

Multidimensional OLAP (MOLAP)

array based storage structures, direct access to array data structures, so fast access times

A form of denormalization where the same data are purposely stored in multiple places in the database is called:

data replication

A detailed coding scheme recognized by system software for representing organizational data is called a(n):

data type

Average linkage (clustering)

distance b/w 2 clusters A and B = (1/(|A|*|B|)) ∑a∑b d(a,b)

Complete linkage (clustering)

distance b/w 2 clusters A and B = max{d(a,b): a∈A, b∈B}

The ________ states that no primary key attribute may be null.

entity integrity rule

A command used in Oracle to display how the query optimizer intends to access indexes, use parallel servers and join tables to prepare query results is the:

explain plan

Practical significance of data dependence

is difference between logical and physical format

Market Basket (Association Rules)

items bought by one customer in a single transaction; more generally, a group of entities that can be considered related for the purposes of data mining

An index on columns from two or more tables that come from the same domain of values is called a:

join index.

column costraint

limits the value that can be placed in a specific column, irrespective of values that exist in other table rows

A form of database specification which maps conceptual requirements is called:

logical specifications

What are the three types of problems by interleaving transactions?

lost update problem uncommitted dependency problem inconsistent analysis problem

The need to ________ relations commonly occurs when different views need to be integrated.

merge

Attribute Hierarchies

most common form of attribute relation. Either linear or lattice structure

Entity integrity

no attribute in a primary key is null

A(n) ________ is a field of data used to locate a related field or record.

pointer

Server DBMS process

retrieves data from the database and performs the the requested functions on the data

What will be the setting of the OPTIMIZER_MODE parameter for your session after the next start up if you issue these commands: alter system set optimizer_most=all_rows scope=spfile; alter system set optimizer_mode=rule; alter session set optimizer_mode=first_rows;

rule

One field or combination of fields for which more than one record may have the same combination of values is called a(n):

secondary key

A key decision in the physical design process is:

selecting structures

Snowflake Schema

set up like star, it with normalized (and connected) versions of dimension tables (each dimension may have more than 1 table now)

How does the timestamp method guarantee serializability?

since all the arcs in the precedence graph are: Transaction with smaller timestamp ---> transaction with larger timestamp. Therefore no cycle and no deadlocks! But may not be recoverable

Fixed-Point

specific number of digits on both the lfwt and right of the decimal

Exec SQL statement

statement is used to identify embedded SQL requests to the preprocessor. EXEC SQL<embedded SQL statement>END_EXEC

Knowledge Discovery in Databases (KDD)

the nontrivial process of identifying valid, potentially useful, and ultimately understandable patterns in data.

You have to use local naming. Which file(s) must you create on the client machine?

tnsnames.ora only

When a session changes data, where does the change get written?

To the data block in the cache, and the redo log buffer

Computer-Aided Systems Engineering (CASE)

Tool that produces: Time and cost effective systems Structured, documented, and standardized applications

Specialization

Top-down process, Identifies lower-level, more specific entity subtypes from a higher-level entity supertype, Based on grouping unique characteristics and relationships of the subtypes

Systems Development Life Cycle (SDLC)

Traces history of an information system Provides a picture within which database design and application development are mapped out and evaluated Iterative rather than sequential process

What is Isolation?

Transactions execute idependently -the partial effects of an incomplete transaction must not be visible to other transactions

Binary relationship

Two entities are associated

Physical data independence Logical data independence

Two kinds of data independence

Specific integrity rule General integrity rule

Two relational integrity rule

Table

Two-dimensional structure composed of rows and columns

Project

Unary operator that yields a vertical subset of a table

Entity Relationship Model Advantages

Visual modeling yields conceptual simplicity, Visual representation makes it an effective communication tool, Is integrated with the dominant relational model

Big Data Challenges

Vo l u m e d o e s n o t a l l o w t h e u s a g e o f conventional structures, Expensive, OLAP tools proved inconsistent dealing with unstructured data

Big Data Characteristics

Volume, Velocity, Va r i e t y

Use of Composite Primary Keys Identifiers of weak entities

Weak entity has a strong identifying relationship with the parent entity

What is a lock?

When a transaction accesses the database the lock denies access to other transactions to prevent incorrect results

When is a non-serial schedule view serializable?

When it is view equivalent to serial schedule (returns same result)

Under what circumstances would a connection through a Database Resident Connection Pool (SERVER=POOLED) connection be suitable?

When many short-lived connections share a schema

When are pairs of operations in conflict?

When one transaction writes a data item and another one either reads or writes the same data item.

SQL Indexes §

When primary key is declared, DBMS automatically creates unique index §

How is deadlock formed?

When two or more transactions wait for each other -> They can wait forever.

Data Series

When you turn a sequence of ordered points into a "time series", but ordering unrelated to time; then can use similar/the same techniques on these series

Time Series

a collection of made sequentially in time (useful because intuitive, easy to understand).

Server

a computer that shares its reources with other computers (terminals)

Data Warehourse

a decision support DB that is maintained separately from the organization's operational DBs. Collection of data that is used primarily for organizational decision making/ analysis Usually pulls data from different sources, important to sanitize, normalize, and preprocess the data

Server process

a program that listens to requests for resources from clients and then responds to those requests

Star Schema

a single fact table and a single table for each dimension, generated keys used for maintenance, performance reasons fact table: keys == foreign keys into dimension table (dimension attributes), other attributes = measure (aka dependent) attributes fact table much larger than dimension tables, normalized

Data Mining

a step in the KDD process consisting of applying data analysis and discovery algorithms that, under acceptable computational efficiency limitations, produce a particular enumeration of patterns over the data

What is timestamp?

a unique identifier TS(Ti) that indicates the relative starting time of a transaction Ti

Syntax to create SQL indexes § CREATE INDEX

indexname ON tablename(); §

Syntax to delete an index § DROP INDEX

indexname;

On-Line Analytical Processing (OLAP)

information technology to help the knowledge worker (executive, manager, analyst) to make faster and better decisions (element of decision support systems, more high level than SQL) - mostly read operations, can optimize on that

Data access layer

interfaces between business logic layer and the underlying database provides mapping from object model of business layer to relational model of database

Dynamic Time Warpping

intuition: we copy and element multiple times so as to achieve a better matching Create a matrix W of size |Q| by |C|, where W_i,j = dist(q_i, c_j) for some distance metric. Computer the best/shortest "warping path" between Q and C with DTW(Q, C) = min{sqrt(∑(k=1 to K)w_k)/K} where K is number of weights being added - can use dynamic programming to find shortest path

A method that speeds query processing by running a query at the same time against several partitions of a table using multiprocessors is called:

parallel query processing

A functional dependency in which one or more nonkey attributes are functionally dependent on part, but not all, of the primary key is called a ________ dependency.

partial functional

An attribute (or attributes) that uniquely identifies each row in a relation is called a:

primary key

Business-logic layer

provides high level view of data and actions on data hides details of data storage schema

Precision

total number of digits both to the left and to the right of the decimal point

A method for handling missing data is to:

track missing data with special reports

Augment transactions to get generalized association rules

transaction T -> {items in T + all of the ancestors of items in T} NOTE: if we have a large itemset Y∪x, we can eliminate all other itemsets Y∪ancestor(x) as redundant

Support for generalized association rules

transaction T supports an item x if: x is an item in T, or x is an ancestor of an item in T note: if {x,y} has above threshold support, then {x, \hat(y)} where \hat(y) is an ancestor of y also has above threshold support (sim for {\hat(x), y} and {\hat(x), \hat(y)})

Database access frequencies are estimated from:

transaction volumes

A functional dependency between two or more nonkey attributes is called a:

transitive dependency

Interestingness (Association Rules)

try to model departure from independence of A and B = Pr(A, B) / (Pr(A)*Pr(B)) -> can model with Supp(A∪B)/(Supp(A)*Supp(B)) range 0 to positive inf - = 1 if A and B are independent

Generalized Association Rules

using (forest of) type hierarchies to generalize items to a less specific type (and then getting association rules based on some of these)

Floating-Point

variable number of decimal places

A relation that contains minimal redundancy and allows easy use is considered to be:

well-structured

Horizontal partitioning makes sense:

when different categories of a table's rows are processed separately

Fact Constellation

when multiple fact tables share common dimension tables

Integer

whole numbers

Sources of Business Rules

Company managers, Policy makers, Department managers, Written documentation, Direct interviews with end users

Data Hardware Software End users Procedure

Components of the dbms environment (5)

Associative Entities

Composed of the primary key attributes of each parent entity

Q.23 In the figure below, the primary key for "Order Line" is which type of key?

Composite

Database Design Process

Conceptual Design Data analysis and requirements Entity Relationship modeling and normalization Data model verification Distributed database design select the dbms • Logical Design Map conceptual model to logical model components Validate logical model using normalization Validate logical model integrity constraints Validate logical model against user requirements • Physical Design Define data storage organization Define integrity and security measures Determine performance measures

ERD depicts the

Conceptual database as viewed by end user

Schema

Conceptual organization of the entire database as viewed by the database administrator

Network Model Advantages

Conceptual simplicity, Handles more relationship types, Data access is flexible, Data owner/member relationship promotes data integrity, Conformance to standards, Includes data definition language, (DDL) and data manipulation, language (DML)

Database Environments

Conceptual, Logical, and physical

Entity integrity

Condition in which each row in the table has its own unique identity

Overlapping subtypes

Contain nonunique subsets of the supertype entity set, Implementation requires the use of one discriminator attribute for each subtype

Character data type

Contain strings that are alphanumeric not used in calculations

General purpose databases

Contains a wide variety of data used in multiple disciplines

Object

Contains data and their relationships with operations that are performed on it

Discipline specific databases

Contains data focused on specific subject areas

Linking Table

Contains the primary key of each of the tables being connected

Entity subtype

Contains unique characteristics of each entity subtype

During the transition from NOMOUNT to MOUNT mode, which files are required?

Controlfiles

PL/SQL Processing with Cursors

Cursor-style processing involves retrieving data from the cursor one row at a time § Current row is copied to PL/SQL variables

DDL DML can provide security system Can provide an integrity system Can provide a concurrency control system Can provide a recovery control system

Different facilities that DBMS provide? (6)

Synonym

Different names are used to describe the same attribute

Data inconsistency

Different versions of the same data appear in different places

Use of Composite Primary Keys Identifiers of composite entities

Each primary key combination is allowed once in M:N relationship 21 ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible web sit e, in wh o le o r in p art. Use of Composite Primary Keys § When used as identifiers of weak entities, represent a real - world object that is: § Existence - dependent on another real - world object § Represented in the data model as two separate entities in a strong identifying relationship 22

Collection of tables stored in the database

Each table is independent from another, Rows in different tables are related based on common values in common attributes

Normalization Process Objective is to ensure that each table conforms to the concept of well-formed relations

Each table represents a single subject, No data item will be unnecessarily stored in more than one table, All nonprime attributes in a table are dependent on the primary key, Each table is void of insertion, update, and deletion anomalies

Conversion to First Normal Form

Enable reducing data redundancies Steps Eliminate the repeating groups Identify the primary key Identify all dependencies All relational tables satisfy 1NF requirements Some tables contain partial dependencies Subject to data redundancies and various anomalies

Online analytical processing (OLAP)

Enable retrieving, processing, and modeling data from the data warehouse

Security management

Enforces user security and data privacy

Keys Used to

Ensure that each row in a table is uniquely identifiable

Existence independence

Entity exists apart from all of its related entities

Existence dependence

Entity exists in the database only when it is associated with another related entity occurrence

Data manipulation language (DML)

Environment in which data can be managed and is used to work with the data in the database

Keys Used to

Establish relationships among tables and to ensure the integrity of the data

The Rule of Precedence

Establish the order in which computations are completed

Requirements for Good Normalized Set of Tables

Evaluate PK assignments and naming conventions Refine attribute atomicity, Identify new attributes and new relationships, Refine primary keys as required for data granularity, Maintain historical accuracy and evaluate using derived attributes

Normalization

Evaluating and correcting table structures to minimize data redundancies Reduces data anomalies Assigns attributes to tables based on determination, Properly designed 3NF structures meet the requirement of fourth normal form (4NF)

Updatable view restrictions §

GROUP BY expressions or aggregate functions cannot be used § Set operators cannot be used § JOINs or group operators cannot be used

Entity supertype

Generic entity type related to one or more entity subtypes, Contains common characteristics

What protocol(s) can be used to contact Database Express?

HTTP HTTPS

Big Data New Technologies

Hadoop, Hadoop Distributed, File System (HDFS), MapReduce, NoSQL

Weak Entity Conditions

Has a primary key that is partially or totally derived from parent entity in the relationship

In which type of file is multiple key retrieval not possible?

Hashed

Which type of file is easiest to update?

Hashed

Natural join

Have common values in common columns

Relvar

Heading contains the names of the attributes and the body contains the relation

Reasons for Identifying and Documenting Business Rules

Help standardize company's view of data, Communications tool between users and designers

Composite entity (Bridge or associative entity):

Helps avoid problems inherent to M:N relationships, includes the primary keys of tables to be linked

NoSQL Advantages

High scalability, availability, and fault tolerance are provided, Uses low-cost commodity hardware, Supports Big Data, 4. Key-value model improves storage efficiency

Structural point of view of normal forms

Higher normal forms are better than lower normal forms

Explicit cursor

Holds the output of a SQL statement that may return two or more rows

Data Dictionary and the System Catalog

Homonyms and synonyms must be avoided to lessen confusion

Deadlock thoughts...

How far to roll back 'victim'? Choice of deadlock victim? How long a transaction has been running? How many still to update? Avoid starvation? Solution: store number of times it has been aborted. Next time use different selection criteria.

Translating Business Rules into Data Model Components Questions to identify the relationship type

How many instances of B are related to one instance of A?, How many instances of A are related to one instance of B?

Which form of compression uses compression algorithms rather than de-duplication algorithms?

Hybrid Columnar Compression

International business machine

IBM means

The Physical Model

Operates at lowest level of abstraction, Describes the way data are saved on storage media such as disks or tapes, Requires the definition of physical storage and data access methods-level details

Index

Orderly arrangement to logically access rows in a table

Secondary key

Other term for alternative key

Attribute

Other term for column

Tuple/record

Other term for row

Relation

Other term for table

Divide

Output is a single column that contains all values from the second column of the dividend that are associated with every row in the divisor

No value

Output of the outer query might result in an error or a null empty set

What will happen if you do not run the CATAALOG.SQL and CATPROC.SQL scripts after creating a database?

It will not be possible to query the data dictionary views.

Data modeling

Iterative and progressive process of creating a specific data model for a determined problem domain

JDBC

Java Database Connectivity - works with Java Connects a java program to the SQL server (mysql-connector-java-5.1.39-bin.jar)

Java Server Page

Java technology that allows dynamically generated HTML pages. Server executes the script embedded within the HTML before it generates a web page. Java code enclosed between <% ... %> (index.jsp)

Client side scripting

JavaScript often used. Security is needed to prevent malicious scripts from damaging client machines. Easy to limit in scripting languages, harder to limit general use languages like Java. Java applet does not make system calls directly. Prevents file writes and notifies user about potentially dangerous actions.

Cross Correlation (time series)

Keep one sequence static and slide the other to compute the correlation (inner product) of each shift; determine shift that gives maximum correlation. Eq: see notes

Private Key

Key known only to individual user, and used to decrypt data. Need not be transmitted to the site doing encryption.

Composite key

Key that consists of two or more attributes that uniquely identify an entity occurrence

Composite key

Key that is composed of more than one attribute

Advanced Data Definition Commands §

Keywords use with the command § ADD - Adds a column § MODIFY - Changes column characteristics § DROP - Deletes a column § Used to: § Add table constraints § Remove table constraints

Disadvantages of Storing Derived Attributes Stored

Requires constant maintenance to ensure derived value is current, especially if any values used in the calculation change

Relational Model Disadvantages

Requires substantial hardware and system software overhead, Conceptual simplicity gives untrained people the tools to use a good system poorly, May promote information problems

UNIQUE constraint

Restriction placed on a column to ensure that no duplicate values exist for that column

Extended Entity Relationship Model (EERM)

Result of adding more semantic constructs to the original entity relationship (ER) model

Developing an ER Diagram

Revise and review ERD

What is required before shrinking a table?

Row movement must be enabled Automatic segment space management must be enabled

Entity instance or entity occurrence

Rows in the relational table

Following syntax enables to specify which rows to select

SELECT columnlist § FROM tablelist § [WHERE conditionlist ];

Copying Parts of Tables §

SQL permits copying contents of selected table columns § Data need not be reentered manually into newly created table(s) § Table structure is created § Rows are added to new table using rows from another table

Format of Association Rules

LHS => RHS, meaning something along the lines of "if item(s) of the LHS are in a transaction, item(s) of the RHS are also likely to be in the same transaction"

Problems with File System Data Processing

Lengthy development times, Difficulty of getting quick answers, Complex system administration, Lack of security and limited data sharing, Extensive programming

Query language

Lets the user specify what must be done without having to specify how

Granularity

Level of detail represented by the values stored in a table's row

Entity Relationship Model Disadvantages

Limited constraint representation, Limited relationship representation, No data manipulation language, Loss of information content occurs when attributes are removed from entities to avoid crowded displays

Natural join

Links tables by selecting only the rows with common values in their common attributes

Equijoin

Links tables on the basis of an equality condition that compares specified columns of each table

Relationships

Links that show how different records are related

The Database Schema §

Logical group of database objects related to each other §

Additional SELECT Query Keywords §

Logical operators work well in the query environment § SQL provides useful functions that: § Counts § Find minimum and maximum values § Calculate averages § SQL allows user to limit queries to entries: § Having no duplicates § Whose duplicates may be grouped

Logical View of Data

Logical simplicity yields simple and effective database design methodologies

Database Systems

Logically related data stored in a single logical data repository, DBMS eliminates most of file system's problems, Current generation DBMS software

Which process is responsible for sending the alert when a tablespace usage critical threshold is reached?

MMON, the Manageability Monitor process

Density - MOLAP vs. ROLAP

MOLAP tends to work well with dense data, and ROLAP with sparser data (hybrid systems can pick representation based on density of data to exploit this difference)

DDL DML DQL DCL TCC

Main categories of sql commands (5)

Conversion to Second Normal Form Steps

Make new tables to eliminate partial dependencies Reassign corresponding dependent attributes

Conversion to Third Normal Form Steps

Make new tables to eliminate transitive dependencies Reassign corresponding dependent attributes

Database Administrator

Manage and maintain DBMS and databases, database fundamentals, SQL, vendor courses

Hierarchical Models

Manage large amounts of data for complex manufacturing projects, Represented by an upside-down tree which contains segments, Depicts a set of one-to-many (1:M) relationships

Extensible Markup Language (XML)

Manages unstructured data for efficient and effective exchange of all data types

Evolution of File System Data Processing

Manual File Systems -> Computerized File Systems -> File System Redux: Modern End User Productivity Tools

Outer join

Matched pairs are retained and unmatched values in the other table are left null

Relation or table

Matrix composed of intersecting tuple and attribute

Associative Entities

May also contain additional attributes that play no role in connective process

Data definition language

Meaning of ddl

Equality or inequality

Meet a given join condition

Candidate Key

Minimal (irreducible) super key; a superkey that does not contain a subset of attributes that is itself a superkey

Integrated

Minimize data redundancy

Data integrity management

Minimizes redundancy and maximizes consistency

________ are anomalies that can be caused by editing data in tables.

Modification

Create table Alter table Drop table Create index Alter index Drop index create view Drop view

Most fundamentals ddl commands (8)

A virtual table

Multicolumn, multirow set of values

Which of the following statements about listeners is correct? A. A listener can connect you to one instance only B. A listener can connect you to one service only C. Multiple listeners can share one network interface card D. An instance will only accept connections from the listener specified on the local_listener parameter

Multiple listeners can share one network interface card

Second Normal Form (2NF)

Must be in 1NF and has no partial dependancies, a partial dependency means the feilds in the table are only partially dependant on the primary key

Third Normal Form (3NF)

Must be in 2NF and has no transitive dependencies, a transitive dependency means a feild is dependent on a feild in a table that is not a primary key

Values in a column

Must conform to same data format

Required attribute

Must have a value, cannot be left empty

Features of table creating command sequence §

NOT NULL specification § UNIQUE specification §

NUMBER

NUMBER [([Precision,] [scale])] Three subtypes of number int, fixed_point, and floating_point

Stored Procedures §

Named collection of procedural and SQL statements § Advantages § Reduce network traffic and increase performance § Reduce code duplication by means of code isolation and code sharing

Database Design Challenges: Conflicting Goals

Need for high processing speed may limit the number and complexity of logically desirable relationships

Database Design Challenges: Conflicting Goals

Need for maximum information generation may lead to loss of clean design structures and high transaction speed

What are the advantages of multiversion timestamp?

No idea.

Entity Integrity Example

No invoice can have a duplicate number, nor it can be null

Entity integrity

No key attribute in the primary key can contain a null

Which of the following are properties of relations?

No two rows in a relation are identical

Desirable Primary Key Characteristics

Non intelligent, No change over time, Preferably single-attribute, Preferably numeric, Security-compliant

What action should you take after terminating the instance with SHUTDOWN ABORT?

None, Recovery will be automatic

Structured Query Language (SQL) §

Nonprocedural language with basic command vocabulary set of less than 100 words § Differences in SQL dialects are minor

1:M relationship

Norm for relational databases

Normalization and Database Design

Normalization should be part of the design process Proposed entities must meet required the normal form before table structures are created Principles and normalization procedures to be understood to redesign and modify databases ERD is created through an iterative process Normalization focuses on the characteristics of specific entities

NoSQL Databases

Not based on the relational model, Support distributed database architectures, Provide high scalability, high availability, and fault tolerance, Support large amounts of sparse data, Geared toward performance rather than transaction consistency, Store data in key-value stores

Raw Data

Not yet been processed to reveal the meaning

When updating rows locally and through a database link in one transaction, what must you do to ensure a two-phase commit?

Nothing special because two-phase commit is automatic

Translating Business Rules into Data Model Components

Nouns translate into entities, Verbs translate into relationships among entities, Relationships are bidirectional

Union-compatible

Number of attributes are the same and their corresponding data types are alike

Common SQL Data Types

Numeric NUMBER(L,D) or NUMERIC(L,D) Character CHAR(L) • VARCHAR(L) or VARCHAR2(L) •DATE Date

ANSI SQL allows use of following clauses to cover CASCADE, SET NULL, or SET DEFAULT §

ON DELETE and ON UPDATE

Ordering a Listing §

ORDER BY clause is useful when listing order is important § Syntax - SELECT columnlist FROM tablelist [WHERE conditionlist ] [ORDER BY columnlist [ASC | DESC]]; § Cascading order sequence : Multilevel ordered sequence § Created by listing several attributes after the ORDER BY clause

Inheritance

Object inherits methods and attributes of parent class

Hibernate

Object relational mapping system. Supports a query language that can express complex queries involving joins. Allows relationships to be mapped to sets associated with objects.

Design Case 4: Redundant Relationships

Occur when there are multiple relationship paths between related entities, Need to remain consistent across the model, Help simplify the design

Design Case 3: Design trap

Occurs when a relationship is improperly or incompletely identified, Represented in a way not consistent with the real world

Design Case 3: Fan Trap

Occurs when one entity is in two 1:M relationships to other entities, Produces an association among other entities not expressed in the model

A list of values

One column and multiple rows

One single value

One column and one row

1:1 relationship

One entity can be related to only one other entity and vice versa

Optional Relationship participation

One entity occurrence does not require a corresponding entity occurrence in a particular relationship

Mandatory Relationship participation

One entity occurrence requires a corresponding entity occurrence in a particular relationship

Identifiers

One or more attributes that uniquely identify each entity instance

Many to many

One or more rows in a table can be related to 0, 1 or many rows in another table

Differential backup

Only modified/updated objects since last full backup are backed up

Inner join

Only returns matched records from the tables that are being joined

Inner join

Only rows that meet a given criterion are selected

Transaction log backup

Only the transaction log operations that are not reflected in a previous backup are backed up

ODBC

Open Database Connectivity - works with C, C++, C#, and visual basic

Set-oriented

Operate over entire sets of rows and columns at once

Dynamic SQL

SQL statement is generated at run time § Attribute list and condition are not known until end user specifies them § Slower than static SQL § Requires more computer resources

Transitive dependency

An attribute functionally depends on another nonkey attribute

What is an outer join?

An extension of the Natural join operation that avoids loss of information Computes natural Join and then: add the result tuple form relation R that do not match tuples from relation S Uses null values.

Which statements are correct about extents? A. An extent is a consecutive grouping of Oracle blocks B. An extent is a random grouping of Oracle blocks C. An extent can be distributed across one or more datafiles D. An extent can contain blocks from one or more segments E. An extent can be assigned to only one segment

An extent is a consecutive grouping of Oracle blocks An extent can be assigned to only one segment

What statements are correct about extents? A. An extent is a grouping of several Oracle blocks B. An extent is a grouping of several operating system blocks An extent can be distributed across one or more datafiles D. An extent can contain blocks from one or more segments E. An extent can be assigned to only one segment

An extent is a grouping of several Oracle blocks An extent can be assigned to only one segment

Large itemset

An itemset that satisfies the minimum support requirement (ie, Supp(itemset)≥min_supp)

Entity

An object about which you want to store data (i.e. students, faculty, courses)

Instance

An occurance of a specific entity

Purpose of Database Initial Study

Analyze company situation Define problems and constraints Define objectives Define scope and boundaries

End users can use PL/SQL to create: §

Anonymous PL/SQL blocks and triggers § Stored procedures and PL/SQL functions

Database schema

Another term for schema

Structured data

It results from formatting, Structure is applied based on type of processing to be performed

Triggers §

Procedural SQL code automatically invoked by RDBMS when given data manipulation event occurs §

Procedural SQL

Procedural code is executed as a unit by DBMS when invoked by end user §

What is Atomicity?

"All or nothing property" -either the transation performed entirely or not at all -the recovery of the subsystem of the DBMS is responsible

What is the lost update problem?

"Override by mistake" -an apparently successful completed update operation by one user is overridden by another user

Confidence (of an association rule)

% of the transactions containing LHS that ALSO contain RHS eq: |{transactions S s.t. (LHS ∪ RHS ⊆ S)}| / |{transactions S' s.t. (LHS)⊆S'}|

Support (of a set of items/ association rule)

% of the transactions containing all of the items in the rule (in both the LHS and RHS). eq: |{transactions S s.t. (LHS ∪ RHS)⊆S}| / |all transactions|

0-I

(0,1) the one side is optional

0<-

(0,N) the many side is optional

II

(1,1) the one side is mandatory

I<-

(1,N) the many side is mandatory

One-to-one

(1:1)

One-to-many

(1:M)

Many-to-many

(M:N or M:M)

What is the cascading rollback problem? What is the solution?

-A transaction rollsback after a long time -causing a pile up of rollbacks SOLUTION: release all locks at the end of every transaction

Course outline

-Structured / Relational Data model -Relational Calculus vs Relational Algebra -Enhanced Entity-Relational model (EER) -Structured Query language -Semi-structured data (SSD) -Querying XML:XPath+XQuery -Transactions - concurrency

What is multiversion timestamp ordering?

-generalized timestamp protocol. -keeps several versions of every data item to increase concurrency.

Types of XML validation

1. Well formed: single root element and matching tags (properly nested) 2. Valid: must be well formed and elements must follow a pre-defined structure, described in the DTD.

Relational database

A database structured to recognized relations among stored items of information

The SYSAUX tablespace is mandatory. What will happen if you attempt to issue a CREATE DATABASE command that does not specify a datafile for the SYSAUX tablespace?

A default SYSAUX tablespace and datafile will be created.

Dimension Table (OLAP)

A dimension has attributes, which can occur in hierarchy (i.e., time over days, weeks, years)

Describe a precedence graph.

A directed graph G =(N,E) with a set of nodes N and a set of directed edges E, with: -a node for each transaction -a directed edge Ti-Tj whenever: *Tj reads a value of item written by Ti or *Tj writes a value into an item atfer it has been read or written by Ti

Fact Table (OLAP)

A fact has a number of dimensions, including at least one measure dimension (quantity that can be analysed by user)

Entity Integrity Requirement

A foreign key may have either a null entry or a entry that matches a primary key value in a table to which it is related

Java Servlet

A java class used to extend the capabilities of servers that host web applications. Defines an API for communicating between the web server and application program. (DBentry.java)

Column

A labelled element of a tuple

Candiate key

A minimal set of attributes that uniquely identify a tuple.

INTERSECT §

Combines rows from two queries, returning only the rows that appear in both sets Syntax - query INTERSECT query

Join columns

Common columns

ANY operator

Allows comparison of a single value to a list of values and selects only the rows for which the value is greater than or less than any value in the list

End-user interface

Allows end user to interact with the data

Join

Allows information to be intelligently combined from two or more tables

Character

Alphabetic or numeric

Associative Entities

Also known as composite or bridge entities

Alias

Alternate name given to a column or table in any SQL statement to improve the readability

Which of these background processes is optional? A. ARCn, the archive process B. CKPT, the checkpoint process C. DBWn, the database writer D. LGWR, the log writer E. MMON, the manageability monitor

ARCn, the archive process

Model

Abstraction of a real-world object or event

Database communication interfaces

Accept end user requests via multiple, different network environments

Structural dependence

Access to a file is dependent on its own structure, All file system programs are modified to conform to a new file structure

Manual File Systems

Accomplished through a system of file folders and filing cabinets

Trigger action based on DML predicates §

Actions depend on the type of DML statement that fires the trigger

SELECT command

Acts as a subquery and is executed first

Comparison Operators §

Add conditional restrictions on selected table contents § Used on: § Character attributes § Dates

Optional WHERE clause §

Adds conditional restrictions to the SELECT statement

Encrypting

Advanced Encryption Standard(AES) based on Rijndael algorithm used. Public-key encryption based on each user having two keys. If user1 wants to share data with user2, user1 encrypts the data using the public key of user2.

Joining Tables With an Alias §

Alias identifies the source table from which data are taken § Any legal table name can be used as alias § Add alias after table name in FROM clause § FROM tablename alias

Full backup/dump:

All database objects are backed up in their entirety

Conversion to First Normal Form 1NF describes tabular format in which

All key attributes are defined There are no repeating groups in the table All attributes are dependent on the primary key

Entity integrity

All of the values in the primary key must be unique

Entity Integrity Requirement

All primary key entries are unique, and no part of a primary key may be null

TAB_COLUMNS

All table columns

Minimum data rule

All that is needed is there, and all that is there is needed

Study this tnsnames.ora file: test = (description = (address_list = (address = (protocol = tcp)(host = serv2)(port = 1521)) ) (connect_data = (service_name = prod) ) ) prod = (description = (address_list = (address = (protocol = tcp)(host = serv1)(port = 1521)) ) (connect_data = (service_name = prod) ) ) dev = (description = (address_list = (address = (protocol = tcp)(host = serv2)(port = 1521)) ) (connect_data = (service_name = dev) ) ) Which of the following statements are correct about the connect strings test, prod, and dev? A. All three are valid B. All three can succeed only if the instances are set up for dynamic instance registration C. The test connection will fail, because the connect string doesn't match the service name D. There will be a port conflict on serv2, because prod and dev try to use the same port

All three are valid All three can succeed only if the instances are set up for dynamic instance registration

Q.32 In the figure below, what is depicted?

An associative entity

Importance of Data Models

Are a communication tool, Give an overall view of the database, Organize data for various users, Are an abstraction for the creation of good database

Fields

Are columns that contain data items and character descriptions.

Records

Are rows of a collection of related feilds that contain related information

Unary relationship

Association is maintained within a single entity

Class

Collection of similar objects with shared structure and behavior organized in a class hierarchy

Functional dependence (Generalized definition)

Attribute A determines attribute B if all of the rows in the table that agree in value for attribute A also agree in value for attribute B

Subtype Discriminator

Attribute in the supertype entity that determines to which entity subtype the supertype occurrence is related, Default comparison condition is the equality comparison

Secondary Key

Attribute or combination of attributes used strictly for data retrieval purposes

Composite attribute

Attribute that can be subdivided to yield additional attributes

Simple attribute

Attribute that cannot be subdivided

Single-valued attribute

Attribute that has only a single value

Key attribute

Attribute that is a part of a key

Determinant

Attribute whose value determines another

Derived attribute

Attribute whose value is calculated from other attributes, Derived using an algorithm

Dependent

Attribute whose value is determined by the other attribute

Table Column

Attribute, and each column has a distinct name

Multivalued attributes

Attributes that have many values and require creating: Several new attributes, one for each component of the original multivalued attribute, A new entity composed of the original multivalued attribute's components

Implicit cursor

Automatically created when SQL statement returns only one value

Candidate Key

Any column that could be used as a primary key

Host language

Any language that contains embedded SQL statements §

Union

Combines all rows from two tables, excluding duplicate rows

Special Operators

BETWEEN • Checks whether attribute value is within a range IS NULL • Checks whether attribute value is null LIKE • Checks whether attribute value matches given string pattern IN • Checks whether attribute value matches any value within a value list EXISTS • Checks if subquery returns any rows

Object/Relational Database Management System (O/R DBMS)

Based on ERDM, focuses on better data management

Object-oriented database management system (OODBMS)

Based on OODM

Determination

Based on the relationships among the attributes

The Object-Oriented Data Model (OODM)

Basic building block for autonomous structures, Abstraction of real-world entity

Conceptual schema

Basis for the identification and high-level description of the main data objects

Entity Relationship Model (ERM)

Basis of an entity relationship diagram (ERD)

Entity names - Required to:

Be descriptive of the objects in the business environment, Use terminology that is familiar to the users

Advantages of the DBMS

Better data integration and less data inconsistency, Increased end user productivity, Improved: Data sharing, Data security, Data access, Decision making

Where is the division between the client and the server in the Oracle environment?

Between the user process and the server process

BFILE

Binary file stored outside the database

An appropriate datatype for adding a sound clip would be:

Blob

Persistent stored module (PSM)

Block of code containing: § Standard SQL statements § Procedural extensions that is stored and executed at the DBMS server

Generalization

Bottom-up process, Identifies a higher-level, more generic entity supertype from lower-level entity subtypes, Based on grouping common characteristics and relationships of the subtypes

DTW Path Properties

Boundary Constraint (w must start and finish in the first and last points of the sequences); Continuity (at any given pt in w, we can only travel to neighboring pts); Monotonicity (points in w must be monotonically ordered)

The normal form which removes any remaining functional dependencies because there was more than one primary key for the same nonkeys is called:

Boyce-Codd normal form

Business Rules

Brief, precise, and unambiguous description of a policy, procedure, or principle, Enable defining the basic building blocks, Describe main and distinguishing characteristics of the data

One segment can be spread across many datafiles. How?

By assigning multiple datafiles to a tablespace

UNION §

Combines rows from two or more queries without including duplicate rows Syntax -query UNION query

EXCEPT (MINUS) §

Combines rows from two queries and returns only the rows that appear in the first set § Syntax § query EXCEPT query § query MINUS query

Shared

Can be access by different users at the same time

PL/SQL Stored Functions §

Can be invoked only from within stored procedures or triggers

Foreign key

Can be used to cross-reference tables

Deleting a Table from the Database

Can drop a table only if it is not the one side of any relationship § RDBMS generates a foreign key integrity violation error message if the table is dropped

Physical independence

Changes in physical model do not affect internal model

Logical independence

Changing internal model without affecting the conceptual model

An appropriate datatype for one wanting a fixed-length type for last name would include:

Char

NCLOB

Character LOB supports 2 byte character codes

CLOB

Character Large Object, stores up to 4GB of character data

Atomicity

Characteristic of an atomic attribute

Attribute

Characteristic of an entity, Columns

Attributes

Characteristics of entities

Value Constraints

Check Condition (CC), NOT NULL (NN), Unique (UK)

Metadata

Data about data, which the end user data are integrated and managed, Describe data characteristics and relationships

Data dependence

Data access changes when data storage characteristics change

CREATE VIEW statement

Data definition command that stores the subquery specification in the data dictionary § CREATE VIEW command § CREATE VIEW viewname AS SELECT query

Categories of SQL function §

Data definition language (DDL) § Data manipulation language (DML) §

DDL

Data definition language: specifies entity names/ attributes/ relationships for the stored data

Computerized File Systems

Data processing (DP) specialist: Created a computer based system that would track data and produce required reports

Data Redundancy To be controlled except the following circumstances

Data redundancy must be increased to make the database serve crucial information purposes, Exists to preserve the historical accuracy of the data

Data independence

Data storage characteristics is changed without affecting the program's ability to access the data

Structutred data

Data that is represented in a scrict format *relational data model

If there are several databases created off the same Oracle Home, how will Database Express be configured?

Database Express will give access to each database through different ports.

Decrypting

Decrypting using public key encryption is based on each user having two keys. If user1 wants to share data with user2, after data being encrypted, user2 will use both of their keys to decrypt.

DATE

Default format (DD-MON-YY)

restricted actions

Deleting a table from a user schema, Changing an existing columns data type, Decrease the width of an existing column, Adding a primary key constraint to an existing column, Adding a foreign key constraint, Adding a UNIQUE constraint, Adding a CHECK constraint, Changing a column's default value

________ problems are encountered when removing data with transitive dependencies.

Deletion

Dependency diagram

Depicts all dependencies found within given table structure Helps to get an overview of all relationships among table's attributes Makes it less likely that an important dependency will be overlooked

Specialization Hierarchy

Depicts arrangement of higher-level entity supertypes and lower-level entity subtypes, Relationships are described in terms of "is-a" relationships, Subtype exists within the context of a supertype, Every subtype has one supertype to which it is directly related, Supertype can have many subtypes

Conceptual data model

Describes main data entities, attributes, relationships, and constrains

Unified Modeling Language (UML)

Describes sets of diagrams and symbols to graphically model a system

Connectivity

Describes the relationship classification

Domain

Describes the set of possible values for a given attribute

Data dictionary

Description of all tables in the database created by the user and designer

Denormalization

Design goals Creation of normalized relations Processing requirements and speed Number of database tables expands when tables are decomposed to conform to normalization requirements Joining a larger number of tables: Takes additional input/output (I/O) operations and processing logic Reduces system speed

Operational database

Designed to support a company's day to day operations

Conceptual Design

Designs a database independent of database software and physical details Designed as software and hardware independent

Logical design

Designs an enterprise-wide database that is based on a specific data model but independent of physical-level details

Database Analyst

Develop Databases for decision support reporting, SQL, query optimization, data warehouses

Developing an ER Diagram

Develop the initial ERD

Data anomaly

Develops when not all of the required changes in the redundant data are made successfully

One to one One to many Many to many

Different entity relationships? (3)

The Boyce-Codd Normal Form (BCNF)

Every determinant in the table should be a candidate key, Equivalent to 3NF when the table contains only one candidate key, Violated only when the table contains more than one candidate key, Considered to be a special case of 3NF

When will the Segment Advisor run?

Every night, as an autotask On demand

Referential integrity

Every reference to an entity instance by another entity instance is valid

A priori property of large itemsets

Every subset of a large itemset is also large

Total completeness

Every supertype occurrence must be a member of any

Lack of Design and Data Modeling Skills

Evident despite the availability of multiple personal productivity tools being available, vital in the data design process, decreases communication between the designer, user, and the developer

Correlated Subquery §

Executes once for each row in the outer query § Inner query references a column of the outer subquery § Can be used with the EXISTS special operator

Weak Entity Conditions

Existence-dependent

Use of Composite Primary Keys, When used as identifiers of weak entities, represent a real-world object that is:

Existence-dependent on another real-world object, Represented in the data model as two separate entities in a strong identifying relationship

Module coupling

Extent to which modules are independent to one another Low coupling decreases unnecessary intermodule dependencies

Syntax alternatives §

IN and NOT IN subqueries can be used in place of INTERSECT

Normalization Process Ensures that all tables are in at least 3NF Higher forms are not likely to be encountered in business environment Works one relation at a time Starts by:

Identifying the dependencies of a relation (table), Progressively breaking the relation into new set of relations

Database Security Officer

Implement security policies for data administration, DBMS fundamentals, database administration, SQL, data security technologies, etc.

Many-to-many (M:N) relationship

Implemented by creating a new entity in 1:M relationships with the original entities

Which of the following is an objective of selecting a data type?

Improve data integrity

Where is the current redo byte address, also known as the incremental checkpoint position, recorded?

In the controlfile

File System Redux: Modern End User Productivity Tools

Includes spreadsheet programs such as Microsoft Excel

Disadvantages of Database Systems

Increased costs, Management complexity, Maintaining currency, Vendor dependence, Frequent upgrade/replacement cycles

Relationship Degree

Indicates the number of entities or participants associated with a relationship

Triggering timing

Indicates when trigger's PL/SQL code executes

Module

Information system component that handles specific business function

Implementation and Loading

Install the DBMS Create the databases Requires the creation of special storage-related constructs to house the end-user tables Load or convert the data Requires aggregating data from multiple sources

Role of the DBMS

Intermediary between the user and the database, Enables data to be shared, Presents the end user with an integrated view of the data, Receives and translates application requests into operations required to fulfill the requests, Hides database's internal complexity from the application programs and users

Conversion to Second Normal Form Table is in 2NF when it:

Is in 1NF Includes no partial dependencies

Associative Entities

Is in a 1:M relationship with the parent entities

Determination

Is the basis for establishing the role of a key

How can you enable the suspension and resumption of statements that hit space errors?

Issue an ALTER SESSION ENABLE RESUMABLE command Set the instance parameter RESUMABLE_TIMEOUT

DDL

It allows the user to create and restructure database objects

Relational database

It is a digital database

Entity relationship diagram

It is a graphical representation of entities and their relationships to each other

Primary key

It is a key that uniquely identify each record in a table

Entity

It is a piece of data-an object or concept about which data is stored

What is a DB management system?

It is a software that allows to manage efficiently a DB (i.e define/create/maintain/ control access)

What is a Database Management system?

It is a software that allows to manage efficiently the data base.

Candidate key

It is an attribute or set of attribute that can act as a primary key for a table to uniquely identify each record in that table

Database system

It is an automated system that enables users to define, create, maintain and control access to the database

Relation

It is composed of rows and columns of data

Super key

It is defined as a set of attributes within a table that uniquely identifies each record within a table

Candidate key

It is defined as the set of fields from which primary key can be selected

Relational database

It is divided into logical units called table which is composed of rows and columns of data

Relationship

It is how the data is shared between entities

Entity Integrity Example

It is impossible to have invalid sales representative number

Entity Integrity Purpose

It is possible for an attribute not to have a corresponding value but it is impossible to have an invalid entry It is impossible to delete row in a table whose primary keys has mandatory matching foreign key values in another table

Super key

It is superset of candidate key

What is Database recovery?

It is the process of restoring a database to a correct state after failure

SQL

It is the standard language used to define, query, update and maintain relational databases

Alter Table

It is used to add, delete or modify columns in an existing table

Key

It is used to establish and identify relation between tables

Relational model

It organizes data into one or more tables of columns and rows, with a unique key identifying each row

Domain

It refers to a set of valid atomic values for a given attribute

Primary key

It refers to an attribute or field that serves as a unique identifier for a particular record within a relation

Triggering action

PL/SQL code enclosed between the BEGIN and END keywords

Data storage management

Performance tuning

Joining Database Tables §

Performed when data are retrieved from more than one table at a time § Equality comparison between foreign key and primary key of related tables § Tables are joined by listing tables in FROM clause of SELECT statement § DBMS creates Cartesian product of every table in the FROM clause

Procedural SQL §

Performs a conditional or looping operation by isolating critical code and making all application programs call the shared code § Yields better maintenance and logic control §

Relational Database Management System (RDBMS)

Performs basic functions provided by the hierarchical and network DBMS systems, Makes the relational data model easier to understand and implement, Hides the complexities of the relational model from the user

________ database specification indicates all the parameters for data storage that are then input to database implementation.

Physical

Testing Factors

Physical security Password security Access rights Audit trails Data encryption Diskless workstations Optimization

Logically related data stored in a single logical data repository

Physically distributed among multiple storage facilities

Design Case 1: Implementing 1:1 Relationships Options for selecting and placing the foreign key:

Place a foreign key in both entities, Place a foreign key in one of the entities

NOT NULL constraint

Placed on a column to ensure that every row in the table has a value for that column

Batch update routine

Pools multiple transactions into a single batch to update a master table field in a single operation

Data Redundancy Implications

Poor data security, Data inconsistency, Increased likelihood of data entry errors when complex entries are made in different files, Data anomaly

Subschema

Portion of the database seen by the application programs that produce the desired information from the data within the database

Distributed Database Design

Portions of database may reside in different physical locations, Ensures database integrity, security, and performance

Periodic Maintenance Activities

Preventive maintenance (backup) Corrective maintenance (recovery) Adaptive maintenance Assignment of access permissions and their maintenance for new and old users Generation of database access statistics Periodic security audits Periodic system-usage summaries

Integrity Constraints

Primary (PK),Foreign(FK), Composite Key(CK)

Primary Key and Foreign Key §

Primary key attributes contain both a NOT NULL and a UNIQUE specification § RDBMS will automatically enforce referential integrity for foreign keys § Command sequence ends with semicolon §

Composite identifier

Primary key composed of more than one attribute

Weak (non-identifying) relationship

Primary key of the related entity does not contain a primary key component of the parent entity

Authentication

Process DBMS uses to verify that only registered users access the data § Required for the creation tables § User should log on to RDBMS using user ID and password created by database administrator

Systems development

Process of creating information system

Physical design

Process of data storage organization and data access characteristics of the database

Database development

Process of database design and its implementation

Systems analysis

Process that establishes need for and extent of information system

Semistructured data

Processed to some extent

The Relational Model

Produced an automatic transmission database that replaced standard transmission databases, Based on a relation, Describes a precise set of data manipulation constructs

Information

Produced by processing data, Reveals the meaning of data, Enables knowledge creation, Should be accurate, relevant, and timely to enable good decision making

Denormalization

Produces a lower normal form, Results in increased performance and greater data redundancy Defects in unnormalized tables Data updates are less efficient because tables are larger Indexing is more cumbersome No simple strategies for creating virtual tables known as views

Static SQL

Programmer uses predefined SQL statements and parameters § SQL statements will not change while application is running

Hierarchical Model Advantages

Promotes data sharing, Parent/child relationship promotes conceptual simplicity and data integrity, Database security is provided and enforced by DBMS, Efficient with 1:M relationships

The Information System

Provides for data collection, storage, and retrieval Composed of: People, hardware, software Database(s), application programs, procedures

Description of Operations

Provides precise, up-to-date, and reviewed description of activities defining organization's operating environment

You receive an alert warning you that a tablespace is nearly full. What action could you take to prevent this becoming a problem, without any impact for your users?

Purge all recycle bin objects in the tablespace Shrink the tables in the tablespace

Design Case 1: Implementing 1:1 Relationships Rule

Put primary key of the parent entity on the dependent entity as foreign key

Tasks to be Completed Before Using a New RDBMS Create database structure

RDBMS creates physical files that will hold database § Differs from one RDBMS to another §

Data

Raw Facts, Have little meaning unless its been organized in some logical manner, Building block of information

End User Data

Raw facts of interest to end user

Natural Keys or Natural Identifier

Real-world identifier used to uniquely identify real-world objects, Familiar to end users and forms part of their day-to- day business vocabulary, Used as the primary key of the entity being modeled

Q.34 In the figure below, what type of key is depicted?

Recursive foreign

Recursive Joins §

Recursive query : Table is joined to itself using alias § Use aliases to differentiate the table from itself

Entity

Refers to the entity set and not to a single entity occurrence

Data Redundancy

Relational database facilitates control of data redundancies through use of foreign keys

Logical View of Data

Relational database model enables logical representation of the data and its relationships

SQL Join Operators

Relational join operation merges rows from two tables and returns rows with one of the following

Relational Algebra

Relational operators have the property of closure

unrestricted action

Renaming a table, Adding new fields, Deleting fields, Increasing the maximum_size value field Deleting constraints

Network Models

Represent complex data relationships, Improve database performance and impose a database standard, Depicts both one-to-many (1:M) and many-to-many (M:N) relationships

Attribute name

Required to be descriptive of the data represented by the attribute

Embedded SQL §

SQL statements contained within an application programming language § Differences between SQL and procedural languages § Run - time mismatch § SQL is executed one instruction at a time § Host language runs at client side in its own memory space

Which of the following violates the atomic property of relations?

Sam Hinz

Candidate key

Same characteristics as primary key but not chosen to be the primary key

Advantages of Storing Derived Attributes Stored

Saves CPU processing cycles, saves data access time, data value is readily available, can be used to keep track of historical data

Advantages of Storing Derived Attributes Not Stored

Saves storage space, computation always yields current value

Islands of information

Scattered data locations, Increases the probability of having different versions of the same data

A relation that contains no multivalued attributes and has nonkey attributes solely dependent on the primary key but contains transitive dependencies is in which normal form?

Second

Object-Oriented Model Advantages

Semantic content is added, Visual representation includes semantic content, Inheritance promotes data integrity

Which type of file is most efficient with storage space?

Sequential

Sequences

Sequential lists of numbers the database automatically generates to guarantee values are unique

Server Side Scripting

Server executes the script embedded within the HTML before it generates a web page. Java Server Pages allow java code to be embedded in static HTML. Java code is enclosed between <% ... %>. Simplifies task of connecting a database to the web.

How do sessions communicate with the database?

Server processes execute SQL received from user processes

Database Role

Set of database privileges that could be assigned as a unit to a user or group

Domain

Set of possible values for a given attribute

Constraint

Set of rules to ensure data integrity

Concurrency control system

Several users accessing the database at the time

Difference between shared (read) lock and exclusive (write) lock?

Shared lock: -T is allowed only to read some data item -any other transaction can only read this item. Exclusive lock: -T is allowed to read and write on some data item -any other transaction has no access to this item

Database

Shared, integrated computer structure that stores a collection of: End user data and metadata

Data models

Simple representations of complex real-world data structures, Useful for supporting a specific problem domain

Primary Keys

Single attribute or a combination of attributes, which uniquely identifies each entity instance, Guarantees entity integrity, Works with foreign keys to implement relationships

Tuple, Table Row

Single entity occurrence within the entity set

Object-Oriented Model Disadvantages

Slow development of standards caused vendors to supply their own enhancements, Compromised widely accepted standard, Complex navigational system, Learning curve is steep, High system overhead slows transactions

Cookies

Small piece of text containing identifying information. Sent by server to browser on first interaction. Sent by browser to server on all subsequent interactions. Server saves information about cookies it issued, and can use it when serving a request; authentication information and user preferences.

Sources of Database Failure

Software Hardware Programming exemptions Transactions External factors

Multiuser access control

Sophisticated algorithms ensure that multiple users can access the database concurrently without compromising its integrity

Flags

Special codes used to indicate the absence of some value

Cursor

Special construct used to hold data rows returned by a SQL query

Specialization vs Generalization

Specialisation: -the top-down process of maximising the differences between entity occurences, by identifying their distinguishing characteristics -given superclass(es), it leads to identifying subclasses. Generalisation: -The bottom-up process of minimising the differences between entity occurences, by identifying their common charactersitics. -Given subclasses, it leads to identifying superclass(es)

External schema

Specific representation of an external view

Internal schema

Specific representation of an internal model, Uses the database constructs supported by the chosen database

WHERE condition •

Specifies the rows to be selected

FROM Subqueries § FROM clause: §

Specifies the tables from which the data will be drawn § Can use SELECT subquery

Completeness Constraint

Specifies whether each supertype occurrence must also be a member of at least one subtype

Embedded SQL framework defines: §

Standard syntax to identify embedded SQL code within the host language § Standard syntax to identify host variables § Communication area used to exchange status and error information between SQL and host language

To create a database, in what mode must the instance be?

Started in NOMOUNT mode

Determination

State in which knowing the value of one attribute makes it possible to determine the value of another

Triggering level

Statement- and -row level

Large Object (LOB)

Stores binary data i.e. images, digitized sounds

Data warehouse

Stores data in a format optimized for decision support

Current generation DBMS software

Stores data structures, relationships between structures, and access paths, Defines, stores, and manages all access paths and components

Data dictionary

Stores definitions of the data elements and their relationships

Analytical database

Stores historical data and business metrics used exclusively for tactical or strategic decision making, data warehouse

Cohesivity

Strength of the relationships among the module's entities

Q.41 In the figure below, what type of relationship do the relations depict?

Strong entity/weak entity

Relational Model Advantages

Structural independence is promoted using independent tables, Tabular view improves conceptual simplicity, Ad hoc query capability is based on SQL, Isolates the end user from physical-level details, Improves implementation and management simplicity

Inline subquery §

Subquery expression included in the attribute list that must return one value

Database fragment

Subset of a database stored at a given location

Specialization Hierarchy Provides the means to:

Support attribute inheritance, Define a special supertype attribute known as the subtype discriminator, Define disjoint/overlapping constraints and complete/partial constraints

Enterprise database

Supports many users across many departments

Multiuser database

Supports multiple users at the same time, Workgroup databases, Enterprise database

Single user database

Supports one user at a time, Desktop Database

Alter table tablename add columnname datatype

Syntax of alter table, add

Alter table tablename drop column columnname

Syntax of alter table, delete

Alter Table tablename modify column columnname datatype

Syntax of alter table, modify

Create Table ( Column1 data type(size), Column2 data type(size) Primary Key(column1) );

Syntax of create table

Drop table tablename

Syntax of deleting a table with data

Adding Primary and Foreign Key Designations §

Syntax to add or modify columns § ALTER TABLE tablename § {ADD | MODIFY} ( columnname datatype [ {ADD | MODIFY} columnname datatype ] ) ; § ALTER TABLE tablename § ADD constraint [ ADD constraint ] ;

Network Model Disadvantages

System complexity limits efficiency, Navigational system yields complex implementation, application development, and management, Structural changes require changes in all application programs

System catalog

System data dictionary that describes all objects within the database

TIMESTAMP

TIMESTAMP (NumberofDecimalPlaces)

IND_COLUMNS

Table Columns that have indexes

CONSTRAINTS

Table Constraints

CONS_COLUMN

Table columns that have constraints

Relational database

Table formatted database, a matrix with colums and rows

INDEXES

Table indexes created to improve query retrieval performance

Fourth Normal Form (4NF)

Table is in 4NF when it: Is in 3NF Has no multivalued dependencies Rules All attributes must be dependent on the primary key, but they must be independent of each other No row may contain two or more multivalued facts about an entity

Clustered Tables

Technique that stores related rows from two related tables in adjacent data blocks on disk

What statements regarding instance memory and session memory are correct?

The SGA is written to by all sessions; a PGA is written by one session The SGA is allocated at instance startup

Physical data independence

The ability to modify the physical scheme without causing application programs to be written

An Oracle instance can have only one of some processes, but several of others. Which of these processes can occur several times? A. The archive process B. The checkpoint process C. The database writer process D. The log writer process E. The session server process

The archive process The database writer process The session server process

Alternative key

The candidate key which are not selected for primary key

Consider this tnsnames.ora net service name: orcl=(description= (address=(protocol=tcp)(host=dbserv1)(port=(1521)) (connect_data=(service_name=orcl)(server=dedicated)) ) What will happen if shared server is configured and this net service name is used?

The connect will succeed with a dedicated server connection

What files are created by the CREATE DATABASE command?

The control file The online redo log files The SYSAUX tablespace datafile The SYSTEM tablespace datafile

What is Durability?

The effects of a committed transaction are permanently recorded -they should never be lost because of a failure.

If a tablespace is created with the syntax create tablespace tbs1 datafile 'tbs1.dbf' size 10m; which of these characteristics will it have? A. The datafile will autoextend, but only to double its initial size B. The datafile will autoextend with MAXSIZE UNLIMITED C. The extent management will be local D. Segment space management will be with bitmaps E. The file will be created in teh DB_CREATE_FILE_DEST directory

The extent management will be local Segment space management will be with bitmaps

What is the concurrency control protocol?

The process of managing simultaneous operations on the DB without them interfering with each other.

Prepared Statement

The same query can be compiled once and then run multiple times with different parameter values. PreparedStatement pStmt = conn.prepareStatement("insert into instructor values(?,?,?,?)");

What is a DB schema?

The total description of the DB.

Relational Algebra

Theoretical way of manipulating table contents using relational operators

Entity Supertypes and Subtypes Criteria to determine the usage

There must be different, identifiable kinds of the entity in the user's environment, The different kinds of instances should each have one or more attributes that are unique to that kind of instance

If you stop your listener, what will happen to sessions that connected through it?

They will not be affect in any way

Reasons for Identifying and Documenting Business Rules Allow designer to

Understand the nature, role, scope of data, and business processes, Develop appropriate relationship participation rules and constraints, Create an accurate data model

Entity

Unique and distinct object used to collect and store data

Data Redundancy

Unnecessarily storing same data at different places, Islands of information

Types of Data Anomaly

Update Anomalies, Insertion Anomalies, Deletion Anomalies

Adding a column §

Use ALTER and ADD § Do not include the NOT NULL clause for new column §

Dropping a column §

Use ALTER and DROP § Some RDBMSs impose restrictions on the deletion of an attribute

Changing Column's Data Characteristics §

Use ALTER to change data characteristics § Changes in column's characteristics are permitted if changes do not alter the existing data type § Syntax § ALTER TABLE tablename MODIFY (columnname(characterstic)) ;

Procedural Language SQL (PL/SQL) §

Use and storage of procedural code and SQL statements within the database § Merging of SQL and traditional programming constructs §

Closure

Use of relational algebra operators on existing relations produces new relations

Creating Table Structures §

Use one line per column (attribute) definition § Use spaces to line up attribute characteristics and constraints § Table and attribute names are capitalized §

Surrogate Keys Require ensuring that the candidate key of entity in question performs properly

Use unique index and not null constraints

Surrogate Keys

Used by designers when the primary key is considered to be unsuitable, System-defined attribute, Created an managed via the DBMS, Have a numeric value which is automatically incremented for each new row

Forms Builder

Used for creating custom applications

Reports Builder

Used for creating reports for displaying, printing, and distributing summary data

Enterprise Manager

Used for performing database administration tasks such as creating new user accounts and configureing how the DBMS stores and manages data

Inserting Table Rows with a SELECT Subquery

Used to add multiple rows using another table as source §

IN subqueries

Used to compare a single attribute to a list of values

SQL* Plus

Used to create and test command-line SQL queries and executing PL/SQL procedural programs

Data definition language (DDL) commands

Used to create new database objects and modify or delete existing objects. DDL Commands automatically save the change made to the database

Data manipulation language (DML) commands

Used to insert, update and view database data. DML commands must be expicitly saved

Associative Entities

Used to represent an M:N relationship between two or more entities

Selecting Rows Using Conditional Restrictions

Used to select partial table contents by placing restrictions on the rows §

Updatable Views §

Used to update attributes in any base tables used in the view

Need for Normalization

Used while designing a new database structure, Analyzes the relationship among the attributes within each entity, Determines if the structure can be improved, Improves the existing data structure and creates an appropriate database design

Presentation Layer

User Interface

When the database is in mount mode, what views may be queried to find what datafiles and tablespaces make up the database?

V$DATAFILE V$TABLESPACE

Which of these views can be queried successfully in NOMOUNT mode? A. DBA_DATA_FILES B. DBA_TABLESPACES C. V$DATABASE D. V$DATAFILE E. V$INSTANCE F. V$SESSION

V$INSTANCE V$SESSION

NVARCHAR2 & NCHAR

VARCHAR2 & CHAR use ASCII coding, Unicode is used with NVARCHAR & CHAR to address the need to expand beyond the 256-charcter limitation of ASCII for other languages

Functional dependence

Value of one or more attributes determines the value of one or more other attributes

Key Feilds

Values that help identify an individual row or link data from different tables

VARCHAR2

Variable-length character data l_name VARCHAR2(maximum_sizeofstring (0-4000)

Data Model Verification

Verified against proposed system processes Revision of original design Careful reevaluation of entities Detailed examination of attributes describing entities

________ partitioning distributes the columns of a table into several separate physical records.

Vertical

Entity Cluster

Virtual entity type used to represent multiple entities and relationships in ERD, Avoid the display of attributes to eliminate complications that result when the inheritance rules change

View

Virtual table based on a SELECT query

Referential integrity

if the relation is a foreign key, the its value is either: -null or -matches a primary key of the corresponding relation

An alternative name for an attribute is called a(n):

alias

Which of the following commands will shrink space in a table or index segment and relocate the HWM? A. alter table employees shrink space compact hwm; B. alter table employees shrink space hwm; C. alter table employees shrink space compact; D. alter table employees shrink space; E. alter index employees shrink space cascade;

alter table employees shrink space;

INTERVAL YEAR TO MONTH

coulmn name INTERVAL YEAR[(year)] TO MONTH

Sensitivity testing involves:

checking to see if missing data will greatly impact results

A method to allow adjacent secondary memory space to contain rows from several tables is called:

clustering

Attribute domain

column has a specific range of values

INTERVAL DAY TO SECOND

column_name INTERVAL DAY(MAXIMUM DIGITS EXPRESSED IN THE COLUMN) TO SECONDS(MAXIMUM DIGITS EXPRESSED IN ELAPSED SECONDS)

ALTER TABLE

command: To make changes in the table structure

A primary key that consists of more than one attribute is called a:

composite key

lookup table (pick list)

contains a list of legal values for a column in another table

In the SQL language, the ________ statement is used to make table definitions.

create table

When a regular entity type contains a multivalued attribute, one must:

create two new relations, one containing the multivalued attribute

The storage format for each attribute from the logical data model is chosen to maximize ________ and minimize storage space.

data integrity

The value a field will assume unless the user enters an explicit value for an instance of that field is called a:

default value

integrity constraints

define primary and foreign keys

value contstraints

define specific data values or date ranges that must be inserted into columns(Unique or null)

Answering queries with CUBE(F) - Useage

define tuples for a query q s.t.: 1) if the query q has a condition s.t. A=v for some constant v, then t must have A=v 2) if query GROUPS BY attribute A, should NOT aggregate over A; t should have concrete values (any non * value) for A 3) if attribute A does not appear in GROUP BY or WHERE clause, then t should have A = * (aggregate over A for all t's)

Designing physical files requires ________ of where and when data are used in various ways.

descriptions

A nonkey attribute is also called a(n):

descriptor

The attribute on the left-hand side of the arrow in a functional dependency is the:

determinant

Database designer

determines whether an entity is weak based on business rules

Mapping Quantitative Association Rules to Boolean Rules:

discretize the data of each column into categories and then, for each categories k options, replace k new boolean columns (one for each option)

Single linkage (clustering)

distance b/w 2 clusters A and B = min{d(a,b): a∈A, b∈B}

An advantage of partitioning is:

efficiency

In most cases, the goal of ________ dominates the design process.

efficient data processing

A factor to consider when choosing a file organization is:

efficient storage

INTERVAL

elapsed time between two dates

A primary key whose value is unique across all relations is called a(n):

enterprise key

A contiguous section of disk storage space is called a(n):

extent

A disadvantage of partitioning is:

extra space and update time

The smallest unit of application data recognized by system software is a:

field

A(n) ________ is a technique for physically arranging the records of a file on secondary storage devices.

file organization

When all multivalued attributes have been removed from a relation, it is said to be in:

first normal form

Constraints on generalized association rules

for non-trivial association rules X => Y: no item in Y is an ancestor of an item in X (b/c x=>ancestor(x) is trivial/ always holds)

An attribute in a relation of a database that serves as the primary key of another relation in the same database is called a:

foreign key

The normal form which deals with multivalued dependencies is called:

fourth normal form

A constraint between two attributes is called a(n):

functional dependency

Transaction Processing

grouping of related database changes into unit of work that will suceed(proceed) or fail(terminate)

A file organization that uses hashing to map a key into a location in an index where there is a pointer to the actual data record matching the hash key is called a:

hash index table

A(n) ________ is a routine that converts a primary key value into a relative record number.

hashing algorithm

A file organization where files are not stored in any particular order is considered a:

heap file organization.

An attribute that may have more than one meaning is called a(n):

homonym

Distributing the rows of data into separate files is called:

horizontal partitioning

The entity integrity rule states that:

no primary key attribute can be null

A requirement to begin designing physical files and databases is:

normalized relations

Understanding the steps involved in transforming EER diagrams into relations is important because:

one must be able to check the output of a CASE tool

Q.16 In the figure below, what type of relationship do the relations depict?

one-to-Many

While Oracle has responsibility for managing data inside a tablespace, the tablespace as a whole is managed by the:

operating system

Iceberg queries

queries that generate a large set of data but only keep/return a small amount after filtering (i.e., tip of the iceberg) (can use a priori algorithm to optimize these by first getting sets of entities that sat the constraints and then joining entity types to get full data candidates & applying filtering again, ie, only include customers that bought more than min number of items)

Subquery

query inside another query

An integrity control supported by a DBMS is:

range control

A rule that states that each foreign key value must match a primary key value in the other relation is called the:

referential integrity constraint

A two-dimensional table of data sometimes is called a:

relation

Relational OLAP (ROLAP)

relational and specialized relational DBMS to store and manage warehouse data (set up like traditional relational DBs)

Attributes

repersents different characteristics about the entity

table constraint

restricts the data value with respect to all other values in the table

Euclidean Distance between two time series

sqrt(∑(c_i - q_i)²), where Q and C are assumed to be same length. Problem: time series that are even a little staggered along time axis have very high distance for this distance function (cannot compensate for small distortions in time axis)

Column-oriented storage for OLAP

store each column's data together instead of storing by rows; good for read only data (but can be expensive to update/write to); easier run-length compression

datetime

stores data and time data

Association rules as implication rules

support({A,B}) = Pr(A, B) Conf(A=>B) = P(B|A) = Pr(A, B)/Pr(A) (= Pr(B) if A and B are independent -- problem!)

Two or more attributes having different names but the same meaning are called:

synonyms

Data is represented in the form of:

tables

Within Oracle, the named set of storage elements in which physical files for database tables may be stored is called a(n):

tablespace


Kaugnay na mga set ng pag-aaral

"How the Mass Media Divide Us" by Diana Mutz

View Set

a&p chapter 25 quiz and homework

View Set

GMetrix Computer Fundamentals Practice Questions

View Set

Sell Products or Services Online - Google Digital Garage

View Set

alterations in respiratory function

View Set