DBS

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Relational Algebra

σ Selection π projection x cross product A Join p Renaming - Quantity difference ÷ Division ∪ Union ∩ Mean Average F Semi-Join (left) E semi-join (right) C left outer join D right outer join

Typical problems without DBMS

- Redundancy and inconsistency - Limited accessibility - Problems with multi-user operation - loss of data - Integrity violation - Security problems - High development costs for application programs - Long term data usage - Scalability

User interfaces for DBS

1. Menu driven surfaces - Guide the user by formulating a request - User need not remember any commands or syntax 2. Form-based surfaces - User enters data in form - Forms realize interfaces for transactions 3. Natural language user interfaces Input is in plain text, system tries to understand this 4. User interfaces for the database administrator - Enter privileged commands - Setting system parameters, granting rights, defining and tuning storage structures

Logical data models

A DATA MODEL is a way of explaining the logical layout of the data and the relationship of various parts to each other and the whole. In the early years often a flat file system, or a simple text file with all the data listed in some order. Network Model contains many links among the various items of data. Interrelated indices allow access to data from a variety of directions. Hierarchical data model constructed using a tree model, with one root and several levels of subtrees. Data are accessed beginning with the root and traveling down the tree until the desired details are located Relational data model - Most widely used - all data are kept in tables or relations - Extensions (object orientation, XML) were assimilated Object-oriented data model - Only with embedded DB (CAD, CASE tools, ...) Deductive data model - Logical statements - mostly only in research PoC

Relational schema (Example)

A relation schema R, denoted by R(A1, A2, ... , An), is made up of a relation name R and a list of attributes, A1, A2, ... , An. Each attribute Ai is the name of a role played by some domain D in the relation schema R. D is called the domain of Ai and is denoted by dom(Ai). A relation schema is used to describe a relation; R is called the name of this relation. The degree of a relation is the number of attributes n of its relation schema. A relation of degree seven, which stores information about university students, would contain seven attributes describing each student as follows: STUDENT(Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa) Using the data type of each attribute, the definition is sometimes written as: STUDENT(Name: string, Ssn: string, Home_phone: string, Address: string, Office_phone: string, Age: integer, Gpa: real) For this relation schema, STUDENT is the name of the relation, which has seven attributes. In the preceding definition, we showed assignment of generic types such as string or integer to the attributes. More precisely, we can specify the following previously defined domains for some of the attributes of the STUDENT relation: dom(Name) = Names; dom(Ssn) = Social_security_numbers; dom(HomePhone) = USA_phone_numbers3, dom(Office_phone) = USA_phone_numbers, and dom(Gpa) = Grade_point_averages. It is also possible to refer to attributes of a relation schema by their position within the relation; thus, the second attribute of the STUDENT relation is Ssn, whereas the fourth attribute is Address.

Data manipulation language (DML)

Access to DB allows the user to enter, to retrieve, and to update the data. Typical accesses: reading, inserting, changing and deleting data Procedural DML (CLI) vs. Non-procedural DML (Pre-Compiler) The actual population of the database with information is accomplished using the DML. INSERT, UPDATE, DELETE

Data Definition Language (DDL)

Allows the user to define the organization of the data at the logical level. Definition of the external, conceptual and internal schema DDL Compiler translates DDL statements into schema instances of the database catalog(mapping from the logical to the physical.) is used to create the tables and describe the fields within each table CREATE, ALTER, DROP, RENAME

Additional SQL functions

Arithmetic operators: +, -, *, / String operators: || Functions: to_number, to_char, to_date; Nvl, decode; sysdate

⊆ (subset)

B is a subset of A (written B ⊆ A) iff every member of B is a member of A.

CASE function

Case-by-case differ in MySQL Wider support than decode However, only exact comparisons. No intervals. CASE <attribute> WHEN <expression> THEN <result> [WHEN <expression> THEN <result> ...] [ELSE <result>] END

Aggregate functions SQL

Consolidate received tuples to a value Counting the tuples: count, Output of different values: distinct, Maximum value: max, Minimum value: min, Sum: sum, Average: avg

DELETE

DELETE FROM vorlesungen WHERE SWS > 2;

Required language options and interfaces for each user category

Data Definition Language (DDL) Data manipulation language (DML)

Data abstraction in DBS

Data Independence: • logical independence The logical organization is the conceptual data model that is being implemented. • physical data independence The physical organization is related to how the data are actually stored on the disk. Change in a level does not affect the levels above it the data should be kept separate from the DBMS. At the physical level, the data should be independent of the particular model or architecture. The schema at any of the three levels should be modifiable without interfering with the next higher level. For example, the physical storage of the database might need to be changed. However, this change should not affect either the conceptual view of what is stored or the user's ability to understand and access the data. The data should also be logically independent. Different users and application programs require different information via different logical views. A well-designed system will maintain data independence both physically and logically.

Data modeling

Detail of the real miniature world --(manual / intellectual modeling)->Conceptual schema (ER model) -(semi-automatic transformation according to defined rules)-> Relational scheme

Sequence example (rel alg)

For example, to retrieve the first name, last name, and salary of all employees who work in department number 5, we must apply a SELECT and a PROJECT operation. We can write a sin- gle relational algebra expression, also known as an in-line expression, as follows: πFname, Lname, Salary(σDno=5(EMPLOYEE)) SELECT E.Fname AS First_name, E.Lname AS Last_name, E.Salary AS Salary FROM EMPLOYEE AS E WHERE E.Dno=5,

3-layer architecture and data abstraction

Goal: Separation of user application and storage details External level - Provision of information views the one concerned with the users. Whether the users are application programmers or end users, they still have a view, or mental model, of the database and what it contains. Conceptual level - Database schema describes the data to be stored the logical definition of the database, is sometimes referred to as the community view. The data model and the schema diagram are both explanations of the database on the conceptual level. The DBA and assistants maintain the schema and usually are the ones who use the DDL to define the database. Internal level - Definition of the type of data storage concerns the way the data are physically stored on the hardware. The internal level is described using the actual bytes and machine-level terminology. Supported to a certain extent by the majority of DBMS

INSERT

INSERT INTO Professoren (PersNr, Name, Rang, Raum, AssiPersNr, AssiName, Fachgebiet) VALUES (2125, 'Sokrates', 'C4', 226, 3002, 'Platon', 'Ideenlehre'); or INSERT INTO Professoren VALUES (2125, 'Sokrates', 'C4', 226, 3002, 'Platon', 'Ideenlehre');

UPDATE (change)

UPDATE vorlesungen SET SWS = SWS * 2; single tuple: UPDATE vorlesungen SET SWS = SWS * 2 WHERE SWS < 4;

Domain

Let D be a new set defined as the union of all the attribute domains. D = D1 ∪ D2 . . . ∪ Dn(value range)/data type Example: Telephone book ⊆ string x string x integer For each column of a table there is a set of possible values called its domain. The domain contains all permissible values that can appear under that column. That is, the domain of any column can be viewed as a pool of values from which we can draw values for the column. We will denote the domain of any given column or attribute by Dom(column name). Observe that any value that appears under a column must belong to its domain. In a table, it is possible that two or more columns may have the same domain.

Explain the difference between logical and physical data independence.

Physical data independence is related to the actual storage of the data on the storage medium. The way the data are stored in bits and bytes should not affect users' access to that data. Logical data independence usually relates to the different logical views of the data available to different users. If one user sees customer name and address without all the billing information, that does not mean that the billing information is not being stored. Another user should be able to see and even change this information without corrupting the database as a whole.

Essential tasks of a DBMS

Providing the typical DBMS functionality for users and application programs - Database query language: declarative (vs. procedural) - Transaction Concept / Recovery: ACID property - Multi-user / network capability Data security and privacy - Protection against unauthorized access Technical efficiency - Response times and throughput Hide the computer architecture - Single / multi-processor systems - Multiple hard drives, RAID systems ...

SELECT

SELECT [ALL | DISTINCT] {(pseudo) column list | *} [Column alias] FROM table list [table alias] [WHERE conditions on rows] [GROUP BY groupings] [HAVING conditions on groups] [{UNION | INTERSECT | MINUS} query] [ORDER BYsorter columns] Eg. SELECT PersNr, Name FROM Professoren WHERE Rang= ́C4 ́; ORDER BY Rang DESC, Name ASC;

WHERE clause

Selection conditions on lines Comparison operators: =, <>,>, <,> =, <= LIKE BETWEEN AND [NOT] IN IS [NOT] NULL Examples: ... WHERE (FlzTypus = '747') OR (manufacturer LIKE 'B%') ... WHERE FlightGes IN ('LH', 'BA', 'TA');

HAVING

Similar to the WHERE clause: additional conditions can be set to the groups formed by GROUP BY using HAVING Examples: SELECT gelesenVon, SUM (SWS) FROM lectures GROUP BY gelesenVon HAVING AVG (SWS)> 3; SELECT Name, SUM (SWS) FROM Lectures, Professors WHERE gelesenVon = PersNr GROUP BY Name HAVING AVG (SWS)> 3;

Database languages as protocols

Stand-alone query languages for the participant and data-storage system dialog An Embedded query language in the programming language (host language) for the further processing of the selected data by application programs Database programming language for complete integration of the data model including DDL into a programming language Rules for the linguistic formulation of transaction procedures: - Start and finish of execution - Return to the state at the beginning of the execution

GROUP BY clause

Summary of tuples that have the same value in a selected attribute combination this value appears - in the base table in several lines, - in the grouped result table only once more Care: in SELECT clause may only be projected on - those attributes that are grouped - Group functions, applied to other attributes of the base table - constant Example: SELECT gelesenVon, SUM (SWS) FROM lectures GROUP BY gelesenVon;

Project (rel algebra)

The PROJECT operation, on the other hand, selects certain columns from the table and discards the other columns. (duplicate elimination) πSex, Salary(EMPLOYEE) SELECT DISTINCT Sex, Salary FROM EMPLOYEE

SELECT (rel algebra)

The SELECT operation is denoted by σ<selection condition>(R) σDno=4 AND Salary>25000 (EMPLOYEE) SELECT * FROM EMPLOYEE WHERE Dno=4 AND Salary>25000;

SET DIFFERENCE (or MINUS) (rel alg)

The result of this operation, denoted by R - S, is a relation that includes all tuples that are in R but not in S. (Only instructors who are not students)

INTERSECTION (rel alg)

The result of this operation, denoted by R ∩ S, is a relation that includes all tuples that are in both R and S. (Student who is also Instructor)

UNION (rel alg)

The result of this operation, denoted by R ∪ S, is a relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated. (Adding everything to one table, leaving the same amount of columns). (Student and Instructor together without duplicates)

Database

a logically coherent collection of data with some inherent meaning

Relation

a relation is the definition of a table with columns (attributes) and rows (tuples). Relation: R ⊆ D1 x ... x Dn Example: Telephone book ⊆ string x string x integer A relation (or relation state) r of the relation schema R(A1, A2, ... , An), also denoted by r(R), is a set of n-tuples r = {t1, t2, ... , tm}. Each n-tuple t is an ordered list of n valuest=<v1,v2,...,vn>,whereeachvaluevi,1≤i≤n,isanelementofdom(Ai)oris a special NULL value. (NULL values are discussed further below and in Section 5.1.2.) The ith value in tuple t, which corresponds to the attribute Ai, is referred to as t[Ai] or t.Ai (or t[i] if we use the positional notation).

Relational schema

determines the structure of the stored data Ex .: Phonebook: {[Name: string, Address: string, Phone #: integer]} A named relation defined by a set of attribute and domain name pairs. Let A1, A2, . . . , An be attributes with domains D1, D2, . . . , Dn. Then the set {A1:D1, A2:D2, . . . , An:Dn} is a relation schema.

SQL Data types

integer (oder auch integer4) smallint (oder auch integer2) numeric(p,q) float(p) character(n), character varying(n) oder varchar(n) bit(n), bit varying(n) date, time, timestamp blob (raw), clob (binary oder character large object)

Data Control Language, DCL

is used to create privileges to allow users access to, and manipulation of, the database. There are two main commands: GRANT to grant a privilege to a user REVOKE to revoke (remove) a privilege from a user

Database Schema

is used to describe the conceptual organization of the database system. This organization is defined during the design process. Once the database has been created and has begun to be populated with data, it is sometimes difficult to change the schema.

JOIN (rel alg)

operation, denoted by A , is used to combine related tuples from two relations into single "longer" tuples. We want to retrieve the name of the manager of each department. To get the manager's name, we need to com- bine each department tuple with the employee tuple whose Ssn value matches the Mgr_ssn value in the department tuple. We do this by using the JOIN operation and then projecting the result over the necessary attributes, as follows: R A <join condition>S DEPARTMENT A Mgr_ssn=SsnEMPLOYEE Result-Columns: Dname, Dnumber, Mgr_ssn, Fname, Minit, Lname, Ssn (Mgr_ssn and SSN contain the same values)

ORDER BY clause

ordered display of query results Examples: SELECT PLZ, name, inhabitant FROM city ORDER BY PLZ ASC; SELECT semester, name, address FROM Student ORDERBY semester;

DBMS

software system that allows users to define, create and maintain a database and provides controlled access to the data. - Provide basic functionality through DBMS - Abstraction layer between DB and applications - Central interfaces

Tuple

t ∈ R (t is an element of R) Example: t = ("Mickey Mouse", "Main Street", 4711)


Kaugnay na mga set ng pag-aaral

Texas Principles of Real Estate 1 Chapter 11

View Set

Psychology 1101: Unit 2 - The Brain and Memory

View Set

Economics Ch 1 and 2 Test Review

View Set

Chapter 18 Endocrine and Reproductive Systems

View Set

MCSA Windows 10 70-698 Assessment Test

View Set

Chapter 3 - External Analysis: Industry Structure, Competitive Forces, and Strategic Groups

View Set

Unit 8 - Innovation and Commercialisation

View Set