csci330 test1
integrity constraints
things like foreign keys, primary keys, not null. OPTIONAL
SQL
a non proceduarl language
atomic
non changing
Atomicity
what SHOULD happen
DBMS or DBS
collection of interrelated data (database) and a set of programs that allows users to access and modify these data
Database
collection of structured and interrelated data
intersect
like "and"
Database administrator
Granting of authorization for data access. By granting different types of authorization, the database administrator can regulate which parts of the database various users can access. The authorization information is kept in a special system structure that the database system consults whenever someone attempts to access the data in the system.
Relation schema
consists of 1)name 2)list of attributes 3)their domains ex) Instructor(ID,name,dept_name,salary) refers to its logical design, while an instance of the relation refers to its contents at a point in time. The schema of a database and an instance of a database are similarly defined. The schema of a relation in- cludes its attributes, and optionally the types of the attributes and constraints on the relation such as primary and foreign key constraints.
centralized vs distributed
centralized( database located on single computer bad) distributed(data physically divided among several computers connected by a network)
Natural Join (∞)
circle everything the same EX) dr ∞ s, where r and s are different relations. MUST HAVE AT LEAST ONE COMMON ATTRIBUTE
Relational Model
collection of tables to represent both data and relationships. most current DBMS are based on this. in relational DB programmer free to work at logical level. almost all low level tasks are automatically done by DB The user of the database system may query these tables, insert new tuples, delete tuples, and update (modify) tuples. There are several languages for expressing these operations.
Referential-integrity constraints
constraints ensure that a value that appears in one relation for a given set of attributes also appears for a certain set of attributes in another relation.
Disk storage(DBS bottom part)
data stored here. data stored on physical device has: 1)indices(makes searching faster/organizes data) 2)data dictionary/meta data(data abt data. who has acces etc 3)statistical data(DBS hides this. details of how data is stored/maintained. data abstruction/user dont need to know
relational query languages
define a set of operations that operate on tables, and output tables as their results. These operations can be combined to get expressions that express desired queries. define a set of operations that operate on tables, and output tables as their results. These operations can be combined to get expressions that express desired queries.
Query processor (DBS middle part)
processes the queries(statement requesting info). represented by database languages like SQL subsystem compiles and executes DDL and DML statements.
procedural lang vs nonprocedural lang
prodedural: user instructs the system to perform operations on DB using relational algebra nonprocedural: user describes desired info without giving procedure. ex. SQL
relational algebra
provides a set of operations that take one or more relations as input and return a relation as an output. Practical query languages such as SQL are based on the relational algebra, but add a number of useful syntactic features.
data definition language (DDL)
provides commands for defining relation schemas. allows specifications for set of relations adn specification of each relation: schema, data type of attributes, integrity constraints, set of indicies, security & auth info, physical storage structure on disk is a language for specifying the database schema and as well as other properties of the data.
primary keys in relations
it is customary to put primary key(attribute doesnt change and chosen) first and underline it
Concurrent Access anomalies
money transfer problem
drawbacks of file(simple)
more complexity, data isolation, data redundancy, daya inconsistency, difficulty accessing data, integrity problems, atomicity problems, concurrent access anomalies, secuirty problems
Cartesian product (x)
multiply relations together. ATTRIBUTES MUST BE UNIQUE!! if not use natural join EX) r x s, where r and s are different relations
schema
of a relation refers to its logical design, while an instance of the relation refers to its contents at a point in time. The schema of a database and an instance of a database are similarly defined. The schema of a relation in- cludes its attributes, and optionally the types of the attributes and constraints on the relation such as primary and foreign key constraints.
File Systems
organizations used to strore data in file systems. use DBS now less complications/errors. control how data is stored and retrieved
varchar(n)
variable length string, with user specified max length n
Selection (∂)
select all attributes that satisfy this condition from relation r ex) ∂A=B^D>5(Relation) ^ = and
domain fo attributes
set of allowed values for a column(ex. numbers, boolenas, char strings)
data model
Underlying the structure of a database is the data model: a collection of conceptual tools for describing data, data relationships, data semantics, and data constraints.
view relations
View relations can be defined as relations containing the result of queries. Views are useful for hiding unneeded information, and for collecting together information from more than one relation into a single view.
primary key
a candiate key chosen by DB designer. attribute values are never/rarely changed ex)SSN. A table does not need to have one
ACID
1)Atomicity(all or none transaction) 2)consistency(preserves consistency of dbs. value of the sum and balances of A and b preserved 3)Durability(After succesful funds transfer. new values A and B must persist even if system fails 4)Isolation(fixes concurrent access anomalies/students regestering at same time)
max number of superkey equation
(2^n) - 1 # of attributes(col titles)
DBMS
Database management system, a software suite designed to organize and search data.
data dictionary
which contains metadata — that is, data about data. The data dictionary is considered to be a special type of table that can only be accessed and updated by the database system itself (not a regular user). The database system consults the data dictionary before reading or modifying actual data.
Users and interfaces (DBS top part)
1. naive users(use,application interfaces) 2. application programers(use,application programs, write) 3.sophisticated users(use,annalysts, query tools) 4. database admin(use,admin tools)
2 tier vs 3 tier
2T:database system on users machine communicates directly with database on server 3T:database system on user machine doesnt direcly communicate with server. tree tier could be better for security reasons
Storage Manager(DBS middle part 2)
4 parts 1)file manager(manages space allocation on disk storage 2)buffer manager(fetches data from disk and stores to main memory. for faster fetches) 3)Authorization and integrity manager(checks if users have auth to access data, checks integrity constraints/identical id #s) 4)Transaction manager(a unit of program that accesses and updates data items. ensures ACID properties. Ensures that the database remains in a consistent state despite system failures subsystem provides the interface between the low- level data stored in the database and the application programs and queries submitted to the system.
purpose of database system
A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained.
Assertions
An assertion is any condition that the database must always satisfy. Domain constraints and referential-integrity constraints are special forms of assertions. When an assertion is created, the system tests it for validity. If the assertion is valid, then any future modification to the database is allowed only if it does not cause that assertion to be violated.
list six major steps that you would take in setting up a database
Define the high level requirements of the enterprise (this step generates a document known as the system requirements specification.) • Define a model containing all appropriate types of data and data relationships. • Define the integrity constraints on the data. • Define the physical level. • For each known problem to be solved on a regular basis (e.g., tasks to be carried out by clerks or Web users) define a user interface to carry out the task, and write the necessary application programs to implement the user interface. • Create/initialize the database.
Routine maintenance
Examples of the database administrator's routine maintenance activities are: 1. Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of data in case of disasters such as flooding. 2. Ensuring that enough free disk space is available for normal operations, and upgrading disk space as required. 3. Monitoring jobs running on the database and ensuring that performance is not degraded by very expensive tasks submitted by some users.
List at least 3 different types of information that a university would maintain, beyond those listed in Section 1.6.2
Information about people who are employees of the university but who are not instructors. • Library information, including books in the library, and who has issued books. • Accounting information including fee payment, scholarships, salaries, and all other kinds of receipts and payments of the university.
Drawbacks of file systems
More complex over time(adding things/updating things), data isolation(data scattered in diff files),data redundancy(repeated data), data inconsistency(diff data in diff files),difficulty accessing data, integrity problems($ transfers),security problems,
architecture of database system
The architecture of a database system is greatly influenced by the underlying computer system on which the database system runs. Database systems can be centralized, or client-server, where one server machine executes work on behalf of multiple client machines. Database systems can also be designed to exploit parallel computer architectures. Distributed databases span multiple geographically separated machines.
View level.
The highest level of abstraction describes only part of the entire database. Even though the logical level uses simpler structures, complexity remains because of the variety of information stored in a large database. Many users of the database system do not need all this information; instead, they need to access only a part of the database. The view level of abstraction exists to simplify their interaction with the system. The system may provide many views for the same database.
Physical level
The lowest level of abstraction describes how the data are ac- tually stored. The physical level describes complex low-level data structures in detail.
Logical level
The next-higher level of abstraction describes what data are stored in the database, and what relationships exist among those data. The logical level thus describes the entire database in terms of a small number of relatively simple structures. Although implementation of the simple struc- tures at the logical level may involve complex physical-level structures, the user of the logical level does not need to be aware of this complexity. This is referred to as physical data independence. Database administrators, who must decide what information to keep in the database, use the logical level of abstraction.
storage manager
The storage manager is the component of a database system that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.
Referential Integrity
There are cases where we wish to ensure that a value that appears in one relation for a given set of attributes also appears in a cer- tain set of attributes in another relation (referential integrity). For example, the department listed for each course must be one that actually exists. More precisely, the deptattribute of some record of the department relation. Database modifications can cause violations of referential integrity. When a referential-integrity con- straint is violated, the normal procedure is to reject the action that caused the violation.
Authorization.
We may want to differentiate among the users as far as the type of access they are permitted on various data values in the database. These differentiations are expressed in terms of authorization, the most common being: read authorization, which allows reading, but not modification, of data; insert authorization, which allows insertion of new data, but not mod- ification of existing data; update authorization, which allows modification, but not deletion, of data; and delete authorization,
Foreign key
dept_name in Instructors table is a foreign key ex)dept_name in Department table is a primary key. dept_name in Instructor table is a foreign key for Instructor table is a set of attributes in a referencing relation, such that for each tuple in the referencing relation, the values of the foreign key attributes are guaranteed to occur as the primary key value of a tuple in the referenced relation.
Integrity constraint
ensure that changes made to the database by authorized users do not result in a loss of data consistency.
Transaction management
ensures that the database remains in a consistent (correct) state despite system failures. The transaction manager ensures that concurrent transaction executions proceed without conflicting.
ways type declaration system of a language such as java differs from the data definition language used in a database
executing an action in the DDL results in the creation of an object in the database. programming lang type declaration is simply an abstraction used database DDLs allows consistency constraints to be specified DDL allows consistency constraints to be specified which programming lang dont.(domain constraints) DDLs support authorization , giving different access rights to diff users programming language type systems are usually richer than the SQL type system. most database support only basic types such as diff types of numbers and strings
numeric(p,d)
fixed decimal number with user specified precision. number consists of p digits and d of p digits are to right of decimal point
char(n)
fixed length string, with user specified length n
float(n)
floating point number, with at least n digits
superset
if K is a super key. anything paird with k is a super key ex) K = ID num superkey of K = name {ID,name}
union, intersect, except/ set difference
in SQL automatically eliminate duplicates. NOTE! for all of these operations A and B MUST HAVE the SAME attributes(col names)
database architecture
influenced by computer system on which DB is running. 1.centralized vs distrubuted 2.client-server 3.parallel(multi-processor)
other data types for SQL
int, small int, real,double precision
data manipulation language (DML)
provides commands for modifying tuples s a language that enables users to access or manipulate data. Nonprocedural DMLs, which require a user to specify only what data are needed, without specifying exactly how to get those data, are widely used today.
Relation vs tuples vs attributes
relation = tables tuples = rows attributes = columns
projection (π)
select listed attributes and get rid of duplicates ex) πA,C(Relation)
Super key superkey
set of one or more attributes(columns) that uniquely identify a tuple(row)/whose values are guaranteed to identify tuples in the relation uniquely. A candidate key is a minimal superkey, that is, a set of attributes that forms a superkey, but none of whose subsets is a superkey. One of the candidate keys of a relation is chosen as its primary key. ex) ID number
disadvantages of a database system
setup of th databse system requires more knowledgs,money,skills adn time. the complexity of the database may result in poor performance
candidate key
special kind of superkey(minimal super key). any proper subset(similar values but not all the same) of a candidate key can't be a super key. CHEAT a superkey with only 1 attribute is a candidate key together they create a primary key, but seperately they cannot. and any proper subset of a candidate key cant be a primary key