IB Computer Science: Databases
What does a logical schema model include?
1) All entities and relationships (with cardinalities) among them. 2) All attributes of interest for each entity are specified. 3) The primary key for each entity is specified. 4) Foreign keys (keys identifying the relationship between different entities) are specified. 5) Normalization occurs if required
Functions of a DBMS
1) Controls definition, creation, maintenance, and use of a database 2) Mediates between data handling applications and the operating system 3) Provides multiple user interfaces, data storage, retrieval and update facilities, data dictionaries, backup and recovery services 4) Restricts unauthorized access 5) Defines and enforces integrity constraints
Features of physical schema model
1) Specification of all tables and columns. 2) Primary Keys are defined 3) Foreign keys are used to identify relationships between tables. 4) Denormalization may occur based on user requirements. 5) Physical considerations may cause the physical data model to be different from the logical data model. 6) Physical data model will be different for different RDBMS. For example, data type for a column may be different between MySQL and SQL Server.
How does data redundancy affect integrity of databases?
1) Wastes storage because of unnecessary repetition of data 2) Too many updates done to one occurrence of repeated data 3) Data inconsistency where multiple updates are not carried out successfully, leading to inconsistent data change. As a result, the integrity, reliability, and consistency of the database is impaired 4) Time consuming when there is multiple entry of the same data 5) Errors occur frequently
How to design physical schema
1. Convert entities into tables. 2. Convert relationships into foreign keys. 3. Convert attributes into columns. 4. Modify the physical data model based on physical constraints / requirements
Why are databases beneficial?
1. Find information quickly 2. Support multi-users simultaneous access 3. Data redundancy is avoided 4. Flexibility of use per user needs 5. Longevity
Database
A collection of organized data and information that can easily be accessed, managed, and updated.
Candidate key
A column, or set of columns, in a table that can uniquely identify a record in the table. Each table may have one or more in the design process. The best one is chosen to be the primary key.
Primary key (PK)
A field (or group of fields) that uniquely identifies a given entity in a table
Calculated field
A field that derives its data from the calculation of other fields. The data are not entered into a calculated field by the user.
Secondary key
A field/column that is used for data searches and retrieval. It is also known as an alternate key
Database transaction
A logical unit of work (multiple commands) performed by the DBMS on a database.
Relational DBMS
A software based on the relational model that helps to define and create a relational database; manages data storage and provides tools and facilities for the retrieval and modification of the data so stored.
DBMS
A software that helps to define and create a database; manages data storage and provides tools and facilities for the retrieval and modification of data.
How do transactions maintain data integrity and consistency?
All commands are executed as one unit of work in the context of a transaction. Only when all commands are completed the transaction is either made public to all (Commited) or reverted (Rolledback)
Data concurrency
Allow multiple users to carry out transactions on a database at the same time
Field
An attribute of an entity. It is equivalent to a column of a table.
Transaction Atomicity
An indivisible unit of work that is either performed in its entirety or is not performed at all. All tasks must succeed together or fail together.
Information System
An integrated set of components for collecting, storing, and processing data and for providing information, knowledge, and digital products
ACID properties of a transaction
Atomicity, Consistency, Isolation, Durability
Data dictionary
Central repository that describes the structure and attributes of "data items" in a database.
Updates/modifictations
Change the state (data) of the database through the following three operations: Insert, Update and delete.
Data
Collection of raw facts and figures
Record
Collection of related fields, each pertaining to the same entity instance. It is equivalent to tuple and row.
Referential integrity
Concept, which states that table relationships must always be consistent. In other words, any foreign key field must agree with the primary key that is referenced by the foreign key.
What are the 3 levels of schema's?
Conceptual, logical, physical
Data redundancy
Condition in a database which permits unnecessary duplication of data NOT common in relational databases. Affects integrity and reliability of databases
DDL basic commands
Create, alter, drop
Metadata
Data about data
DDL
Data definition language - creates, modifies or destroys the structure of database objects such as views, tables, indexes, constraints, and more
How will lack of data concurrency affect an application?
Data integrity will be hurt and performance will be badly impacted.
What are the differences between data and information?
Data: 1. Data is used as input for the computer system 2. Data is unprocessed facts figures 3. Data doesn't depend on information 4. Data is not specific 5. Data is a single unit 6. Data doesn't carry a meaning 7. Data is the raw material Information: 1. Information is the output of information system 2. Information is processed data 3. Information depends on data 4. Information is Specific 5. Information is usually a group of data 6. Information must carry a logical meaning 7. Information is the product
DML
Database Manipulation Language. In SQL, such statements as UPDATE, INSERT and DELETE are considered DML.
Foreign key (FK)
Defined on a field in one table that references/links to the primary key in another table - enforces data integrity (referential integrity) of the DB
Data modeling
Defines and analyses data requirements and creates a blueprint (design) for the construction and maintenance of a database.
Logical schema
Describes the data in as much detail as possible regardless of how they will be physically implemented in the database.
Two major types of data integrity
Entity Integrity maintained by PK and referential integrity maintained by FK
Transaction Isolation
Execute independently of one another. The partial effects of incomplete transactions should not be visible to other transactions.
Conceptual schema model
Identifies the main entities and the relationships between them (no attributes nor PK).
Why is data modeling via ERD important?
Important tool for communicating between user, designer and developer. 1)Ensures data will do what is required and has a clear purpose. 2) Identifies the entities and tables to support purpose of database 3) Only necessary attributes of tables are there to prevent inefficiencies 4) The keys for accessing the data are identified so that the user can access the data 5) The relationships among tables are identified so that complex queries are performed across several tables 5) Perform normalization that reduces data duplication
How can concurrency affect an application?
It can affect the integrity of the data and the overall performance
Examples of RDBMS?
MS access, MySQL, Oracle Enterprise DB
Data integrity
Maintenance and assurance of the accuracy and consistency of data over its entire life-cycle
DBMS tools
Means of querying data, forms for displaying data, reports for producing output, charts for displaying data, DDL for constructing/amending the schema
Update
Modification of the value of data in a database
What does Data Dictionary hold?
Names and descriptions of all the database objects like views, tables, and the fields contained in each object. It also documents information about the data type, field length and other constraints and validations.
Benefits of referential integrity
Overall integrity guaranteed as all references to a particular record are the same andconsistent.
Information
Processed data within a context.
Purposes of database transactions
Provide reliable units of works that allow correct recovery from failures and keep a database consistent even in cases of system failure.
What is the role of data validation?
Reducing errors in the data being entered into your system. Examples include range check, length check, format check.
Physical schema model
Represents how the model will be built in the database. Shows all table structures, including column name, column data type, column constraints, primary key, and foreign key.
Two major database operations are?
Retrievals (queries) and Updates (modification)
Retrievals/queries
Selecting fields and records that satisfy the needs of a particular user.
What are the 6 components of an information system?
Software, hardware, people, database (data), procedures and communication.
What are some processes that turn data to information?
Sorting, selection, arithmetic manipulations, interpretation, summarizing
Schema
Structure or plan of the database which defines the tables and relationships that exist in the database
Transaction Durability
The transaction successful completion must be permanently recorded and must not be lost because of a subsequent system failure.
Transaction Consistency
Transaction results must conform to existing constraints in the database. The database must always be left in a valid state after a transaction.
Join
Used to match records from two or more tables. The tables involved must be joined by at least one common field
How is a DBMS used to promote data security?
User authentication, setting and enforcing access rights, data validation, and data locking
SQL Query
Uses select statements to retrieve data from one or more tables
What is the role of data verification?
Verify entered data is identical to its original source. As example is the "double typing" of a password when creating a user.
Database view
Virtual table (not part of the physical schema and so it does not exist physically in the database) generated by the DBMS referencing the underlying base tables.
How is concurrency managed?
With locks and transactions
Data dictionary: data format
any special formatting applied to the field; e.g. number of decimal places in number fields, short, medium and long date and time formats.
Database state
data in a database at a particular time
Data dictionary: field size
number of characters allowed for a given field
Data dictionary: field name
the name of the field as it appears in the database.