IAB130
NoSQL
"Not only SQL" - a term used to address the class of databases that do not follow RDBMS principles
Relational Algebra
A (high-level) procedural language: it can be used to tell the DBMS how to build a new relation from one or more relations in the database
join
A _____ is used to combine rows from two or more tables, based on a common field between them
transitive dependency
A _____________ is an indirect functional dependency, one in which A→C only by virtue of A→B and B→C
proper subset
A ______________ of a set of attributes is group of attributes that is contained in the original set of attributes, excludes at least one of the original attributes
Database system
A collection of application programs that interact with the database along with the DBMS and database itself.
Data Model
A collection of concepts that can be used to describe a set of data, the operations to manipulate the data, and a set of integrity constraints for the data.
sub-query
A complete SELECT statement embedded in another query. Conceptually, a subquery produces a temporary table whose contents can be accessed by the outer query. A subquery can be embedded in another subquery.
Database application program
A computer program that interacts with the database by issuing an appropriate request (typically an SQL statement) to the DBMS.
Entity Relationship Diagram (ERD)
A conceptual data model for describing the data or information aspects of an organisation, which lends itself to ultimately being implemented in a database (such as a relational database)
Normal form
A defined structure for relational databases in which a relation may not be nested within another relation which have attributes with particular relationships.
Database Schema
A description of the database structure.
Unified Modeling Language (UML)
A general-purpose, developmental, modeling language in the field of software engineering
Entity Type
A group of objects with the same properties, which are identified by the enterprise as having an independent existence.
irreducible
A key is _________ if no proper subset of the key is unique
Candidate Key
A minimal superkey, such that no proper subset is a superkey within the relation
Attribute, column, field
A named column of a relation, which describes a property of the corresponding entity or a relationship type
Relation schema
A named relation defined by a set of attribute and domain name pairs
Attribute
A property of an entity or a relationship type.
Query
A question asked of a database, where a result is returned in the form of a temporary table. All queries begin with the declarative SELECT clause.
1NF
A relation in which the intersection of each row and column contains one and only one value
Degree
A relation is the number of attributes it contains. A unary relation has one attribute, a binary relation has two, a ternary relation has three, and an n-ary relation has n attributes.
0NF
A relation that contains one or more repeating groups (the un-normalised form)
3NF
A relation that is in 1NF and 2NF in which no non-primary-key attribute is transitively dependent on the primary key
2NF
A relation that is in 1NF and every non-primary-key attribute is fully functionally dependent on the primary key
BCNF
A relation that is in 3NF and every determinant is a candidate key
4NF
A relation that is in BCNF and does not contain nontrivial multi-valued dependencies
Recursive relationship
A relationship type where the same entity type participates more than once in different roles.
Tuple, row, record
A row of a relation
Repeating group
A set of logically related attributes that occur multiple times in one tuple
Relationship type
A set of meaningful associations among entity types. A relationship occurrence is a uniquely identifiable association, which includes one occurrence from each participating entity type.
Relational Schema
A set of relation schemas, each with a distinct name
Database
A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization.
Database management system (DBMS)
A software system that enables users to define, create, maintain, and control access to the database.
Structured query language (SQL)
A special-purpose programming language designed for managing data held in a rDBMS. It is defined by an ISO standard, although implementations of it often have minor differences from the standard. SQL acts as both a Data Manipulation Language and Data Definition Language.
relation, table, file
A table with columns and rows. It has the following properties: The relation has a name that is distinct from all other relation names in the relational schema; Each cell of the relation contains exactly one atomic (single) value; Each attribute has a distinct name; The values of an attribute are all from the same domain; Each tuple is distinct; there are no duplicate tuples; The order of attributes has no significance; and The order of tuples has no significance, theoretically.
rDBMS
A term used to refer to a DBMS that uses the relational model.
Logical data independence
A type of data independence. _________ refers to the immunity of the external schemas to changes in the conceptual schema.
Physical data independence
A type of data independence. __________ refers to the immunity of the conceptual schema to changes in the internal schema.
Entity Occurence
A uniquely identifiable object of an entity type.
Transaction
A unit of work performed against a database, and treated in a coherent and reliable way independent of other transactions.
derived attribute
A value that is derivable from the value of a related attribute or set of attributes, not necessarily in the same entity.
view
A virtual or derived relation that is dynamically created from the under- lying base relation(s) when required.
ORDER BY
Allows the result table to be sorted on the values in one or more columns. Each column can be sorted in ascending or descending order. If specified, the ORDER BY clause must be the last clause in the SELECT statement.
Normilisation
Allows us to produce a set of suitable relations that support the data requirements of the use case and minimise redundancy.
Data definition language (DDL)
Allows users to define a database (e.g. create table, foreign key, constraints, create index).
Data manipulation language (DML)
Allows users to insert, update, delete, view and retrieve data from the database. The part of a DML that involves data retrieval is called a query language.
domain
An attribute ________ is the set of allowable values for one or more attributes.
Simple attribute
An attribute composed of a single component with an independent existence.
Composite Attribute
An attribute composed of multiple components, each with an independent existence.
partially dependent
An attribute is ________ if it is a non-key attribute and is functionally dependent on a proper subset of candidate or primary key
fully functionally dependent
An attribute is ___________ on a set of attributes A if it is both functionally dependent on A, and not functionally dependent on any proper subset of A
Single-valued attribute
An attribute that holds a single value for each occurrence of an entity type.
Multi-valued attribute
An attribute that holds multiple values for each occurrence of an entity type.
Foreign Key
An attribute within one relation that matches the candidate key of some (possibly the same) relation
Superkey
An attribute, or set of attributes, that uniquely identifies a tuple within a relation
Strong entity
An entity type Not existence-dependent on some other entity type. Uniquely identifiable using the primary key attribute.
Weak entity
An entity type existence-dependent on some other entity type. Can only identify attributes of weak entity types through relationships with strong entity types.
Inconsistent analysis
Arises where one transaction reads several values from a database, but another transaction updates some of the value while the first is still being performed. Essentially this means that the transactions overlap, and this affects the results of the read transaction.
Availability
CAP theorem (Brewer, 2000) formalizes the limitations of scalable, distributed databases: ________ the condition wherein a given resource can be accessed by its users, such that each transaction is always answered with "successful" or "failed" response
Consistency
CAP theorem (Brewer, 2000) formalizes the limitations of scalable, distributed databases: ________, when a first transaction changes some data on a first node, a subsequent transaction running on a second node shall see the changed data (and no earlier state).
Partition tolerance
CAP theorem (Brewer, 2000) formalizes the limitations of scalable, distributed databases: _________, the distributed database system must continue working (i.e., process transactions) even if single nodes drop out or communication to other nodes is interrupted (colloquially referred to as "brain split"), i.e., "node partitions" form
binary operations
Cartesian Product, Union, Intersection, Set Difference, Join and Division operations are ________, since they operate on more than one relation
Data type
Defines what kind of value a column can contain. You must know the following types: integer, float, string (fixed and variable length), text, date and timestamp.
Functional dependancy
Described where an attribute (A), or set of attributes, is reliant on another (B), and if and only if each A value is associated with precisely one B value. It is written as A → B.
Cardinality
Describes the maximum number of possible relationship occurrences for an entity participating in a given relationship type (e.g. Many to Many).
Referential integrity constraint
Ensures that if a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or the foreign key value must be wholly null.
Entity integrity constraint
Ensures that no attribute of a primary key can be null
Horizontal partitioning
It is sometimes argued that denormalisation needs to occur to improve the overall performance of a database even if flexibility and the speed of updates are sacrificed. __________ is an example application of this technique where tuples of a relation are spread across a number of smaller relations.
Data independence
Makes each level of a database schema immune to changes to lower levels.
Uncommitted dependency
Refers to a state where one transactions is allowed to see the intermediate result of another, even if the other transaction has not been committed
update anomalies
Relations with data redundancy suffer from __________, which can be classified as insertion, deletion, and modification anomalies.
Multi-valued dependency
Represents a dependency between attributes (for example, A, B, and C) in a relation, such that for each value of A there is a set of values for B and a set of values for C. However, the set of values for B and C are independent of each other.
Null
Represents a value for an attribute that is unknown at the present time or is not applicable for this tuple
Aggregate functions
SQL supports five _________ (COUNT, SUM, AVG, MIN, and MAX) that are used to compute a value from an input column within a query
WHERE
Selects rows to be included in the result table by applying a search condition to the rows of the named table(s). It represents the relational algebra operation selection.
internal level
The ANSI-SPARC (3-level) architecture uses three levels of abstraction: external, conceptual, and internal. The ______ is the computer's view of the database. It specifies how data is represented, how records are sequenced, what indexes and pointers exist, and so on.
external level
The ANSI-SPARC (3-level) architecture uses three levels of abstraction: external, conceptual, and internal. The _______ consists of the users' views of the database.
conceptual level
The ANSI-SPARC (3-level) architecture uses three levels of abstraction: external, conceptual, and internal. The conceptual level is the community view of the database. It specifies the information content of the entire database, independent of storage considerations. The _____ represents all entities, their attributes, and their relationships, as well as the constraints on the data, and security and integrity information.
unary operations
The Selection and Projection operations are _________, since they operate on one relation
determinant
The ________ of a functional dependency refers to the attribute, or group of attributes, on the left-hand side of the arrow.
Concurrency
The ability to perform multiple operations on a database at the same time
Primary Key
The candidate key that is selected to identify tuples uniquely within the relation
Scalability
The capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged in order to accommodate that growth. Database scalability can relate to the number of transactions per second, or the size of the database.
Atomicity
The characteristics of traditional relational databases, which guarantee that transactions are processed reliably, can be remembered using the ACID acronym: A - requires that if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged
Consistency
The characteristics of traditional relational databases, which guarantee that transactions are processed reliably, can be remembered using the ACID acronym: C - ensures that any transaction will bring the database from one valid state to another, meeting all constraints
Durability
The characteristics of traditional relational databases, which guarantee that transactions are processed reliably, can be remembered using the ACID acronym: D - means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors (permeant storage, rather than in volatile memory)
Isolation
The characteristics of traditional relational databases, which guarantee that transactions are processed reliably, can be remembered using the ACID acronym: I - ensures one transaction cannot interfere with another
SELECT
The most important statement in the language and is used to express a query. It represents the relational algebra operation projection, because identifies the columns and/or calculated data to appear in the result table. All column names that appear in the SELECT clause must have their corresponding tables or views listed in the FROM clause.
Multiplicity
The number (or range) of possible occurrences of an entity type that may relate to a single occurrence of an associated entity type through a particular relationship (e.g. 1..n to n..n)
Conceptual modelling
The process of constructing a detailed architecture for a database that is independent of implementation details, such as the target DBMS, application programs, programming languages, or any other physical considerations. The design of the conceptual schema is critical to the overall success of the system. It is worth spending the time and energy necessary to produce the best possible conceptual design.
Domain
The set of allowable values for one or more attributes
uniqueness
The state of ________ is where the values of an attribute uniquely identify each tuple in a particular relation
Redundancy
The state of being not or no longer needed or useful (e.g. data is represented in the database more than once)
Integrity Constraints
Used to ensure accuracy and consistency of data in a relational database
Eventual consistency
When a database may be momentarily inconsistent, but will be consistent eventually.
key-value store
allows the storage of arbitrary data, however, a key must be defined for this data.
Projection
operations are used to return only specified columns in a relation
Selection
operations are used to return the tuples of a relation that meet a specific condition (called the predictate)
Nodes
represent entities such as people, businesses, accounts, or any other item you might want to keep track of
Column-oriented database
stores data in columns as opposed to rows (as in a DBMS)
document store database
stores records in separate documents, rather than rows
Soft state
systems where it's may change over time, sometimes without any input (for eventual consistency)
Graph databases
use graph structures for semantic queries with nodes, edges and properties to represent and store data