Topic A - Database Terms

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Durability (ACID property)

The ability of the system to recover committed transaction updates if either the system or the storage media fails. Features to consider are: recovery to the most recent successful commit after a database software failure recovery to the most recent successful commit after an application software failure recovery to the most recent successful commit after a CPU failure recovery to the most recent successful backup after a disk failure recovery to the most recent successful commit after a data disk failure

Database Segmentation

The act of defining portions or sections of a database to make keeping track of information easier.

Data Privacy

The aspect of information technology (IT) that deals with the ability an organization or individual has to determine what data in a computer system can be shared with third parties.

Reports

The formatted result of database queries and contains useful data for decision-making and analysis.

Data Interrogation

The manual process of querying the source data for such purposes as investigating anomalies or confirming relationships between source entities. It may well be the case that queries developed during the development phase will be used to monitor the quality of source data on an ongoing basis.

Data Integrity

The overall completeness, accuracy and consistency of data. This can be indicated by the absence of alteration between two instances or between two updates of a data record, meaning data is intact and unchanged.

Concurrency

The process in which several computations are executed simultaneously and possibly interacting with each other

Data Validation

The process of ensuring that a program operates on clean, correct and useful data

Normalization

The process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.

DW Classification Analysis

The process of organizing data into categories for its most effective and efficient use.

Rollback

The process of restoring a database or program to a previously defined state, typically to recover from an error.

Consistency (ACID Property)

The process that ensures that the database remains in a consistent state before the start of the transaction and after the transaction is over (whether successful or not).

Data

The quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.

Data Consistency

The requirement that any given database transaction must change affected data only in allowed ways.

2NF

There must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence

VARCHAR

Variable length character string

Many-to-Many

When one or more rows in a table are associated with one or more rows in another table.

DW time dependency

You can create dataflow dependencies if several population dataflows need to run in a specific order, or if the amount of time to run dataflows is unpredictable.

Join

A SQL statement that combines records from two or more tables in a relational database. It creates a set that can be saved as a table or used as it is. It is a means for combining fields from two tables (or more) by using values common to each.

Data Type

A classification identifying one of various types of data, such as real, integer or Boolean, that determines the possible values for that type

Dataset

A collection of data. Usually corresponds to the contents of a single database table, where every column of the table represents a particular variable, and each row corresponds to a given member of the data set in question.

Database

A collection of information that is organized so that it can easily be accessed, managed, and updated.

File

A computer file that contains any type of data, including a word processing document or spreadsheet. It may also refer to a database file that contains records, such as orders and customers.

Neural Network

A computer system modeled on how the human brain and nervous system functions.

Information system

A computer system or set of components for collecting, creating, storing, processing, and distributing information, typically including hardware and software, system users, and the data itself.

Integer

A data type which represents some finite subset of the mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values.

Link Analysis

A data-analysis technique used to evaluate relationships (connections) between nodes. Relationships may be identified among various types of nodes (objects), including organizations, people and transactions.

3NF

A database is in third normal form if it satisfies the following conditions: ● It is in second normal form ● There is no transitive functional dependency

Object-oriented Database

A database management system (DBMS) that supports the modelling and creation of data as objects. This includes some kind of support for classes of objects and the inheritance of class properties and methods by subclasses and their objects.

Network Database

A database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice.

Spatial Database

A database that is optimized to store and query data that represents objects defined in a geometric space. Most such databases allow representing simple geometric objects such as points, lines and polygons.

Text

A datatype that can hold very large strings (e.g., articles).

Logical OR

A digital logic gate that implements logical disjunction. It is an electronic circuit that gives a high output if one or more of its inputs are high.

Foreign Key

A foreign key is a column or group of columns in a relational database table that provides a link between data in two tables.

Entity Relationship Diagram (ERD)

A graphical representation of an information system that shows the relationship between people, objects, places, concepts or events within a system.

Record

A group of fields within a table/file that are relevant to a specific entity.

Conceptual Data Model

A high level or coarse data model which is preliminary in structure, possibly abstract in content and sparse in attributes, that is intended to represent a business area.

Inner Join

A join in which the values in the columns being joined are compared using a comparison operator.

Composite Primary Key

A key primary key consisting of more than one column.

Data Warehouse

A large store of data accumulated from a wide range of sources within a company and used to guide management decisions.

Data Locking

A lock is used when multiple users need to access a database concurrently. This prevents data from being corrupted or invalidated when multiple users try to read while others write to the database.

Derived Field

A numeric or date field that derives its data from the calculation of other fields. The data are not entered into a calculated field by the user.

Extract, Transform and Load (ETL)

A process in database usage and especially in data warehousing that: Extracts data from homogeneous or heterogeneous data sources. Transforms the data for storing it in proper format or structure for querying and analysis purpose.

Data Recovery

A process of recovering inaccessible data from corrupted or damaged secondary storage, removable media or files, when the data store cannot be accessed in a normal way.

Data Verification

A process where in different types of data are checked for accuracy and inconsistencies after data migration is done. It helps to determine whether data was accurately translated when data is transferred from one source to another, is complete, and supports processes in the new system.

Atomicity

A property of database transactions which are guaranteed to either completely occur, or have no effects; it is a whole, complete unit which is executed as a whole or not at all.

Complex Query

A query that searches using more than one parameter value i.e. on two or more criteria.

Simple Query

A query that uses only one table and one parameter.

Normalized

A relational database that meets Third Normal Form (3NF).

Table

A set of elements using a model of vertical columns and horizontal rows, the cell being a unit where a row and column intersect.

Data Dictionary

A set of information describing the contents, format, and structure of a database and the relationship between its elements, used to control access to and manipulation of the database.

ACID

A set of properties, (Atomicity, Consistency, Isolation, Durability) that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction.

Data Definition Language

A standard for commands that define the different structures in a database.

Decision Tree Induction

A structure that includes a root node, branches, and leaf nodes. Each internal node denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf node holds a class label. The topmost node in the tree is the root node.

View

A table composed of the result set of a query in relational databases

DW Sequential Patterns

A topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence.

Transaction

A transaction is a sequence of operations performed as a single logical unit of work.

Logical Data Model

A type of data model showing a detailed representation organization's data, independent of any particular technology that standardizes people, places, things and the rules, relationships and the events between them.

Multi-Dimensional Database

A type of database that is optimized for data warehouse and online analytical processing (OLAP) applications. Such databases are frequently created using input from existing relational databases.

DW Cluster Analysis

A type of pattern recognition that is particularly useful is recognizing distinct clusters or sub-categories within the data. Without data mining, an analyst would have to look at the data and decide on a set of categories which they believe captures the relevant distinctions between apparent groups in the data. This would risk missing important categories. With data mining it is possible to let the data itself determine the groups.

Forms

A window or screen that contains numerous fields, or spaces to enter data.

Computer Misuse Act

Act by the Parliament of the United Kingdom that recognized the following new offences: Unauthorized access to computer material. Unauthorized access with intent to commit or facilitate a crime. Unauthorized modification of computer material . Making, supplying or obtaining anything which can be used in computer misuse offences. (1990)

Transaction States

Active: the initial state of a transaction; a transaction remains here while it is executing. Partially Committed: the state after the final statement of the transaction is executed. Failed: the state following the realization that normal execution can no longer proceed. Aborted: The state after a transaction has been rolled back; the database has been restored to its original state. Committed: the state after successful completion of the transaction.

Database Functions

Activities to create and maintain database information including: Insert: adding new rows or columns and therefore new data Update: changing data that is already there Select: choosing the data that you want from the database Delete: removing rows from the table

Backpropagation

An abbreviation for "backward propagation of errors", is a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of a loss function with respect to all the weights in the network.

Data Protection Act

An act of the United Kingdom Parliament defining the ways in which information about living people may be legally used and handled. The main intent is to protect individuals against misuse or abuse of information about them. (1998)

Secondary Key

An entity may have one or more choices for the primary key. Collectively these are known as candidate keys. One is selected as the primary key. Those not selected are known as secondary keys.

Natural Language Interfaces

An interface that enables people to interact with any connected device using normal, everyday language. It understands the meaning of conversational input, and reacts accordingly, creating value and enhancing the user experience.

Tuple

An ordered set of values. The separator for each value is often a comma (depending on the rules of the particular language). Common uses for the data type are (1) for passing a string of parameters from one program to another, and (2) representing a set of value attributes in a relational database.

Relation

Another name for a table.

Candidate key

Any column or a combination of columns that can qualify as unique key in database. There can be multiple occurrences of such keys in one table.

Field

Columns in database tables that identify the type of information that is stored

Visual Queries

Components of visual thinking algorithms where problem components are identified that have potential solutions based on a visual pattern search

Macros

Contain actions that perform different tasks

Data Model

Data modeling is often the first step in database design and object-oriented programming as the designers first create a conceptual model of how data items relate to each other. Data modeling involves a progression from conceptual model to logical model to physical schema.

Information

Data that has meaning in context for its receiver. When information is entered into and stored in a computer, it is generally referred to as data. After processing and organizing, output data can again be perceived as information.

DW Real Time Update

Data warehousing model useful in transactional data situations to keep data up to date all the time.

Integrated Database System

Database systems with integrated hardware and software.

Floating Point

Denoting a representing numbers as two sequences of bits, one representing the digits in the number and the other an exponent that determines the position of the radix point.

1NF

First normal form is a property of a relation in a relational database. A relation is in first normal form if the domain of each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain.

DW Association Analysis

If/then statements that help uncover relationships between seemingly unrelated data in a relational database or other information repository.

Logical AND

In Boolean algebra, a symbol used to indicate a conjunction between two statements. A conjunction is true if, but only if, both of its components are true.

Schema

In a relational database, it is the definition of the tables, the fields in each table, and the relationships between fields and tables.

Char

In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to agrapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language.

Attribute

In computing, an attribute is a specification that defines a property of an object, element, or file. It may also refer to or set the specific value for a given instance of such. For clarity, attributes should more correctly be considered metadata.

Column

In database management systems, another name for field.

Logical NOT

In logic, negation, also called logical complement, is an operation that takes a proposition p to another proposition "not p", written ¬p, which is interpreted intuitively as being true when p is false and false when p is true.

Row

In relational databases, a data record within a table.

Access Rights

Privileges given to a user or application to read, write, and erase files in the computer; privileges may depend on client or server.

Data Mining

Process by which large amounts of data are explored in search of patterns and relationships between variables.

Deviation Detection

Processes to reveal surprising facts hidden inside data. Such tools are used to detect deviations, anomalies, and outliers. Detection is needed for various reasons; knowledge discovery: often such information is vital part of important business decisions and scientific discovery. auditing: examining such information can reveal problems and mal-practices. fraud detection: fraudulent claims often carry inconsistent information. Such information can reveal fraud cases. More on healthcare fraud detection. data cleaning: such information can be from mistakes in data entry which should be corrected.

Referential Integrity

Property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (table)

Data Security

Protecting data, such as a database, from destructive forces and from the unwanted actions of unauthorized users.

Recovery Techniques

Recovery Techniques: 1. Salvation program: Run after a crash to attempt to restore the system to a valid state. No recovery data used. Used when all other techniques fail or were not used. Good for cases where buffers were lost in a crash and one wants to reconstruct what was lost... 2. Incremental dumping: Modified files copied to archive after job completed or at intervals. 3. Audit trail: Sequences of actions on files are recorded. Optimal for "backing out" of transactions. (Ideal if trail is written out before changes). 4. Differential files: Separate file is maintained to keep track of changes, periodically merged with the main file. 5. Backup/current version: Present files from the current version of the database. Files containing previous values form a consistent backup version. 6. Multiple copies: Multiple active copies of each file are maintained during normal operation of the database. In cases of failure, comparison between the versions can be used to find a consistent version. 7. Careful replacement: Nothing is updated in place, with the original only being deleted after operation is complete.

Redundancy

Repeated data that can cause two main problems: 1) inconsistency within the data, 2) and taking up space.

Boolean operators

Simple words (AND, OR, NOT or AND NOT) used as conjunctions to combine or exclude keywords in a search, resulting in more focused and productive results.

DBMS

Software for creating and managing databases. The system provides users and programmers with a systematic way to create, retrieve, update and manage data.

DW Forecasting

Sophisticated data mining analyses which includes trend analyses and uses existing data to forecast trends or predict the future.

Structured Query Language (SQL)

Standard interactive and programming language for getting information from and updating a database. Although SQL is both an ANSI and an ISO standard, many database products support SQL with proprietary extensions to the standard language.

Isolation (ACID property)

The ACID properties needed when there are concurrent transactions. Concurrent transactions are transactions that occur at the same time, such as shared multiple users accessing shared objects. This situation is illustrated at the top of the figure as activities occurring over time. The safeguards used by a DBMS to prevent conflicts between concurrent transactions are a concept referred to as isolation.


Ensembles d'études connexes

Chapter 19 patterns of chromosomal Inheritance

View Set

Chapter 3 Protein Structure and Function

View Set

W8: Evolution/MarineMammalAdaptations

View Set