Database Design Chapter 2
Hadoop
A Java based, open source, high speed, fault-tolerant distributed storage and computational framework. Hadoop uses low-cost hardware to create clusters of thousands of computer nodes to store and process data.
Attribute
A characteristic of an entity or object
relational database management system (RDBMS)
A collection of programs that manages a relational database. The RDBMS software translates a user's logical requests into commands that physically locate and retrieve the requested data.
Class
A collection of similar objects with shared structure (attributes) and behavior (methods).
hardware independence
A condition in which a model does not depend on the hardware used in the model's implementation. Therefore, changes in the hardware will have no effect on the database design at the conceptual level.
logical independence
A condition in which the internal model can be changed without affecting the conceptual model. (The internal model is hardware-independent because it is unaffected by the computer on which the software is installed. Therefore, a change in storage devices or operating systems will not affect the internal model.)
physical independence
A condition in which the physical model can be changed without affecting the internal model.
Entity Relationship Model
A data model that describes relationships among entities at the conceptual level with the help of ER diagrams.
Object-oriented Data Model (OODM)
A data model whose basic modeling structure is an object
Business Rule
A description of a policy, procedure, or principle within an organization. For example, a pilot cannot be on duty for more than 10 hours during a 24-hour period, or a professor may teach up to four classes during a semester.
Entity Relationship Diagram
A diagram that depicts an entity relationship model's entities, attributes, and relations.
Relational Diagram
A graphical representation of a relational database's entities, the attributes within those entities, and the relationships among the entities.
Hadoop Distributed File System (HDFS)
A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds.
Table (relation)
A logical construct perceived to be a two dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model.
Schema
A logical grouping of database objects, such as tables, indexes, views, and quieries, that are related to each other.
Physical Model
A model in which physical characteristics such as location, path, and format are described for the data. The physical model is both hardware- and software-dependent.
Big Data
A movement to find new and better ways to manage large amounts of web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost.
NoSQL
A new generation of database management systems that is not based on the traditional relational database model.
Entity
A person, place, thing, transaction, or event for which data can be stored
Software Independence
A property of any model or application that does not depend on the software used to implement it.
Internal Schema
A representation of an internal model using the database constructs supported by the chosen database.
Conceptual Schema
A representation of the conceptual model, usually expressed graphically. See also conceptual model.
Crow's Foot notation
A representation of the entity relationship diagram that uses a three-pronged symbol to represent the "many" sides of the relationship.
Data Model
A representation, usually graphic, of a complex "real-world" data structure. Data models are used in the database design phase of the Database Life Cycle.
Constraint
A restriction placed on data, usually expressed in the form of rules. For example, "A student's GPA must be between 0.00 and 4.00"
Entity Instance
A row in a relational table
Logical Design
A stage in the design phase that matches the conceptual design to the requirements of the selected DBMS and is therefore software-dependent. Logical design is used to translate the conceptual design into the internal model for a selected database management system, such as DB2, SQL Server, Oracle, IMS, Informix, Access, or Ingress.
Object
An abstract representation of a real world entity that has a unique identity, embedded properties, and the ability to interact with other objects and itself.
Relationship
An association between entities
Network Model
An early data model that represented data as a collection of record types in 1:M relationships.
Hierarchical Model
An early database model whose basic concepts and characteristics formed the basis for subsequent database development. This model is based on an upside-down tree structure in which each record is called a segment. The top record is the root segment. Each segment has a 1:M relationship to the segment directly below it.
MapReduce
An open-source application programming interface (API) that provides fast data analytics services; one of the main Big Data technologies that allows organizations to process massive data stores.
many-to-many (M:N) relationship
Association among two or more entities in which one occurence of an entity is associated with many occurences of related entity and one occurence of the related entitity is associated with many occurences of the first entity.
one-to-many relationship (1:M)
Associations among two or more entities that are used by data models. In a 1:M relationship, one entity instance is associated with many instances of the related entity.
one-to-one (1:1) relationship
Associations among two or more entities that are used by data models. in a 1:1 relationship, one entity instance is associated with only one instance of the related entity.
Object-Oriented Database management system (OODBMS)
Data management software used to manage data in an object-oriented database model.
Internal Model
In database modeling, a level of data abstraction that adapts the conceptual model to a specific DBMS model for implementation. The internal model is the representation of a database as "seen" by the DBMS. In other words, the internal model requires a designer to match the conceptual model's characteristics and constraints to those of the selected implementation model.
Segment
In the hierarchical data model, the equivalent of a file system's record type.
Method
In the object-oriented data model, a named set of instructions to perform an action. Methods represent real-world actions, and are invoked through messages.
Inheritance
In the object-oriented data model, the ability of an object to inherit the data structure and methods of the classes above it in the class hierarchy.
Tuple
In the relational model, a table row.
Client Node
One of the three types of nodes used in the Hadoop Distributed File System (HDFS). The client node acts as the interface between the user application and the HDFS.
Data Node
One of the three types of nodes used in the Hadoop Distributed File System (HDFS). The data node stores fixed-size data blocks.
Name Node
One of three types of nodes used in the Hadoop Distributed File System (HDFS). The name node stores all the metadata about the file system.
External Model
The application programmer's view of the data environemnt. Given its business focus, an external model works with a data subset of the global database system.
Semantic Data Model
The first of a series of data models that more closely represented the real world, modeling both data and their relationships in a single structure known as an object. The SDM, published in 1981, was developed by M. Hammer and D. McLeod.
American National Standards Institute (ANSI)
The group that accepted the DBTG recommendations and augmented database standards in 1975 through its SPARC committee.
Data Definition Language (DDL)
The language that allows a database administrator to define the database structure, schema, and subschema.
Conceptual Model
The output of the conceptual design process. The conceptual model provides a global view of an entire database and describes the main data objects, avoiding details.
Subschema
The portion of the database that interacts with application programs.
Date Modeling
The process of creating specific data model for a determined problem domain
Data Manipulation Language (DML)
The set of commands that allows an end user to manipulate the data in the database. The commands include SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK.
class diagram notation
The set of symbols used in the creation of class diagrams
External Schema
The specific representation of an external view; the end user's view of the data environment.
Connectivity
The type of relationship between entities. Classifications include 1:1, 1:M, and M:N.
3 Vs
Three basic characteristics of Big Data databases: volume, velocity, and variety.
Relational Model
by E.F. Codd of IBM in 1970, the relational model is based on mathematical set theory and represents data as independent relations. Each relation is conceptually represented as a two-dimensional structure of intersecting rows and columns.