IST 334 Chapter 2
Hadoop
A Java based, open source, high speed, fault-tolerant distributed storage and computational framework. __________ uses *low-cost hardware* to create clusters of thousands of computer nodes to store and process data.
Attribute
A characteristic of an entity or object. An attribute has a name and a data type
Relational database management system (RDBMS)
A collection of programs that manages a relational database.The RDBMS software translates a user's logical request (queries) into commands that physically locate and retrieve the requested data
hardware independence
A condition in which a model does not depend on the hardware used in the model's implementation. Therefore, changes in the hardware will have no effect on the database design at the conceptual level.
Entity relationship (ER) model (ERM)
A data model that describes relationships (1:1, 1:M and M:N) among entities at the conceptual level with the help of ER diagrams.
Object-oriented data model
A data model whose basic modeling structure is an object.
Business rule
A description of a policy, procedure, or principle within an organization.
Entity relationship diagram
A diagram that depicts an entity relationship model entities, attributes, relationships
Class Diagram
A diagram used to represent data and their relationships in UML object notation
Relational diagram
A graphical representation of a relational database's entities, the attributes within those entities, and relationships among the entities.
Hadoop Distributed File System (HDFS)
A highly distributed, fault-tolerant *file storage system* designed to manage large amounts of data at high speeds.
Unified Modeling Language (UML)
A language based on object-oriented concepts that provides tools such as diagrams and symbols to graphically model a system
Table
A logical construct perceived to be a two-dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents as entity set in the relational model
Schema
A logical grouping of database objects, such as tables, indexes, views, and queries, that are related to each other
Extensible Markup Language (XML)
A metalanguage used to represent and manipulate data elements. Unlike other markup languages, XML permits the manipulation of a document's data elements. XML facilitates the exchange of structured documents such as orders and invoices over the Internet.
extended relational data model (ERDM)
A model that includes the object-oriented model's best features in an inherently simpler relational database structural environment
Big data
A movement to find new and better ways to manage large amounts of web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost.
NoSQL
A new generation of database management systems that is not based on the traditional relational database model.
Entity
A person, place, thing, transaction, or event about which data can be stored
software independence
A property of any model or application that does not depend on the software used to implement it.
Internal Schema
A representation of an internal model using the database constructs supported by the chosen database.
Conceptual schema
A representation of the conceptual model, usually expressed graphically.
Crow Foot notation
A representation of the entity relationship diagram that uses a symbol to represent "many" sides of the relationship
data model
A representation, usually graphic, of a complex "real-world" data structure. Data models are used in the database design phase of the Database Life Cycle.
Constraint
A restriction placed on data, usually expressed in the form of rules
Logical design
A stage in the design phase that matches the conceptual design to the requirements of the selected DBMS and is therefore software-dependent. ___________ is used to translate the conceptual design into the internal model for a selected database management system, such as DB2, SQL Server, Oracle, IMS, Informix, Access, or Ingress.
Object
An abstract representation of a real world entity that has a unique identity, embedded properties, and the ability to interact with other objects and itself.
Relationship
An association between entities
Network model
An early data model that represented data as a collection of record types in 1:M relationships
Hierarchical model
An early database model whose basic concepts and characteristics formed the basis for subsequent database development. This model is based on an upside-down tree structure in which each record is called a segment. The top record is the root segment. Each segment has a 1:M relationship to the segment directly below it.
Object/relational database management system
Based on the extended relational model. Constitutes the relational model's response to the OODM. This model includes many of the object oriented model's best features within an inherently simpler relational database structure
Object-oriented database management system
Data Management software used to manage data in an object-oriented database model
Relational Model
Developed by E.F. Codd of IBM in 1970, the relational model is based on mathematical set theory and represents data as independent relations. Each relation (table) is conceptually represented as a two-dimensional structure of intersecting rows and columns. The relations are related to each other through the sharing of common entity characteristics (values in columns)
Class
Encapsulates a object data represents and a method's implementation.
internal model
In database modeling, a level of data abstraction that adapts the conceptual model to a specific DBMS model for implementation. The internal model is the representation of a database as "seen" by the DBMS.
Method
In the object-oriented data model, a named set of instructions to perform an action. represent real-world actions, and are invoked through messages.
Inheritance
In the object-oriented data model, the ability of an object to inherit the data structure and methods of the classes above it in the class hierarchy. See also class hierarchy.
Tuple
In the relational model, a table row
Client Node
One of three types of nodes used in the (HDFS). The client node acts as the interface between the user application and the HDFS. See also name node and data node.
Data Node
One of three types of nodes used in the (HDFS). The data node stores fixed-size data blocks (that could be replicated to other data nodes).
Name node
One of three types of nodes used in the HDFS. stores all the metadata about the file system.
External model
The application programmer view of the data environment. Given its business focus, an external model works with a data subset of the global database subset of the global database schema
Semantic data model
The first of a series of data models that more closely represented the real world, modeling both data and their relationships in a single structure known as an object.
class hierarchy
The organization of classes in a hierarchical tree in which each parent class is a superclass and each child class is a subclass
conceptual model
The output of the conceptual design process. The _________ model provides a global view of an entire database and describes the main data objects, avoiding details.
Subschema
The portion of the database that interacts with application programs
Data Modeling
The process of creating a specific data model for a determined problem domain.
Data Manipulation Language (DML)
The set of commands that allows an end user to manipulate the data in the database. The commands include SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK.
Class diagram notation
The set of symbols used in the creation of class diagrams
External schema
The specific representation of an external view; the end user's view of the data environment.
Connectivity
The type of relationship between entities. Classifications include 1:1, 1:M, and M:N.
3 Vs
Three basic characteristics of Big Data databases: volume, velocity, and variety.