Chapter 2 - Data Models

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

A implementation-ready data model should contain:

- A description of the data structure that will store the end-user data - A set of enforceable rules to guarantee the integrity of the data - A data manipulation methodology to support the real-world data transformations

Any SQL-based relational database application involves:

- A user interface - A set of tables stored in the database - The SQL "engine"

Entity Set

A collection of like entities.

Class

A collection of similar objects with shared structure (attributes) and behavior (methods). A class encapsulates an object's data representation and a method's implementation. Classes are organized in a class hierarchy.

Physical independence

A condition in which the physical model can be changed without affecting the internal model.

Class Diagram

A diagram used to represent data and their relationships in UML object notation.

Physical model

A model in which physical characteristics such as location, path, and format are described for the data. The physical model is both hardware- and software-dependent.

software independence

A property of any model or application that does not depend on the software used to implement it.

entity instance (entity occurrence)

A row in a relational table

Conceptual Schema

Basis for the identification and high-level description of the main data objects

Key-Value

Data model is based on a structure composed of two data elements: a key and a value, in which every key has a corresponding value of set of values. Aka attribute-value or associative data model

Data Definition Language (DDL)

Enables the database administrator to define the schema components. The language that allows a database admin to define the database structure, schema, and subschema.

End-user interface

Interface allows end user to interact with the data (by automatically generating SQL code). Each interface is a product of the software vendor's idea of meaningful interaction with the data.

Constraint

Restriction placed on the data - Important to help ensure data integrity EX: Student's GPA must be between 0.00 to 4.00 - Each class must have one and only one teacher.

SubSchema

The portion of the database that interacts with application programs

External Schema

The specific representation of an external view; the end user's view of the data environment.

hardware independence

A condition in which a model does not depend on the hardware used in the model's implementation. Therefore, changes in the hardware will have no effect on the database design at the conceptual level.

NoSQL

A new generation of database management systems that is not based on the traditional relational database model. Large-scale distributed database system that stores structured and unstructured data in efficient ways. Provides distributed, fault-tolerate databases for processing non-structured data.

Entity

A person, place, thing, concept, or event for which data can be stored. Represents a particular type of object in the real world, which means an entity is "distinguishable" Unique & distinct

internal schema

A representation of an internal model using the database constructs supported by the chosen database.

One-to-one (1:1 or 1..1) relationship

Associations among two or more entities that are used by data models. In a 1:1 relationship, one entity instance is associated with only one instance of the related entity. EX: A retail company's mgmt structure may require that each of its stores be managed by a single employee. [EMPLOYEE manges STORE 1:1]

Network Model

Created to represent complex data relationships more effectively than the hierarchical model, to improve database performance, and to impose a database standard. User perceives network database as a collection of records in 1:M relationships. Unlike the hierarchical model, network allows a record to have more than one parent.

Object-Oriented database management system (OODBMS)

Data management software used to manage data in an object-oriented database model.

Relational databases and Big Data

Does this mean that relational databases don't have a place in organizations with Big Data challenges? No, relational databases remain the preferred and dominant databases to sup- port most day-to-day transactions and structured data analytics needs. Each DBMS tech- nology has its areas of application, and the best approach is to use the best tool for the job. In perspective, object/relational databases serve 98 percent of operational market needs. For Big Data needs, Hadoop, MapReduce, and NoSQL databases are the options.

Basic building blocks of all data models

Entities, attributes, relationships, and contraints

Tuple

In the relational model, a table row. Each column represents an attribute.

SQL Engine

Largely hidden from the end user, the SQL engine executes all que- ries, or data requests. Keep in mind that the SQL engine is part of the DBMS software. The end user uses SQL to create table structures and to perform data access and table maintenance. The SQL engine processes all user requests—largely behind the scenes and without the end user's knowledge.

client node

One of three types of nodes used in the Hadoop Distributed File System (HDFS). The client node acts as the interface between the user application and the HDFS.

Big Data

Refers to a movement to find new and better ways to manage large amounts of web and sensor-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost.

conceputal model

Represents a global view of the entire database by the entire organization. Integrates all external views into a single global view of the data in the enterprise

Chen notation

See entity relationship (ER) model. A data model that describes relationships (1:1, 1:M, and M:N) among entities at the conceptual level with the help of ER diagrams.

Logical design

Task of creating a conceptual data model A stage in the design phase that matches the conceptual design to the requirements of the selected DBMS and is therefore software dependent. Used to translate the conceptual design into the internal model for the selected database management system

Data Modeling

The process of creating a specific data model for a determined problem domain.

Connectivity

The type of relationship between entities. Classifications include 1:1, 1:M, and M:N. EX: PAINTER paints many PAINTINGS, an EMPLOYEE learns many SKILLS,

3 V's of Big Data:

Volume - amount of data being stored. Velocity - speed at which data grows and the need to process data quickly in order to generate information and insight Variety - fact that data being collected comes in multiple different data formats.

Business Rule

a brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization A description of a policy, percedure, or principle within the organization. - Must be rendered in writing and updated to reflect any change in the organization's operational environment Properly written bus rules are used to define entities, attributes, relationships, and constraights EX: A customer may generate many invoices An invoice is generated by only one customer. - A training session cannot be schedules for fewer than 10 employees or for more than 30 employees. (Two entities, employee and training, implies relationship between the two and shows a contraight (no <10 & no > 30)

Data Model

a relatively simple representation, usually graphical, of more complex real-world data structures

Map Reduce

an open-source application programming interface (API) that provides fast data analytics services; one of the main big data technologies that allows organizations to process massive data stores. 2 Main function: Map - takes a job and divides it into smaller units of work Reduce - collects all the output results generated from the nodes and integrates them into a single result set

Internal Model

representation of the database as "seen" by the DBMS In database modeling, a levle of data abstraction that adapts the conceptual model to a specific DBMS model for implementation. Internal model requires a requires a designer to match the conceptual model's characteristics and constraints to those of the selected implementation model.

Sparse Data

the number of attributes is very large but the number of actual data instances is low

Evolution of data models... Common characteristics

A data model must show some degree of conceptual simplicity without compro- mising the semantic completeness of the database. It does not make sense to have a data model that is more difficult to conceptualize than the real world. At the same time, the model should show clarity and relevance; that is, the data model should be unambiguous and applicable to the problem domain. A data model must repre- sent the real world as closely as possible.

Entity Relationship (ER) Model [ERM]

A data model that describes relationships (1:1, 1:M, and M:N) among entities at the conceptual level with the help of ER diagrams. The model was developed by P. Chen in 1975. BC it complemented the relational data model concepts

Hadoop (Big Data technologies)

A java based, open source, high speed, fault tolerant distributed storage and computational framework. Uses low-cost hardware to create clusters of 1000s of computer nodes to store and process data. Provide a framework for Big Data Analytics in which data (structured or unstructured) is distributed, replicated, and processed in parallel using a network of low-cost commodity hardware.

Schema

A logical grouping of database objects such as tables, indexes, views, and queries, that are related to each other.

Method

In the object-oriented data model, a named set of instructions to perform an action. Methods represent real-world actions, and are invoked through messages.

Attribute

a characteristic of an entity. EX: CUSTOMER Entity would be described by attributes such as customers last name, first name, phone number, address, and credit limit Equivalent of fields in a file system

Relationship

describes an association among entities. EX: Relationship exists between customers and agents that ca be descibed as: agent can serve many customers and each customer may be served by one agent.

Process of identifying and documenting business rules is essential to database design for several reasons:

- Helps to standardize the company's view of data - Can be a communication tool between users and designers - It allows the designer to understand the nature, role, and scope of data - Allows the designer to understand the business processes - Allows the designer to develop appropriate relationships participation rules and constraints and to create an accurate data model.

Relational Database Management System (RDBMS)

A collection of programs that manages a relational database. The RDBMS software translates a user's logical requests (queries) into commands that physically locate and retrieve the requested data. ADV: Ability to hide the complexities of a the relational model from the user. Manages all the physical details, while the user sees the relational database as a collection of tables in which data is stored. The user can manipulate and query the data in a way that seems intuitive and logical.

logical independence

A condition in which the internal model can be changed without affecting the conceptual model. (The internal model is hardware-independent because it is unaffected by the computer on which the software is installed. Therefore, a change in storage devices or operating systems will not affect the internal model.)

Summary

A data model is an abstraction of a complex real-world data environment. Database designers use data models to communicate with programmers and end users. The basic data-modeling components are entities, attributes, relationships, and con- straints. Business rules are used to identify and define the basic modeling components within a specific real-world environment. • The hierarchical and network data models were early models that are no longer used, but some of the concepts are found in current data models. • The relational model is the current database implementation standard. In the rela- tional model, the end user perceives the data as being stored in tables. Tables are related to each other by means of common values in common attributes. The entity relationship (ER) model is a popular graphical tool for data modeling that comple- ments the relational model. The ER model allows database designers to visually pres- ent different views of the data—as seen by database designers, programmers, and end users—and to integrate the data into a common framework. • The object-oriented data model (OODM) uses objects as the basic modeling struc- ture. Like the relational model's entity, an object is described by its factual content. Unlike an entity, however, the object also includes information about relationships between the facts, as well as relationships with other objects, thus giving its data more meaning. • The relational model has adopted many object-oriented (OO) extensions to become the extended relational data model (ERDM). Object/relational database management systems (O/R DBMS) were developed to implement the ERDM. At this point, the OODM is largely used in specialized engineering and scientific applications, while the ERDM is primarily geared to business applications. • Emerging Big Data technologies such as Hadoop, MapReduce, and NoSQL provide distributed, fault-tolerant, and cost-efficient support for Big Data analytics. NoSQL databases are a new generation of databases that do not use the relational model and are geared to support the very specific needs of Big Data organizations. NoSQL data- bases offer distributed data stores that provide high scalability, availability, and fault tolerance by sacrificing data consistency and shifting the burden of maintaining rela- tionships and data integrity to the program code. • Data-modeling requirements are a function of different data views (global versus local) and the level of data abstraction. The American National Standards Institute Standards Planning and Requirements Committee (ANSI/SPARC) describes three levels of data abstraction: external, conceptual, and internal. The fourth and lowest level of data abstraction, called the physical level, is concerned exclusively with phys- ical storage methods.

Object-Oriented Data Model (OODM)

A data model whose basic modeling structure is an object.

Entity Relationship Diagram (ERD)

A diagram that depicts an entity relationship model's entities, attributes, and relations.

Hadoop Distributed File System (HDFS)

A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds. Write-once, read many model. - Once data is written, it cannot be modified.

Unified Modeling Language (UML)

A language based on object-oriented concepts that provides tools such as diagrams and symbols to graphically model a system.

Table (Relation)

A logical construct perceived to be a two dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model.

Crow's Foot Notation

A representation of the entity relationship diagram that uses a three-pronged symbol to represent the "many" sides of the relationship

Relational Diagram

A representation of the relational database's entities, the attributes within those entities, and the relationships between those entities.

External Views ADV:

ADV: - Easy to identify specific data required to support each business unit's operations. - Makes the designer's job easy by providing feedback about the model's adequacy. - Helps to ensure security contraints in the database design. - Makes application simplier

Object

An abstract representation of a real world entity that has a unique identity, embedded properties, and the ability to interact with other objects and itself.

Hierarchical Model

An early database model whose basic concepts and characteristics formed the basis for subsequent database development. This model is based on an upside-down tree structure in which each record is called a segment. The top record is the root segment. Each segment has a 1:M relationship to the segment directly below it.

Many-to-many (M:N or *..*) relationship

Associations among two or more entities in which one occurrence of an entity is associated with many occurrences of a related entity and one occurrence of the related entity is associated with many occurrences of the first entity. EX: Employee may learn many job skills and each job skill may be learned by many employees [EMPLOYEE learns SKILL] EX2: Students can take many classes and each class can be taken by many students [STUDENTS takes CLASS]

One-To-Many (1:M or 1..*) relationship

Associations among two or more entities that are used by data models. In a 1:M relationship, one entity instance is associated with many instances of the related entity. EX: A painter creates many different paintings, but each is painted by only ONE painter. Painter (one) to Paintings (Many). [Painter paints Painting 1:M] EX2: CUSTOMER generates INVOICE [Customer (one) to invoices (many) 1:M]

Relational Model

Developed by E. F. Codd of IBM in 1970, it represented a major breakthrough for users and designers because of its conceptual simplicity. The relational model is based on mathematical set theory and represents data as independent relations. Each relation (table) is conceptually represented as a matrix of intersecting rows and columns. The relations are related to each other through the sharing of common entity characteristics (values in columns).

Segement

In the hierarchical data model, the equivalent of a file system's record type.

Inheritance

In the object-oriented data model, the ability of an object to inherit the data structure and methods of the classes above it in the class hierarchy. See also class hierarchy.

NoSQL databases

Not based on the relational model - Support distributed database architectures, - Provide high scalability, high availability, - fault tolerance, - Support large amounts of sparse data, - Geared toward performance rather than transaction consistency, - Store data in key-value stores NOT BASED ON THE RELATIONAL MODEL

Data node

One of three types of nodes used in the Hadoop Distributed File System (HDFS). The data node stores fixed-size data blocks (that could be replicated to other data nodes). .

name node

One of three types of nodes used in the Hadoop Distributed File System (HDFS). The name node stores all the metadata about the file system.

Entity in ERD

Represented by a rectangle (aka entity box) Mapped to a relational table Each entity consists of a set of attributes that describes particular characteristics of the entity. [EX: EMPLOYEE will have ssn, last name, first name]

External Model

The application programmer's view of the data environment. Given its business focus, an external model works with a data subset of the global database schema.

Semantic Data Model

The first of a series of data models that more closely represented the real world, modeling both data and their relationships in a single structure known as an object. The SDM, published in 1981, was developed by M. Hammer and D. McLeod.

Class Hierarchy

The organization of classes in a hierarchical tree in which each parent class is a superclass and each child class is a subclass. See also inheritance.

Data Manipulation Language (DML)

The set of commands that allows an end user to manipulate the data in the database. The commands include SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK. Defines environment in whcih data can be managed and is used to work with teh data in the database.

Class Diagram Notation

The set of symbols used in the creation of class diagrams.

Eventual Consistency

updates to the database will propagate through the system and eventually all data copies will be consistent


संबंधित स्टडी सेट्स

Electric Veh Charg Sys - 2017 NEC - 2nd Ed.

View Set

Criminal Procedure 5th amendment

View Set

Nursing: A Concept-Based Approach to Learning, 2e (Pearson) Module 40 Professional Behaviors

View Set

More, Psych Ch 16, Psychology Chapter 15, Psychology

View Set

chapter 2: fluids, electrolytes, and acid-bases

View Set