Data Modeling 2
occurrence
analogous to a row in the relational table
Planning
defines the goals of the database , explains why the goals are important, and sets out the path by which the goals will be reached
Existence
denotes whether the existence of an entity instance is dependent upon the existence of another, related, entity instance defined as either mandatory or optional
Attributes
describe the entity of which they are associated; classified as identifiers or descriptors
descriptor
describes a non-unique characteristic of an entity instance.
connectivity of a relationship
describes the mapping of associated entity instances in the relationship. The values of connectivity are "one" or "many".
Information needed for the requirements analysis
review of existing documents interviews with end users review of existing automated systems
methods used to create a data model
the Entity-Relationship (ER) approach and the Object Model
Steps In Building the Data Model
1. Identification of data objects and relationships 2. Drafting the initial ER diagram with entities and relationships 3. Refining the ER diagram 4. Add key attributes to the diagram 5. Adding non-key attributes 6. Diagramming Generalization Hierarchies 7. Validating the model through normalization 8. Adding business and integrity rules to the Model
Three points to keep in mind during the requirements analysis
1. Talk to the end users about their data in "real-world" terms. 2. Take the time to learn the basics about the organization and its activities that you want to model. 3. End-users typically think about and view data in different ways according to their function within an organization.
ER diagram notations
Bachman, crow's foot, and IDEFIX represent entities as rectangular boxes and relationships as lines connecting boxes. Each style uses a special set of symbols to represent the cardinality of a connection.
generalization hierarchy
a form of abstraction that specifies that two or more entities that share common attributes can be generalized into a higher level entity type called a supertype or generic entity
Associative entities (also known as intersection entities)
are entities used to associate two or more entities in order to reconcile a many-to-many relationship.
Entities
are the principal data object about which information is to be collected; usually recognizable concepts, either concrete or abstract, such as person, places, things, or events which have relevance to the database; classified as independent or dependent
effective data model
completely and accurately represents the data requirements of the end users. It is simple enough to be understood by the end user yet detailed enough to be used by a database designer to build the database. The model eliminates redundant data, it is independent of any hardware and software constraints, and can be adapted to changing requirements with a minimum of effort.
Data Model
conceptual representation of the data structures that are required by a database. The data structures include the data objects, the associations between data objects, and the rules which govern operations on the objects (equivalent to an architect's building plans.)
Components of A Data Model
data model gets its inputs from the planning and analysis stage. The data model has two outputs. The first is an entity-relationship diagram which represents the data strucures in a pictorial form. The second component is a data document. This a document that describes in detail the data objects, relationships, and rules required by the database
Database design
design the logical and physical structure of one or more databases to accommodate the information needs of the users in an organization for a defined set of applications". The design process roughly follows five steps: 1. planning and analysis 2. conceptual design 3. logical design 4. physical design 5. implementation
Subtypes
either mutually exclusive (disjoint) or overlapping (inclusive
ER Notation
entities are represented by labeled rectangles. The label is the name of the entity. Entity names should be singular nouns. * relationships are represented by a solid line connecting two entities. The name of the relationship is written above the line. Relationship names should be verbs. * attributes, when included, are listed inside the entity rectangle. Attributes which are identifiers are underlined. Attribute names should be singular nouns. * cardinality of many is represented by a line ending in a crow's foot. If the crow's foot is omitted, the cardinality is one. * existence is represented by placing a circle or a perpendicular bar on the line. Mandatory existence is shown by the bar (looks like a 1) next to the entity for an instance is required. Optional existence is shown by placing a circle next to the entity that is optional
data model
focuses on what data should be stored in the database while the functional model deals with how the data is processed; used to design the relational tables
direction of a relationship
indicates the originating entity of a binary relationship. The entity from which a relationship originates is the parent entity; the entity where the relationship terminates is the child entity. determined by its connectivity
Analysis
involves determining the requirements of the database. This is typically done by examining existing documentation and interviewing users.
ternary relationship
involves three entities and is used when a binary relationship is inadequate.; decomposed into two or more binary relationships
entity occurrence(also called an instance)
is an individual occurrence of an entity
cardinality of a relationship
is the actual number of related occurences for each of the two entities. The basic types of connectivity for relations are: one-to-one, one-to-many, and many-to-many.
Degree of a Relationship
is the number of entities associated with the relationship
goal of the data model
is to make sure that the all data objects required by the database are completely and accurately represented; detailed enough to be used by the database developers to use as a "blueprint" for building the physical database
requirements analysis
is usually done at the same time as the data modeling
Identifiers
more commonly called keys, uniquely identify an instance of an entity
Entity-Relation Model (ER
most common method used to build data models for relational databases; easily be transformed into relational tables; simple and easy to understand; e used as a design plan by the database developer to implement a data model in a specific database management software.
Data modeling
must be preceded by planning and analysis
Generalization
occurs when two or more entities represent categories of the same realworld object
non-identifying relationship
one in which both entities are independent
identifying relationship
one in which one of the child entities is also a dependent entity.
independent entity
one that does not rely on another for identification
dependent entity
one that relies on another for identification.
Relationships
represents an association between two or more entities. An example of a relationship would be: employees are assigned to projects projects have subtasks departments manage one or more projects classified by their degree, connectivity, cardinality, direction, type, and existence
many-to-many (M:N) relationship
sometimes called non-specific, is when for one instance of entity A, there are zero, one, or many instances of entity B and for one instance of entity B there are zero, one, or many instances of entity A. An example is: employees can be assigned to no more than two projects at the same time; projects must have assigned at least three employees
Binary relationships
the association between two entities is the most common type in the real world. ( "some employees are married to other employees".)
goals of the requirements analysis
to determine the data requirements of the database in terms of primitive objects * to classify and describe the information about these objects * to identify and classify the relationships among the objects * to determine the types of transactions that will be executed on the database and the interactions between the data and the transactions * to identify rules governing the integrity of the data
Subtypes entities
used in generalization hierarchies to represent a subset of instances of their parent entity, called the supertype, but which have attributes or relationships that apply only to the subset.
ER model
views the real world as a construct of entities and association between entities.
mutually exclusive category
when an entity instance can be in only one category
overlapping category
when an entity instance may be in two or more subtypes
one-to-one (1:1) relationship
when at most one instance of a entity A is associated with one instance of entity B. For example, "employees in the company are each assigned their own office. For each employee there exists a unique office and for each office there exists a unique employee.
one-to-many (1:N) relationships
when for one instance of entity A, there are zero, one, or many instances of entity B, but for one instance of entity B, there is only one instance of entity A. An example of a 1:N relationships is a department has many employees each employee is assigned to one department
information contained in the data model
will be used to define the relational tables, primary and foreign keys, stored procedures, and triggers