INFO 364 STUDY GUIDE
Other Rapid Application Approaches
Agile -emphasizes "individuals and interactions over processes and tools, working software over comprehensive documentation, customer collaboration over contract negotiation, and response to change over following a plan." (The Agile Manifesto)
Criteria for identifiers
Choose Identifiers that - Will not change in value - Will not be null Avoid intelligent identifiers (e.g., containing locations or people that might change) Substitute new, simple keys for long, composite keys
Evolution of Database Systems
Driven by four main objectives: Need for program-data independence reduced maintenance Desire to manage more complex data types and structures Ease of data access for less technical personnel Need for more powerful decision support platforms
Conceptual Schema
E-R models-covered in Chapters 2 and 3
Context
Helps users understand the data
Entity Integrity
No primary key attribute may be null. All primary key fields MUST contain data values.
Structured
numbers, text, dates
Solution: The Database Approach
-Central repository of shared data -Data is managed by a controlling agent -Stored in a standardized, convenient form -Requires a Database Management System (DBMS)
Elements of the Database Approach
1) data models 2) entities 3) relationships 4) relational databases
Data Models
-Graphical diagram capturing nature and relationship of data -Enterprise Data Model-high-level entities and relationships for the organization -Project Data Model-more detailed view, matching data structure in database or data warehouse
System Development Life Cycle (SDLC)
1. Planning 2. Analysis 3. Logical Design 4. Physical Design 5. Implementation 6. Maintenance
Business Rules
Are statements that define or constrain some aspect of the business Are derived from policies, procedures, events, functions Assert business structure Control/influence business behavior Are expressed in terms familiar to end users Are automated through DBMS software
integrity constraints
Domain constraints, entity integrity, and referential integrity
Key Fields
Keys are special fields that serve two main purposes: Primary keys are unique identifiers of the relation. Examples include employee numbers, social security numbers, etc. This guarantees that all rows are unique. Foreign keys is an attribute in a relation that serves as the primary key of another relation. Keys can be simple(a single field) or composite(more than one field). Keys usually are used as indexes to speed up the response to user queries (more on this in Chapter 5).
Data Structure
Tables (relations), rows, columns
External Schema
User Views Subsets of Conceptual Schema Can be determined from business-function/data entity matrices DBA determines schema for different users
Disjointness Constraint
Whether an instance of a supertype may simultaneously be a member of two (or more) subtypes Disjoint Rule: An instance of the supertype can be a member of only ONE of the subtypes Overlap Rule: An instance of the supertype could be a member of more than one of the subtypes
Completeness Constraint
Whether an instance of a supertype must also be a member of at least one subtype Total Specialization Rule: Yes (double line) Partial Specialization Rule: No (single line)
Relationships
-Between entities -Usually one-to-many (1:M) or many-to-many (M:N), but could also be one-to-one (1:1)
Enterprise database applications
-Enterprise Resource Planning (ERP) Integrate all enterprise functions (manufacturing, finance, sales, marketing, inventory, accounting, human resources) -Data Warehouse Integrated decision support system derived from various operational databases
subtype discriminator
An attribute of the supertype whose values determine the target subtype(s) Disjoint-a simple attribute with alternative values to indicate the possible subtypes Overlapping-a composite attribute whose subparts pertain to different subtypes. Each subpart contains a Boolean value to indicate whether or not the instance belongs to the associated subtype
Associative Entities
An entity-has attributes A relationship-links entities together When should a relationship with attributes instead be an associative entity? All relationships for the associative entity should be many The associative entity could have meaning independent of the other entities The associative entity preferably has a unique identifier, and should also have other attributes The associative entity may participate in other relationships other than the entities of the associated relationship Ternary relationships should be converted to associative entities
Attributes
Attribute-property or characteristic of an entity that is of interest Classifications of attributes: Required versus Optional Attributes Simple versus Composite Attribute Single-Valued versus Multivalued Attribute Stored versus Derived Attributes Identifier Attributes
Information
Data processed to increase knowledge in the person using the data
Metadata
Data that describes the properties and context of user data
Relational Databases
Database technology involving tables (relations) representing entities and primary/foreign keys representing relationships
Data Manipulation
Powerful SQL operations for retrieving and modifying data
Domain Integrity
Allowable values for an attribute
An entity:
Should be: An object that will have many instances in the database An object that will be composed of multiple attributes An object that we are trying to model Should not be: A user of the database system An output of the database system (e.g., a report)
Strong vs. Weak Entities, and Identifying Relationships
Strong Entity: exists independently of other entity types has its own unique identifier Weak Entity: dependent on a strong entity (identifying owner)...cannot exist on its own does not have a unique identifier (only a partial identifier) entity box and partial identifier have double lines Identifying relationship: links strong entities to weak entities
Generalization
The process of defining a more general entity type from a set of more specialized entity types. BOTTOM-UP
Specialization
The process of defining one or more subtypes of the supertype and forming supertype/subtype relationships. TOP-DOWN
Graphical Displays
Turns data into useful information that managers can use for decision making and interpretation
Internal Schema
Logical structures-covered in Chapter 4 Physical structures-covered in Chapter 5
Problems with data redundancy
Waste of space to have duplicate data Causes more maintenance headaches The biggest problem: Data changes in one file could cause inconsistencies Compromises in data integrity
Unstructured
images, video, documents
Database
organized collection of logically related data
Data
stored representations of meaningful objects and events
Planning
purpose-preliminary understanding Deliverable-request for study Database activity-enterprise modeling and early conceptual data modeling
Prototyping Database Methodology
Prototyping is a classical Rapid Application Development (RAD) approach
SDLC
-System Development Life Cycle -Detailed, well-planned development process -Time-consuming, but comprehensive -Long development cycle
Enterprise Data Model
- First step in the database development process -Specifies scope and general content -Overall picture of organizational data at high level of abstraction -Entity-relationship diagram -Descriptions of entity types -Relationships between entities -Business rules
costs and risk of database approach
-New, specialized personnel -Installation and management cost and complexity -Conversion costs -Need for explicit backup and recovery -Organizational conflict
Entities
-Noun form describing a person, place, object, event, or concept -Composed of attributes
Advantages of the Database Approach
-Program-data independence -Planned data redundancy -Improved data consistency -Improved data sharing -Increased application development productivity -Enforcement of standards -Improved data quality -Improved data accessibility and responsiveness -Reduced program maintenance -Improved decision support
Prototyping
-Rapid application development (RAD) -Cursory attempt at conceptual data modeling -Define database during development of initial prototype -Repeat implementation and maintenance activities with new prototype versions
Examples of agile programming methodologies
-eXtreme programming -Scrum -DSDM Consortium -Feature-driven development
Disadvantages of file processing
1. Program-Data Dependence -All programs maintain metadata for each file they use 2. Duplication of Data -Different systems/programs have separate copies of the same data 3. Limited Data Sharing -No centralized control of data 4.Lengthy Development Times -Programmers must design their own file formats 5. Excessive Program Maintenance -80% of information systems budget
Components of the database environment
1.Data Modeling Design Tools 2.Repository 3.DBMS 4.Database 5.Application Programs 6.User Interface 7.Data and database administrators 8.System developers 9.End Users
Problems with data dependency
1.Each application programmer must maintain his/her own data 2.Each application program needs to include code for the metadata of each file 3.Each application program must have its own processing routines for reading, inserting, updating, and deleting data 4.Lack of coordination and central control 5.Non-standard file formats
Supertype
A generic entity type that has a relationship with one or more subtypes
Relation
A relation is a named, two-dimensional table of data. A table consists of rows (records) and columns (attribute or field). Requirements for a table to qualify as a relation: It must have a unique name.Every attribute value must be atomic (not multivalued, not composite). Every row must be unique (can't have two rows with exactly the same values for all their fields). Attributes (columns) in tables must have unique names. The order of the columns must be irrelevant. The order of the rows must be irrelevant.
Database Management System (DBMS)
A software system that is used to create, maintain, and provide controlled access to user databases
Subtype
A subgrouping of the entities in an entity type that is meaningful to the organization and that shares common attributes or relationships distinct from other subgroupings.
Managing Projects: People involved
Business analysts Systems analysts Database analysts and data modelers Users Programmers Database architects Data administrators Project managers Other technical experts
Cardinality constraints
Cardinality Constraints—the number of instances of one entity that can or must be associated with each instance of another entity Minimum Cardinality - If zero, then optional - If one or more, then mandatory Maximum Cardinality - The maximum number
Simple Vs Composite Attributes
Composite attribute: An attribute that has meaningful component parts (attributes)
Degree of relationships
Degree of a relationship is the number of entity types that participate in it Unary Relationship Binary Relationship Ternary Relationship
entity cluster
EER diagrams are difficult to read when there are too many entities and relationships. Solution: Group entities and relationships into entity clusters. Entity cluster: Set of one or more entity types and associated relationships grouped into a single abstract entity type
E-R Model Constructs
Entities, Relationships, Attributes
entities
Entity instance-person, place, object, event, concept (often corresponds to a row in a table) Entity Type-collection of entities (often corresponds to a table)
Database Schema
External Schema Conceptual Schema Internal Schema
Identifiers
Identifier (Key)-an attribute (or combination of attributes) that uniquely identifies individual instances of an entity type Candidate Identifier-an attribute that could be an identifier...satisfies the requirements for being an identifier Simple versus Composite Identifier
Transforming EER Diagrams into Relations
Mapping Regular Entities to Relations Simple attributes: E-R attributes map directly onto the relation Composite attributes: Use only their simple, component attributes Multivalued Attribute: Becomes a separate relation with a foreign key taken from the superior entity Mapping Weak Entities Becomes a separate relation with a foreign key taken from the superior entity Primary key composed of: Partial identifier of weak entity Primary key of identifying relation (strong entity) Mapping Associative Entities Identifier Not Assigned Default primary key for the association relation is composed of the primary keys of the two entities (as in M:N relationship) Identifier Assigned It is natural and familiar to end-users Default identifier may not be unique Mapping Unary Relationships One-to-Many-Recursive foreign key in the same relation Many-to-Many-Two relations: One for the entity type One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity
Transforming EER Diagrams into Relations
Mapping Ternary (and n-ary) Relationships One relation for each entity and one for the associative entity Associative entity has foreign keys to each entity in the relationship Mapping Supertype/Subtype Relationships One relation for supertype and for each subtype Supertype attributes (including identifier and subtype discriminator) go into supertype relation Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation 1:1 relationship established between supertype and each subtype, with supertype as primary table
Multi-Valued and derived attributes
Multivalued -may take on more than one value for a given entity (or relationship) instance Derived -values can be calculated from related attribute values (not physically stored in the database)
Range of Database Applications
Personal databases Two-tier and N-tier Client/Server databases Enterprise applications - Enterprise resource planning (ERP) systems - Data warehousing implementations
Managing people and projects
Project-a planned undertaking of related activities to reach an objective that has a beginning and an end Initiated and planned in planning stage of SDLC Executed during analysis, design, and implementation Closed at the end of implementation
Attributes
Properties or characteristics of an entity or relationship type (often corresponds to a field in a table)
Physical Design
Purpose-develop technology and organizational specifications Deliverable-program/data structures, technology purchases, organization redesigns Database activity-physical database design (define database to DBMS, physical data organization, database processing programs)
Logical Design
Purpose-information requirements elicitation and structure Deliverable-detailed design specifications Database activity-logical database design (transactions, forms, displays, views, data integrity and security)
Maintenance
Purpose-monitor, repair, enhance Deliverable-periodic audits Database activity-database maintenance, performance analysis and tuning, error corrections
implenmentation
Purpose-programming, testing, training, installation, documenting Deliverable-operational programs, documentation, training materials Database activity-database implementation, including coded programs, documentation, installation and conversion
Analysis
Purpose-thorough requirements analysis and structuring Deliverable-functional system specifications Database activity-thorough and integrated conceptual data modeling
A good data name is:
Related to business, not technical, characteristics Meaningful and self-documenting Unique Readable Composed of words from an approved list Repeatable Written in standard syntax
Correspondence with E-R model
Relations (tables) correspond with entity types and with many-to-many relationship types. Rows correspond with entity instances and with many-to-many relationship instances. Columns correspond with attributes
Relationships
Relationship instance-A relationship instance is an association between (or among) entity instances Relationship type-A relationship type is a meaningful association between (or among) entity types
Referential Integrity
Rules that maintain consistency between the rows of two related tables.
Two approaches to database and is development
SDLC and prototyping
Attribute Inheritance
Subtype entities inherit values of all attributes of the supertype An instance of a subtype is also an instance of the supertype
data integrity
The models include mechanisms for implementing business rules that maintain integrity of manipulated dat
Enhanced ER model
extends original ER model with new modeling constructs