Database Design Quiz 1

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What should an entity type represent?

An item or noun

Logical Data Independence

The capacity to change the conceptual schema without having to change external schemas or application programs Change of a particular user's view will not affect views of other user groups. Transform the schema differences by various mappings Examples: Age, temperature, price

Physical Data Independence

The capacity to change the internal schema without having to change the conceptual schema. Hence, the external schemas need not be changed as well Examples: Learning function is programmed by humans

Can ER diagrams define the database scope?

Yes

Mappings

between levels support data independences

Is there a systematic process for conceptual data modeling?

yes

Conceptual Data Schema

Abstraction of the reality of data elements and their relationships Conceptual view of the data requirements Pictorial representation of the data requirements It is easier to communicate with end users by using conceptual schemas.

Bottom-Up

Analysis of existing files and applications Synthesis programs and algorithms

Fundamental Modeling structures:

Attributes Entities Relationships

Why Model Data?

Conduct a system development problem from different perspectives by using different tools Every modeling tool has its purpose: Based on certain assumptions about the aspect of the problem it is intended to solve. Help designers to collect input and seek feedbacks from users Data modeling is considered as the most important modeling technique and skill for information system development.

Conceptual Level

Core of the ANSI/SPARC framework Represents the global view of the structure of the entire database: the model of the whole database Describes all data items and data relationships together with data integrity constraints Separates data from the program (or views from the physical storage structure) Captures data specification (metadata) Technology independent Relatively stable over long period of time Visible to DBA and database designers

Database Environment

Database hardware Database Management Systems (DBMS): software package for managing databases Database users Database administrator Database: Schema (meta-data) and Instances (data contents) Database models: logical and physical Database languages (DDL and DML) Data dictionary and directory Database design tools (e.g., ERWin)

Notation of ER Diagrams

Entity types: rectangles Weak entity types: double rectangles Relationship types: diamonds Connecting lines: roles entities play in the relationships Cardinalities and connectivities

Database Design Problem

Given: a data management request and its data and application requirements Objective: develop a database system Such that: Adequately manage the data of application domains Effectively supports applications' information needs

Inaccurate data

Having $1000 instead of $10000 for customer's credit limit Having $1.05 instead of $1.50 for an item

Characteristics of Business Data

Inaccuracies: Data do not reflect the reality. E.g.: wrong address, wrong number of products on shelf Inconsistency: Over time duplicate records do not agree. E.g.: school recording two different grades for same student

Cardinalities

Indicates the number of each object type that may participate in an association.

1: Requirements Collection and Analysis

Input: Application knowledge and users' information needs Task: The database designers interview users to understand and document their data requirements. Output: concisely written set of users' data requirements

4: Physical Database Design

Input: the logical database schema Task: specify the internal storage structure indexes, access paths, and file organizations for database files Output: physical implementation structures It is fully dependent on the target DBMS.

Evolution of Data Management

Manual systems (before computer age) File-based processing: sequential, indexed and direct data access

Lost data

Missing the record of a product item for sales Losing an invoice transaction

Mappings from real world primitives to ER modeling constructs: See Table 2.1

Object -> Entity Association -> Relationship Property -> Attribute

Real World Primitives

Objects Object types Properties Fact Property value set Associations Object classes

For binary associations:

One-to-one (1:1) One-to-many (1:N) Many-to-many (N:M)

Database Development Life Cycle

Preliminary Planning Database Initial Study (Feasibility Study) Database Design

Values of Properties

Properties for object types and associations Property value set: a set of possible values for a particular property.

Entity-Relationship (ER) Model

Proposed by Peter Chen in 1976 The most widely accepted data model for conceptual design Easy to use and understand by end users Capture and represent data requirements Doesn't imply how data should be organized and implemented Chosen by ANSI in 1988 as the standard model for Information Resource Directory Systems (IRDSs)

Database Evolution

Redesign and Reengineering Database Migration Database Integration IS30

Database Design

Requirements collection and analysis Conceptual design using a conceptual model Logical design using an implementation model Physical design using different file structures

Data Independences

The capacity to change the schema at one level of a database system without having to change the schema at the next higher level Two types: Logical data independence Physical data independence

External Level

The external schemas (usually many) represent different user views, each describing portions of the database for a particular application. All external schemas are generated exclusively from the conceptual schema. Each external schema (also called view) describes the database of one application such as inventory control, customer relationship, etc. Visible to the applications and end users Technology independent

Focus on Processing Functions

The program logic and functions will be specified first. Data structure is quite simple.

Database Management System (DBMS)

A collection of general-purpose software that facilitates the process of defining, constructing, and manipulating a database for various applications E.g.,: Microsoft Access, ORACLE RDBMS, IBM DB2, Microsoft SQL Server

Databases

A single, integrated set of files

The Basic Structure: Relation

A two dimensional table Rows correspond to individual records and data relationships Rows are called "tuples" Columns correspond to attributes (properties) Each attribute has a data domain specified

Objects and Object Types

An object is a thing in the real world Objects may be tangible or intangible Object type: refers to a group of objects. E.g., Employee is an object type. Each instance of an object type has certain characteristics. E.g., Employees have name, titles, salary, etc.

Top -Down

Analysis of organizational functions Determine their information requirements Synthesis these requirements

Why should we capture and represent subtype relationships?

Because the account holder could possibly do more than one action (buy or sell)

Database Development Life Cycle

Construction and Loading Testing and Evaluation Implementation Performance Maintenance and Evaluation Database Evolution

How to make sure I capture and represent all the data requirements?

Consult a partner and by prototyping

Advantages of Database Systems

Control data redundancy Eliminate data inconsistencies Facilitate data sharing Facilitate better data management Increase application reliability Increase data accessibility Better data control and protection Improve end-user interfaces Reduce maintenance work Increase security, privacy and integrity control

Large volume of data

Customer data for a bank: e.g., 5/3 Items and customer orders: e.g., Amazon, eBay Personal data recorded by government: e.g., social security

Why maintain organizational memory

Daily transaction processing Decision making Competitive advantages

Why Database Approach?

Data is the valuable resource, not the hardware. Data should be shared and accessed as needed. Data should occupy the central position. One representation for each piece of data. Control data redundancy Separate data from programs Easy to establish relationships among data Provide data security and control Provide multiple interfaces Enforce integrity constraints The most popular tool for managing data.

Problems of File-based Data Processing

Data redundancy and inconsistency Lack of Data Integration: It is the data isolation problem (Islands of Data). Each application defines and processes its own universe of data. Difficult to locate and access data Difficult to ensure data consistency Lack of control over data Lack of flexibility Ad hoc data queries to the applications are impossible. Cost of software development is high: excessive programming effort. Dependence between programs and data Inconsistent naming of terms: Homonyms and Synonyms

Data should be shared.

Data should not be owned by one entity and used for one application. Existing and new applications share the same data.

Internal Level

Describes the physical structure of the stored data (how the data is actually laid out on storage devices) Describes the methods used to implement data access (e.g., indexes, hashed addresses, etc.) Concerns with the efficiency of physical data storage and access methods Technology dependent Visible to DBMS and particularly to DBA

Relational Data Model

Developed by E. F. Codd at IBM in 1970 Conceptual simplicity The database is perceived by users as tables. Basic structure: relation (logical table) Data are related through the logical relationships according to the data values. Users are isolated from complex implementation details inside the database systems.

Inconsistent data

Different inventory levels of an item Different grades for the same course taken

Physical Models (Physical data structures)

E.g.: Indexed files (ISAM, VSAM, etc.)

File-based Data Processing Issues

Each application has its own input and output files designed to meet its particular needs. Certain data items will appear in multiple files. Each application developed independently that will result in different representations and formats for the same data items.

'Account' & 'Category': attributes or entity types?

Entity

Is 'Bid" an attribute, an entity type or a relationship type?

Entity

Conceptual Models

Entity-Relationship (ER) model Object-oriented model such as UML Semantic data models

Three-schema architecture

External level Conceptual level Internal level

ANSI/SPARC Framework

Foundation for database development Proposal for DBMS architecture to support data independencies Very influential to DBMS's design by vendors, especially Relational DBMSs such as ORACLE

Logical Models (Implementation Models)

Hierarchical Network Relational

Databases

Hierarchical (legacy database systems) Network (legacy database systems) Relational (most popular commercial databases)

Extensions to ER Model

Incorporate a set of data abstractions to capture data requirements: Aggregation Generalization Specification Categorization

Value of Digital Data for Businesses

Increase production efficiency Enhance management control Support decision making Act as the competitive source Generate strategic value (e.g., Big Data)

3: Logical Design (Data Model Mapping)

Input: conceptual schema Task: Transform the conceptual schema according to an implementation data model Output: the logical database schema Relational data model is the most popular implementation data model of commercial DBMSs.

2: Conceptual Design

Input: users' data requirements Task: analyze users' requirements (business rules) and express them by using a conceptual data model Output: the conceptual schema The conceptual schema doesn't include implementation details. It is easier to understand and be used to communicate with nontechnical users. ER is the most popular model for this phase.

Associates

Interactions and relationships among objects. Associates can generate their own properties: E.g., A list of students register IS3030 taught by Professor Chiang in Fall 2017.

How to draw a 'good' and 'simple' ER diagram?

It can vary there is usually not one right answer

Is it necessary to design a database starting from the conceptual modeling?

It is up to the designer's preference

How detailed should an ER diagram be?

It should contain entities, attributes, and relationships

Human Information Processing

Memory is limited Ability to process data is restricted Use a variety of external tools

Evolution of Data Management

Multimedia database: organize a variety of information sources such as voice, audio, photos, video, etc. Object-oriented (OO) databases Client/Server databases Distributed databases (e.g., Blockchain) Data warehouses (Data Lake)

Can attributes have relationships with other attributes or entity types?

No

Can or should we combine account and customer as one entity type?

No

Can relationship types have directions?

No

Can we have an entity type without any attribute (e.g., Auction, Web site)?

No

Should a relationship type represent multiple associations (i.e., buy and sell)?

No

Should we consider implementation details during the conceptual modeling?

No

Should we model Buyer, Seller, and Bidder as separate entity types? If not, why not?

No

Should we model control flows such as "Customers create Accounts to offer items", and "Customers assign categories to items"?

No

Should we specify the 'primary keys' for the entity types?

No

Should we specify the sequence of relationship types (e.g., bid before buy)?

No

Conceptual Data Modeling Using Entity-Relationship Model

Objective: represent the data requirements using ER model

Why Conceptual Data Modeling for Database Design?

Opportunity for enhanced user participation in the design process Merits that accrue from DBMS-independent modeling Ease of understanding of the big picture and the consequent facilitation of maintenance of schemas and applications in the long run

Organizational Information Processing

Processors: humans and computers Use external memory Adopt external sources as processors

Common Problems of Organizational Memory

Redundancy Data control issues Poor user interface Poor data security Poor information quality Long delays in responding to information needs Unable to answer questions about the real world: Lack of reality Lack of memory (data) integration

Bid, Buy & Sell: Relationship or Entity?

Relationship

Conceptual Data Model:

Shows the data elements and how they are related without considering how they should be implemented.

Why Relational Databases?

Simple structure, easy to design Support data independences Powerful and flexible data manipulation capability 4GL query language (SQL) View data logically rather than physically Better control data redundancies

Organizational Memory

Storage medium: most data is stored in memory Storage structure: File and Database Data entry and information retrieval methods Trade-off between access speed and cost Essential component of modern organizations Key competitive resources

Needs of Data Independences

Support independent end users' and applications' views of data Support program-data independence Minimize the ripple effort of changes: E.g.: application programs immune to changes in storage structure and access methods But, how to achieve them?

Components of a DBMS

The major components of this include one or more query languages; tools for generating reports; facilities for providing security, integrity, backup and recovery; a data manipulation language for accessing the database; and a data definition language used to define the structure of data.

Integrated

The responsibility for 'integrating' data items as needed is assumed by the DBMS instead of the programmer.

Self-describing

The structure of the database (metadata) is recorded within the database system - not in the application programs.

File-based Data Processing Advantages

Very efficient use of machine resource, based on the view that computing power is the most valuable resource Relatively easy to design and implement for simple applications

Data Models

What is a model? A representation of something real for design purposes Examples: blueprint, prototype, simulation What is the purpose of models? So designers know what they need to change What are data models and their purposes? Abstraction of reality of data items A conceptual way in organizing data items

Can relationship types have their own attributes?

Yes

Can we have more than one relationship type between two entity types? (e.g., Customers and Items)

Yes

Should (can) I make assumptions for the conceptual data modeling?

Yes

Should we model the 'derived' attribute such as '# of bidder', 'the highest bid'?

Yes

Should we model the business rules such as "A bidder cannot bid his/her own items."

Yes

Can we model process requirements into the ER diagram? If we can, do we want to do so?

Yes, but we don't want to

Can attributes have their own attributes?

Yes, they are known as composite attributes

Information

a collective of data that holds meaning

What is the purpose of conceptual modeling?

comprehension, abstraction, representation

Data

material that can be accessed that has no specific meaning

What to do and how to start the conceptual data modeling process?

start with an ER diagram


Kaugnay na mga set ng pag-aaral

NetAcad Chapter 5 Exam Questions - 'Getting Help'

View Set

Business chapter 14 (final exam)

View Set