Database Test 1
What does database implementation involve?
Create database file/table structures; Create views (external schema); Establish access rights; Load test data; Write/test programs that process data; Install database (with production data) into production operations (outcomes are secured database tables loaded with data)
What are the disadvantages of index files?
IO (saves IO when searching for records); when making new records you have more IO (also updating and deleting)
What is the evolution of database technologies?
1960s - traditional files; 1970s - hierarchal and network; 1980s - relational; 1990s - object and object relational; 2000s - federated, MDDB, XML, and NoSQL
Discuss foreign keys in regards to 1:1, 1:M, and M:M.
1:1 - merge attributes into single table; OR create foreign key in either relation 1:M - create foreign key in relation on "many" side of relationship M:M - should've been eliminated on ERD; create new relation with PKs of related entities as (1) concatenated PK, and (2) FKs in new relation
What are the three different relationships? (1, M, etc.)
1:1; M:M; 1:M
When is a table in 2NF?
1NF + no partial functional dependencies;
When is a table in 3NF?
2NF + no transitive dependencies
What is data administration?
A high-level function that is responsible for the overall management of data resources in an organization, including maintaining corporate-wide definitions and standards
What is database administration?
A technical function that is responsible for physical database design and for dealing with technical issues such as security enforcement, database performance, and backup and recovery
What is enterprise data modeling?
Determine organizational data requirements; Build enterprise data model (outcome is a very high-level Entity-Relationship Diagram)
What is conceptual data modeling?
Determine user data requirements; Determine business rules; Build conceptual data model; outcome is an Entity-Relationship Diagram (conceptual schema)
What does database maintenance involve?
Maintain database structures (Storage/space management); Performance, tuning (I/O Contention, CPU usage, application tuning); Data availability; DBMS upgrades, "fixes" Backup (full, incremental, differential), recovery, business continuity (data replication - fallback)
What is a foreign key?
PK in another entity as an attribute
When is a table already in 2NF?
PK is not concatenated; relation contains no non-key attributes
What is a natural key?
PK that existed before the process i.e. SSN
What is an atomic key?
PK that is only one attribute
What are the disadvantages of a traditional file processing environment?
Program-data dependence = "structural" & "data" Limited data sharing = "islands of automation" Duplication of data = "redundancy" Lengthy development times Excessive program maintenance
What are the advantages of a database?
Program-data independence Improved data sharing Minimal data redundancy Improved data accessibility/responsiveness Improved data consistency Faster application development Enforcement of standards Improved data quality Reduced program maintenance
What are you options for deleting and updating PKs?
Restrict (will let you delete as long as there are no FKs with PK); Cascade (will do the same to all FKs); Set Null (any FK value will be empty; keep order line without FK value)
Talk about the SDLC with reference to DB activities.
SDLC DB Activities in SDLC Planning - Enterprise Modeling Analysis - DB Scope, Requirement (Conceptual data model) Design - DB Design (logical DB design & physical DB design) Implementation - DB implementation (load, test, eval, OP0 & DB maintenance
What is logical database design?
Select database model e.g., the Relational Model; Transform conceptual (ERD) into logical (relational) data model; Normalize and link data structures (Outcome is normalized, linked relational tables)
What is physical database design?
Select database product (e.g., SQL Server); Select storage device(s); Design fields, records, files (physical schema); outcomes are detailed, physical definitions for: fields (data dictionary) records (space requirements for physical structures)* files (access methods)
Talk about data types and lengths.
Text - Char vs. Varchar/NVarchar (char - fixed length; varchar - variable length; nvarchar - newer; variable; allows for different languages) Numbers - Int vs. Decimal (or Numeric) (int - don't give length specification; numeric - Precision-max number of digits in number, Scale-how many occur after decimal) Dates - Date (don't require length specification) vs. Time vs. Datetime Others (will not use in class)
What are transitive dependencies?
a functional dependency between 2 non-key attributes; when a non-key attribute is functionally dependent on another non-key attribute
What are the relational operators?
UNION (+); INTERSECTION; DIFFERENCE (-); PRODUCT (x); SELECT (tuples); PROJECT (attributes); JOIN (PRODUCT, SELECT, PROJECT)
What is an index?
a table or other data structure used to determine in a file the location of records that satisfy some condition
What are the advantages of a sequential file? Disadvantages?
advantages: simple; if you are going to process the majority of the records anyway i.e. payroll disadvantages: hard to retrieve a single record (you have to start at the beginning)
What are the advantages of a relational data model? Disadvantages?
advantages: structural independence (biggest); improved conceptual simplicity; easier database design, implementation, management, and use; ad hoc query capability; mathematical foundation disadvantages: too simple sometimes (non-IT people can design them) - can facilitate poor design and implementation; hardware and system software overhead; may promote "islands of information" problems
What are deletion anomalies?
are experienced when a value for one field we wish to keep is unexpectedly removed when a value for another field is deleted; e.g. cannot delete the sole order for a customer without deleting the only copy of the customer's information also
What are update anomalies?
are experienced when changes to multiple records of a table are needed to effect an update to a single value of a field; e.g. cannot completely update a customer's address without changing it for every order placed by that customer
What are insertion anomalies?
are experienced when we attempt to store a value for one field but cannot because the value of a noter field is unknown; e.g. cannot add a new customer's information until an order number is ready to be entered
Talk about converting entities to a relational table.
attributes become columns; primary key must be designated (regular entities have atomic keys; associative entities have composite key; subtype entities have same key as super type); example entity instances are rows of table
What is a candidate key?
attributes that could be a PK
Explain a multi-tier client/server database management system.
bc two-tier dtabase architecture requires powerful client computers to contain the user interface logic as well as the business logic, most companies use a multi-tier client/server dms; the user interface is accessible on any users computer through the use of either a web browser or written using programming languagaes such as VB.net, Visual C#, or Java; ease of separating the development of the ddatabase and modelus that maintain the data from the information systems modeules that focus on business logic and or presentation logic; three tiers (client, application/web tier, enterprise tier)
What is denormalization? When would you do this?
bringing tables back together; do this when there are performance concerns (every time you have to get info from a table you have to read and go to a secondary place (IO)); this is time consuming
What are some of the common data roles?
business analyst, system s analyst, database analyst & data modelers (concentrate on determining the requirements and design for the database component of the information system), users, programmers, database architects (establish standards for data in business units, striving to attain optimum data location, currency, and quality), data administrators (responsible for existing and future databases; ensure consistency and itegrity across databases, and as experts on database technlogy, they provide consulting and training to other project team members), project managers, and others...
What is an alternate (secondary) key?
candidates keys that didn't get chosen as the PK
What is an attribute?
characteristics that describe entities; common to each instance of entity; avoid MVAs; minimize derived attributes
What are the steps for physical design?
choose DBMS; detailed definition; physical file creation; query optimization
Describe the three parts for a multi-tier client/server database management system.
client tier (browser); application/web server (contains the business logic required to accomplish the business transactions requested by the users; talks to the database server); enterprise server with DBMS (transaction database containing all organizational data or summaries of data on department servers)
What is an entity type?
collection or class of entities sharing common properties i.e. Employee
What are some of the factors we use in choosing a DBMS?
compatibility with existing hardware, software, network, operating system; DBMS features meet requirements (needed functionality, structure of data, nature of workload); product reliability; vendor support; IT personnel expertise; pricing, licensing
What is the business model for cloud computing?
computing resources on demand; need-based architects; internet-based delivery; pay as you go
What are index files?
contain data records (physically ordered - sequential) and index records (give physical location of each data record); think of book example
What is normalization?
convert complex relations into simpler relations; reveals/corrects redundancies, errors, ambiguities in data model; only a simple check IF good data model exists
What are some common business entities?
customer, product, transaction, employee, supplier/vendor, physical assets/facilities, money
How to correct a table into 2NF?
decompose into 2 or more relations (if not already); one with original (concatenated) key + attributes; one (or more) with the "depended on" partial key as PK + attributes
How do you correct a table into 3NF?
decompose into 2 or more relations (if not already); one with original PK + attributes; one (or more) with "depended on" non-key attribute as PK + attributes
What are the different data integrity controls?
defaults (if a value is not supplied by user or program, you may choose to default to something), ranges (GPA >= 0; GPA between 0-4), nulls (not null - required; null - field can be empty), referential integrity
How to correct a table into 1NF?
define new (usually associative) entity
What is a conceptual data model?
describes the business in terms of "things" it deals with; basically draw a picture; logical (not tied to any technology; not technology dependent); business document (easily understandable to users)
What is a relational data model?
developed by Codd (IBM) in 1970; data represented as tables
What is master data management?
disciplines, technologies, and methods used to ensure the currency, meaning, and quality of refernce data within and across various subject areas; identify - common data subjects, common data elements, sources of the truth
Supertype must have ? Subtypes must have ?
discriminator; PK of the super type
Rules in physical database design?
doesn't really have all the rules compared to ERDs and normalization; the way we do it can vary a lot across organizations
Why do we normalize?
ensures relations conform to rules; ensures relation contains facts about one "theme"
What do you need detailed definitions for?
fields (data dictionary); records (physical record structure, quantity); files (access methods, space requirements)
What is a secondary index key?
fields that index is built on
What does we mean by maximum cardinality?
given one instance of an entity, what is the maximum number of instances of the other entity that it can be related to
What is a generalization hierarchy?
grouping common attributes into one entity; unique attributes/relationships into other entities; supertype and subtype; no cardinalities or relationship names to subtypes
What are some guidelines for primary keys? (NOT RULES)
helpful if easy to write, read, hear, remember; record time on 24-hour clock; avoid using data that changes; avoid letters; surrogate over natural key
What are the discovery methods used to uncover requirements?
interviews (identify who to interview - executives, managers, staff employees; conduct interviews - what do you do? business challenge? information to be more effective in job...); existing systems; existing documents (e.g., applications, reservations, time sheets, contracts, catalogs, etc.)
What is surrogate key?
key that has no business meaning; only exists to make a value unique i.e. Baylor ID
How do you represent relationships?
line connecting related entities; name of relationship on the line; name must be a verb (phase)
For referential integrity where does the line go from?
line goes from FK to PK (arrow on PK side)
How do you represent an attribute?
listed underneath the entity name; mixed case i.e. Employee Name; primary key typically underlined
What is the purpose of referential integrity?
maintains consistency between data in related tables; for every foreign key there must be an existing primary key with that value; creates rules/constraints for: insertion of foreign keys and update and deletion of primary keys
What do you do when you have a many-to-many relationship?
make an associative entity
What does the DBMS manage?
manages the interaction between the end user and the database
What type of attributes do we not want?
multivalued attributes
How do you avoid anomalies with insertion, update, and deletion?
normalization
When is a table already in 3NF?
only 0 or 1 non-key attributes in relation
What is a super key?
overkill key; concatenated extra things
What is the purpose of each of the four phases of the SDLC? (or 5..)
planning - to develop a preliminary understanding of a business situation and how information systems might help solve a problem or make an opportunity possible; analysis - to analyze the business situation thoroughly to determine requirements, to structure those requirements, and to select among competing system features; design - to elicit and structure all information requirements; to develop all technology and organizational specifications; implementation - to write programs, build databases, test and install the new system, train users, and finalize documentation; maintenance - to monitor the operation and usefulness of the system, and to repair and enhance the system
What is the conceptual data model foundation for?
promoting cross-functional "business understanding"; organizing data that supports a business process; defining the scope of database
What is the purpose and goal of physical database design?
purpose: determine physical specifications for data; goal: processing efficiency (performance, integrity, security, and recoverability)
What are data stewerds? Business Analytics Engineer?
quality, metadata, MDM; data analytics, statistics, mining
What are sequential files?
records are stored one after another (1st in 1st out); in whatever order it is presented to the database (often by PK)
How do you represent an entity?
rectangular box; name is singular noun, in CAPS
How do you avoid inconsistency with insertion, update, and deletion?
referential integrity
What are some rules of relations?
relation (tables) names must be unique; entries in columns are atomic (single valued); entries in column are from same domain; each row is unique; ordering is insignificant; data in tables should be added, updated, and deleted without errors
What are the three components of a relational data model?
relational database structure; relational rules (integrity); relational operations (manipulation)
Talk about the relational database structure?
relations, tuples, attributes (same as tables, rows, columns OR files, records, fields); primary key must be designated; foreign keys must be designated for relationships
What do you consider when choosing data types and lengths for fields?
represent all possible values; ensure data integrity; support data manipulations; minimize storage space
What are the steps in converting an ERD to a relational model?
represent entities as relations; represent relationships as either: foreign keys in relations or new relations; provide sample data; normalize relations
What are the three different file organizations?
sequential, hashed, indexed
What is an entity instance?
single occurrence of an entity i.e. Employee 1, Gina Green
What are normal forms?
state of a relation; rids relations of potential anomalies
When is a table in 1NF?
table is a relation; primary key; no repeating values of groups; all column values are from the same domain
Which phase is the design phase?
the "how" phase
Which phase is the analysis phase?
the "what" phase
What do you use to build a conceptual data model?
the ERD technique; represent entities with attributes; relationships with cardinalities
Every attribute is dependent on:
the key (1NF); the WHOLE key (2NF); and nothing but the key (3NF)
What are entities?
things we need to collect information about; people, events, objects, assets, and things
Explain a two-tier client/server database management system.
two-tier - each member of the workgroup has a computer, and the computers are linked by means of a network (wired or wireless local area network [LAN]); each computer has a copy of the a specialized application (client) that provides the user interface as well as the business logic through which the data are manipulated; the database itself and the DBMS are stored on a central device called the database server, which is also connected to the network; thus, each member of the workgroup has access to shared data (might have different views); super powerful computers; every time an update is made to the logic, each computer has to be updated
Tell me about relationships?
unary, binary, ternary; cardinality (minium and maximum)
What are the different type of relationships? How are they determined?
unary; binary; ternary; determined by the number of entities
What is the RI constraint?
whatever happens when we insert a FK value (has to have an existing primary key with that value)
What is partial functional dependency?
when a non-key attribute is functionally dependent on a part of the PK
What is full functional dependency?
when the value of one attribute can be determined base on the value of another attribute