C175 - Database Management - Foundations - Chapter Review Cards

¡Supera tus tareas y exámenes ahora con Quizwiz!

Give an example of each of the three types of relationships: 1:1, 1:M, M:N

*1:1*- An academic department is chaired by one professor; a professor may chair only one academic department. *1:M*- A customer may generate many invoices; each invoice is generated by one customer. *M:N*- An employee may have earned many degrees; a degree may have been earned by many employees.

What is a relationship, and what three types of relationships exist?

An association between 2 or more entities, 1:M, M:M, 1:1

What is a composite entity, and when is it used

An entity generally used to transform M:N relationships into 1:M relationships, A.K.A bridge entity, linking table. It has a PK composed of multiple elements, attributes inherited from the entities it is bridging

Why is a table whose primary key consists of a single attribute automatically in 2NF when it is in 1NF?

Because there isn't a composite PK which is required for partial dependencies which is what the 2NF process is designed to remove

Explain how the entity relationship (ER) model helped produce a more structured relational database design environment.

By graphically representing the db's main entities and their relationships it made it easier to understand their roles making it easy to map the ERM to the db model's tables and attributes

Discuss the importance of data modeling.

Db designers use them as a communication device to facilitate interaction among the designers, app programmers, and end users to help reduce, if not eliminate, discrepancies between the db design's components and the complex real world environment via a relatively simple (usually graphical) representation

T or F : The normalization process is used to eliminate data redundancy in a database

False (it is used to produce the carefully controlled redundancy required to properly link database tables)

What are the main components of a database system?

Hardware, software, people, procedures, and data

What are the potential costs of implementing a database system?

Increased acquisition and operating costs Management complexity Maintaining currency (keeping things current) Vendor dependence

What is a DBMS, and what are its functions?

It's a collection of programs that manage the db structure and control shared access to the data in the db. Its functions are: - Storing and updating, as needed, metadata in a data dictionary - Creates the complex structures for data storage - Transforms data to fit the created structures - Creates and enforces security - Creates complex structures to allow multiuser access - Handles backup and recovery - Promotes and enforces integrity rules - Provides access to the data through utility programs and APIs - Provides end-user access to data within a network environment

MOLAP

Multidimensional On-Line Analytical Processing (MOLAP) provides OLAP functionality using multidimensional databases (MDBMS) to store and analyze multidimensional data

MDBMS

Multidimensional database systems (MDBMS) use special proprietary techniques to store data in matrix-like arrays of n-dimensions.

Where are 3 main modules on an OLAP system: OLAP GUI, OLAP Analytical Processing Logic, and OLAP Data Processing Logic, and where are they typically located?

OLAP GUI: is the main point of contact between the end user and the system and is always placed on the end user's computer as it provides the interface through which the end user queries the data warehouse's contents OLAP ALP: performs the complex transformations required for business data analysis, such as multiple dimensions, aggregation, period comparison, and so on and can either be placed on the client (for speed) or on the server (for better administration and better throughput). OLAP DPL: maps the data analysis requests to the proper data objects in the Data Warehouse and is generally placed at the server level

OLAP

OLAP stands for On-Line Analytical Processing and uses multidimensional data analysis techniques. OLAP yields an advanced data analysis environment that provides the framework for decision making, business modeling, and operations research activities.

List and describe the different types of databases.

Single-user: sits on a user's desktop Multiuser: allows more than one user simultaneous access Workgroup: is a multiuser for 50 or fewer users, or a specific department in an organization Enterprise: is used by the entire organization and supports more than 50 users Centralized: supports data located at a single site Distributed: supports data distributed across several sites Cloud-based: created and maintained through cloud data services General-purpose: contain a wide variety of data across multiple disciplines Discipline-specific: contains data pertaining to specific subject areas Transactional: a.k.a. production, operational, or OLTP, support a company's day-to-day operations Analytical: allow advanced analysis of business data using sophisticated tools typically using another type of db: data warehouse Data Warehouse (OLAP): specialized db that stores data in a format optimized for decision support containing historical data from the operational db as well as data from other external sources Unstructured Data: storage of raw state data in the format it was collected Structured Data: storage of data formatted to facilitate storage, use, and generation of information

What is sparse data? Give an example.

Sparse data is when the number of entity attributes is large, but the number of distinct values is small A personnel entity could have attributes for first name, last name, middle name, degree, income, veteran status, etc. but it's possible that not all personnel have a middle name, or that they are veterans, etc.

data sparcity

The number of different values an attribute can be

What is data redundancy, and which characteristics of the file system can lead to it?

Unnecessarily duplicated data resulting from a file system's inability to represent and manage data relationships.

Use examples to compare and contrast unstructured and structured data. Which type is more prevalent in a typical business environment?

Unstructured data is raw, in the format originally collected, and not usable to extract data from, e.g. scans of invoices stored as images Structured data is the result of formatting unstructured data based on processing needs, e.g. storing the scanned invoice data in a structured spreadsheet

What three data anomalies are likely to be the result of data redundancy? How can such anomalies be eliminated?

Update, addition (insert), and delete. Normalization

In a SELECT query, what is the difference between a WHERE clause and a HAVING clause?

WHERE eliminates rows before any grouping occurs while the HAVING clause eliminates groups after the grouping occurs WHERE can't contain an aggregate function, but HAVING can

Explain why the following command would create an error, and what changes could be made to fix the error: SELECT V_CODE, SUM(P_QOH) FROM PRODUCT;

Your SELECT statement can't have attributes outside of an aggregate function if part of the statement includes one. If you need an attribute outside of the aggregate function as in this example you would have to put the value (V_CODE) in a GROUP BY clause

What is normalization?

a multistep process for assigning attributes to entities eliminating uncontrolled data redundancies

Explain the difference between an ORDER BY clause and a GROUP BY clause.

an ORDER BY clause simply changes how your results appear, but does not modify what appears whereas a GROUP BY clause can actually affect what is returned through aggregate functions

What is a surrogate key, and when should you use one?

artificial pk, to simplify pk selection

What is the difference between the COUNT aggregate function and the SUM aggregate function?

count just gives you a total # of rows that match regardless of data type, whereas SUM adds the values together requiring that they be numeric values

Explain the differences between data, information, and a database.

data are raw facts, information is data processed to provide meaning, and a database stores/shares data so that it can be transformed into information as needed

What is data independence, and why is it lacking in file systems?

data independence exists when changes to a file's data characteristics don't affect data access programs and is therefore lacking in file systems because file access is dependent on a file's data characteristics

What is structural independence, and why is it important?

it exists when changes to the file's structural characteristics (e.g. # of columns) has no impact on data access programs which substantially reduces programming effort and program maintenance costs

what's the difference between sparse data and data sparsity?

sparse data refers to essentially a large number of attributes with many NULL instances whereas data sparsity refers to the number of different values an attribute could have, e.g. gender only has 2 values, M or F so has low data sparsity, whereas name would have high data sparsity because the number of different names would be high

Suppose that someone tells you that an attribute that is part of a composite primary key is also a candidate key. How would you respond to that statement?

that's not possible because a primary key is a minimal candidate key which means it can't be broken down further all elements within the composite are required to uniquely identify

When is a table in BCNF?

when in 3NF and every determinant is a candidate key

When is a table in 1NF?

when there are no repeating groups (all key attributes have been defined) in the table and when all remaining attributes are dependent on the primary key

The proper OLAP/MOLAP selection criteria must include:

• purchase and installation price • supported hardware and software • compatibility with existing hardware, software, and DBMS • available programming interfaces • performance • availability, extent, and type of administrative tools • support for the database schema(s) • ability to handle current and projected database size • database architecture • available resources • flexibility • scalability • total cost of ownership

What 3 (often conflicting) db requirements must be addressed by the db designer?

1) Design elegance: adhering to design rules 2) Information requirements of the end users 3) Transaction speed required by the end users

OLAPs four main characteristics are

1. Multidimensional data analysis techniques 2. Advanced database support 3. Easy to use end user interfaces 4. Support for client/server architecture.

What is a business rule, and what is its purpose in data modeling?

A brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization's environment derived from a description of operations: a detailed description of the operational environment of the organization, used to define the entities, attributes, relationships, and constraints used in the db design


Conjuntos de estudio relacionados

Ch. 50 Prep U Patients With Hearing and Balance Disorders

View Set

Chp 12 - Employee Benefits: Group Life and Health Insurance

View Set

Unit 17 / Cambridge English Objective Advanced Wordlist

View Set

Chapter 21: PrepU - Complications Occurring Before Labor and Delivery

View Set

Marketing & Professional Sales Unit 3 Test Review

View Set

Introduction to project management week1

View Set