Database Management Final
The grain of a data warehouse indicates the size and depth of the records.
False
The target market for Hadoop is small to medium companies using local area networks.
False
Transient data are never changed.
False
When systems are automatically generated and maintained, quality is diminished.
False
Which of the following finds all groups meeting stated conditions?
Having
The Hadoop framework consists of the ________ algorithm to solve large scale problems
MapReduce
Why are packaged data models gaining popularity?
Most organizations can no longer afford to have data models developed in house, both in terms of labor costs as well as time. As such, the role of data modeler is evolving from artisan to that of engineer. A packaged data model allows one to customize an organization's database needs using a superset of the model needed by the organization.
In the figure below, which of the following is a subtype of patient?
Outpatient
OLAP tools that use the database as a traditional relational database are called:
ROLAP tools
What does the following SQL statement do? Select * From Customer Where Cust_Type = "Best"
Selects all the fields from the Customer table for each row with a customer labeled "Best"
Which of the following is a generic entity type that has a relationship with one or more subtypes?
Supertype
EXISTS will take a value of ________ if the subquery returns an intermediate results table which contains one or more rows.
TRUE
Provide a brief overview of the various components of the database environment.
The components of the database environment make up of: Data Modeling and Design - automated tools used to design databases / applications User Interface - is what the users views Applicaton Programs - interact with database Repository - data stored in central knowledge base Database Mgmt System - software to manage databases Users are database admins, system developers, and end users
A data model is a graphical system used to capture the nature and relationships among data.
True
A data quality audit helps an organization understand the extent and nature of data quality problems.
True
A database table is defined using the data definition language (DDL).
True
A relational database establishes the relationships between entities by means of a common field.
True
An event is a database action that results from a transaction.
True
An independent data mart is filled with data extracted from the operational environment without the benefit of a data warehouse.
True
An operational data store is typically a relational database and normalized, but it is tuned for decision-making applications.
True
Application integration is achieved by coordinating the flow of event information between business applications.
True
Big data databases tend to sacrifice consistency for availability.
True
Constraints are a special case of triggers.
True
Development starts from scratch with the traditional file processing approach because new file formats, descriptions, and file access logic must be designed for each new program.
True
ETL is short for Extract, Transform, Load.
True
Expressions are mathematical manipulations of data in a table that may be included as part of the SELECT statement.
True
File processing systems have been replaced by database systems in most critical business applications today.
True
Grain and duration have a direct impact on the size of fact tables.
True
Graph-oriented databases are designed to maintain information regarding the relationships between data items.
True
In an E-R diagram, an associative entity is represented by a rounded rectangle.
True
In order to find out what customers have not placed an order for a particular item, one might use the NOT qualifier along with the IN qualifier.
True
In the figure shown below, there could be an instance of a rental unit that is neither an apartment nor a house.
True
Informational systems are designed to support decision making based on historical point-in-time and prediction data.
True
It is easy to miss the opportunity to visualize future requirements shown in the full data model when using a packaged data model.
True
Loading data into the warehouse typically means appending new rows to tables in the warehouse as well as updating existing rows with new data.
True
Medical claims and pharmaceutical data would be an example of big data.
True
NoSQL databases DO NOT support ACID (atomicity, consistency, isolation, and durability).
True
One of the original purposes of the SQL standard was to provide a vehicle for portability of database definition and application modules between conforming DBMSs.
True
Prototyping is a type of rapid application development.
True
Reconciled data are data that have been selected, formatted, and aggregated for end-user decision support applications.
True
Redundancy increases the risk of inconsistent data
True
Separation of metadata from application programs that use the data is called data independence.
True
The 'schema on read' approach often incorporates JSON or XML.
True
The FROM clause is the first statement processed in an SQL command.
True
The data reconciliation process is responsible for transforming operational data to reconciled data.
True
The enterprise data model controls the phased evolution of the data warehouse.
True
The following query totals sales for each salesperson.
True
The major advantage of the data propagation approach to data integration is the near real-time cascading of data changes throughout the organization.
True
The three major types of analytics are: descriptive, predictive, and prescriptive.
True
When creating tables, it's important to decide which columns will allow null values before the table is created.
True
YARN, also called MapReduce 2.0, is like a traffic cop that controls the allocation of resources in a system.
True
________ includes NoSQL accommodation of various data types.
Variety
________ includes concern about data quality issues.
Veracity
In an SQL statement, which of the following parts states the conditions for row selection?
Where
________ are not used for querying and analyzing data stored in data warehouses.
Word processing programs
Given a table named store with 5 fields: store_id, address, city, state, zipcode, why would the following insert command not work? insert into store values ('234 Park Street')
You must specify the fields to insert if you are only inserting some of the fields.
An entity cluster is:
a set of one or more entity types and associated relationships grouped into a single abstract entity type.
The following figure shows an example of:
a strong entity and its associated weak entity.
The SDLC phase in which every data attribute is defined, every category of data is listed and every business relationship between data entities is defined is called the ________ phase.
analysis
At a basic level, analytics refers to:
analysis and interpretation of data.
A procedure is:
called by name.
A ________ specifies the number of instances of one entity that can be associated with each instance of another entity.
cardinality constraint
A join operation:
causes two tables with a common domain to be combined into a single table or view.
The UNION clause is used to:
combine the output from multiple queries into a single result table.
A rule that CANNOT be violated by database users is called a:
constraint
Including data capture controls (i.e., dropdown lists) helps reduce ________ deteriorated data problems.
data entry
All of the following are some beneficial applications for real-time data warehousing EXCEPT:
data entry.
When we consider data in the data warehouse to be time-variant, we mean:
data in the warehouse contain a time dimension so that they may be used to study trends and changes.
A data mart is a(n):
data warehouse that is limited in scope.
Allowing users to dive deeper into the view of data with online analytical processing (OLAP) is an important part of:
descriptive analytics
To eliminate duplicate rows in a query, the ________ qualifier is used in the SQL Select command.
distinct
Going from a summary view to progressively lower levels of detail is called data:
drill-down.
Any create command may be reversed by using a(n) ________ command.
drop
The following code is an example of a(n):
equi-join.
Using a packaged data model, projects take less time and cost because:
essential components and structures are already defined.
A researcher trying to explain why sales of garden supplies in Hawaii have decreased would be an example of ________ data mining.
explanatory
The goal of data mining related to analyzing data for unexpected relationships is:
exploratory
Using an outer join produces this information: rows that do not have matching values in common columns are not included in the result table.
false
Program-data dependence is caused by:
file descriptions being stored in each database application.
Embedded SQL consists of:
hard-coded SQL statements included in a program written in another language.
A dependent data mart:
is filled exclusively from the enterprise data warehouse with reconciled data
Dynamic SQL:
is used to generate appropriate SQL code on the fly as an application is processing.
Informational and operational data differ in all of the following ways EXCEPT:
level of detail.
A student can attend five classes, each with a different professor. Each professor has 30 students. The relationship of students to professors is a ________ relationship.
many-to-many
A join that is based upon equality between values in two common columns with the same name and where one duplicate column has been removed is called a(n):
natural join.
The use of a set of graphical tools that provides users with multidimensional views of their data is called:
on-line analytical processing (OLAP).
The following code would include:
only rows that match both Customer_T and Order_T Tables.
A join in which rows that do not have matching values in common columns are still included in the result table is called a(n):
outer join.
Rotating the view of a multidimensional database for a particular data point is called data:
pivoting
Application of statistical and computational methods to predict data events is:
predictive analytics.
Descriptive, predictive, and ________ are the three main types of analytics.
prescriptive
All of the following are properties of metadata EXCEPT:
processing logic.
All of the following are primary purposes of a database management system (DBMS) EXCEPT:
providing an integrated development environment.
Event-driven propagation:
pushes data to duplicate sites as an event occurs.
The major advantage of data propagation is:
real-time cascading of data changes throughout the organization.
NoSQL systems allow ________ by incorporating commodity servers that can be easily added to the architectural solution.
scaling out
When reporting and analysis organization of the data is determined when the data is used is called a:
schema on read.
NoSQL systems enable automated ________ to allow distribution of the data among multiple nodes to allow servers to operate independently on the data located on it.
sharding
An expanded version of a star schema in which all of the tables are fully normalized is called a(n):
snowflake schema.
A view may not be updated directly if it contains:
the HAVING clause.
One characteristic of independent data marts is complexity for end users when they need to access data in separate data marts. This complexity is caused by not only having to access data from separate databases, but also from:
the possibility of a new generation of inconsistent data systems, the data marts themselves.
External data sources present problems for data quality because:
there is a lack of control over data quality.
One characteristic of quality data which pertains to the expectation for the time between when data are expected and when they are available for use is:
timeliness
The ________ rule specifies that each entity instance of the supertype must be a member of some subtype in the relationship.
total specialization
In packaged data models, all subtype/supertype relationships follow the ________ and ________ rules.
total specialization; overlap
A named set of SQL statements that are considered when a data modification occurs are called:
triggers
Languages, menus, and other facilities by which users interact with the database are collectively called a(n):
user interface.
Although volume, variety, and velocity are considered the initial three v dimensions, two additional Vs of big data were added and include:
veracity and value
Factless fact tables may apply when:
we are deleting correlated data.
A relatively small team of people who collaborate on the same project is called a:
workgroup
What will be returned when the following SQL query is executed?
A listing of all drivers who made more than 2 deliveries as well as a count of the number of deliveries
Which of the following statements is true about the figure shown below?
A rental unit can be an apartment, house or just a rental unit; it may not be more than one at the same time.
In which of the following situations would one have to use an outer join in order to obtain the desired results?
A report is desired that lists all customers and the total of their orders during the most recent month, and includes customers who did not place an order during the month (their total will be zero).
Which of the following advances in information systems contributed to the emergence of data warehousing?
Advances in middleware products that enabled enterprise database connectivity across heterogeneous platforms.
Which of the following are key steps in a data quality program?
Apply TQM principles and practices.
All of the following are new data types added in SQL:2008 EXCEPT:
BIT
Which type of index is commonly used in data warehousing environments?
Bit-mapped index
Which of the following counts ONLY rows that contain a value?
Count
The ________ DBA view shows information about all users of the database in Oracle.
DBA_USERS
________ duplicates data across databases.
Data propagation
The main concept of relational databases was published in 1970 by:
E.F.Codd.
________ do NOT concentrate on determining the requirements for the database component of an information system
End Users
A business rule is a statement of how a policy is enforced or conducted.
False
A function has only input parameters but can return multiple values.
False
Adapting a packaged data model from your DBMS vendor makes it difficult for the application to work with other applications from the same vendor.
False
After the extract, transform, and load is done on data, the data warehouse is never fully normalized.
False
An operational data store (ODS) is not designed for use by operational users.
False
Characteristics of the structure of the database are generally changed during the implementation phase of the database development process.
False
Completeness means that all data that must have a value does not have a value.
False
Count( *) tallies only those rows that contain a value, while Count counts all rows.
False
DCL is used to update the database with new records.
False
Data transformation is not an important part of the data reconciliation process.
False
Database processing programs are coded and tested during the design stage of the systems development life cycle.
False
EXISTS takes a value of false if the subquery returns an intermediate result set.
False
In the figure below, Name would be an ideal identifier.
False
In the figure shown below, a rental unit can be both an apartment and a house but must be at least one.
False
NoSQL is a great technology for storing well-structured data.
False
Packaged data models use an entity type to store union data.
False
Repositories are always used in file processing systems.
False
Specifying the attribute names in the SELECT statement will make it easier to find errors in queries and also correct for problems that may occur in the base system.
False
The CREATE SCHEMA DDL command is used to create a table.
False
The data that you are interested in capturing about an entity is called an instance.
False
The following query will execute without errors.
False