Unit 3

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What are the two main ways of storing PL/SQL code in the Oracle database?

CREATE PROCEDURE and CREATE TRIGGER

Which ACID property ensures that any transaction will bring the database from one valid state to another?

Consistency

How is a data warehouse organized?

around important subject areas.

What is detailed data in a fact table called?

atomic data.

What is an example of a post-relational database that is NOT a NoSQL database? a. BigTable. b. CouchDB. c. Hive. d. SQLite.

d

What is the purpose of the HAVING clause?

to filter rows resulting from the GROUP BY clause.

How does authentication differ from access control?

Authentication ensures that the individual is who he or she claims to be, but says nothing about access rights of the individual

The degree to which the administration of a database is _______________ dictates the skills and personnel required to manage databases. What is DB automation?

Automated

What are the 7 types of information security control?

1. Access control 2. Auditing 3. Authentication of users identity and rights management 4. Encryption 5. Integrity controls 6. Backups 7. Application security

What are the 3 BASE guarantees?

1. Basic Availability 2. Soft-state 3. Eventual consistency

What are different ways of organizing and/or grouping documents?

1. Collection 2. Directory hierarchies 3. Non-visible Metadata 4. Tags

What is an example of network security controls?

Firewalls

What is the relationship between operational data and a data warehouse?

The operational data are used as a source for the data warehouse

How many queries can be nested in a Where clause of an Outer query?

Unlimited number

Which of the following is the preferred way to recover a database after a transaction in progress terminates abnormally? a. Rollback b. Rollforward c. Switch to a duplicate database d. Reprocess transactions

a

What is a data warehouse?

a database system used for reporting and data analysis

What does the consistency property insure?

any transaction will bring the database from one valid state to another

If both data and database administration exist in an organization, typically the database administrator is responsible for which one of the following? a. Data modeling b. Database design c. Metadata d. Stewardship

b

What are DBMS policies?

general statements of action that communicate and support DBA goals

What is data scrubbing?

improve the quality of data before it is moved into a data warehouse

An _______________________ database is a database management system that primarily relies on main memory for computer data storage.

in-memory

Generally, a star schema is composed of __________ fact table(s)?

one

The simultaneous use of multiple computer resources to solve a computation problem

parallel computing

The data Warehouse is__________.

read only.

What do dimension tables commonly describe in a star schema?

relevant facts

a dimension that stores and manages both current and historical data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records

slowly changing dimension

a data mart design that consists of one or more fact tables that reference multiple dimension tables.

star schema

What is eventual consistency?

system will eventually become consistent once it stops receiving input

After the database designers complete the logical design, what database design does a DBA typically create?

Physical

given system continues to operate even when there is partial data loss or temporary system failure or interruption

Partition Tolerance

What does poor data administration often lead to or cause?

Performance problems

a data warehouse that integrated data from one or more disparate sources to create a central repository of data for the entire enterprise

enterprise data warehouse

A decision support application that emphasizes access to and manipulation of a time series of internal company data and, sometimes, external data

data-driven decision support system

What programming language is used to write custom functions to perform the map and reduce operations?

Javascript

The _________ operation allows the combining of two relations by merging pairs of tuples, one from each relation, into a single tuple.

Join

What relational operation causes two or more tables with a common domain to be combined into a single table?

Join

What are the Set operators represented in the Venn diagrams?

Joins

Which of the following is the simplest type of NoSQL database?

Key-value

What is the input to each of the two phases of the MapReduce algorithm?

Key-value pairs.

In a distributed data store, what has similar importance like a schema has in a relational Database?

Keyspace - an object that holds together all column families of a design

How does a Left Outer Join differ from a Right Outer Join?

Left outer join - returns all records from the left table, and the matched records from the right table Right outer join - returns all records from the right table, and the matched records from the left table

responsible for overall physical aspects of database administration

Operations DBAs

In what type of join are the rows that do not have matching values in common columns nonetheless included in the result table?

Outer Join

What type of join is needed when you wish to include rows that do not have matching values?

Outer join

What is the normal form of a Fact table in a Star schema?

Partially denormalized to speed query performance

What are four of Bill Inmon's characteristics of a data warehouse?

Subject Oriented Integrated Nonvolatile Time Variant

What are four characteristics or features of a data warehouse?

Subject-oriented Non-volatile Integrated Time-variant

Queries can be nested so that the results of one query can be used in another query via a relational operator or aggregation function. A nested query is also known as a _____________.

Subquery

Queries can be nested so that the results of one query can be used in another query. A nested query is also known as a _____________.

Subquery

What type of data in a data warehouse is never found in the operational database environment?

Summary.

In 1983, what company introduced a database management system specifically designed for decision support?

Teradata

What is the dimensional approach to data warehousing?

The data warehouse should be modeled using a dimensional model

What is the normalized approach to data warehousing?

The data warehouse should be modeled using a normalized model

What is data granularity or data grain?

The level of detail of data in a data warehouse

What does database transaction durability ensure in the event of a system failure?

a transaction is not lost once it has been committed

A database management system that primarily relies on main memory for computer data storage.

in-memory database

What kind of database is designed for data whose elements are interconnected with an undetermined number of relations between them?

Graph database

What databases store data as a collection of nodes, connected by edges?

Graph database system

The SQL ______________ clause includes a predicate used to filter rows resulting from the GROUP BY clause.

HAVING

If we want to select output rows based on the results of the group function, what clause would you use?

HAVING clause

What does the CAP theorem assert about a distributed, networked system?

It is impossible for a distributed system to simultaneously provide consistency, availability, and partition tolerance guarantees

What is JSON?

JSON (JavaScript Object Notation) is a lightweight data-interchange format.

What are the 3 desirable system requires in CAP theorem?

Availability Consistency Partition Tolerance

Which of the following is a wide-column store? a. Cassandra b. Riak c. MongoDB d. Redis

A

Which of the following statements is incorrect? a. Non or Post Relational databases require that schemas be defined before you can add data b. NoSQL databases are built to allow the insertion of data without a predefined schema c. NewSQL databases are built to allow the insertion of data without a predefined schema

A

A data warehouse administrator is concerned with which of the following? a. The amount of time needed to make a decision but not the typical roles of a database administrator b. The time to make a decision and the typical roles of a database administrator c. The typical roles of a data administrator and redesigning existing applications d. The typical roles of a database administrator and redesigning existing applications

B

In what clause(s) can you place a subquery? a. A subquery can be nested inside the IN or WHERE clauses of an outer SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. b. A subquery can be nested inside the WHERE or HAVING clause of an outer SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. c. A subquery can be nested only inside the WHERE or HAVING clause of an outer or inner SELECT subquery.

B

Which of the following is a NoSQL Database Type? a. SQL b. Document databases c. JSON d. All of the mentioned

B

What is the name of a proprietary (closed-source) database created by Google that uses Columns?

BigTable

optional, used to handle any errors that occur during the execution of the statements and commands in the executable section

Exception

The _________________________ is the simplest style of data mart schema. It consists of one or more fact tables referencing any number of dimension tables.

Star schema

ability to define a data warehouse by subject matter

Subject Oriented

What is the purpose of a query tool?

data retrieval

What are the four steps in the proactive cycle to improve security?

1) prevention 2) detection 3) analysis 4) control

What are three features of key value stores?

(1) a scalable architecture (2) a schema-free data structure (3) minimal delay and high performance processing.

With a _________________ database, application owners do not have to install and maintain the database on their own. Instead, a provider takes responsibility for installing and maintaining the database, and application owners pay according to their usage.

Cloud

What are the four types of NoSQL data models?

Column, Document, Graph, Key Value

the process of ongoing risk assessment

Compliance program

written instructions that describe a services of steps to be followed during the performance of a given activity

DBMS procedures

In the bottom-up approach to data warehouse design, ___________________ are first created to provide reporting and analytical capabilities for specific business processes.

Data Marts

A central, integrated repository for historical, enterprise data from multiple database systems. Designed to support a broad range of decision tasks in a specific organization

Data Warehouse

__________ is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions.

Data Warehouse

What is a typical hierarchy for database administration?

Data analysts/query designers → Junior DBA → Mid-level DBAs → Senior DBAs → DBA consultants → Manager/Director of Database Administration/Information Technology

What types of data are stored in a data warehouse?

Historical Derived Metadata Structured

What functions can return the results as a document, or may write the results to collections?

MapReduce

What technology created by Google is the foundation for Hadoop?

MapReduce.

What are two approaches to scaling up (out) a NoSQL database?

Master-slave and Sharding.

What term is used for descriptions of the data contained in the data warehouse?

Metadata

The concept of ___________ was introduced into SQL to handle "missing data" in the relational model.

NULL

What relational algebra operator is called "bowtie" because of the symbol used to represent it in notation?

Natural Join

What databases provides eventual consistency?

NewSQL databases provide this feature

What type of database provides a mechanism for storage and retrieval of data that use less structured consistency models than traditional relational database?

NoSQL

What type of subquery is executed before the outer query and is executed only once?

Non-correlated type 1 subquery

once entered into the data warehouse, it should not change

Nonvolatile

Data is stored, retrieved and updated in a ____________ environment.

OLTP

What type of relationships exist in a star schema between dimension and the fact table in a Star schema?

One-to-many

What type of DBA focuses on physical aspects of database administration like installation and configuration?

Operational DBA

What do we call the technical function responsible for database design, security, and disaster recovery?

Operational Database Administrator

As database administration automation increases, what is the impact on personnel Needs?

The personnel needs of the organization splits into two categories: Highly skilled workers to create and manage the automation A group of less skilled "line" DBAs who execute and monitor the automation

What does soft state mean in a NoSQL database?

The state of the system could change over time

data warehouse's focus on change over time

Time Variant

Why is concurrency control important?

To ensure data integrity when updates occur to the database in a multiuser environment

Why does every data structure in the data warehouse contains the time element or Dimension?

To give context to fact data

What is a discrete unit of work called that must be processed completely or not at all in an operational system?

Transaction

What is it called when changes to the database are recorded in a journal file to facilitate automatic recovery of an in-memory database?

Transaction logging

What is the purpose of backups? a. Provide a structure for the storage and recovery of data. b. Recover data after it is lost by data deletion or corruption. c. Recover data from an earlier time, according to a user-defined data retention policy. d. B and C.

d

What is {FirstName:"Bob", Address:"5 Oak St.", Hobby:"sailing"}?

document

In what environment can data be updated?

Operational.

What database offers an API or query language that allows a user to retrieve based on Content?

Document-oriented database

What are the four major post-relational data models?

1. Column store 2. Key-value stores 3. Document stores 4. Graph database systems

What are three benefits of a data warehouse?

1. Consolidate data from multiple sources 2. Maintain data history 3. Improve turnaround time for data access and reporting

What are 2 benefits of a data warehouse.

1. Consolidate data from multiple sources into a single database for querying 2. Maintain data history, even if source transaction systems do not

What are the disadvantages of using a Star schema?

1. Data integrity is not enforced 2. data redundancy 3. Insert and update anomalies

What are the sub-parts of PL/SQL programs?

1. Header 2. Declaration 3. Executable 4. Exception

What are the four attributes of a data warehouse specified by Inmon?

1. Integrated 2. Nonvolatile 3. Subject-Oriented 4. Time Variant

What are three limitations of a data warehouse?

1. Lack of data 2. Poor data quality 3. Creates unrealistic expectations for users

What are the 3 types of DBAs?

1. Operational 2. Development 3. Application

What are phases in the database life cycle?

1. Requirements analysis 2. Logical and Physical Design 3. Operation 4. Maintenance

What are major advantages/benefits of creating a star schema for a data warehouse?

1. Simpler queries 2. Simplified business reporting logic 3. Query performance gains 4. Fast aggregations 5. Feeding cubes

What are 3 motivations for using the NoSQL approach to storing data?

1. Simplicity of design 2. Horizontal scaling 3. Finer control over availability

What are 2 distinct purposes of backups?

1. To recover data from an earlier time 2. Recover after a data loss event

What are two primary methods to run a database on the cloud?

1. Virtual machine Image 2. Database as a service

What makes databases especially challenging to implement and manage? (2)

1. You can't guarantee the data will be used efficiently 2. Work is complex

What are the main disadvantages of the dimensional approach of Ralph kimball?

1. loading the data warehouse with data from different operational systems is complicated 2. It is difficult to modify the data warehouse structure

What are major problems with NoSQL databases?

1. no transaction consistency 2. Not designed with security features similar to RDBMS 3. Must implement a security layer 4. Data access involves programming

What is a common time horizon for data stored in a Data warehouse?

6 or more years.

How are data administration and database administration different?

A data administrator is focused on the high level and that standards that are implemented for the data, while a database administrator deals with the more physical characteristics and technical issues.

What is SQL?

A declarative or non procedural set-oriented query language.

How does a procedure differ from a function?

A function returns a value and a procedure only executes commands.

What is data administration?

A high-level function responsible for the overall management of data resources in an organization

What is On-line Analytical Processing (OLAP)?

A software for manipulating multidimensional data form a variety of sources

What test is used to assess an online transaction processing environment?

ACID.

selective restriction of access to data by users

Access control

used to regulate who or what can view or use resources in a computing environment

Access control

What is the UNION operator?

An SQL operator that allows you to stack datasets

An open-source software framework that supports data intensive distributed applications. It supports parallel running of applications on large clusters of commodity hardware. It derives from Google's MapReduce and Google File System (GFS) papers.

Apache Hadoop

manage third party application components that interact with the database

Application DBAs

What property requires that each transaction is "all or nothing"?

Atomicity

What are the four ACID guarantees for transactions in a database?

Atomicity, Consistency, Isolation, Durability

monitoring and recording of selected user database actions

Auditing

the process of verifying who you are

Authentication

What is the difference between an authentication and authorization?

Authentication - process of verifying who you are Authorization - process of verifying that you have access to something

Designers of distributed data stores have increased what feature at the expense of Consistency?

Availability

given system is available when needed

Availability

How does business intelligence differ from data warehousing?

BI - the tools and processes for retrieving and analyzing data to meet management information needs Data warehousing - a historical database design and maintenance to meet the business intelligence requirement of a specific organization

Who championed the Normalization approach?

Bill Inmon

What are the two processes ensure that distributed databases remain up-to-date and current?

Duplication and Replication.

Three desirable system requirements for the successful design, implementation, and deployment of applications in distributed computing systems. Attaining all three is not however possible

CAP theorem

What SQL statement returns varying results based upon the evaluation of expressions?

CASE

________________ causes all data changes in a transaction to be made permanent.

COMMIT

What SQL statements are used in large, multiuser database systems to control transactions, i.e., sequences of changes to a database?

COMMIT and ROLLBACK

Computer aided software engineering

Case Tools

What types of data stores are used to store information about networks, such as social connections?

Graph

Authentication means ______________ of someone (a user, device, or other entity) who wants to use data, resources, or applications.

Confirming the identity

A ____________ dimension has the same values for all areas of a business; it is a dimension that has the same meaning to every fact with which it relates.

Conformed

predictability and reliability of data in a database across all nodes.

Consistency

What type of subquery is nested inside another outer query from which it uses values?

Correlated Type 2 subquery

A _____________________ is a subquery that uses values from the outer query. The subquery is evaluated once for each row processed by the outer query.

Correlated subquery - Type 2

The UNION compatible rule means the tables have same number of columns and _____________.

Corresponding columns have identical data types and lengths

What data from the operational environment is most commonly extracted and loaded into a data warehouse?

Current detail data

What is an advantage of using a subquery rather than a query with DISTINCT when both can yield the correct answer? a. A query will run more efficiently if you use the subquery to eliminate duplicates b. Use of subqueries reduces the hierarchy found in execution which can be useful c. Using subqueries makes it easier to read and understand the processing d. a and c e.All of the above.

D

Store the description of all objects that interact with the database

Data dictionary

What are common database administration tools? (2)

Data dictionary and Case tools

After a DBMS is purchased, who has primary responsibility for installation and maintenance?

Database Administrator

in general, who determines the access privileges for a user and enters the appropriate authorization rules in the DBMS catalog to ensure users only access a database in appropriate ways?

Database Administrator

What are controls called that are designed to restrict access and activities?

Database access control

What do we call the function of managing and maintaining database management systems (DBMS) software?

Database administration

What does DSS mean in a data warehouse context?

Decision Support System.

an optional section of the code block, contains the name of the local objects that will be used in the code block

Declaration

What are the two basic functions of a query optimizer?

Determine Join order and Join method.

What type of DBA is responsible for data model design and maintenance and DDL Generation?

Development DBA

focus on logical and development aspects of database administration

Development DBAs

Why are SQL databases not natively suited to a cloud environment?

Difficult to scale. Not build to service heavy read/write loads are are not able to scale up and down easily

What is the difference between a dimension and a fact in a star schema?

Dimension - qualifying characteristics that provide additional perspectives to a given fact Fact - numeric measurements (values) that represent a specific business aspect or activity

What are the two leading approaches to storing data in a data warehouse?

Dimensional approach and the Normalized approach

What is the intention of DBA automation?

Enables DBAs to focus on proactive activities like performance and service level management

conversion of data into a coded format

Encryption

An ___________________ is a unified database that holds all the business information of an organization.

Enterprise data warehouse

only mandatory section, contains the statements that will be executed

Executable

What does the acronym EIS commonly mean?

Executive information system

What is ETL?

Extract, Transform and Load actions to populate a data warehouse

What company originally developed the NoSQL database Apache Cassandra?

Facebook

A _________________ is a value or measurement about a specific event.

Fact

What is the term used in data warehousing for a value or measurement?

Fact

In data warehousing, a _____________ table consists of the measurements or metrics of a business process.

Fact table

What is an open-source DBMA?

Free or nearly free database software whose source code is publicly available

A join that combines the effect of applying both Left and Right Outer Joins.

Full outer join

What company originally developed the NoSQL database BigTable?

Google

Apache _____________________ is an open-source software framework that supports data intensive distributed applications. It supports parallel running of applications on large clusters of commodity hardware. It derives from Google's MapReduce and Google File System (GFS) papers.

Hadoop

the optional first section of the code block, used to identify the type of code block and its name

Header

What are the primary purposes of a database?

Help people record and keep track of facts, objects and things

What type of join query will return all of the records in the left table (table A) that have a matching record in the right table (table B)?

INNER JOIN

Which of the following is NOT a property of database transactions?

Identifiable

What is the UNION compatible rule?

If the relations have the same number of attributes and each attribute is from the same domain.

When are two relations union-compatible?

If they have the same number of attributes and each attribute is from the same domain

When are main memory databases often used?

In application where response time is critical

putting data from disparate sources into a consistent format

Integrated

assuring the accuracy and consistency of data over its entire life-cycle

Integrity controls

The _______ operator takes the results of two queries and returns only rows that appear in both result sets.

Intersect

The _______________ operator takes the results of two queries and returns only rows that appear in both result sets.

Intersect

What is a DBA's Technical Role?

Physical database design and dealing with technical issues

What is the meaning of PL/SQL?

Procedural Language/Structured Query Language

What is the purpose of the MAP function in the MapReduce Framework?

Processes one or more "chunks" of data and produces the output result Divided problems into subproblems

What are the 3 fundamental Relational Algebra operators?

Projection (π) Selection (σ) Natural join (⋈)

Who championed the Dimensional approach?

Ralph Kimball

What is the purpose of backups?

Recover data after it is lost by data deletion or corruption and Recover data from an earlier time, according to a user-defined data retention policy.

Most NoSQL databases support automatic __________, meaning that you get high availability and disaster recovery

Replication

What join returns records from the right table that have no matching key in the left table in the result set?

Right outer join

How does rollback differ from roll forward in database recovery?

Rollback - a transaction which rolls back the transaction to the beginning Rollforward - redoes the changes made by a transaction

"Sharding" a database across many server instances can be achieved with:

SAN

In what clauses can you place a subquery? (4)

SELECT, FROM, WHERE, and HAVING

SELECT E.EmployeeID, E.EmployeeName, M.EmployeeName AS Manager FROM Employee_T E, Employee_T M WHERE E.EmployeeSupervisor = M.EmployeeID; What type of join is used in the above query?

SELF JOIN

What are conditional CASE expressions?

SQL statement that handles If/Then logic.

a test where testers attempt to find security vulnerabilities that could be used to defeat or bypass security controls, break into the database, compromise the system ect.

Security vulnerability assessment

What is the Relational Algebra notation for UNION, INTERSECTION, and MINUS Set Operators?

Union - A ∪ B Intersection - A ∩ B Minus - A - B

In what type of join are all columns from each table that is joined, and an instance for each row of each table?

Union Join

NoSQL databases are used mainly for handling large volumes of ______________ data.

Unstructured

What is it called when a SELECT query has been given a name and saved in the Database?

User view

How does a security vulnerability assessment differ from a compliance program?

Vulnerability assessment is a preliminary procedure to determine risk where a compliance program is the process of ongoing risk assessment

One technique for evaluating database security involves performing _________________________________ or penetration tests against the database.

Vulnerability assessments

A set of theories, methodologies, processes, architectures, and technologies that transform information into meaningful and useful data for business purposes.

business intelligence

A user friendly Business Intelligence tool that helps a person write SQL queries.

ad hoc query tool

is the result of a computation made with an aggregate function. Aggregate functions compute a single result value from a set of input values.

aggregate

An interactive, methodical exploration of data with emphasis on statistical analysis

analytics

Which of the following statements is true of a data warehouse? a. Can be updated by end users. b. Contains numerous naming conventions and formats. c. Organized around important subject areas. d. Contains only current data.

c

Where the before-images are applied to the database

backward recovery

What does data transformation include?

change data from a detailed level to a summary level

How is isolation of transactions achieved?

concurrency control

a dimension that has a single meaning and content throughout a data warehouse. A conformed dimension can be used in any star schema.

conformed dimensions

Approximately how many steps are involved in the process of creating a new database?

consist of hundreds or thousands

The UNION compatible rule means the tables have same number of columns and ____________________?

corresponding columns have identical data types and lengths

What are characteristics of an active data warehouse architecture? a. at least one data mart. b. data that has extracted from multiple internal and external sources. c. near real-time data updates. d. all of the above.

d

What is SQL? a. A declarative or nonprocedural query language. b. An imperative language that specifies an explicit sequences of steps to follow. c. A set-oriented language. d. A and C. e. All of the above.

d

What tasks are Business Intelligence and data warehousing systems used for in an organization? a. Forecasting. b. Reporting. c. Analysis of large volumes of product sales data. d. All of the above.

d

Which of the following statements is true? a. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents b. MongoDB can link to only proprietary programming languages and development environments c. When compared to relational databases, NoSQL databases are more scalable and provide superior performance d. a and c e. All of the above.

d

A small database that serves a different user. A data warehouse can be divided into separate ____

data mart

What is the main organizational justification for implementing a data warehouse?

decision support

A set of related computer programs and the data required to assist with analysis and decision making within an organization

decision support system

If you don't have enough information available when you design a query to determine which rows you want, what would you do?

do any joins, use wild card * and do not use filters to examine the result set

What is an XML database? a. A category of NoSQL databases. b. A database that allows data to be specified in eXtensible Markup Language format. c. A type of document-oriented database. d. A and B. e. All of the above.

e

Which of the following topics are part of an administrative policy to secure a database? a.Authentication policies b. Limiting particular areas within a building to only authorized people c. Backup procedures d. A and C. e. All of the above.

e

Who is responsible for running queries and reports against data warehouse tables? a. DBA. b. Software applications. c. End users. d. Database analysts. e. C and D

e

capturing a subset of the data contained in various operational systems

extract process

What is the central table in a star schema?

fact table

Information and the complementary networks of hardware and software that people and organizations use to collect, filter, process, create and distribute data.

informational system

What is changing the database design to improve the performance called?

tuning the design

An operational system is _____________.

used to run the business in real time and is based on current data.

process of identifying, quantifying, and prioritizing the vulnerabilities in a system.

vulnerability assessment


Ensembles d'études connexes

Chapter 13: Protists, Fungi & Viruses

View Set

Chapter 9b: Inventory Management

View Set

Head-to-Toe Assessment (Chapter 28)

View Set

Intro to Organizational Behavior Chapter 1

View Set

Disorders of Sodium and Water Balance

View Set