Database Final

¡Supera tus tareas y exámenes ahora con Quizwiz!

How many duplicate values are allowed in attribute which is a primary key?

0

XML can have this many root elements

1

max number of clustered indexes per table is...

1

Common HDD platter rotation speed is

5400, 7200, 10000 or 15000 RPM

Create View SQL creates

A virtual table without copying data

RDBMSs support the following property

ACID

INNER JOINs are used when we want to keep

All matching tuples of both tables

FULL JOINs are used when we want to keep

All tuples from both tables, regardless of matching tuples

LEFT JOINs are used when we want to keep

All tuples of table on the LEFT, regardless of matching tuples

RIGHT JOINs are used when we want to keep

All tuples of table on the RIGHT, regardless of matching tuples

ACID stands for

Atomicity, Consistency, Isolation, Durability

What is the correct ordering of record, disk block, attribute value?

Attribute value < record < disk block

NoSQL supports the following property

BASE

NoSQL BASE stands for

Basically available, soft state, eventual consistency

____ specify structure of XML document

Both DTD and XSD

SQL consists of

DDL, DML, DCL and TCL

What concept specifies the various data items, their characteristics, and relationships, constraints, storage details

Database model

DTD stands for

Document Type Definition

XML stands for

Extensible Markup Language

XSLT stands for

Extensible stylesheet language transformation

ER Diagrams are used for archiving purposes only (T/F)

False

Which of the following statements is correct? a. DataNodes in HDFS store a registry of metadata. b. The HDFS NameNode sends regular heartbeat messages to its DataNodes. c. HDFS is composed of a NameNode, DataNodes, and an optional SecondaryNameNode. d. Both the SecondaryNameNode and primary NameNode can simultaneously handle requests from clients.

HDFS is composed of a NameNode, DataNodes, and an optional SecondaryNameNode.

In SQL, subqueries may be used

In SELECT, FROM and WHERE

T1 calculates a summary of DB items. Other transactinos are updating some of these items. This is a

Incorrect summary

System database

Is NOT (none above) physical structure determined by DB designer, contain data that represents state of business process, or the logical struct determined by db designer

When an SQL view is created

It exists as a stored SQL query and can be queried

In practice, these OUTER joins are used most frequently

LEFT JOIN

XPath is

Language that refers to parts of an XML document

When item x has incorrect value because its update by T1 is overwritten by T2, this is a

Lost update

DBMS buffers are located in

Main Memory

Is NOT NULL constraint required for a column which is a PRIMARY KEY?

No

Does primary key allow NULLs?

No / False

NoSQL stands for

Not Only SQL

User database

Physical structure determined by db designer, contain data that represents the state of business process, and logical structure designed by DB designer (all above)

RDBMS stands for

Relational Database Management System

If sorted by speed, what is the correct order?

SSD > RAM > CPU Cache > CPU Registers

In the relational model

Specialization, Categorization, Aggregation is not supported (all above)

Attribute which a foreign key may contain NULLs (T/F)

True

ER Diagrams commonly use Chen or Crows foot may also be in UML (T/F)

True

Foreign key must be of the same data type as PK which FK refers to (T/F)

True

SQL is pretty much the same for all relational databases

True

T1 reads x twice, T2 changes X's value between reads, this is a

Unrepeatable read

XSD stands for

XML Schema Definition

Every logical table is assigned to

a tablespace

Which of the following statements is correct? a. HBase can be considered as a NoSQL database. b. HBase offers an SQL engine to query its data. c. MapReduce programs cannot be used with HBase. Data are accessed using simple put and get commands instead. d. HBase works well on large clusters as well as small ones having a few nodes

a. HBase can be considered as a NoSQL database.

________ consist of 2 or more adjacent sectors

allocation unit, cluster (page) and disk block (all above)

When using subqueries, nesting is supported

any level of nesting is fine

In HDD a cylinder is

any set of all tracks of equal diameter

In the embedded identification file organization

attributes are always preceded by attribute types

Which statement is correct? a. A star schema has one large central dimension table which is connected to various smaller fact tables. b. A star schema has one or more large fact tables which are connected to any number of smaller dimension tables. c. To speed up report generation and avoid time-consuming joins in a star schema, the dimension tables need to be normalized. d. The dimension tables in a star schema are frequently updated.

b. A star schema has one or more large fact tables which are connected to any number of smaller dimension tables.

In terms of data manipulation, a data warehouse focuses on... a. Insert/Update/Delete/Select statements. b. Insert/Select statements. c. Select/Update statements. d. Delete statements.

b. Insert/Select statements.

Which statement is not correct? a. A transaction is a set of database operations (e.g., a consecution of SQL statements in a relational database), induced by a single user or application, that should be considered as one undividable unit of work. b. Transactions typically exist in isolation, and cannot be executed concurrently with other transactions on the same data. c. It should not be possible to terminate a transaction in such a way that the database remains in an inconsistent state, because some operations of a single transaction were executed successfully and others were not. d. Recovery is the activity of ensuring that, whichever problem occurred, the database is returned to a consistent state afterwards, without any data loss.

b. Transactions typically exist in isolation, and cannot be executed concurrently with other transactions on the same data.

Pig is... a. a programming language that can be used to query HDFS data. b. a project offering a programming language to provide more user-friendliness compared to MapReduce programs. c. a database that runs on Hadoop. d. an SQL engine that runs on top of Hadoop

b. a project offering a programming language to provide more user-friendliness compared to MapReduce programs.

14.2. When a transaction is aborted, it is important that... a. all changes made by the individual operations belonging to that transaction should be made permanent. b. a rollback of the transaction is executed: all changes made by the transaction's respective operations should be undone

b. a rollback of the transaction is executed: all changes made by the transaction's respective operations should be undone

Which statement about ETL is not correct? a. Some estimates state that the ETL step can consume up to 80% of all efforts needed to set up a data warehouse. b. To decrease the burden on both the operational systems and the data warehouse itself, it is recommended to start the ETL process by dumping the data in a staging area where all the ETL activities can be executed. c. During the loading step, the data warehouse is populated by filling the fact and dimension tables, thereby also generating the necessary surrogate keys to link it all up. Fact rows should be inserted/ updated before the dimension rows. d. The extraction strategy can be either full or incremental. In the latter case, only the changes since the previous extraction are considered.

c. During the loading step, the data warehouse is populated by filling the fact and dimension tables, thereby also generating the necessary surrogate keys to link it all up. Fact rows should be inserted/ updated before the dimension rows.

Which of the following statements is not correct? a. A mapper in Hadoop maps each element in a collection to one or more output elements. b. A reducer in Hadoop reduces a collection of elements to one or more output elements. c. Reducer workers in Hadoop will start once all mapper workers have finished. d. A MapReduce pipeline in Hadoop can include an option

c. Reducer workers in Hadoop will start once all mapper workers have finished.

Configuration and tuning of DB parameters...

can be done by a DBA, autoconfigured by a DBA, and is a DBA's responsibility (all above)

Transactions...

can be either read-only, read-write, and can have begin and end transaction statements (all above)

In XML, tag sequences

cannot be overlapped

ER Diagrams are used for

communication with a client

the limitation of a relational model is

complex objects are difficult to handle

In NoSQL BASE property, Eventual consistency is

consistency at some point after transaction

Pointers and lists file organization

contains references to overflow locations

Which of the following is not one of the reasons why Spark programs are generally faster than MapReduce operations? a. Because Spark tries to keep its RDDs in memory as long as possible. b. Because Spark uses a directed acyclic graph instead of MapReduce. c. Because RDD transformations are "lazily" applied. d. Because Mesos can be used as a resource manager instead of YARN.

d. Because Mesos can be used as a resource manager instead of YARN.

Which of the following statements is not correct? a. Hive offers an SQL engine to query Hadoop data. b. Hive's query language is not as feature-complete as the full SQL standard. c. Hive offers a JDBC interface. d. Hive queries run much faster than hand-written MapReduce programs.

d. Hive queries run much faster than hand-written MapReduce programs.

Which of the following statements is not correct? a. Spark SQL exposes DataFrame and Dataset APIs which underlyingly use RDDs together with a performant SQL query engine. b. Spark SQL can be used from within Java, Python, Scala, and R. c. Spark SQL can be used through ODBC and JDBC interfaces. d. Spark SQL DataFrames need to be created by loading a file

d. Spark SQL DataFrames need to be created by loading a file

In relative location file organization

data items of the same entity are stored adjacently

physical order of table records in storage is...

determined by primary or clustered index

high-performance DBMSs communicate to storage devices...

directly

T1 updates x, T2 reads x, T1 fails and is rolled back, this is a

dirty read

DBMSs performance tuning utilities...

distribute data and tables over storage media, detect and resolve I/O problems of storage devices, and provide index-tuning wizards (all above)

Reasons for index creation...

efficient retrieval of rows, performance of join queries, and enforce uniqueness (all above)

Within a file, records are commonly organized using

embedded identification, pointers and lists, and relative location (all above)

indxes have a cost of...

extra storage and data overhead

For audio, video and text, the relational model

has poor support

NoSQL DBs support well

horizontal scaling

SQL queries with EXIST checks

if the result of a correlated nested query is empty or not

NoSQL databases support data

in non-tabular format

Transactions consist of 1 or many

insertions or deletions, updates, and retrievals (all above)

Tablespace

is a physical container of DB objects, may consist of 1 or more files, and may be distributed over several objects (all above)

NoSQL databases may support data in forms (there are 2 of this question)

key-value based, document based, or column based (all above) OR Graph based, XML and object based, or time series (all above)

max number of secondary indexes per table is...

many

Transactions submitted by various users

may execute concurrently, access the same DB items and update the same DB items (all above)

XML tags

must always be opened, closed, and correctly nested (all above)

Database is represented as a collection of

named data items

XSLT transforms XML documents to

other XML documents, HTML web pages, or plain text (all above)

Relational model

required tuples to be normalized

The smallest addressable unit of an HDD is

sector

HDD service time is

seek time + rotational delay + transfer time

HDD response time is

service time + queueing time

Random block access is _________ than sequential block access

slower

Granularity is

the size of a data item

Relational model has these constructors

tuple constructor and set constructor

The comparison condition v > ALL V returns TRUE if

value v is greater than all values in multiset V

The comparison condition v > ANY V returns TRUE if

value v is greater than at least one value in multiset V

Relational model is also known as

value-based model

RDBMSs support well

vertical scaling

SQL Views are

virtual tables without copying data

Primary storage is _____ while secondary storage is _______

volatile / persistent

When computer reboots the content of

volatile storage is erased and persistent storage is preserved


Conjuntos de estudio relacionados

Chapter 17: Evolution of Populations

View Set

intro to interpersonal communications 1-7

View Set

Management of Patients with Upper Respiratory Tract Disorders

View Set

ПРО ОПЕРАТИВНО-РОЗШУКОВУ ДІЯЛЬНІСТЬ

View Set

Suspense in Romeo and Juliet, Part 7

View Set

#1 PEDS Ch 46 - Nursing Care of the Child With an Alteration in Cellular Regulation / Hematologic or Neoplastic Disorder

View Set

Authoritarian leadership - a style of leadership in which the leader tells the employees what needs to be done and how to perform it without getting their advice or ideas.

View Set