DS Midterm 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

index

a stored, ordered list of key values that is part of the internal data model and that provides a fast access path to the physical data to speed up the execution time of a query

after trigger

a trigger that is activated after the triggering event took place.

active

a DBMS, such as a modern RDBMS, that autonomously takes the initiative for action if specific situations occur.

Authorization identifier

indicates the user, or user account, who owns the schema.

NOT NULL constraints

prohibits null values for a column.

replicas

the different virtual nodes that correspond to the same physical node in a consistent hashing ring.

passive

a DBMS, such as a traditional RDBMS, that only executes transactions explicitly invoked by users and/or applications.

memcached

a NoSQL database that implements a distributed memory-driven hash table that is placed in front of a traditional database to speed up queries by caching recently accessed objects in internal memory.

multiset

a collection type that can have duplicate elements without order.

Correlated nested queries

a condition in which the WHERE clause of a nested query references a column of a table declared in the outer query; the nested query is evaluated once for each tuple, or combination of tuples, in the outer query.

column-oriented DBMS

a database management system that stores data tables as sections of columns, rather than as rows of data.

key-value stores

a database storing data as (key, value) pairs.

hash function

a function that takes an arbitrary value of arbitrary size and maps it to a key with a fixed size, which is called the hash value, hash code, hash sum, or simply the hash.

Cypher

a graph-based query language introduced by Neo4j, one of the most popular graph databases.

SQL schema

a grouping of tables and other database objects such as views, constraints, and indexes that logically belong together.

nested query

a query that appears within, or "nested" inside, a different query. Also called a subquery or inner query.

SET NULL

a referential integrity constraint in which all foreign keys in the referring tuples are set to NULL.

SET DEFAULT

a referential integrity constraint in which the foreign keys in the referring tuples should be set to their default value.

vertical scaling

a way to increase data capacity by the extension of storage capacity and/or CPU power of a database server; also called scaling up.

join queries

allows the user to combine, or join, data from multiple tables.

GROUP BY

an SQL clause in which rows are grouped when they have the same value for one or more columns and the aggregation is applied to each group separately.

VARIANCE

an SQL operator used in aggregate functions that provides the variance of all values in the selected column.

COUNT

an SQL operator used in aggregate functions to count the number of tuples.

LIKE

an SQL operator used with the WHERE clause that uses wildcards to find patterns.

INTERSECT

an SQL set operation that returns a table that includes all tuples that are in both SELECT blocks.

UNION

an SQL set operation that returns a table that includes all tuples that are in one of the SELECT blocks, or both.

EXCEPT

an SQL set operation that returns a table that includes all tuples that are in the first SELECT block but not in the second.

SELECT

an SQL statement that retrieves data from a relational database.

shard

an individual partition

ALL

an operator that compares a single value to a multiset and returns TRUE only if all values in the subquery match.

What is memcached? a. is the only NoSQL database that doesn't implement a distributed memory-driven hash table b. a caching solution rather than a persistent database c.the only example of a NoSQL database known today d. a very recent example of a SQL database

b

Which statement is CORRECT? a. According to CAP, a DBMS can only achieve one of the following three properties: consistency, availability and partition tolerance. b. By using the membership protocol, the system can only achieve eventual consistency. c. The soft state property of the BASE principle indicates that a system can only change over time if it receives input. d. The BASE principle emphasizes continuous consistency.

b

According to the CAP theorem, a NoSQL database that has high availability & the ability to easily scale horizontally across many machines (partition), what will necessarily be lower? a. partitioning b. availability c. ability to scale d. consistency

d

In relational databases you have ACID, but in NoSQL databases, with eventual consistency, you have BASE. BASE stands for... a. basically applicable, severe state, eventually free b. basically available, severe state, eventually consistent c. base application, soft state, eventually consistent d. basically available, soft state, eventually consistent

d

Which of the following statements describes NoSQL databases best? a.NoSQL databases do not support joins. b.A NoSQL database offers no support for SQL. c.NoSQL databases are not capable to deal with large datasets. d. NoSQL databases are non-relational.

d

Which of the following statements describes NoSQL databases the best? a.NoSQL databases are always consistent. b.NoSQL databases never use hashing. c.NoSQL databases are not the desired kind of databases. d.NoSQL databases are non-relational

d

eventual consistency

data and respective replicas of the same data item become consistent over time after each transaction, but continuous consistency is not guaranteed.

binary large object (BLOB)

data collected and stored as a single large object in a DBMS; BLOBs are typically multimedia files (e.g., image, audio, video).

Privilige

defines a user's right to use certain SQL statements such as SELECT, INSERT, etc. on one or more database objects.

dissemination

mechanism used to specify how information flows between nodes in a network.

ring topology

network topology in which each node occupies a position in a closed range of numbers

edges

one of the primary components of graphs, consisting of arcs or lines that connect nodes.

membership protocol

set of rules to specify how nodes remain informed at all times of the other nodes in the network.

graph-based database

type of NoSQL database that applies the mathematical graph theory to the storage of records. Data are stored in graph structures using the elements of nodes and edges.

document stores

type of NoSQL database that stores a collection of attributes labeled and unordered, representing items that are semi-structured.

BASE principle

used by NoSQL this alternative to the traditional transactional model (ACID) enforces the following three principles: Basic Availability, Soft state, and Eventual consistency.

A key characteristic of NoSQL databases is: a. They are built to horizontally scale & easily distributed onto many computers b. They are made to be too complicated, therefore are not used much. c. They are built to vertically scale & easily distributed onto many computers d. They are built to replace all SQL databases.

a

Which statement is CORRECT? a. All statements are correct b. A distinct data type is a user-defined data type which specializes a standard, built-in SQL data type. c. An opaque data type is an entirely new, user-defined data type, which is not based upon any existing SQL data type. d. A named row type is a user-defined data type which groups a coherent set of data types into a new composite data type and assigns a meaningful name to it. d. An unnamed row type allows to include a composite data type in a table by using the keyword ROW.

a

Which statement is CORRECT? a. The fact that most NoSQL databases adopt an eventual consistency approach is due to the CAP theorem, which states that strong consistency cannot be obtained when availability and partitioning has to be ensured. b. Stabilization relates to the waiting time between the start-up of a NoSQL system and when the system becomes usable to receive user queries. c. Replicas in a distributed NoSQL environment relate to making periodic backups of the database to a second system. d. Some relational constructs, such as the many-to-many relationship, are harder to express using graph databases.

a

Free Form Language

a language in which no special indentation is required, in contrast to languages such as Python or COBOL.

trigger

a piece of SQL code consisting of declarative and/or procedural instructions and stored in the catalog of the RDBMS. It is automatically activated and run (fired) by the RDBMS whenever a specified event (e.g., insert, update, delete) occurs and a specific condition is evaluated as true.

stored procedure

a piece of SQL code consisting of declarative and/or procedural instructions and stored in the catalog of the RDBMS. It must be invoked explicitly by calling it from an application or command prompt.

MapReduce

a programming model, primarily seen in Hadoop, that is a highly scalable implementation of both a map function (the conversion of a dataset broken down into tuples of key-value pairs), and a reduce function, which reduces the output of the map into a smaller set of tuples. The third module of the Hadoop stack.

RESTRICT

a referential integrity constraint in which the update or removal is halted if referring tuples exist.

ON DELETE CASCADE

a referential integrity constraint that says a removal should be cascaded to all referring tuples.

ON UPDATE CASCADE

a referential integrity constraint that says an update should be cascaded to all referring tuples.

view

a subset of the data items in the logical model tailored toward the needs of a specific application or group of users. Often called a virtual table in a relational setting.

graph theory

a theory that uses the mathematical structures of graphs to model pairwise relations between objects.

horizontal scaling

a way to increase capacity by arranging multiple database servers in a cluster.

schema-level triggers

also called DDL triggers, RDBMS triggers that are fired after changes are made to the DBMS schema (such as creating, dropping, or altering tables, views, etc.).

sharding

also known as horizontal fragmentation or partitioning, where data are partitioned into separate sets, each of which are attributed to a different node in a distributed DBMS.

WHERE

an SQL clause that when added to a statement specifies selection conditions to indicate which table rows should be selected.

ALTER

an SQL command that modifies table column definitions.

DROP

an SQL command that removes database objects.

EXISTS

an SQL function that checks whether the result of a correlated nested query is empty or not, returning TRUE or FALSE.

NOT EXISTS

an SQL function that returns TRUE if there are no tuples in the result of the nested query, or otherwise returns FALSE.

left outer join

an SQL join query in which each row from the left table is kept in the result; if no match is found in the other table it will return NULL values for these columns.

right outer join

an SQL join query in which each row from the right table is kept in the result; if no match is found in the other table it will return NULL values for these columns.

full outer join

an SQL join query in which each row of both tables is kept in the result, if necessary completed with NULL values.

inner join

an SQL join query in which matching tuples from two different tables are joined. An exact match is a requirement; tuples that do not match any other tuple are not included.

DISTINCT

an SQL operator used in aggregate functions that filters out duplicates.

AVG

an SQL operator used in aggregate functions that provides the average of all values in the selected column.

SUM

an SQL operator used in aggregate functions that provides the sum of all values in the selected column.

IN

an SQL operator used with the WHERE clause that can specify multiple values, creating a set.

BETWEEN

an SQL operator used with the WHERE clause that specifies a range of values from which to return results.

INSERT

an SQL statement that adds data to a relational database.

UPDATE

an SQL statement that modifies data to a relational database.

ORDER BY

an SQL statement that orders the tuples in the result of a query by the values of one or more columns.

DELETE

an SQL statement that removes data from a relational database.

stabilization

an operation that repartitions hashes over nodes if nodes are added or removed.

Which of the following is not a property of a good hash function for use in key-value based storage structures? a. A hash function should return an output of fixed size. b. Two hashes from two inputs that differ little should also differ as little as possible. c. A good hash function should map its inputs as evenly as possible over the output range. d. A hash function should always return the same output for the same input.

b

Which of the following statements on column-oriented database is not correct? a. It yield advantages in applications such as marketing analytics or clinical data systems. b. We can easily retrieve all the attributes of one entity. c. The decision on being column-oriented stands quite orthogonal to the type of data being stored. d. It works well for sparse data, since all NULL values can be left out.

b

NoSQL databases use hashing to: a. Make MapReduce accessible to all users b. Differentiate between SQL and NoSQL databases c. Organize data on multiple nodes/servers d. Store data in JSON file format

c

Which statement is CORRECT? a. Document stores require users to define document schemas before data can be inserted. b. Document stores do not provide SQL-like capabilities. c. Document stores are built on the same ideas as key-value and tuple based database systems. d. Document stores require that you perform all filtering and aggregation logic in your application.

c

Which of the following statements on key-value stores is not correct? q. Horizontal scalability is supported. b. A hash function must have the following three properties: deterministic, uniform and defined size. c. Stabilization is the operation that repartitions hashes over nodes when nodes are added or removed. d. Usually elaborate querying facilities are offered in a key-value store.

d

Which statement is CORRECT? a. Document stores require that you perform all filtering and aggregation logic in your application. b. The BASE principle emphasizes continuous consistency. c. Document based databases are not an example of a NoSQL d. Redundancy is achieved through storing duplicate key-value pairs across multiple nodes.

d

Which statement is CORRECT? a. Graph based databases are not an example of a NoSQL database. b. Document stores require that you perform all filtering and aggregation logic in your application. c. NoSQL databases do not support joins. d. Column oriented databases are more efficient when aggregates need to be calculated over many or all rows in the dataset.

d

Which statement is CORRECT? a. In the relational model, the tuple and set constructor can be used in a nested way. b. In the relational model, the tuple constructor allows to define composite attribute types. c. In the relational model, the set constructor allows to define multivalued attribute types. d. In the relational model, the tuple constructor can only be used on atomic values and the set constructor can only be used on tuples.

d

Which statement is CORRECT? a. NoSQL databases focus heavily on consistency. b. Due to the new No-SQL technologies, relational databases will become obsolete. c. NoSQL databases do not include XML databases or object-oriented DBMSs. d. The meaning of NoSQL shifted from 'not SQL' to 'not only SQL' or 'not relational'.

d

UNIQUE constraint

defines an alternative key of a table.

CAP theorem

formulated by Eric Brewer, this theorem states a distributed computer system cannot guarantee the following three properties simultaneously: Consistency, Availability, and Partition tolerance.

aggregate functions

functions used to summarize information from database tuples. Popular examples are COUNT, SUM, AVG, VARIANCE, MIN/MAX, and STDEV.

consistent hashing

hashing schema that avoids having to remap each key to a new node when nodes are added or removed.

nodes

key component of a network which is connected to other nodes using edges or vertices.

Redis is a _____ data store and MangoDB is a _____ data store.

key-value, document

failure detection

mechanism used to find out if a node in a network has gone down.

JSONB

one of the two JSON data types, in which data are stored in a decomposed binary format which is slower to store but significantly faster to process in subsequent calls, as no reparsing is needed.

FROM

part of an SQL statement that specifies which table(s) are used for data retrieval.

tuple stores

similar to a key-value store, a tuple store stores a unique key with a vector of data instead of a pairwise combination of a key and a value.

data replication

the act of propagating data written to one device over a network onto another device. This can be done synchronously or semi-synchronously, or asynchronously.

request coordinator

the responsible party that routes requests to the appropriate destination node and relays back the result status of the operation in a distributed system.

before trigger

trigger that is activated before the triggering event takes place.

embedded documents

used in MongoDB to embed related data as subdocuments in a document.

CHECK constraint

used to define a constraint on the column values of a relational table.

data redundancy

when additional (redundant) copies of the files or data are stored on offline media (e.g., a tape vault) or online media (e.g., on an online backup hard disk drive, or even in a full-blown redundant database node), e.g., for recovery purposes in case of failures or crashes; see also replication.


Ensembles d'études connexes

Study.com Financial Management Chapter 7

View Set

Child Health Exam 1 Practice Questions

View Set

Unit 8-The Digestive System and Nutrition

View Set

Nationalism : expansionist nationalism

View Set

Chapter 27: Amine, amino acid and polymers

View Set

Measuring the Area of Regular Polygons: Formula & Examples Chapter 6

View Set