NoSQL
Where does _id come from for documents in MongoDB?
-It can be generated by the database system -It can be assigned by the user as long as it is unique
Which of the following are contributors to the rise of NoSQL? Choose all that apply.
-Need to cluster database servers. -The difference between data structures in memory and database structure on disk. -The rise of web services. -Growing numbers of users and amounts of data stored by organizations.
Select all of the following that are ways Riak deals with conflicts:
-Newest write wins -All values are returned
If I have a cluster of NoSQL nodes with N equal to 5 and R equal to 3, what is the smallest value of W I can use to ensure strong consistency? Use the number only, no words or punctuation.
3
If I have a cluster of NoSQL nodes with N equal to 5 and R equal to 5, what is the largest value of W I can use to ensure strong consistency? Use the number only, no words or punctuation.
5
We get to choose whether or not to store a timestamp on key-value pairs in Cassandra.
FALSE
_id is a special field that may be found on some documents in Mongo.
FALSE
You must know how data will be used in an application before we can define how to create the boundaries between aggregates.
TRUE
In a graph database, entities are also known as:
nodes
In a graph database, which of the following can have properties? Choose all that apply.
nodes/edges
From CAP theorem, P is for _____.
partition
____ approaches tend to severely degrade performance of systems.
pessimistic
Using different data stores based on the needs of an application, based on the circumstances, instead of always using the same kind of database is called ________ ___________.
polyglot persistence
Data that is stale will be detected and future queries will get the latest data through a process called ______ _______.
read repair
A ____ function takes multiple outputs with the same key and combines their values.
reduce
Aggregate-oriented databases treat the aggregate as the unit of data-retrieval. Consequently, ______ is only supported within the contents of a single aggregate.
Atomicity
Which of the following is NOT generally true of NoSQL databases?
Built for 20th-century web estates
CQL stands for _______ _______ ________
Cassandra Query Language
Neo4J has the _____ query language, which is specific to just Neo4J.
Cypher
Which of the following is the second step of a write in Cassandra?
Data is written to a memtable
Embedding child documents as subobjects is not allowed in document databases.
FALSE
In a graph database, all relationships are bi-directional.
FALSE
In a graph database, nodes do not know about the connections going out - they only know about the connections coming in.
FALSE
Map-reduce only applies to databases.
FALSE
NoSQL databases don't support ACID transactions and thus sacrifice consistency.
FALSE
Replication and sharding are orthagonal technologies, meaning they cannot be combined.
FALSE
Replication factor and number of nodes are always the same thing.
FALSE
Since key-value stores always use primary-key access, they generally have great performance but do not scale well.
FALSE
The whole point of the map-reduce pattern is to enable us to run calculations quickly on one server node.
FALSE
_id is a special field that will be found on most documents in Mongo.
FALSE
What does NoSQL stand for?
It was an accidental neologism (started as a hashtag)
If I use an aggregate-oriented database and have data that splits between aggregates, I might need to use a ________ to temporarily store complex query results.
Materialized view
Which of the following is not a common operation for a Key-Value Database?
Search for a value
Running separate servers for different sets of data is called __________. It must be controlled by the application, and has been called "unnatural acts" by those with experience using this method with relational databases.
Sharding
_______ rows have few columns with the same columns used across the many different rows.
Skinny
Cassandra is not designed to enforce strong consistency.
TRUE
Cassandra uses peer to peer replication.
TRUE
In a graph database, all relationships have one direction.
TRUE
We get to set different consistency guarantees based on a particular situation.
TRUE
A(n) _______ is a collection of data that we interact with as a unit.
aggregate
If I have data that will need to be combined in several different ways in different queries, I might need to use a(n) ______ database.
aggregate ignorant
Transactions at the single-document level are known as ______ ______.
atomic transactions
If a database moves data around so that it's not on every server, you are probably using _____.
auto-sharding
CAP theorem implies that in a system that experiences partitions, you have to trade consistency versus _____.
availability
The ______ data model divides the aggregate into column families, allowing the database to treat them as units of data within the row aggregate.
column-family
A ________ _________ function needs a special shape, where its output must match the format of its input.
combinable reducer
A _____ function takes cuts data down by summarizing all of the data for a single key.
combiner
In Cassandra, space is reclaimed during the ______ phase.
compaction
If you have to change your aggregate boundaries, the process is ______ .
complex
_____ in an RDBMS is like ________ in Cassandra.
database; keyspace
The ______ data model makes the aggregate transparent to the database allowing you to do queries and partial retrievals.
document
The ______ data model is an example of an aggregate-ignorant NoSQL option.
graph
______ databases work well with small records with complex interactions.
graph
Of the different types of NoSQL databases, which ones usually support ACID transactions?
graph databases
Handing changes to a node recently brought back online uses a technique called:
hinted handoff
The common problem where software data structures from memory do not persist easily into a database is called _________ _________.
impedance mismatch
The ____ is the format of data that application code expects to function correctly.
implicit schema
Where has sharding been accomplished historically?
in the application
Using a database to help multiple applications work together is called:
integration
The ______ data model treats the aggregate as an opaque whole
key-value
A ____ operation only operates on a single record and outputs a bunch of key-value pairs.
map
The authoritative source for data in a cluster is the _____.
master
In a graph database, data about one thing is stored in a:
node
The ______ data model is the most commonly used aggregant-ignorant data model.
relational
Each document in a document database is like a ____ in a traditioanl RDBMS.
row
Ensuring that all nodes apply operations in the same order is _____ _______.
sequential consistency
According to the authors, which of the following is NOT one of the general ways to scale graph databases?
shard data using the built-in autosharding mechanism
When I put different data on different servers, I am using _____.
sharding
If you have a choice and any distribution strategy will do, which distribution scheme would you usually select?
single server (i.e. no distribution)
_____ _____ is a situation in which a cluster is broken such that multiple partitions cannot communicate with each other.
split brain or network partition
For column-family stores, when the columns in a column family are simple columns, the column family is known as a(n):
standard column family
The ________ mechanism has worked well to contain the complexity of concurrency with relational databases.
transaction
Finding all nodes with a particular relationship to another node is called _______ the graph.
traversing
______ rows have many columns (perhaps thousands), with rows having very different columns.
wide
A(n) ____ conflict is when two people try to update the same data item at the same time.
write-write
In CAP theory, availability means:
you can always read and write to any reachable node