NoSQL Basics

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Column-Family Database

* Organize data into column families, which are composed of one or more columns * Each column is part of a single column family * Column families are named and specified in table definitions

What are the use cases for a column-family database?

1. Event logging 2. Blogging platforms 3. Geographic info systems

When using a MongoDB database, ___ allows applications to store related pieces of data in the same document.

Embedding

___ is the de facto standard for storing hierarchical documents in document databases.

JSON (JavaScript Object Notation)

___ can be image names, web page URLs, file paths and SQL queries that point to ___ like binary images, HTML web pages, PDF documents or SQL query results.

Keys... values

What are the drawbacks of XML as a storage format?

* XML is wasteful and computationally expensive to process due to the verbose and repetitive nature of XML tags * XML tags increase the amount of storage required and makes it expensive to parse

Key features of MongoDB

1. High performance 2. Rich query language 3. High availability 4. Horizontal scalability

What is a BSON document?

A binary representation of a JSON object

Document

A data structure composed of field and value pairs

What serves as the key for data lookup in a column-family database?

A key (unique address used to look up the value of a cell) is formed with the following: 1. Rowkey 2. Column family 3. Column name

Collection

A set of documents sharing some common purpose in a JSON document database.

XML

A standard format for document types and a basis for data exchange protocols.

MongoDB stores data records as ___.

BSON documents

MongoDB stores its data in ___.

Collections

Embedded Data Models

Data model in which parent and child documents are in the same collection

What data model is used by MongoDB, CouchDB, OrientDB and Terrastore?

Document

___ allow some form of structure without enforcing a schema, thus providing a balance between the rigid schema of the relational database and the completely schema-less key-value database.

Document databases

MongoDB is a ___ database.

Document-oriented

A MongoDB collection holds one or more ___.

Documents

Notes are connected to each other by ___.

Edges

What is the equivalent of a table join in MongoDB?

Embedded Documents and $lookup

What are the primary advantages of using an embedded data model with MongoDB?

1. Better performance as applications may need to issue fewer queries. 2. Like an embedded query in SQL, it is most useful if you need to view one or many data entities (child documents) in context of another entity (parent document).

What are the disadvantages of using the embedded data model in MongoDB?

1. Documents grow after creation and there is a physical limit to the size of a document. 2. You could end up storing repetitive data sets (especially in an one-to-many relationship)

What are the two ways to represent the relationships between data in a MongoDB database?

1. Normalized data models 2. Embedded data models

Common characteristics of NoSQL databases

1. Provide data models that better fit application needs, reducing mapping and data between applications and databases 2. Do not use SQL; some NoSQL databases have their own query language but there is no notion of a standard query language for NoSQL databases. 3. Generally open-source projects 4. Most are designed to run on clusters, making them a better fit for big data scenarios. 5. Operate without a schema, allowing the creation of fields to databqse records without having to define any change in structure first.

What is the maximum BSON document size?

16 megabytes

Document databases do not provide ___.

Join operations

Graph Database

* Allow us to store entities (nodes) and relationships (edges) between these entities * Edges have directional significance (like Instagram followers), allowing us to find patterns between notes

What is the primary advantage and disadvantage of using a normalized data model with MongoDB?

Normalize data models using references provide more flexibility than embedded data models. However, client-side applications must issue follow-up queries to resolve the references. In other words, normalized data models require more round trips to the database server.

What are the operations that application developers use to access and manipulate a key-value store?

PUT, GET and DELETE

Sharding

The process of slicing up large collections of data to store on multiple servers in a cluster. It basically distributes data across a cluster of machines, thus allowing you to horizontally scale your database.

Key-value databases

* Simplest NoSQL databases from an API perspective * Each entry has a key and a value * Keys and values are flexible and can be represented by any format

Document databases

* Store data in structured documents, usually XML, JSON and BSON formats * Documents are self-describing and have hierarchical tree data structures

What is the key decision in designing data models for MongoDB?

Choosing the structure of documents and how the applications represent the relationships between data.

Each MongoDB database consists of one or more ___, which hold one or more ___ documents.

Collections... BSON

How are MongoDB fields and values separated from one another in a document?

Colons

What data model is used by Amazon SimpleDB, Cassandra, HBase and Hypertable?

Column-Family

Unlike relational databases, you do not have to insert data in to all ___ for each ___ when using column-family databases.

Columns...row

How are MongoDB field-value pairs separated from one another?

Commas

XML databases are useful for ___, which organize and maintain collections of text files in XML format, such as academic papers, business documents, etc.

Content management systems

Relational databases focus on ___ through ___.

Data consistency... transactions

How is data for a specific column family stored in a specific column-family database?

Data is stored together on a disk to optimize disk input/output operations, with the assumption that data for a particular column family will be usually accessed together.

NoSQL databases are categorized according to their ___.

Data model

Normalized Data Models

Data model in which parent and child documents are in different collections.

T or F: Documents in a collection have to be of the same type.

FALSE; Documents in a collection don't have to be of the same type, though it is typical for documents in a collection to represent a common category of information.

T or F: Documents in a MongoDB database must have the same set of fields and datatypes must be the same across fields

FALSE; MongoDB documents do not need to have the same set of fields nor do the data types for a field need to be the same across documents.

T or F: Fields in MongoDB can be any data type.

FALSE; MongoDB fields must be strings

T or F: You must use the key to query the content in a document database.

FALSE; You can get almost any item out of a document database by querying the content of the document. Since documents are searchable, document databases can quickly extract subsections of a large number of documents without loading each document into memory.

T or F: In a JSON document database, arrays cannot contain documents.

FALSE; arrays may also contain documents allowing for a complex hierarchical relationship.

T or F: In a key-value database, you can have two rows with the same key.

FALSE; keys are unique. You can never have two rows with the same key.

T or F: Values of a cell cannot contain multiple versions of data.

FALSE; the value of a cell can contain multiple versions of data indexed by timestamps. For instance, if we are capturing the temperature readings in a city in different times, the value of a cell can contain multiple values.

MongoDB documents are composed of ___.

Field-value pairs

___ are similar to the columns in a relational database table.

Fields

NoSQL database

Generally refers to non-relational databases; databases that store and manipulate data in formats other than tabular relations

What data model is used by FlockDB, HyperGraphDB, Neo4J and OrientDB?

Graph

___ do not support distributed deployment and must run on a single server only.

Graph

Which NoSQL data model uses a distribution model similar to relational databases?

Graph databases

If the number of child documents for a parent document is small with limited growth, where do you store the reference to child documents?

Inside the parent document For instance, if the number of books per publisher is small with limited growth, storing the book references inside the publisher document using an array field is the best option. Here we have two book documents both published by the same publisher whose information is given in a separate document. The publisher document has references to these books using _id field values.

How does MongoDB ensure that data is always available?

It duplicates data on multiple database instances (a replica set) to ensure that data is always available. In other words, a group of servers maintain the same data set, providing redundancy but increasing data availability.

Why don't we use JSON data types in MongoDB?

JSON data types are somewhat limited. It only has one number type and doesn't not support dates. BSON extends JSON data types to include integers, doubles, dates and binary data.

A key-value database is indexed by the ___.

Key

In a JSON document database, document is a basic unit of storage, corresponding to a row in a relational database. A document comprises one or more ___, ___ and ___.

Key value pairs, nested documents... arrays

What data model is used by Redis, Riak, BerekleyDB and LevelDB?

Key-Value

T or F: In a column-family database, you do not need to fully design your data model before you begin inserting data.

TRUE; columns are dynamically created upon insertion of new data. In this way, column family databases are similar to key-value stores.

What is the advantage and disadvantage of storing data for a specific column family on a single disk?

The advantage is that we can easily access and retrieve all values of a certain column family. The disadvantage is that retrieving all attributes pertaining to a single record becomes more difficult because different column families can be stored in different machines.

How does MongoDB facilitate aggregation operations?

The aggregation pipeline, which is similar to Group By clause in SQL

What is the most important thing to keep in mind when designing a MongoDB database?

The application usage of the data (i.e. queries, updates and processing of the data)

What do we mean when we say MongoDB collections have dynamic schemas?

The documents within a single collection can have any number of different shapes. Said differently, documents can have different fields and different types for their values.

What allows column-family databases to contain tables that are sparse (i.e. do not contain values at the intersection of each row and column)?

The fact that, unlike relational databases, empty cells or null values in a column-family database do not consume any storage space.

XQuery

The query language for XML documents

How does MongoDB provide high performance to its users?

Through embedded documents and arrays, which reduce the need for expensive joins and thereby limit IO operations.

In a graph data model, what term is synonymous with querying the database?

Traversing the graph

Document databases have a ___ structure.

Tree-like; document trees have a single root element. Beneath the root element, there is a sequence of branches, sub branches and values. Each branch has a related path expression that shows you how to navigate from root of the tree to any given branch, sub branch and value.

Key-value databases do not care about the ___ that are stored. It is the responsibility of ___ to understand what is stored.

Values... application

In a key-value database, you can not perform queries on the ___.

Values; you cannot select a key-value pair using the value part.

Why are graph databases a poor choice when running on cluster?

When a node points to an adjacent node located on another machine, the overhead of routing the traversal across multiple machines eliminates the advantages of the graph database model. This is because inter-server communications is far more time-consuming than local access.

Cypher

Graph query language used by Neo4j that is particularly optimized for graph traversals.

When using referencing, what determines where we store references?

Growth of relationships

What makes the schema of MongoDB different from that of a relational database?

Unlike the fixed schema of relational databases, MongoDB has a dynamic schema.

T or F: The name of the field that serves as the primary key in a MongoDB database is the same across collections.

TRUE; This is different from relational databases in which we can choose the primary key field.

MongoDB has a rich query language to support ___ tasks.

CRUD (create, read, update, delete)

A normalized data model describes relationships using ___ between documents.

References

What are the basic queries that can be run against a Cassandra database?

SET, GET and DELETE

T or F: Both keys and values in a key-value store are flexible and can be represented by any format.

TRUE

T or F: The key in a document database may be a simple ID, which is never used or seen.

TRUE

T or F: The query language used by Cassandra databases to form SQL-like queries does not allow joins or subqueries.

TRUE

Neo4j

The most widely adopted graph database.

In a graph database, edges and nodes can both have ___.

Properties

Collections in MongoDB are analogous to ___ in relational databases.

Tables

What replaces the concept of a row/record in MongoDB?

A document

What datatypes can the value of a field in a MongoDB BSON document be?

Any of the BSON data types, including: * Other documents * Arrays * Arrays of documents * String * Number * Boolean * Objects

What is the key challenge in modeling a MongoDB database?

Balancing the needs of the application, the performance characteristics of the database engine and the data retrieval patterns.

Why are joins unnecessary in MongoDB?

Because MongoDB can represent complex data structures using embedded documents.

What is one reason why document databases naturally have less collections than the number of tables in a relational database?

Because the JSON structure maps closely to the object structure in applications.

In a MongoDB database using a normalized data model, how do references store the relationships between data?

By including links from one document to another; applications can resolve these references to access the relate data. In the example shown, if the application needs to obtain the phone number of a user with the user name 123xyz, it will follow the link between the user and contact documents for this user.

How can you decipher when a MongoDB document begins and ends?

By looking for the opening and closing curly brackets, which indicate the beginning and end of the document.

How do JSON document databases support modern web-based applications?

By storing and modifying dynamic content and transactional data.

What is the primary use case for a graph database?

Link analysis to look for patterns in social networks, telephone/email records, fund transfers, recommender systems, etc.

Documents in MongoDB are analogous to ___ in relational databases.

Records / rows

___ databases are a precursor to modern JSON document databases.

XML (extensible markup language)


संबंधित स्टडी सेट्स

Oxygenation, perfusion, fluid and electrolytes, sensation, and pain

View Set