MongoDB for Developers
ACID
"Atomicity, Consistency, Isolation, Durability". A set of properties that guarantee that database transactions are processed reliably.
CRUD
"Create Read Update Delete" --> "Insert Find Update Remove". Things that can be performed on records.
What are the two data structure that live within JSON?
(1) Arrays - lists of things, (2) Dictionaries - associative maps
Which features did MongoDB omit to retain scalability (2)?
(1) Joins, (2) Transactions across multiple collections
What are the goals of normalization in relational databases? (3)
(1) free the database of modification anomalies, (2) minimize redesign when extending, and (3) avoid bias toward any particular access pattern
What are two reasons you might want to keep two documents that are related to each other one-to-one in separate collections?
1) to reduce the working set size of your applications, (2) because the combined size of the documents would be larger than 16MB.
What is MongoDB?
A non-relational, schemaless store of JSON documents
dynamic typing
A programming language is said to be dynamically typed, or just 'dynamic', when the majority of its type checking is performed at run-time as opposed to at compile-time. In dynamic typing, types are associated with values not variables.
static typing
A programming language is said to use static typing when type checking is performed during compile-time as opposed to run-time. In static typing, types are associated with variables not values.
database transaction
A transaction comprises of a unit of work performed within a DBMS against a database and treated in a coherent and reliable way independent of other transactions.
Which optimization will typically have the greatest impact on the performance of a database?
Adding appropriate indexes on large collections so that only a small percentage of queries need to scan the collection.
Access pattern
An access pattern is a specification of an access mode for every attribute of a relation schema, i.e., it is an indication of which attributes are used as input and which ones are used as output.
Atomic operations
An operation during which a processor can simultaneously read a location and write it in the same bus operation. This prevents any other processor or I/O device from writing or reading memory until the operation is complete.
BSON
Binary representation of JSON. Supports basic data types that Mongo contains.
What can you do to make your MongoDB 'schema' more effective.
Embed and pre-join anywhere you can.
consistency (ACID)
Guarantee that database constraints are not violated, particularly once a transaction commits.
durability (ACID)
Guarantee that transactions that have committed will survive permanently.
isolation (ACID)
How transaction integrity is visible to other users and systems.
What is the biggest advantage of embedding?
Improved read performance - only one round trip needed to the DB.
What's the primary difference between Python dicts and JSON dictionaries?
JSON dictionaries preserve the order of the key-value pairs whereas Python dicts are orderless.
What is the interactive language for the MongoDB shell?
JavaScript - can type it right in the shell!
What's the single most important factor in designing your application schema with MongoDB?
Matching the data access patterns of your application
Rich documents
More than just tabular data. Includes arrays, key-values, etc. Allows for "pre-joins" or embedded data
Does MongoDB have its own query language?
No, Mongo's CRUD operations exist as methods/functions in programming language APIs, not as separate language
Are you allowed to build MultiKey Indexes when both keys are array?
No, cartesian match too big. However, you can build one if one is an index and the other isn't.
Does MongoDB have constraints? (ex: FK constraints)
Nope. Less of an issue b/c of embbedding
How does MongoDB's < / > query operators work for strings?
Retrieves based on Ascii character order. Case sensitive. Ex: db.uses.find({name:{$gte :"F",$lte:"Q"}}) <> db.uses.find({name:{$gte :"f",$lte:"Q"}})
What does $type do while using the "find" method
Returns values from the specified field that match the character type. $type values are numeric and correspond to the BSON type as specified in the BSON documentation.
What are the costs of indexes?
They take up space on disk and they need to be updated when new records are added.
Schemaless
Two documents do not have to have the same schema
atomicity (ACID)
When a series of database operations either all occur or nothing occurs. Prevents updates to the database occurring only partially.
When is it recommended to represent a one to many relationship in multiple collections?
When the "many" is large.
GRIDFS
Will break-up large file (>64 MB) into chunks and store chunks in a collection AND will store meta-data about the chunks in another collection.
Python exception handling
import sys. try: "code" , except: print "exception ", sys.exc_info()[0]