Quiz 8
Cassandra writes commit logs to the file system as binary files. The commit log files are found under
$CASSANDRA_HOME/data/commitlogdirectory
Rack:
A logical set of nodes in close proximity to each other.
Data Center:
A logical set of racks, may be located in the same building and connected by a reliable network. Out of the box, Cassandra comes with a default configuration of a single data center.
memTable
After the write operation is written to the commit log, the value is written to a memory-resident data structure called
Importing data into a table column: Example - the cyclist first names
COPY cycling.cyclist_name (id,firstname) FROM 'C:/Temp/cyclist_firstname.csv' WITH HEADER = TRUE ;
Copying data to an external file
COPY cycling.cyclist_name (id,lastname) TO 'C:/Temp/cyclist_lastname.csv' WITH HEADER = TRUE ;
Importing data into a table column: Example - the cyclist last names
COPY cycling.cyclist_name (id,lastname) FROM 'C:/Temp/cyclist_lastname.csv' WITH HEADER = TRUE ;
Topology of Cluster
Cassandra provides two levels of grouping; Rack and Data Center.
cqlsh> SELECT * FROM user;
ERROR: InvalidRequest: Error from server: code=2200 [Invalid query] message="No keyspace has been specified. USE a keyspace, or explicitly specify keyspace.tablename"
The system.size_estimates stores the estimated number of partitions per table which is used for
Hadoop integration.
Step 2: Try to create a table where you can satisfy your query by reading (roughly) one partition
In practice, this generally means you will use roughly one table per query pattern. If you need to support multiple query patterns, you usually need more than one table. To put this another way, each table should pre-build the "answer" to a high-level query that you need to support. If you need different types of answers, you usually need different tables. This is how you optimize for reads. Remember, data duplication is okay. Many of your tables may repeat the same data.
. Following are the Relational Database rules that don't really apply to Cassandra:
Minimize the Number of Writes Minimize Data Duplication
goals to keep in mind with Cassandra
Rule 1: Spread Data Evenly Around the Cluster Rule 2: Minimize the Number of Partitions Read
Remove all records from a table: TRUNCATE function
TRUNCATE cycling.cyclist_name ;
coordinator node
The coordinator identifies which nodes are replicas for the data that is being written or read and forwards the queries to them.
Step 1: Determine What Queries to Support
Try to determine exactly what queries you need to support. This can include a lot of considerations that you may not think of at first. For example, you may need to think about: •Grouping by an attribute •Ordering by an attribute •Filtering based on some set of conditions •Enforcing uniqueness in the result set •etc ...
Time to live (TTL)
a feature that allows expiration of data that is no longer needed. This expiration is very flexible and works at the level of individual column values. -a value that Cassandra stores for each column value to indicate how long to keep the value.
A distributed system exhibits strong consistency when
a reader will always see the most recently written value.
Cassandra represents the data managed by a cluster
as a ring
The first replica will always
be the node that claims the range in which the token falls, but the remainder of the replicas are placed according to the replication strategy.
Writing data is very fast in Cassandra, because
because its design does not require performing disk reads or seeks. The memtables and SSTables save Cassandra from having to perform these operations on writes, which slow down many databases.
Write operation is immediately written to a
commit log
For a write, the coordinator node
contacts all replicas, as determined by the consistency level and replication factor, and considers the write successful when a number of replicas commensurate with the consistency level acknowledge the write.
Syntax for creating Keyspace & Table
cqlsh> CREATE KEYSPACE my_keyspace WITH replication={'class':'SimpleStrategy','replication_factor':'1'} AND durable_writes=true;
Find list of Keyspaces
cqlsh> SELECT * FROM system_schema.keyspaces;
Querying Record (Limiting output)
cqlsh> SELECT * From cycling.cyclist_nameLIMIT 3;
Basic cqlsh Commands
cqlsh>HELP cqlsh>DESCRIBE CLUSTER; cplsh> DESCRIBE KEYSPACES; cqplsh>SHOW VERSION;
For a read, the coordinator contacts
enough replicas to ensure the required consistency level is met, and returns the data to the client
cqlsh:my_keyspace> INSERT INTO user(first_name,last_name) VALUES('Muhammad','Razi');
inserting a record
The map
is a collection type that is used to store key value pairs. As its name implies that it maps one thing to another. For example, if you want to save course name with its prerequisite course name, map collection can be used.
Hinted Handoff
is an optional part of writes whose primary purpose is to provide extreme write availability when consistency is not required
Materialized view
is available in Cassandra 3.0 and later. Materialized View is a query only table and is created from a base table; when changes are made to the base table the materialized view is automatically updated. Materialized views are used to more efficiently query the same data in different ways.
Masterless system:
it is also possible to achieve strong consistency in a fully distributed, masterless system like Cassandra with quorum reads and writes.
The system_schema.keyspaces, system_schema.tables, and system_schema.columnsstore the definitions
of the keyspaces, tables, and indexes defined for the cluster.
you're •Not allowed to ask for the timestamp
on primary key (PK) columns.
durable_writes=true|false
optionally, by pass the commit log when writing to the keyspace by dialing durable writes (durable_write=false) *never use when using simple strategy replication
The gossip protocol is primarily implemented by
org.apache.Cassandra.gms.Gossiper class, which is responsible for managing gossip for the local node. At the start of the server node, it registers itself with the gossiper to receive endpoint state information
A node claims
ownership of the range of values less than or equal to each token and greater than the token of the previous node
bloom filters
probabilistic data structures that allows Cassandra to determine one of 2 possible states: the data definitely does not exist in the given file or the data probably exists in the give file.
cqlsh> use my_keyspace; cqlsh:my_keyspace> SELECT * FROM user;
queuing records in my_keyspace
Cassandra is a distributed database that avoids
reading before a write, so an INSERT or UPDATE sets the column values you specify regardless of whether the row already exists. As a result, every INSERT (and every UPDATE ) in Cassandra is actually an upsert(inserts and updates are treated the same). Upsertrefers to an operation that inserts rows into a database table if they do not already exist, or updates them if they do
A Set
stores group of elements that returns sorted elements when querying.
The replication factor is
the number of nodes in the cluster that will receive copies of the same data. If the replication factor is 3, then three nodes in the ring will have copies of each row.
Model table design around
your queries
TRACING
•- Enables or disables request tracing
CAPTURE
•Captures the output of a command and adds it to a file.
COPY
•Copies data to and from Cassandra.
DESCRIBE
•Describes the current cluster of Cassandra and its objects.
HELP
•Displays help topics for all cqlsh commands.
SHOW
•Displays the details of current cqlsh session such as Cassandra version, host, or data type assumptions.
Timestamps
•Each time you write data into Cassandra, a timestamp is generated for each column value that is updated. Internally, Cassandra uses these timestamps for resolving any conflicting changes that are made to the same value. Generally, the last timestamp wins. Not allowed to ask for the timestamp on primary key (PK) columns.
PAGING
•Enables or disables query paging.
SOURCE
•Executes a file that contains CQL statements.
EXPAND
•Expands the output of a query vertically.
•CQL Data Manipulation Commands
•INSERT - Adds columns for a row in a table. •UPDATE - Updates a column of a row. •DELETE - Deletes data from a table. •BATCH - Executes multiple DML statements at once.
•there is no fundamental difference between the insert and update operations.
•If you insert a row that has the same primary key as an existing row, the row is replaced. •If you update a row and the primary key does not exist, Cassandra creates it.
Columns
•Is the most basic unit of data structure in the Cassandra data model. •Contains a name and a value. •Each of the values in the column has to be of a particular type (when we define the column). •Other attributes of a column: 1.Timestamp 2.Time to live (TTL)
How the Gossiper works:
•Once per second, the gossiper will choose a random node in the cluster and initialize a gossip session with it. Each round of gossip requires three message. 1.The gossip initiator sends its chosen node a GossipDigestSynMessage. 2.When the chosen node receives this message, it returns a GossipDigestAckMessage. 3.When the initiator receives the ack message from the chosen node, it sends the node a GossipDigestAck2Message to complete the round of gossip. •When the gossiper determines that an endpoint is dead, it 'convicts' that endpoint by marking it as dead in its local list and logging that fact.
Rule 2: Minimize the Number of Partitions Read
•Partitions are groups of rows that share the same partition key. When you issue a read query, you want to read rows from as few partitions as possible. Why is this important? Each partition may reside on a different node. The coordinator will generally need to issue separate commands to separate nodes for each partition you request. This adds a lot of overhead and increases the variation in latency. Furthermore, even on a single node, it's more expensive to read from multiple partitions than from a single one due to the way rows are stored. •Therefore, if it is good to minimize the number of partitions that you read from, why not put everything in a single big partition?
•CQL Clauses
•SELECT - This clause reads data from a table •WHERE - The where clause is used along with select to read a specific data. •ORDERBY - The orderby clause is used along with select to read a specific data in a specific order.
CONSISTENCY
•Shows the current consistency level, or sets a new consistency level.
Rule 1: Spread Data Evenly Around the Cluster
•The goal is for every node in the cluster to have roughly the same amount of data. Rows are spread around the cluster based on a hash of the partition key, which is the first element of the PRIMARY KEY. So, the key to spreading data evenly is this: pick a good primary key.
The snitches can be found in the package
•The snitches can be found in the package org.apache.Cassandra.locator
EXIT
•Using this command, you can terminate cqlsh.
Gossip protocols generally assume
•a faulty network. They take their name from the concept of human gossip, a form of communication in which peers can choose with whom they want to exchange information.
All writes in Cassandra are
•append-only.
lightweight transactions
•are limited to a single conditional statement, which allows an "atomic compare and set" operation. That is, it checks if a condition is true, and if so, it conducts the transaction. If the condition is not met, the transaction does not go through. •They were called "lightweight" since they do not truly lock the database for the transaction. Instead, it uses a consensus protocol to ensure there is agreement between the nodes to commit the change.
writetime
•cqlsh:my_keyspace> select first_name, last_name, writetime(last_name) FROM user;
Each memtable contains
•data for a specific table.
Cassandra uses gossip protocol that allows
•each node to keep track of state information about the other nodes in the cluster. The gossip runs every second on a timer.
For write operations, the consistency level specifies
•how many replica nodes must respond for the write to be reported as successful to the client.
For read queries, the consistency level specifies
•how many replica nodes must respond to a read request before returning the data.
The commit log
•is a crash-recovery mechanism that supports Cassandra's durability goals.
Master-based system
•maintains strong consistency because reads and writes are routed to a single master. However, this also has the unfortunate implication that the system is unavailable when the master fails until a new master can take over.
Consistency levels in Cassandra can be configured to manage
•manage availability versus data accuracy. Configure consistency for a session or per individual read or write operation. Within cqlsh, use CONSISTENCY, to set the consistency level for all queries in the current cqlshsession. For programming client applications, set the consistency level using an appropriate driver. For example, using the Java driver, call QueryBuilder.insertInto with setConsistencyLevel to set a per-insert consistency level.
A higher consistency level means
•means that more nodes need to respond to a read or write query, giving assurance that the values present on each replica are the same.
Each node in the ring is assigned
•one or more ranges of data described by a token, which determines its position in the ring. A token is a 64-bit integer ID used to identify each partition. This gives a possible range for tokens from -2^63 to 2^63-1.
Information about the structure of the cluster communicated via gossip is stored in
•system.local and system.peers. These tables hold information about the local node and other nodes in the cluster including IP addresses, locations by data center and rack, CQL, and protocol versions.
The system.paxos table stores the status of transactions in progress, while the system.batchlogtable stores
•the status of atomic batches.
The job of a snitch is
•to determine relative host proximity for each node in a cluster, which is used to determine which nodes to read and write from.
The system.range_xfers and system.available_ranges track
•token ranges managed by each node and any ranges needing allocation.
User-provided extensions such as system_schema.types for user-defined types, system_schema.triggers for
•triggers configured per table, system_schema.functions for user-defined functions, and system_schema.aggregates for user-defined aggregates.
A snitch determines
•which datacenters and racks nodes belong to. They inform Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into datacenters and racks.
Cassandra's consistency level is tuneablewhich means
•you can specify in your queries how much consistency you require on writes.
Cassandra provides a batch mechanism that allows
•you to group modifications to multiple partitions into a single statement. •Only modifications elements (INSERT, UPDATE, or DELETE) may be included in a batch.
