Introduction to HDFS

Ace your homework & exams now with Quizwiz!

________ is the slave/worker node and holds the user data in the form of Data Blocks. a) DataNode b) NameNode c) Data block d) Replication

a) DataNode A DataNode stores data in the [HadoopFileSystem]. A functional filesystem has more than one DataNode, with data replicated across them.

Point out the correct statement : a) DataNode is the slave/worker node and holds the user data in the form of Data Blocks b) Each incoming file is broken into 32 MB by default c) Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault tolerance d) None of the mentioned

a) DataNode is the slave/worker node and holds the user data in the form of Data Blocks There can be any number of DataNodes in a Hadoop Cluster.

HDFS provides a command line interface called __________ used to interact with HDFS. a) "HDFS Shell" b) "FS Shell" c) "DFS Shell" d) None of the mentioned

b) "FS Shell" The File System (FS) shell includes various shell-like commands that directly interact with the Hadoop Distributed File System (HDFS).

HDFS is implemented in _____________ programming language. a) C++ b) Java c) Scala d) None of the mentioned

b) Java HDFS is implemented in Java and any computer which can run Java can host a NameNode/DataNode on it.

A ________ serves as the master and there is only one NameNode per cluster. a) Data Node b) NameNode c) Data block d) Replication

b) NameNode All the metadata related to HDFS including the information about data nodes, files stored on HDFS, and Replication, etc. are stored and maintained on the NameNode

________ NameNode is used when the Primary NameNode goes down. a) Rack b) Data c) Secondary d) None of the mentioned

c) Secondary Secondary namenode is used for all time availability and reliability.

The need for data replication can arise in various scenarios like: a) Replication Factor is changed b) DataNode goes down c) Data Blocks get corrupted d) All of the mentioned

d) All of the mentioned Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.

Point out the wrong statement: a) Replication Factor can be configured at a cluster level (Default is set to 3) and also at a file level b) Block Report from each DataNode contains a list of all the blocks that are stored on that DataNode c) User data is stored on the local file system of DataNodes d) DataNode is aware of the files to which the blocks stored on it belong to

d) DataNode is aware of the files to which the blocks stored on it belong to NameNode is aware of the files to which the blocks stored on it belong to.

Which of the following scenario may not be a good fit for HDFS? a) HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file b) HDFS is suitable for storing data related to applications requiring low latency data access c) HDFS is suitable for storing data related to applications requiring low latency data access d) None of the mentioned

a) HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file HDFS can be used for storing archive data since it is cheaper as HDFS allows storing the data on low cost commodity hardware while ensuring a high degree of fault-tolerance.

HDFS works in a __________ fashion. a) master-worker b) master-slave c) worker/slave d) all of the mentioned

a) master-worker NameNode servers as the master and each DataNode servers as a worker/slave


Related study sets

Digital Marketing Midterm - Kocur

View Set

Developmental Psych Test 1(ch.1-6)

View Set

Insurance Exam Practice Questions

View Set

AP Computer Science-AP Test Studying

View Set

Understanding Business Chapter 7 Reduced

View Set

Cancer Grading, Stages and Treatment

View Set