Q: Q1 What are the different types of tombstone markers in HBase for deletion?

Answer: There are 3 different types of tombstone markers in HBase for deletion- Family Delete Marker- This marker marks all columns for a column family. Version Delete Marker-This marker marks a single version of a column. Column Delete Marker-This markers mark all the versions of a column.

Q: Q2 When should you use HBase and what are the key components of HBase?

Answer: HBase should be used when the big data application has – A variable schema When data is stored in the form of collections If the application demands key-based access to data while retrieving. Key components of HBase are – Region- This component contains memory data store and Hfile. Region Server-This monitors the Region. HBase Master-It is responsible for monitoring the region server. Zookeeper- It takes care of the coordination between the HBase Master component and the client. Catalog Tables-The two important catalog tables are ROOT and META.ROOT table tracks where the META table is and META table stores all the regions in the system.

Q: Q3 Explain the difference between HBase and Hive.

Answer: HBase and Hive both are completely different Hadoop based technologies-Hive is a data warehouse infrastructure on top of Hadoop whereas HBase is a NoSQL key-value store that runs on top of Hadoop. Hive helps SQL savvy people to run MapReduce jobs whereas HBase supports 4 primary operations-put, get, scan and delete. HBase is ideal for real-time querying of big data where Hive is an ideal choice for analytical querying of data collected over the period of time.

Q: Q4 What is Row Key?

Answer: Every row in an HBase table has a unique identifier known as RowKey. It is used for grouping cells logically and it ensures that all cells that have the same RowKeys are co-located on the same server. RowKey is internally regarded as a byte array.

Q: Q5 Explain the difference between RDBMS data model and HBase data model.

Answer: RDBMS is a schema-based database whereas HBase is schema-less data model. RDBMS does not have support for in-built partitioning whereas in HBase there is automated partitioning. RDBMS stores normalized data whereas HBase stores de-normalized data.

Q: Q6 What are the different operational commands in HBase at record level and table level?

Answer: Record Level Operational Commands in HBase are –put, get, increment, scan and delete. Table Level Operational Commands in HBase are-describe, list, drop, disable and scan.

Q: Q7 Explain about the different catalog tables in HBase?

Answer: The two important catalog tables in HBase, are ROOT and META. ROOT table tracks where the META table is and META table stores all the regions in the system.

Q: Q8 Explain the process of row deletion in HBase.

Answer: On issuing a delete command in HBase through the HBase client, data is not actually deleted from the cells but rather the cells are made invisible by setting a tombstone marker. The deleted cells are removed at regular intervals during compaction.

Q: Q9 What is column families? What happens if you alter the block size of ColumnFamily on an already populated database?

Answer: The logical deviation of data is represented by a key known as column Family. Column families consist of the basic unit of physical storage on which compression features can be applied. In an already populated database, when the block size of column family is altered, the old data will remain within the old block size whereas the new data that comes in will take the new block size. When compaction takes place, the old data will take the new block size so that the existing data is read correctly.

Q: Q10 Explain about HLog and WAL in HBase.

Answer: All edits in the HStore are stored in the HLog. Every region server has one HLog. HLog contains entries for edits of all regions performed by a particular Region Server.WAL abbreviates to Write Ahead Log (WAL) in which all the HLog edits are written immediately.WAL edits remain in the memory till the flush period in case of deferred log flush.

Question 1

Q1 What are the different types of tombstone markers in HBase for deletion?

Accepted Answer

Answer:

There are 3 different types of tombstone markers in HBase for deletion-

Family Delete Marker- This marker marks all columns for a column family.
Version Delete Marker-This marker marks a single version of a column.
Column Delete Marker-This markers mark all the versions of a column.

Question 2

Q2 When should you use HBase and what are the key components of HBase?

Accepted Answer

Answer: HBase should be used when the big data application has –

A variable schema
When data is stored in the form of collections
If the application demands key-based access to data while retrieving.

Key components of HBase are –

Region- This component contains memory data store and Hfile.
Region Server-This monitors the Region.
HBase Master-It is responsible for monitoring the region server.
Zookeeper- It takes care of the coordination between the HBase Master component and the client.
Catalog Tables-The two important catalog tables are ROOT and META.ROOT table tracks where the META table is and META table stores all the regions in the system.

Question 3

Q3 Explain the difference between HBase and Hive.

Accepted Answer

Answer:

HBase and Hive both are completely different Hadoop based technologies-Hive is a data warehouse infrastructure on top of Hadoop whereas HBase is a NoSQL key-value store that runs on top of Hadoop. Hive helps SQL savvy people to run MapReduce jobs whereas HBase supports 4 primary operations-put, get, scan and delete. HBase is ideal for real-time querying of big data where Hive is an ideal choice for analytical querying of data collected over the period of time.

Question 4

Q4 What is Row Key?

Accepted Answer

Answer:

Every row in an HBase table has a unique identifier known as RowKey. It is used for grouping cells logically and it ensures that all cells that have the same RowKeys are co-located on the same server. RowKey is internally regarded as a byte array.

Question 5

Q5 Explain the difference between RDBMS data model and HBase data model.

Accepted Answer

Answer:

RDBMS is a schema-based database whereas HBase is schema-less data model.

RDBMS does not have support for in-built partitioning whereas in HBase there is automated partitioning.
RDBMS stores normalized data whereas HBase stores de-normalized data.

Question 6

Q6 What are the different operational commands in HBase at record level and table level?

Accepted Answer

Answer:

Record Level Operational Commands in HBase are –put, get, increment, scan and delete.

Table Level Operational Commands in HBase are-describe, list, drop, disable and scan.

Question 7

Q7 Explain about the different catalog tables in HBase?

Accepted Answer

Answer:

The two important catalog tables in HBase, are ROOT and META. ROOT table tracks where the META table is and META table stores all the regions in the system.

Question 8

Q8 Explain the process of row deletion in HBase.

Accepted Answer

Answer:

On issuing a delete command in HBase through the HBase client, data is not actually deleted from the cells but rather the cells are made invisible by setting a tombstone marker. The deleted cells are removed at regular intervals during compaction.

Question 9

Q9 What is column families? What happens if you alter the block size of ColumnFamily on an already populated database?

Accepted Answer

Answer:

The logical deviation of data is represented by a key known as column Family. Column families consist of the basic unit of physical storage on which compression features can be applied. In an already populated database, when the block size of column family is altered, the old data will remain within the old block size whereas the new data that comes in will take the new block size. When compaction takes place, the old data will take the new block size so that the existing data is read correctly.

Question 10

Q10 Explain about HLog and WAL in HBase.

Accepted Answer

Answer:

All edits in the HStore are stored in the HLog. Every region server has one HLog. HLog contains entries for edits of all regions performed by a particular Region Server.WAL abbreviates to Write Ahead Log (WAL) in which all the HLog edits are written immediately.WAL edits remain in the memory till the flush period in case of deferred log flush.

Question 11

Q11 what is NoSql?

Accepted Answer

Answer:

Apache HBase is a type of “NoSQL” database. “NoSQL” is a general term meaning that the database isn’t an RDBMS which supports SQL as its primary access language, but there are many types of NoSQL databases: BerkeleyDB is an example of a local NoSQL database, whereas HBase is very much a distributed database. Technically speaking, HBase is really more a “Data Store” than “Data Base” because it lacks many of the features you find in an RDBMS, such as typed columns, secondary indexes, triggers, and advanced query languages, etc.

Question 12

Q12 What is region server?

Accepted Answer

Answer: It is a file which lists the known region server names.

Question 13

Q13 Give the name of the key components of HBase

Accepted Answer

Answer: The key components of HBase are Zookeeper, RegionServer, Region, Catalog Tables and HBase Master.

Question 14

Q14 What is the reason for using HBase?

Accepted Answer

Answer: Hbase is used because it provides random read and write operations and it can perform a number of operation per second on a large data sets.

Question 15

Q15 Define standalone mode in Hbase?

Accepted Answer

Answer: It is a default mode of HBase .In standalone mode, HBase does not use HDFS—it uses the local filesystem instead—and it runs all HBase daemons and a local ZooKeeper in the same JVM process.

Question 16

Q16 Which operating system is supported by HBase?

Accepted Answer

Answer: HBase supports those OS which supports java like windows, Linux.

Question 17

Q17 What are the main features of Apache HBase?

Accepted Answer

Answer: Apache HBase has many features which support both linear and modular scaling, HBase tables are distributed on the cluster via regions, and regions are automatically split and re-distributed as your data grows(Automatic sharding).HBase supports a Block Cache and Bloom Filters for high volume query optimization(Block Cache and Bloom Filters).

Question 18

Q18 What is the difference between HDFS/Hadoop and HBase?

Accepted Answer

Answer: HDFS doesn’t provide fast lookup records in a file, IN Hbase provides fast lookup records for a large table.

Achieve your goals

Achieve your goals

transform your life through education

Achieve your goals

Achieve your goals

transform your life through education

Top HBase Interview Questions

Headquarters

follow us

Quick Links

resources

About Us

Newsletter