Press enter to see results or esc to cancel.

Top Big Data Technologies and Tools- Hadoop and NoSQL Ecosystem

In our previous blogs, we defined Big Data and Big Data Analytics. In this we are going to discuss the tools that are used for solving big data from technology standpoint – Hadoop (HDFS, MapReduce) which is an open source computing framework and NoSQL which is non-relational database.

Big Data Technologies and Tools


High-availability distributed object-oriented platform or “Hadoop” is a software framework which analyse structured and unstructured data and distribute applications on different servers. Below is an overall Hadoop architecture –cisco

Source: Cisco

Basic Application of Hadoop

Hadoop is used in maintaining, scaling, error handling, self healing and securing large scale of data. These data can be structured or unstructured. What I mean to say is if data is large then traditional systems are unable to handle it. Thus, Hadoop comes in the picture. Below are some basic features of Hadoop -[subscribelocker]

  • Hadoop maintains and secures the data by storing and keeping its replica.
  • It is focused on scaling according to data usage.
  • It can detect and delete the failed task and as well as failed transaction of data.
  • It not only recovers the data but also automatically restores the data at its place.

Typical Hadoop Platform Stack – HDFS + Hive + HBase + Pig

HDFS (Hadoop Distributed File System) – is part of Hadoop and is known as a special file system which deals with distribution and storage of large set of data. HDFS stores file as sequence of same size of block except the last block. It also deals with hardware failure and smoothen the data handling.

Hive – Hive was initiated by Facebook. Hive is data warehouse tool which is based on Hadoop and converts query language into MapReduce jobs. It deals with the storage , analysis and queries of large set of data. Query language in hive used as HQL statement. Hive Query Language is similar to standard SQL statement.

Hbase – Hbase is a Hadoop application which runs on top of HDFS. Hbase system represents set of table but Hbase is column oriented database management system i.e. different from the row oriented database management system. Generally if we talk about database then we think of relational database system but unlikely Hbase is not relational database at all and also it doesn’t support Structured Query Language like SQL. Java is prefered language use for Hbase application. One most important feature of Hbase is to real time read or write to large set of data.

Pig – initiated by Yahoo, became open source in 2007. Do you know why it is named as Pig? It is because it can handle any type of data!! Strange but true. Pig is a high level procedural programming platform developed for simplifying large data sets query in Hadoop and MapReduce. Pig has two components- one is PigLatin which is programming language and the other is run time environment where PigLatin programs are executed.

Advantage and Disadvantage of Hadoop:


As the term says NoSQL, it means non relational or Non-SQL database, refer to Hbase, Cassandra, MongoDb, Riak, CouchDB. It is not based on table formats and that’s the reason we don’t use SQL for data access. A traditional database deals with structured data while a relational database deals with the vertical as well as horizontal storage system. NoSQL deals with the unstructured, unpredictable kind of data according to the system requirement.

NoSQL Technologies HBase (part of the Hadoop ecosystem), Cassandra, MongoDB, Riak, CouchDB.

Cassandra database is used to handle the large set of data when we need to scale the database with high performance. Cassandra deals with the fault tolerance and replication of the data. With this we can go deeper in columns, supercolumns and more. It is a partial relational database system, supports best query capability but don’t have joins feature. It follows the column family model map with two dimensional and 3 dimensional. 2D model includes column family with some column in it, while 3D model created by associating super column in column family.

MongoDB is an agile NoSQL document database, unlike the traditional database which store the data in rows and column, MongoDB stores the document data in binary form of JSON document which is also known as BSON format. It is used for high scalability, availability and performance. In MongoDB dynamic schemas are the unit of database, which found in document where set of documents are found in collection while set of collection makes the database.

Riak is open source NoSQL database system which is designed for availability, fault tolerance, scalability and high performance. It provides three kind of storage key/value store, document oriented store and web shaped store. It also stores documents in the JSON format. When we talk about data modeling, we will see that there is no ‘Master’, only nodes are there. All nodes are same and don’t have different responsibility.

CouchDB is open source NoSQL database ,distributed, and schemaless.It stores the document data in the JSON format. It also provides feature related to web, like access of document from the web browser through HTTP. Javascript can also be use to modify the document. In CouchDB document is combination of strings “keys” and “values”.

Advantage and Disadvantage of NoSQL:


[/subscribelocker]Tools from Companies – Cassandra, Riak, Redis, HBase, Oracle, membase, mongoDB.

Now, super fun geek time – here’s a funny parody on Hadoop and NoSQL. Enjoy and have a great weekend!



Top Big Data Technologies and Tools- Hadoop and NoSQL Ecosystem | Big Enterprise Data

[…] See on […]


I got a job by saying this answer in my last interview. thanks for awesome help.
I got more idea about Hadoop from Besant Technologies. If anyone wants to get Hadoop Training in Chennai visit Besant Technologies.

AB-Initio online Training

There is certainly a great deal to find out about this issue.
I love all the points you’ve made.


This blog tells abut the latest technology t is really great and unique too it is really interesting and thanks for sharing this information.

Leave a Comment