Have a look please!
You will definitely be surprised to know that every day, we create 2.5 quintillion bytes of data — and the data increase rate having a steep slope, such that 90% of the data in the world today has been created in the last two years alone. Every day millions of users log on to Facebook, change their profile picture, even you would have changed it in the last week or even more recently may be! Gmail, Yahoo, and several other search Engines, many websites add to the cluster of the data that are being dumped everyday! I said dumped because in the past not even the top companies liked to take the pain to use this data to their advantage. To your surprise 80% of data captured today is unstructured, from sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, to name a few.
All of this unstructured data is called as Big Data. Big Data is nothing but a high volume and variety of information containing assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. If you are interested to learn more about Big data, check our previous blog post, What is Big Data and How Fast is it Growing?
Imagine if you could “afford” to keep all the data generated by your business. What if you had some magical tool to analyze that Big Data! “What would you want to analyze?” you may ask. How about customer click and/or buying patterns? How about buying recommendations? How about personalized ad targeting, or more efficient use of marketing dollars?
Do not tease your brains more because your wait is over! And the Sirius star in the cluster of million stars is “Hadoop”.
With Hadoop, no data is too big. Some big success stories are : The New York Times using Hadoop to convert about 4 million entities to PDF in just under 36 hours, or the infamous story of Pete Warden using it to analyze 220 million Facebook profiles, in just under 11 hours for a total cost of $100. In the hands of a business-savvy technologist, Hadoop makes the impossible look trivial.
So the big question is how is Hadoop helpful to you ? How does the Big Data seem like an ant to it!
Instead of serial processing of data and killing time, Hadoop utilizes distributed parallel processing of huge amounts of data. As it can work across inexpensive servers that can both store and process the data, it is called as Distributed processing and as you can work on one or more servers at the same time, it is also called as Parallel processing.
- Scalable – [subscribelocker]New data in forms of nodes can be added to the cluster whenever needed without changing the data formats, loading process etc.
- Cost effective – Hadoop can store enormous amount of data on enormous cluster of computers and then allows to operate on that data in parallel. The result is a sizeable decrease in the cost per terabyte of storage, which in turn makes it affordable to model all your data.
- Flexible – Data from multiple sources in any format and type can be joined and aggregated on Hadoop which enables deeper analysis than provided by any one system. It does not require any pre – defined schematic for analyzing the data.
- Fault tolerant – When you lose a node, the system redirects work to another location of the data and continues processing without missing a beat.
These features of Hadoop make it better than the other available software.
Source: Brad Hedlund Blog
Structured and unstructured data, communications records, emails, log files, pictures, audio files and almost anything you can think of and in any format can be managed with ease by using Hadoop. Even when different types of data have been stored in unrelated systems, you don’t need to know how to query your data before you store it; Hadoop lets you decide that later. You just need to tell Hadoop what function it has to perform on the data and the same function will be performed on every bit of data irrespective of its size, format etc. and over time it can reveal questions you never even thought of asking.
Hadoop has various applications. Hadoop lets you frame questions which you could not even think of framing and reveal answers to the typical problems by making all the available data usable. Now complete data sets and not just samples are available to you for better analysis.
From a business perspective, Hadoop allows businesses to find answers to questions they didn’t even know how to ask and help in research and development work. It is useful to build deeper relationships with external customers, providing them with valuable features like recommendations, fraud detection, and social graph analysis.
- It provides insights into daily operations
- Drives new product ideas
- Used by companies for research and development and marketing analysis
- Image and text processing.
- Analyses huge amount of data in comparatively less time.
- Network monitoring
- Log and/or click stream analysis of various kinds.
Advantages of Hadoop:
Map Reduce programs can be written in Java which is easy to learn and use.[/subscribelocker]
Can be employed on cheap hardware and does not require specialized parallel processing hardware thus making the system cheap.
Can process huge amount of data in parallel.
And in today’s hyper-connected world where more and more data are being created every day, Hadoop’s numerous advantages are a breakthrough for businesses and organizations which can now find value in data that was recently considered useless.
In fact, the need for Hadoop is no longer a question. The only question now is how to take advantage of it in the best way.
So do you still think is Hadoop right for you?