Monday, June 4, 2012

A lot of Junk to get a tiny HUNK

A lot of Junk to get a tiny HUNK

"Data Mining" was the word used only few years ago to search a lot of data to query or search important aspects or slices of the data. Now the word "Big Data" has become very popular.

You may look at the "bytology" in Wikipedia http://en.wikipedia.org/wiki/Zettabyte to get familiar with the different powers of 10, starting from Kilo for the power of 3, Mega for the power of 6, Giga for the power of 9, Terra for the power of 12, Peta for the power of 15, Exa for the power of 18, Zetta for the power of 21 and Yotta for the power of 24.

The Big Data platforms can and do search peta bytes of data using hundreds to thousands of clusters and setting up virtual machines, that is mostly "junk'. A lot of junk may give a tiny HUNK.

What is HUNK?
  • High Value
  • Utilitarian
  • Niche
  • Knowledge (or Information)
The value of the hunk depends on the "surgical" processes, methods, algorithms and the statistical analysis employed in combing through the structured, semi structured  and unstructured data, with search / query criteria defined and employed by the data science and business intelligence team.  Before these steps, the data logs, with routine (daily, hourly or up to the minute or second) collection of the data are processed through ETL (Extract, Transform and Load), which are further go through Map and Reduce type massively parallel processing in hundreds to thousands of clusters, many of which are replicated for redundancy. 

Big Data processing, analysis and visualization requires a lot of memory hardware, many processors and other computing hardware and software resources. One of the Vs of Big Data "Volume" implies the BIG part of the word Big Data. To achieve the other V, the "Velocity" requires many parallel processor and memory resources. 

When you churn oceans of milk, large waves are created, with the cream rising to the top; a lot of junk for a tiny HUNK; a lot of peta, exa, zetta and yotta to get a tiny "sweeta"!

This tiny Sweeta is as valuable as gold. At least, this is the premise and promise of the Big Data.

Happy churning!

No comments:

Post a Comment