Volume : III, Issue : V, May - 2014
Extending Hadoop to Improve Support for Multiple–input Applications
Sarath C, Mrs Usha K
Abstract :
Hadoop is a MapReduce programming model which provides a cost effective solution for many data–intensive applications. Hadoop stores data distributively and exploits data locality by assigning tasks to where data is stored. Many data–intensive applications, however, require two (or more) input data for each of their tasks. Such applications pose significant challenges for Hadoop as the inputs to one task may reside on multiple nodes, and Hadoop is unable to discover data locality in this scenario. This often leads to excessive data transfers and significant degradations in application performance. So, Bi–Hadoop was introduced as an efficient extension of Hadoop to better support binary–input applications. Bi–Hadoop integrates an easy–to–use user interface, a binary–input aware task scheduler, and a caching subsystem. Experiments show that Bi–Hadoop can significantly improve the execution of binary–input applications by reducing the data transfer overhead, and outperforms existing Hadoop by more than 3x. In this paper, we introduce a further enhancement of Bi–Hadoop by incorporating support for multiple input applications, that is, applications in which the input may reside on more than two nodes.
Keywords :
Article:
Download PDF
DOI : 10.36106/ijsr
Cite This Article:
Sarath C, Mrs Usha K "Extending Hadoop to Improve Support for
Multiple-input Applications International Journal of Scientific Research, Vol.III, Issue. V, May 2014
Number of Downloads : 812
References :
Sarath C, Mrs Usha K "Extending Hadoop to Improve Support for Multiple-input Applications International Journal of Scientific Research, Vol.III, Issue. V, May 2014
Our Other Journals...
-
Indian Journal of
Applied Research Visit Website -
PARIPEX Indian Journal
of Research Visit Website -
Global Journal for
Research Analysis Visit Website