Kong Xiangsheng
Department of Computer and Information, Xinxiang University, Xinxiang, China
ABSTRACT
This study focuses on the scientific data processing and transmission in map tasks. Based on Hadoop MapReduce of cloud computing, we propose the basic idea of scientific data prefetching mechanism proposed in this paper is to overlap the data transmission process with the scientific data processing process. And we propose the detailed procedure of scientific data processing algorithm which can improve the overall performance under the shared environment while retaining compatibility with the native Hadoop MapReduce in this study.
PDF References Citation
How to cite this article
Kong Xiangsheng, 2013. Scientific Data Processing Using Mapreduce in Cloud Environments. Information Technology Journal, 12: 7869-7873.
DOI: 10.3923/itj.2013.7869.7873
URL: https://scialert.net/abstract/?doi=itj.2013.7869.7873
DOI: 10.3923/itj.2013.7869.7873
URL: https://scialert.net/abstract/?doi=itj.2013.7869.7873
REFERENCES
- Howe, B., P. Lawson, R. Bellinger, E. Anderson and E. Santos et al., 2008. End-to-end eScience: Integrating workflow, query, visualization and provenance at an ocean observatory. Proceedings of the 4th IEEE International Conference on eScience, December 7-12, 2008, Indianapolis, IN., USA., pp: 127-134.
CrossRef - Thirumala Rao, B. and L.S.S. Reddy, 2011. Survey on improved scheduling in hadoop mapreduce in cloud environments. Int. J. Comput. Appl., 34: 29-33.
Direct Link - Chu, C.T., S.K. Kim, Y.A. Lin, Y.Y. Yu, G. Bradski, A.Y. Ng and K. Olukotun, 2006. Map-reduce for machine learning on multicore. Proceedings of the 19th Conference on advances Neural Information Processing Systems, December 4-7, 2006, Cambridge, MA., USA., pp: 281-288.
Direct Link - Yang, H.C., A. Dasdan, R.L. Hsiao and D.S. Parker, 2007. Map-reduce-merge: Simplified relational data processing on large clusters. Proceedings of the ACM SIGMOD International Conference on Management of Data, June 11-14, 2007, Beijing, China, pp: 1029-1040.
CrossRef - Ekanayake, J., S. Pallickara and G. Fox, 2008. MapReduce for data intensive scientific analyses. Proceedings of the IEEE 4th International Conference on eScience, December 7-12, 2008, Indianapolis, IN., USA., pp: 277-284.
CrossRef - Campbell, R., I. Gupta, M. Heath, S. Ko and M. Kozuch et al., 2009. Open CirrusTM cloud computing testbed: Federated data centers for open source systems and services research. Proceedings of the USENIX Workshop on Hot Topics in Cloud Computing, June 15, 2009, San Diego, CA., USA., pp: 1-5.
Direct Link - Downs, R.R. and R.S. Chen, 2010. Self-assessment of a long-term archive for interdisciplinary scientific data as a trustworthy digital repository. J. Digital Inform., Vol. 11, No. 1.
Direct Link - Seo, S., I. Jang, K. Woo, I. Kim, J.S. Kim and S. Maeng, 2009. HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment. Proceedings of the IEEE International Conference on Cluster Computing and Workshops, August 31-September 4, 2009, New Orleans, LA., USA., pp: 1-8.
CrossRef - Kleiminger, W., E. Kalyvianaki and P. Pietzuch, 2011. Balancing load in stream processing with the cloud. Proceedings of the IEEE 27th International Conference on Data Engineering Workshops, April 11-16, 2011, Hanover, Germany, pp: 16-21.
CrossRef - Liu, X., 2010. Key research issues in scientific workflow temporal verification. Proceedings of the 1st CS3 PhD Symposium, February 26, 2010, Australia, pp: 49-51.
Direct Link - Xiao, Z.F. and Y. Xiao, 2011. Accountable MapReduce in cloud computing. Proceedings of the IEEE Conference on Computer Communications Workshops, April 10-15, 2011, Shanghai, China, pp: 1082-1087.
CrossRef