Workload Characterization Of Hadoop Cluster | IJRSR Journal-UGC guidelines journal

SUBJECT AREA

Life Sciences

Biology
Botany
Zoology
Microbiology
Biotechnology
More→

Health Science

Dermatology
Medicine and Dentistry
Dentistry
Endocrinology
Immunology
Nursing and Health Professions
More→

Physical Sciences and Engineering

Physics
Kinetics
Mechanics
Electromagnetics
Mathematics
Thermodynamic
More→

Social Science and Humanities

Physical Education
Economics
Social Sciences
Arts and Humanities
Business Management
Management
Accounting
More→

Workload Characterization Of Hadoop Cluster

Research Article

C.M.Arun, D Sathya and V.Monica

DOI:

http://dx.doi.org/10.24327/ijrsr.2019.1005.3453

Subject:

science

KeyWords:

HDFS is a tool to implement Hadoop. The default scheduler used is a YARN scheduler. It is concerned with five priority levels

Abstract:

As the organizations start to use data intensive cluster computing systems like Hadoop for more applications, there is a growing need to share clusters between users. To address the conflict between locality and fairness various algorithm are proposed. Map Reduce is becoming the high-tech computing paradigm for processing large-scale datasets on a large cluster with large number of nodes. It has been most useful in various applications such as e-commerce, Web search, social networks, and scientific computation. Tounderst and the characteristics of Map Reduce workloads is the key to achieve better configuration decisions and improving the system throughput. To achieve better performance, a map reduce scheduler must avoid unnecessary data transmission by enhancing the data locality. A map reduce which is found to be the offline computing Engine solves the issues of too large data to fit into a single machine. This mapreducefunction comprises of Job Tracker and Task Tracker, where the Job Tracker is concerned with the division of the given input dataset into chunks and sends to the individual nodes. Map Reduce is a programming model which supports distributed and parallel computing on the data intensive applications. HDFS is a tool to implement Hadoop. The default scheduler used is a YARN scheduler. It is concerned with five priority levels. Improving the performance of the map reduce function will be in the contest of improvement in system latency, memory settings, input output bandwidth, job parallelization. The factors that affect the performance of the Hadoop are Hardware, Mapreduce, HDFS, Shuffle tweaks.

Certificate Request Form

13414-A-2019.pdf

Monthly archive