Analysis And Processing Of Streaming Data Using Spark

Research Article
Amratansh Sharma and Poovammal E
DOI: 
http://dx.doi.org/10.24327/ijrsr.2017.0808.0701
Subject: 
Engineering
KeyWords: 
Spark, Streaming Data, Big Data Processing Engines, Pipelines, real time processing, Big Data Analytics.
Abstract: 

The data flow is consistently increasing with the volume of data. The batch processing which requires different programs and computations for input, processing and output will definitely lead to lower efficiency of the big data management systems in terms of cost and speed. Hence it is important to handle the streaming data which is the real time processing. The rate at which it ingest in, the variety of with which occurs gives rise to the need of having big data processing engines and software which brings computation to the data and analyze the data in an efficient manner. One such tool which is described in details is spark. Besides the elucidation of various concepts and tools for the data processing, some analysis is done on the streaming data set which is the weather data set containing different attributes and values at different time stamps. The analysis is done taking into account different window size, the time interval which it takes to extract the data and different sliding size.