Frequent Subgraph Extraction Based On Map Reduce

Research Article
CH Sudhakar
DOI: 
xxx-xxxxx-xxxx
Subject: 
science
KeyWords: 
Map Reduce, Frequent sub graph, Feature extraction.
Abstract: 

Frequent sub graph extraction from a large number of small graphs is a primitive operation for many data mining applications. To extract frequent subgraphs, existing techniques need to enumerate a large number of subgraphs which is super linear with the cardinality of the dataset. Given the rapid growing volume of graph data, it is difficult to perform the frequent subgraph extraction on a centralized machine efficiently. So, there is a need to investigate how to efficiently perform this extraction over very large datasets using MapReduce. Parallelizing existing techniques directly using MapReduce does not yield good performance as it is difficult to balance the workload among the compute nodes. This framework adopts the MRFSE strategy to iteratively extract frequent subgraphs, i.e., all frequent size-(i+1) subgraphs are generated based on frequent size-i subgraphs at the ith iteration using a single MapReduce job. To efficiently extract frequent subgraphs, preparation and mining phase are used which includes isomorphism testing to eliminate duplicate patterns. Frequent subgraphs extraction can be done efficiently and efficiently by using a distributed environment named Hadoop MapReduce framework.