Finding The Co-Occurrence Of Text Entities From The Corpus

Research Article
A.Muthusamy
DOI: 
http://dx.doi.org/10.24327/ijrsr.2018.0901.1401
Subject: 
science
KeyWords: 
Information Extraction, Retrieval; Text mining, Text Analytics, Statistical Computing;
Abstract: 

The web text documents are often structured, un-structured or semi-structured format are available on the internet. Multiple web text documents are downloaded and loaded into text mining framework that aims to extract co-occurrence of the entities from textual information. To achieve this process multiple text documents with text-mining scripts using R are presented in an efficient manner. In that, Term Document Matrix and the association between the terms is statistically computed. For statistical computing, R provides a class as term-document-matrices transported from a Corpus use bag-of-words mechanism, which implies that lists all occurrences of words within the corpus and this approach results in a matrix format.