Topic-Sensitive Multi-document Summarization Algorithm | IEEE Conference Publication | IEEE Xplore

Topic-Sensitive Multi-document Summarization Algorithm


Abstract:

Latent Dirichlet Allocation (LDA), has beenrecently used to automatically generate text corpora topics, and applied to sentences extraction based multi-documentsummarizat...Show More

Abstract:

Latent Dirichlet Allocation (LDA), has beenrecently used to automatically generate text corpora topics, and applied to sentences extraction based multi-documentsummarization algorithms. However, not all the estimatedtopics are of equal importance or correspond to genuinethemes of the domain. Some of the topics can be a collection ofirrelevant or background words, or represent insignificantthemes. This paper proposed a topic-sensitive algorithm formulti-document summarization. Our approach is distinguishedfrom existing approaches in that we use LDA model to identifyand distinguish significance topic which is used in sentenceweight calculation. Moreover, beside topic characteristics, thisapproach also considered some statistics characteristics, suchas term frequency, sentence position, sentence length, etc. Thisapproach not only highlights the advantages of statisticscharacteristics, but also cooperated with LDA topic model. Theexperiments showed that the proposed algorithm achievedbetter performance compared the other state-of-the-artalgorithms on DUC2002 corpus.
Date of Conference: 13-15 July 2014
Date Added to IEEE Xplore: 07 October 2014
ISBN Information:

ISSN Information:

Conference Location: Beijing, China

References

References is not available for this document.