Abstract:
This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a vari...Show MoreMetadata
Abstract:
This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer to as the concept space approach, we aimed to create graphs of domain-specific concepts (terms) and their weighted co-occurrence relationships for all major engineering domains. Merging these concept spaces and providing traversal paths across different concept spaces could potentially help alleviate the vocabulary (difference) problem evident in large-scale information retrieval. In order to address the scalability issue related to large-scale information retrieval and analysis for the current Illinois DLI project, we conducted experiments using the concept space approach on parallel supercomputers. Our test collection included computer science and electrical engineering abstracts extracted from the INSPEC database. The concept space approach called for extensive textual and statistical analysis (a form of knowledge discovery) based on automatic indexing and co-occurrence analysis algorithms, both previously tested in the biology domain. Initial testing results using a 512-node CM-5 and a 16-processor SGI Power Challenge were promising.
Published in: IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 18, Issue: 8, August 1996)
DOI: 10.1109/34.531798