Guoren Wang
Information Science and Engineering, Northeastern University, Shenyang 110004, Peoples Republic China
Bin Wang
Information Science and Engineering, Northeastern University, Shenyang 110004, Peoples Republic China
Donghong Han
Information Science and Engineering, Northeastern University, Shenyang 110004, Peoples Republic China
Baiyou Qiao
Information Science and Engineering, Northeastern University, Shenyang 110004, Peoples Republic China
ABSTRACT
Easily accessible information on the World Wide Web (WWW) and affordable large capacity secondary storage make it easy to build up very large document collections even in personal computers. However, the method of organizing files in computers has not been changed too much for decades. Searching for a particular document or file from a gegabytes collection based on traditional tree structured file directories becomes never an easy task. This study presents a system where documents are no longer identified by their file names. Instead, a document is represented by its semantics in terms of descriptor and contents vector. The descriptor of a document consists of a set of attributes, such as date of creation, its type, its size, annotations, etc. The content vector of a document consists of a set of terms extracted from the document. Such semantic information provides the user with associative searching capability, that is, documents can be obtained by giving required properties. The representation of document semantics and document organization and key word-based indexing techniques are discussed. Furthermore, for the largely used XML data in Web representing and exchanging, some structure-based querying techniques are proposed in this study, i.e. structural indexes and path expression optimization principles. A prototype visual based explorer that makes use of semantics of documents is also described.
PDF References Citation
How to cite this article
Guoren Wang, Bin Wang, Donghong Han and Baiyou Qiao, 2005. Design and Implementation of a Semantic Document Management System. Information Technology Journal, 4: 21-31.
DOI: 10.3923/itj.2005.21.31
URL: https://scialert.net/abstract/?doi=itj.2005.21.31
DOI: 10.3923/itj.2005.21.31
URL: https://scialert.net/abstract/?doi=itj.2005.21.31
REFERENCES
- Zamir, O. and O. Etzioni, 1998. Web document clustering: A feasibility demonstration. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 24-28, 1998, ACM Melbourne, Australia, pp: 46-54.
CrossRef - Schutze, H. and C. Silverstein, 1997. Projections for efficient document clustering. Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Volume 31, ACM SIGIR, July 27-31, 1997, Philadelphia PA, USA., pp: 74-81.
Direct Link