The bureaucratic domain and the legal one, in particular, are characterized by a huge amount of information. In order to opportunely manage the knowledge embedded within documents for structuring, indexing and retrieval purposes, a suitable statistical-lexical approach is required for a quick identification of relevant and peculiar information. The main goal of this study is to describe two integrated strategies for semi-automatic extraction of significant and peculiar terms, starting from a corpus of documents belonging to legal domain. The extracted lexicon will provide a basis for the construction of a conceptual system to be used as knowledge base supporting the semantic processing of documents.
F. Amato, R. Canonico, A. Mazzeo and A. Picariello, 2011. Statistical and Lexical Analysis for Semi-automatic Extraction of Relevant Information from Legal Documents. Journal of Applied Sciences, 11: 639-646.