Abstract
Background: The number of pedagogic web page created on the web, which corresponds to number of courses and exercises available on the web, exceeds the number of published books each year. These web documents are often too long to be easy to read, especially when important information is dispersed in various parts and often definite in a more or less formal way. Therefore, they must be described with machine-readable data, otherwise they become unusable and impossible to find. The main objectives of this paper are enhancing information sharing, improving trade and increasing interoperability on the web. Recently, few patents on semantic annotations have been published. Indeed, with the great mass of data managed throughout the world and especially with the development of the web towards semantic Web where annotations are associated with all types of documents on the web, the selection of annotation has become an important criterion in research step. In this article, we focus on the annotation of Web documents and validation of this annotation.
Methods: The keywords representing the page are defined and tagged with the concepts of ontology. The words that are components of the annotation are determined from a mixed analysis: calculating the degree of similarity and the frequency. When inconsistencies are detected, the annotation is revised in a revised module.
Results: The results obtained are very encouraging, which shows the importance of our validation module after the merger of the two annotation techniques in extraction of keywords.
This validation creates an act of trust between the annotation systems and the search engines that take on the annotations created.
Conclusion: The extraction of the words used in the annotation is a very important factor which gives a fair presentation to the documents in question. Once the annotation is made, the validation tests of stage make these consistent annotations ready to be consumed by the search engines.
Keywords: Annotation, semantic web, ontology, similarity degree, the frequency calculation, consistency.
Graphical Abstract