Abstract
With the development of new technologies of information and communication, more and more data sources are available. The diversity, distribution and heterogeneity of these sources make their access a real issue. Data integration systems allow a unified and transparent access to heterogeneous data sources giving the illusion to user that he is accessing only one source. A query over the global schema of a data integration system is rewritten into the union of sub-queries over sources. But obviously, not all sources contain necessary answer to the posed query and querying such sources is a loss of time. To optimize queries on an integration system, we propose in this patent paper, to associate (annotate) to each numeric or enumerated property, the domain of values of this property in the source. We group together and merge domains into subdomains. A set of sources is assigned to each sub-domain. We store these sub-domains and linked sources (annotations) at the mediator. Consequently, a query having a predicate on one of these annotated properties will be checked before sending to a source. If the source does not contain instances that satisfy the predicate, this source is removed from the concerned sources. Our patent proposition is evaluated on a set of generated databases and the experimental results show a clear improvement in the queries execution time.
Keywords: Annotation, data integration, ontologies, query optimization.