From the perspective of data integration, we have two ontologies from different sources as the input of QUEST. Before continuing, we define some notation for simplicity.?
1. Two ontologies (or schemas) are A and B.?
2. map(A, B): defines a set of mapping rules between A and B, which depict relationships of elements or structures in A with those in B. These rules also define conflicts happened during the matching or merging.?
3. comb(A, B): gives the integrated view of both A and B. Mapping rules may be merged in this integrated representation.?
4. tran(A): converts the ontology/schema A into the graphic representation of QUEST.
5. tran(R): integrates the mapping rule R into the graphic representation of QUEST.
Three cases are considered when the data integration of A and B is concerned:
1) A + B + map(A, B) --> comb(A, B) --> tran(comb(A, B))
The two ontologies A and B are matched based on some algorithm and mapping rules between A and B are generated as well. All of these are merged into one single integrated view comb(A, B).
In order to translate comb(A, B) into QUEST model in our paper, it's probably to adapt the comb(A, B) to fit our existing model.
The papers "A graph theoretical foundation for integrating RDF ontologies" and "A graph-oriented model for articulation of ontology interdependencies" including "Semantic data integration in hierarchical domains", are good examples which could provide an intermediate and integrated form.
?? Could we use it? Modifications are required for the model of comb(A, B).
2) T(A) + T(B) + T(map(A, B))
Many papers about schema matching, don't resort to an integrated view of schemas from different sources. Instead, they focus on the construction of mapping rules linking schemas from different sources. Usually, it's seemly easy to convert a schema like A into QUEST model (T(A)) based on the algorithm in our paper. When the mapping rules (map(A, B)) are concerned, our model is probably supposed to be extended.
3) A --> T(A) + B --> T(B) + map(T(A), T(B))
In this case, the matching operation is made on the basis of QUEST representations of A and B, that is T(A) and T(B). Usually, the user will be requested during the matching process. In order for the representation of null values, QUEST model is a little more complex, so that it's not direct for a user instead of an expert to finish the matching work.
The conclusion is that 2) might be preferred according to the difficulties that we could meet with when modifying the existing algorithm to generate QUEST model.
Thus, in the following we focus on the case: T(A) + T(B) + T(map(A, B)) ;
If the ontology of A is kind of standard input, it is similar to a hierarchical structure, and should be easily converted into QUEST form based on our algorithm.
map(A,B) obtains a set of mapping rules about the relationships between schemas from different sources. We divide these rules into 2 categories:
(1) mapping rules
In this category, 3 different situations can be as follows:
a. node - node:
The two nodes from different schemas are equivalent of some matching concept. (That's they are equal to represent one concept)
One method is to merge these two nodes as one in the final QUEST model.
The other is to introduce a new node to represent the single concept, taking these nodes as children.
b. node - group of nodes
A group of nodes is composed of over one nodes, and two subcategories are set based on the structure of the nodes.
If there are no a structure in the group or the structure can be ignored, it's a simple set of nodes.
* Merging or introducing new concept node is the basic method to deal with these cases.
Otherwise, the nodes in the group are organized in the structural form, like tree, graph, or some patterns.
* It's a little complex
c. group of nodes - group of nodes
(2) constraints (see paper "A graph theoretical foundation for integrating RDF ontologies")
a. Horn constraints: r1 ^ r2 ^ ... ^ rn -> t
b. Negative constraints: <> etc.