Saturday, December 02, 2006

A Survey on Information Systems Interoperability
In this survey, some basic concepts are presented. It's pretty good to go over these concepts in this paper. Its structure is clear and it's easy to understand most of the content. The author had the rich experience to apply the techniques of information systems into the agriculture. I think that is a system to manage multiple database containing agricultural information, therefore it might be pretty helpful on our work, since our goal is to construct a data-integration system to manage multiple archeological database.

Now, let's take a look at his paper section by section, (the easiest way to review J)
  1. In the first section, the motivation and the basic method used are discussed respectively. 
    1. the goal is the construction of data warehouse (or materialized view) integrating several kinds of data sources, particularly for scientific applications in agriculture. Interesting, because it's a little similar to our goal in KADIS, the only difference is that our application is about archeology. 
    2. The background of the application is that: distinct data sources may be maintained independently, and research on semantic data heterogeneity is focused, and an incremental and modularized approach is suggested by the paper to deal with the issues of data integration. 
  2. Information System Interoperability
    1. The paper suggests that the only way to reach interoperability is by publishing the interfaces, schemas and formats used for information exchange, making their semantics as explicit as possible, so that they can be properly handled by the cooperative systems. 
    2. Three viewpoints must be considered about the information systems' interoperability: application domain, conceptual design and software system technology. For each viewpoint, interoperability should be achieved.
  3. Data System Interoperability
    1. in this section, the definitions of centralized database system and heterogeneous database system are firstly presented.
    2. Two categories of approaches to enable integrated access to multiple physical databases: schema integration and the federated approach. 
    3. web database is also mentioned in the section, and the challenge of the querying Web database research is the construction of a unified and simple interface.
  4. Data Integration
    1. The basic procedure of data integration concerned here include the resolution of heterogeneity conflicts and transformations of source data to accommodate them in the integrated view.
    2. The kinds of data to be integrated and the heterogeneity conflicts should be firstly categorized. 
    3. The structure of data is discussed: the structured data and semi-structured data. 
    4. Data heterogeneity, or conflict is summarized. Two ways are used to define the data conflicts or data heterogeneity. 
      1. representational conflict and semantic conflict;
      2. based on the different levels of abstraction, such as instance, schema, data model. The conflicts can be classified as : data conflicts, schema conflicts, data versus schema conflicts, and data model conflicts. 
    5. Some proposals are brought up to solve these conflicts in the RDB and semi-structured data. Here some surveys need reading. 
    6. Another way to solve the conflict, is the construction of the standard to describe the semantics. (common semantics)
    7. A series of procedures are suggested in the paper to generate the unified view of heterogeneous data. It's inspiring for our work, I think.
  5. Building blocks to integrate data in cooperative systems
    1. In the section, the author describes the software framework, modules, and techniques that could be used to contribute the integrated data views.
  6. The semantic web
    1. A simple digest is presented here, giving us a general idea of the semantic Web standards and technologies. 
      1. Character Encoding + URI 
      2. XML + Schema
      3. RDF + RDFS
      4. Ontology
  7. Web services

No comments: