Tuesday, January 17, 2006

Definition of Null Values in the XML documents

According to Dr.Candan's report - A Unified Treatment of Null Values using Constraints, there are at least 5 types of null values in the relational database: existential null (ex_mar), maybe null (ma_mar), place holder null (pl_mar), partial null (pa_mar), and partial maybe null (pm_mar). All of these null values are tag-related, that's, they don't deal with the potential relationships among these values.
As far as an XML document is concerned, two kinds of null values can be categorized from a high level of view: tag-related null values and structure-related values. In such a situation, we can directly exploit the results obtained in the report to process the tag-related null value. (Is it true? I need more consideration and discussion!) For structure-related null values, it's not decided yet. We have to define them first of all.
Usually there is a DTD file associated with a set of XML documents, giving a definition of the structure of these documents. So we can call such a DTD as a structural description of a class of XML documents. All of possible non-leaf nodes should be provided in the DTD file, and the basic components making up the tree are also presented.
From this point, nodes in an XML document can be divided into two categories: non-leaf nodes and leaf nodes. Leaf nodes are value nodes, which constantly change. Tag-related null values can be happened as a leaf node. Non-leaf nodes are function nodes, which are almost constant. The relationship among these nodes gave a general idea about the structure of the XML document. In result, structure-related null values should take place in these nodes. But if we treat non-leaf and leaf nodes equally as nodes in the XML document, we are also able to take advantage of methods to do with tag-related null values when processing the very kind of structure-related null values. (Hope it's expected. )
On the other hand, we have to consider the relationship among different non-leaf nodes (such as parent-child, ancestor-descendant etc.) Another category of structure-related null values come out at this point. It's not close to tag-related null values at all. Some questions follows at once. How to define these null values, and how they influence the common operations on XML documents, ...

No comments: