Search This Blog

Friday, December 14, 2007

Validation by Projection

Many of the architectures and strategies for validation apply validity
checking to a particular document with a pass or fail result on the
document. This assumes that the schemas used in validation are expressive
enough for all the potential versions of documents including any
extensions. We've regularly seen that the Schema 1.0 wildcard limits
the ability for fully describing documents. For example, it is
impossible to have a content model that has optional elements in
multiple namespaces with a wildcard at the end. The choice is to either
have the wildcard or the elements. There is another approach to
validation, called validation by projection, which effectively removes
any unknown content prior to validation. It is validation of a projection
of the XML document, where the projection is a subset of the xml document
with no other modifications to the contents including order. Part of
validation by projection is determining what to project. The simplest
rule for determining what to project is: Starting at the root element,
project any attributes and any elements that match elements in the
content model of the current complexType and recurse into each element.
[Author's note to W3C TAG: I wrote up a couple of personal blog entries
on validation by projection.This seems to be a useful way of achieving
forwards and backwards compatibility without relying upon schemasthat
havewildcards or open content models. From the TAG's definitional
perspective, I'd characterize validation by projection as an architecture
where the schema(s) define a Defined Text Set and an Accept Text Set
that is equal to the Defined Text Set, then the process of projection
is the creation and validation of the text against a generated Accept
Text Set that has the original Accept Text Set plus all possible extra
undefined elements and attributes.]

No comments: