Search This Blog

Sunday, February 17, 2008

XML 1.0 (Fifth Edition)

The fifth edition of XML 1.0 is now a 'Proposed Edited Recommendation'
(PER). New editions do little more than incorporate errata, hardly
newsworthy. This one is different. Fifth Edition is now out for review.
The review period is long, lasting until 16-May-2008, because one of
the proposed changes is significant. Before the fifth edition, XML 1.0
was explicitly based on Unicode 2.0. As of the fifth edition, it is
based on Unicode 5.0.0 or later. This effectively allows not only
characters used today, but also characters that will be used tomorrow.
One of the real strengths of XML from the very beginning was that it
required processors to support Unicode. This made XML, and all XML
processors, international. But as Unicode has been extended to support
languages written in Cherokee, Ethiopic, Khmer, Mongolian, Canadian
Syllabics, and other scripts, XML 1.0's explicit use of Unicode 2.0 has
prevented it from growing as well. That's a problem that XML must fix
if it wants to continue to be regarded as a universal text format...
The fifth edition does not change the status of any existing XML 1.0
document with respect to well-formedness or validity. Nor does it
introduce any of the backwards-incompatible changes introduced in XML 1.1.
It isn't entirely without pain, unfortunately. Even if we imagine that
all parsers will be updated to reflect the fifth edition (and it's
possible to be optimistic on this point as it actually makes parsers
smaller and simpler) eventually, there will be some period of time in
which your (fourth edition) parser might reject my (fifth edition)
document. The XML Core WG is taking the position that the benefits of
extending XML 1.0 in this way outweigh the costs imposed by the change.
It remains to be seen if the community will agree. Bear in mind that
this sort of change isn't entirely unprecedented, we previously
decoupled 'xml:lang' attributes from the relevent RFCs and we tinkered
with the specific version of Unicode 3 referenced. That said, this is
still a much more substantial change. More Information See also the XML-DEV discussion thread: Click Here

No comments: