Search This Blog

Sunday, October 14, 2007

Bytes not Infosets

Security is the one area where the WS-* world has developed a set of
standards that provide significantly more functionality than has so
far been standardized in the REST world. I don't believe that this
is an inherent limitation of REST; I'm convinced there's an opportunity
to standardize better security for the REST world. So I've been giving
quite a lot of thought to the issue of what the REST world can learn
from WS-Security (and its numerous related standards). Peter Gutmann
has a thought-provoking piece on his web site in which he argues that
XML security (i.e. XML-DSig and XML encryption) are fundamentally
broken... I would suggest that there are two different ways to view
XML: (1) the concrete view: in this view, interchanging XML is all about
interchanging sequences of bytes in the concrete syntax defined by XML
1.0; (2) the infoset view: in this view, interchanging XML is all about
interchanging abstract structures representing XML infosets; the syntax
used to represent the infoset is just a detail to be specified by a
binding; the infoset view tends to lead to bindings up the wazoo. I
think each of these views has its place. The infoset is an invaluable
conceptual tool for thinking about XML processing. However, I think
there's been an unfortunate tendency in the XML world (and the WS-* world)
to overemphasize the infoset view at the expense of the concrete view.
I believe this tendency underlies a lot of the problems that Gutmann
complains of. There's nothing unstable or unsignable about an XML
document under the concrete view. It's just a blob of bytes that you
can hash and sign as easily as anything else (putting external entities
on one side for the moment). The infoset view makes it hard to
accommodate non-XML formats as first-class citizens. If your central
data model is the XML infoset, then everything that isn't XML has to
get mapped into XML in order to be accommodated. For example, the
WS-* world has MTOM. This tends to lead to reinventing XML versions of
things just so they can be a first-class citizens in an infoset-oriented
world... Although infoset-level APIs are needed for processing XML,
when you use infoset-level APIs for interchanging XML between separate
components, I believe you pay a significant price in terms of
flexibility and generality. In particular, using infoset-level APIs
at trust boundaries seems like a bad idea. My conclusion is this: one
aspect of the WS-* approach that should not be carried over to the REST
world is the emphasis on XML infosets. More Information

No comments: