XML: January 2008

Tuesday, January 8, 2008

The Design Goals of XML

Recently, I have heard several times people quoting the XML goals to
support various opinions on what makes a good or bad markup language
(schema). In particular, [the XML Specification] goal #10 Terseness is
of minimal importance gets used to claim that abbreviated element names
go against the spirit of XML -- a blithe spirit indeed. But if we look
at the XML Spec, we see that these are not general goals for XML
documents to follow, but goals for the committee designing XML the
technology: they are explicitly design goals... Looking at the goals
[and see Tim Bray's comments in the Annotated Version of the XML spec]
you can see that most of the goals are specific responses to problems
either with SGML or with the SGML process at ISO then. ISO standards
were supposed to have 10 year reviews which would be an opportunity for
changes to be addressed, outside the ordinary maintenance process. But
some influential and vital members of the ISO group had been committed
to keeping SGML unchanged for as long as possible, and many of the
other members who wanted change wanted changes that would support
technologies such as ISO HyTime better: these would be changes that
made SGML more complicated and varigated rather than simpler, to the
frustration of all... [John Cowan replies:] Two failures of the XML
process: It would have been better to require that public identifiers
be URIs, thus allowing system identifiers to continue their historic
function of being local addresses for things. Instead, we have public
ids limited to the charset of formal public ids, but nobody uses (or
knows how to use) FPIs. The XML Recommendation should have made clear
that the purpose of attributes of type NOTATION was to specify the
data type of the content of an element, rather than allowing this
function to be reinvented as 'xsi:type'... More Information

Web Applications Format and WS-ResourceTransfer Both Overload Alleged

've been reviewing the Access Control for Cross-site Requests document.
One interesting aspect of the document is that it specifies how a web
site can authorize other web sites to do non-GET operations such as
PUT or DELETE. The client makes an authorization request by creating
an HTTP GET with the http header Method-Check. The server then responds
with an HTTP Response containing Access-Control HTTP Headers or even
an XML document with Processing Instructions. The part that I found
very interesting is that it seems that the client's authorization
request isn't really for the resource identified by the URI, because
the goal is to actually get the authorization information. Thus, an
HTTP GET has been over-ridden to be a GET of metadata about a resource.
Also interestingly, if the URI for some reason doesn't know about the
Method-Check header, then it will return the "wrong" representation,
that is the actual representation. There is no way of requiring that
the server knows about the Method-Check request. Over in WS-* land,
WS-ResourceTransfer is a specification that uses a SOAP header
wsrt:ResourceTransfer to indicate that there may be RT specific
extensions to the WS-Transfer operation, such as GET. Because it uses
a SOAP header, it can use the soap:mustUnderstand attribute to require
that the server understand it. Seems to me like this is an interesting
case of where SOAP solves a problem that the Access Control for
Cross-site requests has, that is the ability to mark a header as
mustUnderstand. This isn't surprising, given that SOAP was exactly
created to solve problems with HTTP headers. More Information

XML-based Information Delivery Framework on the Desktop

JustSystems is offering xfy Technology as a fundamentally disruptive
technology, promoted as "the world's first XML-based information delivery
framework". Xfy is a data processor not a word processor and can create
all sorts of standards-based documentation. It could be described as an
XML browser, rather than a web browser. It deals with XML from scratch
and makes use of XML Schema, Extensible Stylesheet Language
Transformations (XSLT) and the extensibility of XML. xfy is a new
approach to building RIA, which produces composite documents based on
standard XML and integration via XSLT. It is a framework for enterprise
mashups with role-based visualisation and translation -- which "enables"
the addition of semantics (meaning) to data (its roadmap seems to support
semantic web in the future) and manages complexity. It features fast
deployment, and there is no expensive XML parsing before data can be
used. It is centralised and server-based. Xfy competes with Silverlight
and Flex, but it is just about XML, not code - it is "just" an RIA-style
framework for near-real-time applications. It is most effective in
(but not limited to) Java environment, as xfy has a Java-based client
that can combine/unify/normalise and visualise information from multiple
sources on the client side without the need for server-based scripting.
Debuggers are available but everything is built from reusable XML
components, so xfy users should not get into deep, complicated
programming-style logic...

Tech Giants Woo Developers Seeking JavaScript Turf

It's been a pretty busy fall and winter in software tools. In early
December 2007, Google hosted its Google Web Toolkit conference. Probably
not coincidentally that same week, Microsoft took the wraps off its
nascent Volta toolset. Both products allow developers to use their
existing expertise in Java (GWT) or in .NET-supported languages (Volta)
to write applications that will run on any device supporting JavaScript.
And thus the battle for developers' hearts and minds continues along
the Java vs. .NET front. "Microsoft has saturated the enterprise market
with Visual Studio and the other part of that market is owned by Eclipse,"
says Dave Thomas, founder and chairman of Bedarra Research Labs, a
long-time programming expert. He is taking a look at Volta, which will
probably end up as a Visual Studio add-on. So now Microsoft is striking
out beyond the enterprise into the more consumer-oriented Web
application development now dominated by Adobe/Macromedia toolsets.
The common denominator here for both the Microsoft and Google tools
is down-and-dirty JavaScript... Adobe this fall also took a tiny
open-source step, offering up its ActionScript Virtual Machine to
Mozilla.org. The resulting Tamarin Project hopes to bring ActionScript's
flashiness into the Firefox realm. ActionScript is the scripting
language embedded in Adobe's ubiquitous Flash player. Adobe, with
its Macromedia muscle, is still pretty much in the proprietary software
camp but some viewed this as a step forward.

Ten Mistakes Companies Make When Implementing SOA Projects

A growing number of companies are implementing SOA projects to lower
IT integration costs, while improving the time it takes to make changes
to business units. However, according to Paul Callahan, manager of
technical services at NetManage, of those companies that have begun
deployment, many are held up in the early implementation phase. This
has resulted in a number of organizations either scaling down or
abandoning their SOA deployment plans. "There are fairly common mistakes
to avoid when implementing an SOA project, and best practices are
starting to emerge based on successful, enterprise-wide deployments.
To follow are the 10 most common mistakes companies make when
implementing an SOA project. Recognize and avoid these potential
pitfalls to successfully get your SOA initiative off the ground: (1)
Taking a Shotgun Approach; (2) Failing to Involve Business Analysts;
(3) Spending More Time on SOA Products than SOA Planning; (4) Tackling
the Largest Projects First; (5) Forgetting that SOA is a Business
Problem; (6) Treating Identity as an Afterthought; (7) Buying New
Products When Existing Investments Suffice; (8) Misunderstanding
Company Key Players; (9) Expecting the SOA Project to Spread Quickly;
(10) Lacking Necessary Elements.

XHTML Access Module: Module to Enable Generic Document Accessibility

W3C invites public comment on the First Public Working Draft of
"XHTML Access Module: Module to Enable Generic Document Accessibility."
The document has been produced by members of the XHTML2 Working Group;
it has been developed in conjunction with the W3C's Web Accessibility
Initiative and other interested parties. The specification is intended
to help make XHTML-family markup languages more effective at supporting
the needs of the accessibility community. It does so by providing a
generic mechanism for defining the relationship between document
components and well-known accessibility taxonomies. XHTML Access is not
a stand-alone document type. It is intended to be integrated into other
XHTML Family Markup Languages. A conforming XHTML Access document is a
document that requires only the facilities described as mandatory in
this specification and the facilities described as mandatory in its host
language. More Information

Friday, January 4, 2008

OASIS XLIFF Version 1.2 to be Considered for Standardization

Members of the OASIS XML Localization Interchange File Format (XLIFF)
Technical Committee have submitted an approved Committee Specification
document set for XLIFF 1.2 to be considered as an OASIS Standard. The
XLIFF 1.2 Specification defines the XML Localization Interchange File
Format (XLIFF), designed by a group of software providers, localization
service providers, and localization tools providers. The purpose of
this vocabulary is to store localizable data and carry it from one
step of the localization process to the other, while allowing
interoperability between tools. It is intended to give any software
provider a single interchange file format that can be understood by
any localization provider. The specification is tool-neutral, supports
the entire localization process, and supports common software, document
data formats, and markup languages. The specification provides an
extensibility mechanism to allow the development of tools compatible
with an implementer's data formats and workflow requirements. The
extensibility mechanism provides controlled inclusion of information
not defined in the specification. XLIFF is loosely based on the OpenTag
version 1.2 specification and borrows from the TMX 1.2 specification.
However, it is different enough from either one to be its own format.
The Version 1.2 specification set includes a Core prose document, XML
schemas, Representation Guide for HTML, Representation Guide for Java Resource
Bundles, and Representation Guide for Gettext PO (defines a guide for mapping
the GNU Gettext Portable Object file format to XLIFF).

Introduction to the Eclipse Business Intelligence and Reporting Tools

Eclipse's Business Intelligence and Reporting Tools (BIRT) project is
an open source project based on the popular Eclipse IDE and is used to
build and deploy reports in a Java/J2EE environment. Some of the key
downloads available with the project include: (1) BIRT Designer: Used
to construct reports. At the center of BIRT is the report designer, which
is a set of Eclipse plug-ins that make up the designer perspective
providing drag-and-drop capabilities to quickly design reports. The
reports designs are created and stored in an XML format. (2) Report
Editor: The Report Editor is used to construct the report and acts as
a canvas for positioning and formatting report elements. Within this
View, there are tabs for Layout, Master Page, Script, XML Source, and
Preview. The XML Source tab displays the XML source code for the report
design. It is possible to edit the XML within this tab, although it is
generally best to use the Layout View. (3) Web Viewer: An example J2EE
application used to deploy reports, containing a JSP tag library to ease
the integration with existing web applications. Once report development
is complete, the reports can be deployed using the BIRT example Web
Viewer. The viewer has been improved for BIRT 2.2 and is an AJAX based
J2EE application that illustrates using the BIRT engine to generate
and render report content. (4) BIRT Charting package: Supports building
sophisticated actionable charts. The BIRT project had its first major
release in the summer of 2005 and has garnered over a million downloads
since its inception. The BIRT project web site includes an introduction,
tutorials, downloads, and examples of using BIRT. In this article, we
will begin by first describing the BIRT designer, which is used to build
report designs, and conclude by discussing the example BIRT Viewer,
which is used to deploy the designs and generate the completed reports.

Apache Wicket 1.3 Set for Java Web Development

Looking to make Web development easier for Java developers, the
Apache Software Foundation began offering this week Apache Wicket 1.3,
an open source component-based Web framework. According to the web site:
"Apache Wicket is a component oriented Java web application framework.
With proper mark-up/logic separation, a POJO data model, and a
refreshing lack of XML, Apache Wicket makes developing web-apps simple
and enjoyable again. Swap the boilerplate, complex debugging and
brittle code for powerful, reusable components written with plain Java
and HTML." Formerly housed at SourceForge, the Wicket project moved
over to Apache last year; version 1.3 is the first release bearing the
Apache nameplate, according to Martijn Dashorst, chairman of the project
and a senior software developer at Web application development firm
Topicus. In version 1.3, enhancements have been made in areas such as
AJAX (Asynchronous JavaScript and XML) and portal support. A key
improvement in version 1.3 is enhanced AJAX support; pages can be a
lot more dynamic than previously. Google Guice capability has been
added as an alternative to using the Spring Framework with Wicket.
Developers also can use Wicket pages directly in a portal without
changing a line of code. Also new is the switch to the Apache license
over from the Lesser GPL (GNU General Public License). It's more
business friendly, "because it allows companies to create closed-source
products from Wicket'."

IESG Approves GEOPRIV Revised Civic Location Format for PIDF-LO

The Internet Engineering Steering Group (IESG) announced that the
"Revised Civic Location Format for PIDF-LO" specification has been
approved as an IETF proposed Standard. The document defines an XML
format for the representation of civic location. This format is
designed for use with PIDF Location Object (PIDF-LO) documents and
replaces the civic location format in RFC 4119 ("A Presence-based
GEOPRIV Location Object Format"). The format is based on the civic
address definition in PIDF-LO, but adds several new elements based
on the civic types defined for DHCP, and adds a hierarchy to address
complex road identity schemes. The format also includes support for
the "xml:lang" language tag and restricts the types of elements
where appropriate. The approved version -07 Internet Draft document
was reviewed by the GEOPRIV working group, where it has reached
consensus for publication as an IETF RFC. Document Quality: The XML
Schema contained within this document has been checked against
Xerces-J 2.6.2. In addition to updating RFC 4119, the document is
also a normative reference in IETF 'draft-ietf-ecrit-lost'
("LoST: A Location-to-Service Translation Protocol"). There are
three known implementations of this specification. The IETF GEOPRIV
Working Group was chartered to assess the authorization, integrity
and privacy requirements that must be met in order to transfer
location information, or authorize the release or representation of
such location information through an agent.

An IPFIX-Based File Format

Members of the IP Flow Information Export (IPFIX) Working Group have
released an initial -00 Internet Draft for "An IPFIX-Based File Format."
The IPFIX WG has developed a MIB module for monitoring IPFIX
implementations. Means for configuring these devices have not been
standardized yet. Per its charter, the WG is developing an XML-based
configuration data model that can be used for configuring IPFIX devices
and for storing, modifying and managing IPFIX configurations parameter
sets; this work is performed in close collaboration with the NETCONF WG.
The IETF Proposed Standard "Information Model for IP Flow Information
Export" defines an XML-based specification of template, abstract data
types and IPFIX Information Elements can be used for automatically
checking syntactical correctness of the specification of IPFIX
Information Elements. The new "IPFIX-Based File Format" document
describes a file format for the storage of flow data based upon the
IPFIX Message format. It proposes a set of requirements for flat-file,
binary flow data file formats, then applies the IPFIX message format
to these requirements to build a new file format. This IPFIX-based file
format is designed to facilitate interoperability and reusability
among a wide variety of flow storage, processing, and analysis tools...

[Note, in relation to W3C's Efficient XML Interchange (EXI) Working Group
Charter and Deliverables:] Over the past decade, XML markup has emerged
as a new 'universal' representation format for structured data. It is
intended to be human-readable; indeed, that is one reason for its rapid
adoption. However XML has limited usefulness for representing network
flow data. Network flow data has a simple, repetitive, non-hierarchical
structure that does not benefit much from XML. An XML representation of
flow data would be an essentially flat list of the attributes and their
values for each flow record. The XML approach to data encoding is very
heavyweight when compared to binary flow encoding. XML's use of start-
and end-tags, and plain-text encoding of the actual values, leads to
significant inefficiency in encoding size. Typical network flow datasets
can contain millions or billions of flows per hour of traffic represented.
Any increase in storage size per record can have dramatic impact on flow
data storage and transfer sizes. While data compression algorithms can
partially remove the redundancy introduced by XML encoding, they
introduce additional overhead of their own. A further problem is that
XML processing tools require a full XML parser... This leads us to
propose the IPFIX Message format as the basis for a new flow data file
format.

Spolsky (and Usdin and Piez) on Specs

Joel Spolsky [...] has an interesting riff on specifications and their
discontents, which feels relevant to the perennial topics of improving
the quality of W3C (and other) specs, and of the possible uses of
formalization in that endeavor. Excerpt (from the Talk at Yale): "the
hard-core geeks tend to give up on all kinds of useful measures of
quality, and basically they get left with the only one they can prove
mechanically, which is, does the program behave according to specification.
And so we get a very narrow, geeky definition of quality: how closely
does the program correspond to the spec. Does it produce the defined
outputs given the defined inputs. The problem, here, is very fundamental.
In order to mechanically prove that a program corresponds to some spec,
the spec itself needs to be extremely detailed. In fact the spec has
to define everything about the program, otherwise, nothing can be proven
automatically and mechanically. Now, if the spec does define everything
about how the program is going to behave, then, lo and behold, it
contains all the information necessary to generate the program! And now
certain geeks go off to a very dark place where they start thinking about
automatically compiling specs into programs, and they start to think that
they've just invented a way to program computers without programming.
Now, this is the software engineering equivalent of a perpetual motion
machine." [... Sperberg-McQueen:] "In their XML 2007 talk on 'Separating
Mapping from Coding in Transformation Tasks', Tommie Usdin and Wendell
Piez talk about the utility of separating the specification of an
XML-to-XML transform ('mapping') from its implementation ('coding'),
and provide a lapidary argument against one common way of trying to
make a specification more precise: 'Code-like prose is hard to read.'
Has there ever been a more concise diagnosis of many reader's problems
with the XML Schema spec? I am torn between the pleasure of insight and
the feeling that my knuckles have just been rapped, really hard.
[Deep breath.] Thank you, ma'am, may I have another?"

JSF Testing Tools

Unit testing JSF based web applications has been considered difficult
because of the constraints of testing JSF components outside the
container. Most of the web-tier testing frameworks follow black-box
testing approach where developers write test classes using the web
components to verify the rendered HTML output is what is expected.
Frameworks such as HtmlUnit, HttpUnit, Canoo WebTest, and Selenium
fall into this category. The limitation of these frameworks is that
they only test the client side of a web application. But this trend
is changing with the recently released JSFUnit and other JSF testing
frameworks such as Shale Test and JSF Extensions that support white-box
testing to test both client and server components of the web application.
And projects like Eclipse Web Tools Platform (WTP) and JXInsight are
also helping in the development and testing of JSF applications...
JSFUnit, which is built on HttpUnit and Apache Cactus, allows integration
testing and debugging of JSF applications and JSF AJAX components. It
can be used for testing both client and server side JSF artifacts in
the same test class. With JSFUnit API, the test class methods can submit
data on a form and verify that managed beans are properly updated.
JSFUnit includes support for RichFaces and Ajax4jsf components. Beta 1
version of this framework was released last month and the second Beta
Version release is scheduled for the end of next month.

Thursday, January 3, 2008

XML Outlook for 2008

"I've made it a habit over the last several years to put together a
list of both my own forecasts for the upcoming year in the realm of XML
technologies and technology in general and, with some misgivings, to
see how far off the mark I was from the LAST time that I made such a
list. The exercise is useful, and should be a part of any analyst's
toolkit, because it forces you to look both at the trends and at potential
disruptors to trends -- which I've come to realize are also trends,
albeit considerably more difficult to spot... As to open standards -- I
think we're going to see a period of consolidation and integration within
the W3C after a few very productive years. The XPath family of languages
are largely complete and solid (though I see the rumblings of an XForms
2.0 in the near future), the Semantic languages are close to complete,
the initial burst of activity to align HTML with AJAX will continue but
within a clearly defined timeline, and the mobile initiative set out a
couple of years ago will likely run its course in 2008 or early 2009.
My anticipation is that there will likely be a drawdown of activity by
the W3C over the course of the next couple of years, at least until the
next major wave... Overall, 2008 should prove to be an interesting, if
somewhat nail-biting, year. There are signs that XML is maturing at the
enterprise level and is beginning to make its presence felt within web
browsers and web interfaces (especially beyond simply being local data
islands or data stores), and XML is also beginning to become a solid
data technology in its own right, rather than simply a messaging or
'document' format. In general, the coming year should prove not to have
too many huge disruptions, but it will see a number of standards that
have been in the works for several years now start to become widely
deployed, including in the Semantic Web, XPath-family, and compound
document arenas. I'm lost optimistic about proprietary XML client
frameworks -- they will continue achieving some market penetration,
but likely not as much as their marketing departments would like to
project. Beyond that, macro-economic trends will begin to have an
impact upon XML and IT in general, especially towards the latter half
of 2008 and early into 2009, though probably not as dramatically as
in years past." More Information

Microsoft Office 2008 for Mac: The Complete Package

"The revamped Word, Excel, PowerPoint, and Entourage offer expanded
tools for image-conscious users and businesses. After a series of
delays, Microsoft plans to release Office for Mac 2008 to
brick-and-mortar and online stores on January 15, 2008, making this
the first update in about four years. We've tested beta versions of
the new applications over the last month without running into glitches.
Office for Mac includes Word, Excel for spreadsheets, PowerPoint for
presentations, and Entourage for e-mail and time management. There's
no Microsoft Access database app for the Mac, although Filemaker's
upcoming release of Bento offers Mac users a new choice. Unlike
Microsoft Office 2007, the interface changes don't look radically
foreign next to the 2004 edition. That's good news for anyone who
doesn't want to relearn the locations of common functions. The 2007
applications for Windows arrange functions within tabs, while the 2008
Mac software largely clusters functions within the same drop-down menus
including File, Edit, and View... Office for Mac saves work in the same
new Open XML formats used by Office 2007 for Windows. We're not thrilled
about this being the default option, even though you can save your
work in the older DOC, XLS, and PPT formats. Free file conversion
tools won't be available until up to 10 weeks from now, or eight weeks
after the applications are available in stores. That means that for now,
should you save work in a new Open XML format in a hurry, someone with
the older software won't be able to open it. Although we're glad that
Microsoft offers free converters, we find the forced extra steps
annoying in Office 2007. That said, the new document types are smaller
and purportedly more secure than their predecessors."

Can IBM Bring the Semantic Web to Notes and Outlook?

OmniFind Personal Email Search tries to extract useful information like
addresses or phone numbers from inboxes, and lets organizations
customize semantic tagging to avoid irrelevant results. While email
search itself isn't new (Google's Desktop Search will happily index your
inbox along with the rest of your hard drive), the IBM software is
slightly different. Rather than finding a specific email message or
thread, Omnifind is aimed at searching for unstructured data: the
information buried within an inbox. And it looks like one of the first
genuinely useful desktop applications based on the Semantic Web -- an
idea that has been somewhat eclipsed by Web services and Web 2.0, but
which could eventually unite them with SOA. The key technology in the
tool is UIMA (Unstructured Information Management), an IBM-led open-source
framework for analyzing text and other unstructured data. This is
essentially pattern recognition: a series of ten digits with hyphens,
brackets or spaces in the right places is a phone number, two letters
followed by five numbers is a zip code, etc. The tool uses this to
generate semantic XML tags automatically, overcoming what has been the
biggest barrier to the Semantic Web: that people don't have the time
or inclination to add metadata to documents manually... Omnifind also
lets users edit the default tags or create their own, using regular
expressions to represent search patterns. IBM suggests that these be
used to customize the search to a specific organization, finding
information like employee IDs or package tracking numbers. It could
also be used to weed out irrelevant search results, most of which
are caused by the one-size-fits-all approach that public search engines
must take. The long-term goal of UIMA is to apply the same automated
pattern recognition to other kinds of data, which will likely be harder.
Email is in some senses the low-hanging fruit, as it isn't entirely
unstructured: There are the formal fields like "To", plus the informal
structure of salutations and signatures that it inherited from regular
mail.

openLiberty ID-WSF ClientLib Project Releases Alpha Code

Asa Hardcastle, Technical Lead for the openLiberty ID-WSF ClientLib
Project, announced that the ClientLib Alpha is now available online.
The ClientLib uses OpenSAML's Java XML Tooling, SOAP, and SAML2
Libraries. The Identity Web Services Framework (ID-WSF) is a set of
open specifications for interoperable, secure, idenity-enabled Web
services. OpenLiberty.org, a domain name donated by HP, has been
launched with the aid of the Liberty Alliance as a resource for all
those looking to deploy open source solutions for securing
identity-based Web services at the relying party. The participants
are initially focusing on ways to provide the open source community
with ID-WSF Web Services Consumer (WSC) libraries so that developers
can incorporate SAML 2.0 functionality into a variety of Web services
and client-based applications... SAML 2.0 is the leading standard for
federated identity and is now widely adopted. Liberty Federation
(SAML 2.0 + Liberty Alliance policy best practices) is a key enabler
for securing Web services across domains, protecting user privacy and
enabling appropriate user control over the use of identity information.
While SAML alone can secure access to Web-based applications, the
client technologies of ID-WSF are required to allow applications to
invoke services across the network. By focusing initially on WSC
Libraries that take advantage of SAML 2.0, we will have new tools for
building more functional, secure and privacy respecting Web services,
especially at the relying party.

2008 Predictions: SOA, Grid, SCA, Web 2.0, REST

"(1) Grid computing will grip the attention of enterprise IT leaders,
although given the various concepts of hardware grids, compute grids,
and data grids, and different approaches taken by vendors, the
definition of grid will be as fuzzy as ESB. This is likely to happen
at the end of 2008. (2) At least one application in the area of what
Gartner calls 'eXtreme Transaction Processing' (XTP) will become the
poster child for grid computing. This killer app for grid computing
will most likely be in the financial services industry or the travel
industry. Scalable, fault tolerant, grid enabled middle tier caching
will be a key component of such applications. (3) Event-Driven
Architectures (EDA) will finally become a well understood ingredient
for achieving realtime insight into business process, business metrics,
and business exceptions. New offerings from platform vendors and
startups will begin to feverishly compete in this area. (4) Service
Component Architecture (SCA) will become the new way for SOA applications
to be defined as support from all the major platform vendors (sans
Microsoft) will be rolled out... (9) By end of year it will be clear
that an understanding of infrastructure requirements for common
problems such as predictable scalability, reliability, security,
(*-ilities) will be necessary in order to support any combination of
SOA, REST, or Web 2.0 style applications. However the exact architecture
or even the list of requirements in support of such infrastructure
will not be well understood or agreed upon. Such a common understanding
will not come to bear until at least 2010. This will be the new
frontier to explore in the coming years..."

Tuesday, January 1, 2008

Does XML Have a Future on the Web?

Earlier this month, the opening session of the XML 2007 conference was
devoted to a panel session on the topic 'Does XML have a future on the
Web?' [...] A lot depends on what we mean by 'the Web'. If we mean
Web 2.0 Ajax applications, we may get one answer. If we mean the
universe of data publicly accessible through HTTP, the answer might be
different. But neither of these, in reality, is 'the Web'. If there is
a single central idea of the Web, it's that of a single connected
information space that contains all the information we might want to
link to -- that means, in practice, all the information we care about
(or might come to care about in future): not just publicly available
resources, but also resources behind my enterprise firewall, or on my
personal hard disk. If there is a single technical idea at the center
of the Web, it's not HTTP (important though it is) but the idea of the
Uniform Resource Identifier, a single identifier space with distributed
responsibility and authority, in which anyone can name things they care
about, and use their own names or names provided by others, without
fear of name collisions. Looked at in this way, 'the Web' becomes a
rough synonym for 'data we care about', or 'the data we process, store,
or manage using information technology'... There were something like
two hundred people actively involved in the original design of XML, and
among us I wouldn't be surprised to learn that we had a few hundred, or
a few thousand, different goals for XML... One reason to think that XML
has found broad uptake is the sheer variety of people complaining about
XML and the contradictory nature of the problems they see and would like
to fix. For some, XML is too complicated and they seek something simpler;
for others, XML is too simple, and they want something that supports
more complex structures than trees. Some would like less draconian error
handling; others would like more restrictive schema languages. Any
language that can accumulate so many different enemies, with such widely
different complaints, must be doing something right. Long life to
descriptive markup! Long life to XML! More Information

XML 2007: Year in Review

2007 was a productive year for XML. The most sound and fury focused
around the standardization of office document formats, a fight that even
spilled over into the popular press. Who ever thought you'd be reading
about ISO standards for XML formats in the Wall Street Journal? But if
I had to pick the most important story of the year, I'd be hard pressed
to choose between the continuing slow growth of XQuery, APP (Atom
Publishing Protocol), and XForms. All have the potential to radically
alter the software infrastructure that underlies the Web. XForms is a
radically new client-development platform, XQuery is a radically new
server-development platform, and APP connects them together. Of the
three, XQuery is ready for serious production use today, and APP is
gearing up. Atom Publishing Protocol (APP) began its life as a simple
format for uploading blog entries to replace custom APIs like the
MetaWeblog and WordPress APIs. But along the way, it turned into
something much, much more. APP is nothing less than a RESTful, scalable,
extensible, secure system for publishing content to HTTP servers. On one
hand, it's a pure protocol, completely independent of any particular
server or client. On the other hand, because it's nothing more than
HTTP, it's easy to implement in existing clients and servers. The Web
was originally intended to be a read-write medium. But for the first
15 years, most energy went into the reading half of that equation.
Browsers got all the attention, while authoring tools withered on the
vine. Page editors were generally poor and forced to tunnel through FTP
to file systems. Only now, with APP, is the field opening up to editors
that are as rich, powerful, and easy to use as the browsers. Some good
server software, such as the eXist native XML database, has already
started to take advantage of APP, and several clients are working on
it. More will do so over the coming year. Publishing on the Web will
finally become as straightforward as browsing it... XForms is running
behind and may be a little late to the party, but I hope it gets there
before the doors close. Either way, the future for XML on the Web looks
brighter than ever. More Information See also Atom references Click Here

XML Schemas: Guaranteed Non-Interoperability as a Design Methodology?

The vogue quip that 'a camel is a horse designed by committee' probably
makes more sense to people who don't live in a desert country. From
here in Australia, camels seem to a very plausible design. It is the
speaker, actually, who is wrong: what you need is a camel when you are
in the desrt, a horse on the planes, a yak in the mountains, perhaps a
porpoise in the sea, and an elephant in the jungle. The ongoing XML
Schemas trainwreck shows little sign of improvement; that users have
so repetitively stated their problem and received no satisfaction from
the W3C shows how disenfranchised they are. I am thinking about these
things again this week for three reasons... One positive thing that has
come out has been the W3C Basic XML Schemas Databinding Patterns which
lists various XPaths that databinding tools can have. It mentions how
to use these in Schematron, which is good too... But it doesn't come up
to the level of a profile. And, to be fair, the W3C Schema WG has also
upgraded XSD to reduce some gotchas that have been reported, such as
allowing unbounded on all groups.

Information Model and XML Data Model for Traceroute Measurements

The IESG has received a request from the IP Performance Metrics WG
(IPPM) to consider "Information Model and XML Data Model for Traceroute
Measurements" as a Proposed Standard. The IESG plans to make a decision
in the next few weeks, and solicits final comments on this action.
Substantive comments should be sent to the IETF mailing lists by
2008-01-15. Traceroutes are being used by lots of measurement efforts,
either as an independent measurement or to get path information to
support other measurement efforts. That is why there is the need to
standardize the way the configuration and the results of traceroute
measurements are stored. The standard metrics defined by IPPM working
group in matter of delay, connectivity and losses do not apply to the
metrics returned by the traceroute tool; therefore, in order to compare
results of traceroute measurements, the only possibility is to add to
the stored results a specification of the operating system and version
for the traceroute tool used. This document, in order to store results
of traceroute measurements and allow comparison of them, defines a
standard way to store them using a XML schema. Section 7 contains the
XML schema to be used as a template for storing and/or exchanging
traceroute measurements information. The schema was designed in order
to use an extensible approach based on templates (similar to how IPFIX
protocol is designed) where the traceroute configuration elements (both
the requested parameters, Request, and the actual parameters used,
MeasurementMetadata) are metadata to be referenced by results
information elements (data) by means of the TestName element (used as
unique identifier). Currently Open Grid Forum (OGF) is also using this
approach and cross-requirements have been analyzed. As a result of
this analysis the XML schema is compatible with OGF schema since it
was designed in a way that both limits the unnecessary redundancy and
a simple one-to-one transformation between the two exist. More Information

First Public Working Draft for Delivery Context Ontology Specification

Members of the W3C Ubiquitous Web Applications Working Group (UWA) have
published a First Public Working Draft for "Delivery Context Ontology"
and a Candidate Recommendation for "Delivery Context: Client Interfaces
(DCCI) 1.0 -- Accessing Static and Dynamic Delivery Context Properties."
The UWA Working Group focuses on extending the Web to enable distributed
applications of many kinds of devices including sensors and effectors.
Application areas include home monitoring and control, home entertainment,
office equipment, mobile and automotive. (1) The new "Delivery Context
Ontology" document provides a formal model of the characteristics of the
environment in which devices interact with the Web. The delivery context
includes the characteristics of the device, the software used to access
the Web and the network providing the connection among others. The
delivery context is an important source of information that can be used
to adapt materials from the Web to make them useable on a wide range of
different devices with different capabilities. The ontology is formally
specified in the Web Ontology Language [OWL]. This document describes the
ontology and gives details of each property that it contains. The core,
normative sections of this document are generated automatically from the
ontology itself. (2) Implementations are now invited in connection with
the "Delivery Context: Client Interfaces (DCCI) 1.0" Candidate
Recommendation specification. This document defines platform and language
neutral programming interfaces that provide Web applications access to
a hierarchy of dynamic properties representing device capabilities,
configurations, user preferences and environmental conditions. The key
uses for DCCI are related to adaptation. One major use is in supporting
devices that are capable of interaction with users in a variety of
modalities. For example, a device may be able to interact visually, or
using voice, depending on the user's current context. Another major use
for DCCI relates to content adaptation for device independence. Materials
to be used on a particular device may need to be tailored to take account
of that device's particular capabilities.

XML

Search This Blog