Wednesday, February 27, 2008

Web Services: RPC, REST, and Messaging

How to choose a model for interoperable communication in the enterprise?
For the implementation of Web Services in the enterprise environment,
I've seen many different technologies used. Recently, in my spare moments,
I've reflected on this and have come to the conclusion that all these
technologies tend to fit one of three models (or hybrids of these models).
I would summarise these three models as: (1) Remote Procedure Calls (RPC).
A client-server based remotable pattern where a subset of an existing
system's local functions is exposed pretty much 'as-is' over the wire
to client programs. (2) Resource-oriented Create-Read-Update-Delete
(CRUD). A client-server based resource-oriented pattern where the
server-side provides a representation of a set of resources (often
hierarchical) and exposes Create, Read, Update and Delete capabilities
for these resources to client programs. (3) Messaging (e.g., as commonly
seen with Message Oriented Middleware and B2B). Messages or documents are
passed asynchronously between peer systems in either, but not always both,
directions. Sometimes its hard to distinguish between these models and
where the boundaries lie. In fact, I don't think there are boundaries,
only grey areas and all three models lie in the same spectrum. In the
Web Services world, we may typically implement these three models using
one of the following three approaches: (1') Remote Procedure Calls: SOAP
using a synchronous RPC programming approach and, typically, generated
'skeletons/stubs' and some sort of Object-to-XML marshalling technology.
(2') Resource-oriented Create-Read-Update-Delete: REST or 'RESTful Web
Services' or ROA, re-using World-Wide-Web based approaches and standards
like HTTP and URIs. (3') Messaging: SOAP using an asynchronous
Message/Document passing approach where invariably the documents are
defined by schemas and, often, the use of message-level (rather than
transport-level) security elements is required... When faced with the
REST zealot or the WS-* zealot, we probably need to bear this spectrum
in mind. For the Web Services paradigm, there is not a 'one-size fits all'
and specific requirements for a given situation should dictate which
position in this spectrum best lends itself to satisfying the requirements.
Also, the overlap between the models may be greater [than shown in the
diagram]. For example, some would argue that REST can happily and more
appropriately be used to fulfil what would otherwise be RPC oriented
problems, in addition to solving Resource-oriented CRUD style problems.

Holder-of-Key Web Browser SSO Profile

"As part of my work for the National Institute of Informatics and the
UPKI initiative, I've been working on a modified Web Browser SSO profile
for SAML 2.0 that uses holder-of-key confirmation for the client rather
than bearer authentication. The keys for this confirmation are supplied
through TLS using client certificates. This results in a more secure
sign-on process and, particularly, a more secure resulting session at
the SP. There is no need for the SP to do PKI validation or know
anything about the client certificate itself. The specification
supplies an alternative to "Profiles for the OASIS Security Assertion
Markup Language (SAML) V2.0." Excerpt: "The profile allows for transport
and validation of holder-of-key assertions by standard HTTP user agents
with no modification of client software and maximum compatibility with
existing deployments. Most of the flows are as in standard Web Browser
SSO, but an x.509 certificate presented by the user agent supplies a
valid keypair through client TLS authentication for HTTP transactions.
Cryptographic data resulting from TLS authentication is used for
holder-of-key validation of a SAML assertion. This strengthens the
assurance of the resulting authentication context and protects against
credential theft, giving the service provider fresh authentication and
attribute information without requiring it to perform successful
validation of the certificate... A principal uses an HTTP user agent
to either access a web-based resource at a service provider or access
an identity provider such that the service provider and desired resource
are understood or implicit. In either case, the user agent needs to
acquire a SAML assertion from the identity provider. The user agent
makes a request to the identity provider using client TLS authentication.
The X.509 certificate supplied in this transaction is used primarily to
supply a public key that is associated with the principal. The identity
provider authenticates the principal by way of this TLS authentication
or any other method of its choice. The identity provider then produces
a response containing at least an assertion with holder-of-key subject
confirmation and an authentication statement for the user agent to
transport to the service provider. This assertion is presented by the
user agent to the service provider over client TLS authentication to
prove possession of the private key matching the holder-of-key
confirmation in the assertion. The service provider should rely on no
information from the certificate beyond the key; instead, it consumes
the assertion to create a security context. The TLS key may then be
used to persist the security context rather than a cookie or other
application-layer session. To implement this scenario, a profile of
the SAML Authentication Request protocol is used in conjunction with
the HTTP Redirect, HTTP POST and HTTP Artifact bindings. It is assumed
that the user is using an HTTP user agent capable of presenting client
certificates during TLS session establishment, such as a standard web
browser...

Liberty Alliance Announces Health Identity Management SIG

Liberty Alliance, the global identity consortium working to build a
more trusted internet for consumers, governments and businesses worldwide,
has announced the launch of a global public forum formed to develop an
interoperable, secure and privacy-respecting information exchange system
for the healthcare sector. The Liberty Alliance Health Identity Management
Special Interest Group (HIM SIG) is leveraging the Liberty Alliance model
of addressing the technology, business and privacy aspects of digital
identity management to meet the unique identity management and regulatory
challenges facing the international healthcare industry today. The
Health Identity Management SIG offers members an opportunity to join
with other Liberty Alliance Members (regardless of membership level)
to recommend standards to enable an internationally interoperable health
care identity management and information exchange system. This may
includes standard directory (LDAP) models, health care roles,
implementation guides, and similar recommendations. The SIG will review
existing standards, and recommend new standards for an interoperable
health care identity management system using Security Assertion Markup
Language (SAML) and Liberty Specifications. Co-chaired by John Fraser,
CEO, MEDNET USA and Pete Palmer, Security and Cryptography Architect,
Wells Fargo, the HIM SIG currently includes over 30 members from around
the world representing the education, government, healthcare and technology
sectors. Members are working to address how the healthcare industry will
deliver secure identity management solutions that meet global regulatory
mandates and ensure patient privacy. The public group is working closely
with the Liberty Identity Assurance Expert Group to ensure requirements
for standardized and certified identity assurance levels in the
healthcare sector meet criteria established in the policy-based Liberty
Identity Assurance Framework.

PRESTO: A WWW Information Architecture for Legislation and PublicPRESTO: A WWW Information Architecture for Legislation and Public

PRESTO (P - Public, Permanent URLs; REST - Representation, State
Transfer; O - Object-oriented) is not something new: its basic ideas are
presupposed in a lot of people's thinking about the web, and many people
have given names to various parts. The elevator pitch for PRESTO is this:
"All documents, views and metadata at all significant levels of
granularity and composition should be available in the best formats
practical from their own permanent hierarchical URIs." I would see PRESTO
as the kind of methodology that a government could adopt as a
whole-of-government approach, in particular for public documents and
of these in particular for legislation and regulations. The problem is
not 'what is the optimal format for our documents?' The question is 'How
can link to the important grains of information in a robust,
technology-neutral way that only needs today's COTS tools?' The format
wars, in this area, are asking exactly the wrong question: they focus
us on the details of format A rather than format B, when we need to be
able to name and link to information regardless of its format:
supra-notational data addressing. If you are wanting to build a large
information system for the kinds of documents, and you want to be truly
vendor neutral (which is not the same thing as saying that preferences
and delivery-capabilities will not still play their part), and you want
to encourage incremental, decentralized ad hoc and planned developments
in particular mash-ups, then you need Permanent URLs (to prevent link rot),
you need REST (for scale etc) and you need object-oriented (in the sense
of bundling the methods for an object with the object itself, rather than
having separate verb-based web services which implement a functional
programming approach: OO here also including introspection so that when
you have a resource you can query it to find the various operations
available). A rule of thumb for a document system that conformed to this
PRESTO approach would be that none of the URLs use '#' (which indicates
that you are groping for information inside a system-dependent level of
granularity rather than being system-neutral) or '?' (which indicates that
you are not treating every object you can think about as a resource in
its own right that may itself have metadata and children.)

Why Liberty's Identity Governance Framework is So Important

In late 2006, several companies got together and created the Identity
Governance Framework (IGF), an initiative of the Liberty alliance.
The purpose of the IGF is to provide an open architecture that
addresses governance of identity related information. This architecture
is meant to bridge the gap between regulatory requirements and the
lower-level protocols and architecture. How can the inherent risks
associated with the creation, copying, maintenance and use of identity
data be mitigated? Who has access to what data for which purpose, and
under what conditions? Ideally, policies on data usage are created by
sources (attribute authorities) and consumers (attribute authorities)
of identity data. These policies can then then be used for the
implementation and auditing of governance. In other words: if you
know what the rules are, express them in a policy, and make sure your
policy is watertight when the next audit comes. Exactly this is what
the IGF attempts to create: a standardised mechanism for expression
and implementation of these policies. The IGF is working on several
standards and components to make this happen. One of them is the CARML
(Client Attribute Request Markup Language) protocol. It defines
application identity requirements, in other words what type of identity
information an application needs, and what that application will do
with that information. On the other side of the spectrum there is AAPML
('Attribute Authority Policy Markup Language') that describes the
constraints on the use of the provided identity information -- under
what conditions specific pieces of identity data is made available to
applications, and how this data may be used, and possibly modified.
For example: what part of the users data can be modified by the users
directly at a self-service portal? Or: under which condition may a
marketing application use a users data, and what type of explicit consent
needs to be given by the user? AAPML is proposed as a profile of XACML,
so that AAPML policies can be consumed directly by a policy enforcement
point (PEP) to enforce access over the requests for identity data...
CARML and AAPML bridge a very important gap that is not addressed
anywhere else: not how to request and receive attributes, but to
express the need and purpose of identity data, and on the other side
the allowed use and conditions for its consumption. IGF's framework
conceptually fits seamlessly into architectures harnessing today's
frameworks and picks up where CardSpace, Higgins, Bandit and WS-Trust,
leave off.

Beta Release: ID-WSF 2.0 Web Services Client Library (ClientLib)

Asa Hardcastle, OpenLiberty Technical Lead, has announced the beta
release of the ID-WSF 2.0 ClientLib application. openLiberty.org was
established to provide easy access to tools and information to jump
start the development of more secure and privacy-respecting
identity-based applications based on Liberty Alliance standards. The
first project at openLiberty.org is the ID-WSF WSC Client Library
("ClientLib") that will help you more easily build and deploy a wide
range of new relying party (identity-consuming) applications. The
ClientLib uses OpenSAML's Java XML Tooling, SOAP, and SAML2 Libraries.
As announced: "As of February 25th 2008 the ClientLib is officially
released as BETA code. Over the next few months we'll be writing more
code and doing some interoperability testing. The ClientLib includes
support for ID-WSF Authentication Service (PLAIN and CRAM-MD5),
Discovery Service, a non-standard Profile Service, and Directory Access
Protocol Service (ID-DAP). Both signed and unsigned messaging is
supported. The Data Services Template (DST 2.1) is mostly complete.
The DST 2.1 reference implementation is mostly complete. People Service
is partially complete." From Asa's blog entry: "This release marks
excellent progress, but there is still a lot of work to do. The beta
is not bug free nor is it thoroughly tested. It is ready for other
people to sink their teeth into and give feedback, make requests, or
write some code. For development purposes we are currently testing
against two ID-WSF WSPs and have access to a third (HP Select Federation)
which we hope to have working with the library before Version 1 release
planned later this year."

New Book: Understanding Windows CardSpace

"There is a really wonderful new book out on digital identity and
Information Cards called "Understanding Windows CardSpace". Written
by Vittorio Bertocci, Garrett Serack and Caleb Baker, all of whom were
part of the original CardSpace project, the book is deeply grounded in
the theory and technology that came out of it... The presentation begins
with a problem statement: 'The Advent of Profitable Digital Crime'.
There is a systematic introduction to the full panoply of attack vectors
we need to withstand, and the book convincingly explains why we need an
in-depth solution, not another band-aid leading to some new vulnerability.
For those unskilled in the art, there is an introduction to relevant
cryptographic concepts, and an explanation of how both certificates and
HTTPS work. These will be helpful to many who would otherwise find parts
of the book out of reach. Next comes an intelligent discussion of the
Laws of Identity, the multi-centered world and the identity metasystem.
The book is laid out to include clever sidebars and commentaries, and
becomes progressively more McLuhanesque. On to SOAP and Web Services
protocols -- even an introduction to SAML and WS-Trust, always with
plenty of diagrams and explanations of the threats. Then we are introduced
to the concept of an identity selector and the model of user-centric
interaction. Part two deals specifically with CardSpace, starting with
walk-throughs, and leading to implementation. This includes 'Guidance for
a Relying Party', an in-depth look at the features of CardSpace, and a
discussion of using CardSpace in the browser. The authors move on to
Using CardSpace for Federation, and explore how CardSpace works with
the Windows Communication Foundation. Even here, we're brought back to
the issues involved in relying on an Identity Provider, and a discussion
of potential business models for various metasystem actors..."

W3C Offices Program Celebrates Ten Years of International Outreach

W3C announced that representatives from W3C Offices -- regional branches
that promote W3C and interact with participants in local languages --
now celebrate ten years of the Offices program. Offices currently
represent seventeen (17) regions around the globe, helping to organize
meetings, recruit Members, translate materials, and find creative ways
to encourage international participation in W3C work. Offices staff
gather for a face-to-face meeting in Sophia-Antipolis France to review
ten years of experience and to forge improvements to the program. At
this occasion, W3C thanks the Offices staff past and present for all
of their work. W3C Offices are located in Australia, Brazil, Benelux,
China, Finland, Germany & Austria, Greece, Hungary, India, Israel,
Italy, Korea, Morocco, Southern Africa, Spain, Sweden, United Kingdom
and Ireland

SOA Spending Up Despite Unclear Benefits

The number of companies investing in service-oriented architecture (SOA)
has doubled over the past year in every part of the world, with a
typical annual spend of nearly $1.4 million, according to a new research
report from the analyst firm AMR Research that surveyed 405 companies in
the U.S., Germany, and China. Now the bad news: "Hundreds of millions of
dollars will be invested pursuing these markets in 2008, much of it
wasted," said AMR analyst Ian Finley. The AMR survey found that most
companies don't really know why they are investing in SOA, which Finley
said makes long-term commitment iffy. Often, there are multiple reasons
cited within any organization, letting SOA appear as a buzzword
justification for unrelated individual priorities. "People more easily
rally around a thing rather than five things... that lack of a rallying
purpose for SOA calls its momentum into question." Finley is concerned
that SOA may not get picked up much beyond the early adopters -- mainly
financial services, telecommunications, and government organizations
that are more often than not predisposed to the value of architecture
and thus more willing to pursue SOA for less-quantifiable benefits --
unless a coherent set of benefits is made clear. Another danger seen
from the SOA survey is that the main benefit that the vendors sell around
SOA (code reuse) is not the real benefit that early SOA adopters have
gotten. Often the code from project A is irrelevant to project B, he
noted. That focus on reuse can cause organizations to dismiss SOA's
benefits because they're looking at the wrong metric.

IPTC Announces NewsML-G2 and EventsML-G2 as G2-Standards

Misha Wolf (Reuters) posted an IPTC announcement about the launch of
NewsML-G2 and EventsML-G2 as the first parts of a new framework of
XML-based news exchange formats from the International Press
Telecommunications Council (IPTC). NewsML-G2 defines a string-derived
datatype called QCode (Qualified Code), which looks like this:
"CodingSchemeAlias:Code." The CodingSchemeAlias maps to an IRI
representing the CodingScheme. The IRI obtained by appending the Code
to this IRI represents the Code. The Code can contain (and start with)
most characters. The main exception is white space, and the Code can
be entirely numeric. QCodes are used as attribute values. Such
attributes accept QCodes only, so there is no conflict with IRIs/URIs.
The next steps include the creation of an OWL representation of the
NewsML-G2 Schema and Semantics, the translation into SKOS of NewsML-G2
KnowledgeItems, and the updating of our GRDDL transform to reflect the
released version of NewsML-G2. Acording to the announcement: "NewsML-G2
allows the bundling of multiple news items -- articles, photos, videos
or whatever -- and a detailed description of their content and how the
items relate to each other. Whether populating a web site with complex
news packages or building bundles of news items for resale or archiving,
NewsML-G2 provides an easy way to package and exchange news... The
G2-Standards also fit into the Semantic Web initiatives of the World
Wide Web Consortium, enriching content so that computers can more easily
search the huge universe of news. The goal is to better help news
agencies manage and distribute their massive libraries of current and
archived news content, and to help customer search engines find content
quickly and accurately. G2-Standards can be easily combined with IPTC's
groundbreaking NewsCodes, which provide a rich suite of standard terms
for describing news, to give news agencies amazing flexibility in how
news can be bundled for downstream users. With widely available digital
news archives now dating back to 1850 or earlier, news agencies,
librarians and archivists have a special interest in the rapid searching
and retrieval of news, which NewsCodes can accelerate to help drive
revenue growth." IPTC is a consortium of the world's major news
agencies, news publishers and news industry vendors. It develops and
maintains technical standards for improved news exchange that are used
by virtually every major news organization in the world.

Sunday, February 24, 2008

vCard Extensions to WebDAV (CardDAV)

Members of the IETF vCard and CardDAV (VCARDDAV) Working Group have
released an updated version of the specification "vCard Extensions to
WebDAV (CardDAV)." This IETF WG was chartered to produce: (1) A revision
of the vCard specification (RFC 2426) at proposed standard status; this
revision will include other vCard standardized extensions (RFC 2739,
4770) and extensions assisting synchronization technologies -- for
example, a per-entry UUID or per-attribute sequence number; other
extensions shall be considered either in the base specification or in
additional documents; (2) An address book access protocol leveraging
the vCard data format, for which the 'draft-daboo-carddav' I-D is the
starting point; (3) An XML schema which is semantically identical to
vCard in all ways and can be mechanically translated to and from vCard
format without loss of data. While vCard has deployed successfully and
will remain the preferred interchange format, a standard XML schema
which preserves vCard semantics might make vCard data more accessible
to XML-centric technologies such as AJAX and XSLT. Such a standard format
would be preferable to multiple proprietary XML schemas, particularly if
vCard semantics were lost by some of them and a lossy gateway problem
resulted. The draft "vCard Extensions to WebDAV (CardDAV)" defines
extensions to the Web Distributed Authoring and Versioning (WebDAV)
protocol to specify a standard way of accessing, managing, and sharing
contact information based on the vCard format. Address books containing
contact information are a key component of personal information management
tools, such as email, calendaring and scheduling, and instant messaging
clients. To date several protocols have been used for remote access to
contact data, including Lightweight Directory Access Protocol (LDAP),
Internet Message Support Protocol (IMSP) and Application Configuration
Access Protocol (ACAP - RFC 2244), together with SyncML used for
synchronization of such data. Each has key disadvantages... The proposed
CardDAV address book is modeled as a WebDAV collection with a well
defined structure; each of these address book collections contain a
number of resources representing address objects as their direct child
resources. Each resource representing an address object is called an
"address object resource". Each address object resource and each address
book collection can be individually locked and have individual WebDAV
properties. Definitions of XML elements in this document use XML element
type declarations (as found in XML Document Type Declarations), described
in Section 3.2 of the XML 1.0 Recommendation.

Last Call Working Draft for RDFa in XHTML: Syntax and Processing

Members of W3C's Semantic Web Deployment Working Group and the XHTML 2
Working Group have published a Last Call Working Draft for the
specification "RDFa in XHTML: Syntax and Processing A collection of
attributes and processing rules for extending XHTML to support RDF."
RDFa is a specification for attributes to be used with languages such
as HTML and XHTML to express structured data. The rendered, hypertext
data of XHTML is reused by the RDFa markup, so that publishers don't
need to repeat significant data in the document content. This document
only specifies the use of the RDFa attributes with XHTML. The underlying
abstract representation is RDF, which lets publishers build their own
vocabulary, extend others, and evolve their vocabulary with maximal
interoperability over time. The expressed structure is closely tied to
the data, so that rendered data can be copied and pasted along with its
relevant structure. The rules for interpreting the data are generic, so
that there is no need for different rules for different formats; this
allows authors and publishers of data to define their own formats without
having to update software, register formats via a central authority, or
worry that two formats may interfere with each other. RDFa shares some
use cases with microformats. Whereas microformats specify both a syntax
for embedding structured data into HTML documents and a vocabulary of
specific terms for each microformat, RDFa specifies only a syntax and
relies on independent specification of terms (often called vocabularies
or taxonomies) by others. RDFa allows terms from multiple independently
developed vocabularies to be freely intermixed and is designed such that
the language can be parsed without knowledge of the specific term
vocabulary being used. Motivation: RDF/XML (Syntax) provides sufficient
flexibility to represent all of the abstract concepts in RDF. However,
it presents a number of challenges; first it is difficult or impossible
to validate documents that contain RDF/XML using XML Schemas or DTDs,
which therefore makes it difficult to import RDF/XML into other markup
languages. Whilst newer schema languages such as RELAX NG do provide a
way to validate documents that contain arbitrary RDF/XML, it will be a
while before they gain wide support. Second, even if one could add RDF/XML
directly into an XML dialect like XHTML, there would be significant data
duplication between the rendered data and the RDF/XML structured data.
It would be far better to add RDF to a document without repeating the
document's existing data. One motivation for RDFa has been to devise a
means by which documents can be augmented with metadata in a general
rather than hard-wired manner. This has been achieved by creating a
fixed set of attributes and parsing rules, but allowing those attributes
to contain properties from any of a number of the growing range of
available RDF vocabularies. The values of those properties are in most
cases the information that is already in an author's XHTML document.

Microsoft Readies Silverlight 2 Beta

Microsoft's Scott Guthrie, general manager in the Microsoft Developer
Division, has provided a list of features planned for Silverlight 2 and
the beta, planned for release during the first quarter of 2008.
Silverlight 2 will be a major update of Silverlight that focuses on
enabling Rich Internet Application (RIA) development. Silverlight 2
includes a cross-platform, cross-browser version of the .NET Framework,
and enables a rich .NET development platform that runs in the browser.
Once Silverlight 2 is installed, you can browse the Web and automatically
run rich Silverlight applications within your browser of choice; thus
includes such browsers as Internet Explorer, Firefox, Safari, and others.
For networking, Silverlight 2 backs REST (Representational State Transfer),
WS-*, and SOAP as well RSS, POX, and HTTP services. Cross-domain network
access in Silverlight 2 enables Silverlight clients to directly access
resources and data from resources on the Web. Built-in sockets networking
also is included in the beta release. Silverlight 2 features a rich .Net
base class library of functionality, such as collections, generics
threading, globalization, XML, and local storage. Rich APIs in the
product enable HTML DOM/JavaScript integration with .Net code Also featured
is Microsoft's LINQ (Language Integrated Query) technology, which provides
native query syntax for C# and Visual Basic, and LINQ to XML library
support. This enables easy transformation and querying of data; local
data caching and storage support are highlighted as well in Silverlight 2.
Developers can write Silverlight applications using a .Net language, such
as Visual Basic, C#, JavaScript, IronPython, or IronRuby. Microsoft plans
to ship support for developer/designer workflow and integration for
Silverlight in its Visual Studio 2008 and Expression Studio tools.

Addressing Fragments in REST

REST offers a great way to build simple applications that Create, Read,
Update, and Delete resources. But what if you want to get at part of a
resource? A database row is a simple thing, even if it enables immense
complexity. It contains named fields - no more than one to a given name,
usually conforming to a rather predictable schema. There's nothing
floating between the fields, and every row contains the same set of
fields. (They may be empty, of course, but they're clearly defined. An
XML document - or even an XML document fragment - is potentially incredibly
complicated. While many XML documents are regular and relatively simple,
the ones that aren't simply holding data as it moves between databases
are often very complicated. XML elements are kind of like fields, sure,
but [not quite...] Nonetheless, it seems like the basic operations most
people would like to perform on these documents (and other loosely-structured
resources) are the same operations people want to perform on database
records: Create, Read, Update, Delete. CRUD is everywhere, and CRUD is
good. Typically, though, an XML document is treated as a single resource.
A book might assemble itself using entities or XIncludes that pull in
chapters, of course, and those chapters could be individually addressed
as resources, but that has limits. Though it's possible, I don't think
anyone wants to write paragraphs in which each sentence is included from
a separate file using one of those mechanisms. As soon as you hit mixed
content, the entity approach breaks down anyway. (Other formats, like JSON,
don't have entities but share a few of the same problems.) So how can
developers build RESTful applications that address parts of documents?
One approach that's getting a lot of discussion in the last few days is
to add a new verb, PATCH, to HTTP... It seems to me that the problem is
not that developers want to do something that can't be expressed with a
RESTful verb - in this case, probably UPDATE. The problem is that developers
can't address the resource on which they want to work with sufficient
granularity given their current set of tools and agreements. Though I've
inveighed against the many many sins of XPointer for years, that
incredibly broken process was at least working to solve the problem of
addressing XML documents at a very fine granularity, extending the tool
most commonly used on the client side for this: fragment identifiers...

OpenGIS Web Processing Service (WPS) Interface Standard Version 1.0

OGC announced that members of the Open Geospatial Consortium have
approved version 1.0 of the OpenGIS Web Processing Service (WPS)
Interface Standard. The WPS standard defines an interface that
facilitates the publishing of geospatial processes and makes it easier
to write software clients that can discover and bind to those processes.
Processes include any algorithm, calculation or model that operates on
spatially referenced raster or vector data. Publishing means making
available machine-readable binding information as well as human-readable
metadata that allows service discovery and use. A WPS can be used to
define calculations as simple as subtracting one set of spatially
referenced data from another (e.g., determining the difference in
influenza cases between two different seasons), or as complicated as a
hydrological model. The data required by the WPS can be delivered across
a network or it can be made available at the server. This interface
specification provides mechanisms to identify the spatially referenced
data required by the calculation, initiate the calculation, and manage
the output from the calculation so that the client can access it. The
OGC's WPS standard will play an important role in automating workflows
that involve geospatial data and geoprocessing services. The specification
identifies a generic mechanism to describe the data inputs required and
produced by a process. This data can be delivered across the network, or
available at the server. This data can include image data formats such
as GeoTIFF, or data exchange standards such as Geography Markup Language
(GML). Data inputs can be legitimate calls to OGC web services. For
example, a data input for an intersection operation could be a polygon
delivered in response to a WFS request, in which case the WPS data input
would be the WFS query string.

HTTP-based IETF Namespace URIs at IANA

The draft I-D "HTTP-based IETF Namespace URIs at IANA" creates a registry
and defines a procedure to allow IETF specifications to register XML
Namespace Names with IANA which are HTTP URIs and thus potentially useful
for looking up information about the namespace. Many IETF specifications
use Extensible Markup Language (XML) 1.0 (Fourth Edition) with Namespaces
in XML 1.0 (Second Edition). XML Namespace Names are URIs, and there are
many options for constructing them. One of the options is the use of HTTP
URIs -- those whose scheme is 'http:'. IETF RFC 3688 (The IETF XML
Registry) created an IANA registry for XML namespaces based on URNs,
which take on the form 'urn:ietf:params:xml:ns:foo'. RFC 3470 observes
that in the case of namespaces in the IETF standards-track documents,
it would be useful if there were some permanent part of the IETF's own
web space that could be used to mint HTTP URIs. However, it seems to
be more appropriate and in line with IETF practice to delegate such a
registry function to IANA... IANA maintains a registry page listing the
registered XML namespaces which use HTTP URIs. For each registered
namespace, the registry page includes a human-readable name for the
namespace, a link to the namespace document, and the actual namespace
URI. Namespaces created by IANA upon registration have the following form.
There is a common prefix, "http://www.iana.org/xmlns/" [...] followed
by a unique identifier. The unique identifier should be a relatively
short string of US-ASCII letters, digits, and hyphens, where a digit
cannot appear in first position and a hyphen cannot appear in first or
last position or in successive positions. In addition, the unique
identifier can end in a single '/' or '#'. XML namespaces are
case-sensitive, but all registrations are required to mutually differ
even under case-insensitive comparison. For uniformity, only lower
letters should be used. A unique identifier is proposed by the requester,
but IANA may change it as they see fit, in consultation with the
responsible Expert Reviewer. For each namespace registered, there must
be a namespace document in either HTML or XHTML which may be retrieved
using the HTTP URI which is the registered namespace name. It contains
the template information with appropriate markup. The request for creation
and registration of a HTTP XML Namespace URI is made by including a
completed registration template in the IANA Considerations section of
an Internet-Draft.

Thursday, February 21, 2008

Lessig Considers Running for Congress

Lawrence Lessig, the cyberlaw author and advocate for free software and
online civil liberties, is considering a run for the U.S. Congress, he
announced on his blog Wednesday. Lessig, author of books such as "Free
Culture" and "Code 2.0," would run for the open House of Representatives
seat in California created by the death of Representative Tom Lantos,
a Democrat, earlier this month. A "draft Lessig" movement has popped
up online since Lantos died. Lessig said he plans to make the decision
about whether to run by about March 1, 2008. "This is a very difficult
decision," he wrote on his blog. "Thank you to everyone who has tried
to help -- both through very strong words of encouragement and very,
very strong words to dissuade. Lessig, a self-described progressive,
would run as part of his Change Congress campaign. The Stanford University
law professor announced in January that he would shift his focus to
political corruption and away from free software and free culture. He
called on lawmakers to stop accepting money from political action
committees and lobbyists, and to stop adding so-called earmarks for
special projects in appropriation legislation. Politicians need to
change "how Washington works" and to end a culture of corruption that's
based on political contributions, he said in a video at Lessig08.org.
"You know about this corruption in Washington, a corruption that doesn't
come from evil people, a corruption that comes from good people working
in a bad system," he said in the video. Progressives should work to
change the way money influences decisions in Washington, he said, "not
because this is, in some sense, the most important problem, but because
it is the first problem that has to be solved if we're going to address
these more fundamental problems later." Lessig is the founder of the
Creative Commons, which attempts to give copyright holders additional
options for licensing their work beyond all rights reserved. Lessig
has served on the boards of the Free Software Foundation, the Electronic
Frontier Foundation, the Public Library of Science, and Public Knowledge.

XML 2.0? No, Seriously.

Maybe its madness to consider XML 2.0 seriously. The cost of deployment
would be significant. Simultaneously convincing a critical mass of
users to switch without turning the design process into a farce would
be very difficult. And yet, the alternatives look a little like madness
too. I found three topics on my desk simultaneously last week: (1) The
proposal to amend the character set of XML 1.0 identifiers by erratum.
(2) the proposal to deploy CURIEs, an awkward, confusing extension of
the QName concept. (3) A thread of discussion suggesting that we consider
allowing prefix undeclaration in Namespaces in XML 1.0. That's right 1.0.
We're in an odd place. XML has been more successful, and in more and
more different arenas, than could have been imagined. But... XML 1.0
is seriously broken in the area of internationalization, one of its
key strengths, because it hasn't kept pace with changes to Unicode.
QNames, originally designed as a way of creating qualified element
and attribute names have also been used in more and more different
arenas than could have been imagined. Unfortunately, the constraints
that make sense for XML element and attribute names, don't make sense,
are unacceptable, in many of the other arenas. And in XML, we learned
that it is sometimes useful to be able to take a namespace binding out
of scope. XML 1.1 addressed some of these concerns, but also introduced
backwards incompatibilities. Those incompatibilities seemed justified
at the time, although they seem so obviously unnecessary and foolish
now. In short, we botched our opportunity to fix the problem 'right'.
What to do? ... Perhaps, dare I say it, it is time to consider XML 2.0
instead. Trouble is, if XML 2.0 gets spun up as an open-ended design
exercise, it'll be crushed by the second-system effect. And if XML 2.0
gets spun up as 'only' a simplification of XML 1.0, it won't get any
traction. If XML 2.0 is to be a success, it has to offer enough in the
way of new functionality to convince people with successful XML 1.0
deployments (that's everyone, right?) that it's worth switching. At the
same time, it has to be about the same size and shape as XML 1.0 when
it's done or it'll be perceived as too big, too complicated, too much
work. With that in mind, here are some candidate requirements for XML
2.0... More Information

Infiniflow: Distributed Application Server Based on OSGi and SCA

Paremus recently released version 1.2 of Infiniflow, a next-generation
distributed application server based on OSGi and SCA. Paremus Marketing
Manager Andrew Rowney explained that Infiniflow was based upon OSGi and
SCA, and that it follows an application server paradigm -- a component
is written as a series of OSGi modules, it is linked to external
services through the SCA bindings, and Infiniflow provides life-cycle
management, monitoring, scaling and fault-recovery for any application
deployed on it. Rowney also described some best practices for application
development with Infiniflow: To take advantage of the full capabilities
of Infiniflow, an application needs to be presented as a composite
application rather than a single runtime entity, with different parts
of the processing requirements being handled in separate components
(OSGi bundles). A good example is where a part of the composite
application contains an intensive calculation that can be run in parallel
to reduce the overall processing time. For this type of applications the
developer is able to specify that Infiniflow should duplicate the bundle
that runs the calculation, instantiating as many copies as possible in
order to calculate the final result as quickly as possible... Infiniflow
itself is built using OSGi, and wired together using SCA System
descriptions. It has a Model-Driven Architecture: to reduce operational
complexity, application runtimes can only be modified through their SCA
System descriptor, and all interactions with the descriptor are secured
and audited An Infiniflow Service Fabric consists of a number of
Infiniflow containers -- OSGi-enabled JVM's -- which are able to
dynamically install/start/stop/uninstall code packaged in the OSGi
bundles referenced from the SCA System document...

XML Daily Newslink. Wednesday, 20 February 2008

Extreme transaction processing (XTP) is being added to complex event
processing (CEP) in service-oriented architecture (SOA) implementations
for the financial services industry, explains David Chappell, vice
president and chief technologist for SOA at Oracle Corp. Chappell:
"What we're seeing is that SOA coupled with a class of applications
coined as extreme transaction processing or XTP is the future for
financial services infrastructure. So IT continues to be seen as the
enabler. We've seen some supporting data from Gartner/DataQuest that
IT spending in financial services is going to reach $566 billion by
2010. Where SOA comes into the picture is that it enables IT to deliver
new business services faster, while leveraging existing systems. At
the same time the financial institutions are pushing limits that require
more processing capability yet at the same time they don't want to see
an exponential rise in their investment in hardware. So the extreme
transaction processing class of applications has been most notably seen
in areas such as fraud detection, risk computation and stock trade
resolution... What XTP does is allow transactions to occur in memory
and not against the backend systems directly due to the need for extremely
fast response rates, but still including transactional integrity. So
think of classes of applications that need to handle large volumes of
data that need to be absorbed, correlated and acted upon. Typically
that data processed by XTP applications comes in the form of large
numbers of events and usually represents data that changes frequently...
once the pattern matching engine, whether it's built directly into the
XTP application itself or is identified by the complex event processing
engine, is it identifies an event of significance such as ATM withdrawal
fraud. Say for example your ATM card is used in different ATM machines
or is used to make purchases in three or four states or even different
countries within a matter of minutes, that's usually a flag that some
kind of fraud is going on. Once that kind of a situation is detected
then an SOA process in BPEL can be kicked off to make the proper
notification, send alerts to Business Activity Monitoring dashboards...

ODF Standard Editor Calls for Cooperation with OOXML

The teams developing ODF (OpenDocument Format) and OOXML (Office Open
XML) standards should work together, evolving the two in parallel, the
editor of the ODF standard said Tuesday in an open letter to the
standards-setting community. The Microsoft-sponsored OOXML document
format is just days away from a critical meeting that will influence
whether the International Organization for Standardization (ISO) will
adopt it as a standard as its rival ODF was adopted in May 2006.
Relations between supporters of the two formats are, for the most part,
combative rather than cordial. Patrick Durusau, ISO project editor for
ODF, or ISO/IEC 26300 as it is known there, thinks supporters of the
two formats would be more productive if they allowed the formats to
co-evolve, he wrote in his open letter . Durusau thoughtfully avoided
the ODF and OOXML formats for his letter, choosing instead PDF, itself
adopted as an ISO standard in December. From Durusau's "Co-Evolving
OpenXML And OpenDocument Format": "If we had a co-evolutionary environment,
one where the proponents of OpenXML and OpenDocument, their respective
organizations, national bodies and others interested groups could meet
to discuss the future of those proposals, the future revisions of both
would likely be quite different. Co-evolution means that the standards
will evolve based on the influence of each other and their respective
user communities. Both remain completely independent and neither is
subordinate to the other. What is currently lacking is a neutral forum
in which proponents can meet and learn from each other. Creating such
an environment is going to take time and effort so I would like to
suggest a first step towards fostering co-evolution between OpenXML
and OpenDocument..."

ISO News: Ballot Resolution Meeting on ISO/IEC DIS 29500 Standard

National delegations from thirty-seven (37) countries will be
participating in a ballot resolution meeting in Geneva, Switzerland,
on 25-29 February 2008 on the draft international standard "ISO/IEC
DIS 29500, Information Technology -- Office Open XML File Formats."
ISO/IEC DIS 29500 is a proposed standard for word-processing documents,
presentations and spreadsheets that is intended to be implemented by
multiple applications on multiple platforms. According to the
submitters of the document, one of its objectives is to ensure the
long-term preservation of documents created over the last two decades
using programmes that are becoming incompatible with continuing
advances in the field of information technology. The objective of the
ballot resolution meeting (BRM) will be to review and seek consensus
on possible modifications to the document in light of the comments
received along with votes cast during a five-month ballot on the draft
which ended on 2 September 2007... No decision on publication will be
taken at the meeting itself. Following the BRM, the 87 national member
bodies that voted in the 2 September ballot will have 30 days (until
29 March 2008) to examine the actions taken in response to the comments
and to reconsider their vote if they wish. If the modifications proposed
are such that national bodies then wish to withdraw their negative
votes, or turn abstentions into positive votes, and the acceptance
criteria are then met, the standard may proceed to publication. The
BRM is being organized by subcommittee SC 34, Document description and
processing languages, of the joint technical committee JTC 1, Information
technology. JTC 1 is one of the most experienced and productive of ISO
and IEC technical committees, having developed some 2 150 widely and
globally used international standards and related documents.
Approximately 4 200 comments were received during last year's ballot.
By grouping and by eliminating redundancies, these have been edited by
SC 34 experts down to 1100 comments for processing during the five
days of the BRM. The task will be carried out by 120 participants who
have registered for the meeting. They comprise members of the 37
national delegations, plus representatives of Ecma International, the
computer manufacturers' association that submitted ISO/IEC DIS 29500
for adoption by JTC 1, plus officers of the ISO/IEC Information
Technology Task Force (ITTF) which is responsible for the planning and
coordination of JTC 1 work. More Information

Wednesday, February 20, 2008

GRDDL: Gleaning Information From Embedded Metadata

This article explains how to put GRDDL-enabled agents to the task of
extracting valuable information from machine-processable metadata
embedded in documents -- courtesy of prevailing semantic web standards.
HTML and XHTML traditionally have had only modest support for metadata
tags. The World Wide Web Consortium (W3C) is working on including richer
metadata support in HTML/XHTML with emerging standards such as RDF with
attributes (RDFa), embedded RDF (eRDF), and so on. These standards allow
more specific metadata to be attached to different structural and
presentation elements, which provides a unified information resource.
Gleaning Resource Descriptions from Dialects of Languages (GRDDL,
pronounced griddle) offers a solution to the embedded metadata problem
in a flexible, inclusive, and forward-compatible way. It allows the
extraction of standard forms of metadata (RDF) from a variety of sources
within a document. People usually associate XHTML with GRDDL, but it is
worth noting that GRDDL is useful for extracting standardized RDF
metadata from other XML structures as well. GRDDL theoretically supports
a series of naming conventions and standard transformations, but it does
not require everyone to agree to particular markup strategies. It allows
you to normalize metadata extraction from documents using RDFa,
microformats, eRDF, or even custom mark-up schemes. The trick is to
identify the document as a GRDDL-aware source by specifying an HTML
metadata profile. The profile indicates to any GRDDL-aware agents that
the standard GRDDL profile applies. Anyone wishing to extract metadata
from the document should identify any relevant 'link' tags with a 'rel'
attribute of transformation and apply it to the document itself. This
approach avoids the conventional problem of screen scraping, where the
client has to figure out how to extract information. With GRDDL, the
publisher indicates a simple, reusable mechanism to extract relevant
information. While there is currently no direct support for GRDDL in
any major browser, that situation is likely to change in the near future.
Until then, it is not at all difficult to put a GRDDL-aware proxy in
between your browser and GRDDL-enabled pages, which the Piggy Bank
FireFox extension from MIT's SIMILE Project does.

Access Control for Cross-site Requests

W3C announced that the Web Application Formats (WAF) Working Group has
released a new snapshot of the editor's draft of "Access Control for
Cross-site Requests." The WAF Working Group is part of the Rich Web
Clients Activity in the W3C Interaction Domain. It includes recent
HTTP header name changes and incorporates a new proposal for limiting
the amount of requests in case of non-GET methods to various different
URIs which share the same origin. In addition to those technical
changes it also makes the (until now) implicit requirements and use
cases explicit by listing them in an appendix and contains a short
FAQ on design decisions. Summary: "In Web application technologies
that follow this pattern, network requests typically use ambient
authentication and session management information, including HTTP
authentication and cookie information. This specification extends
this model in several ways: (1) Web applications are enabled to
annotate the data that is returned in response to an HTTP request with
a set of origins that should be permitted to read that information by
way of the user's Web browser. The policy expressed through this set
of origins is enforced on the client. (2) Web browsers are enabled to
discover whether a target resource is prepared to accept cross-site
HTTP requests using non-GET methods from a set of origins. The policy
expressed through this set of origins is enforced on the client. (3)
Server side applications are enabled to discover that an HTTP request
was deemed a cross-site request by the client Web browser, through
the Access-Control-Origin HTTP header. This extension enables server
side applications to enforce limitations on the cross-site requests
that they are willing to service. This specification is a building
block for other specifications, so-called hosting specifications,
which will define the precise model by which this specification is
used. Among others, such specifications are likely to include
XMLHttpRequest Level 2, XBL 2.0, and HTML 5 (for its server-sent
events feature). According to the editor's note: "We expect the next
draft to go to Last Call so hereby we're soliciting input, once again,
from the Forms WG, HTML WG, HTTP WG, TAG, Web API WG, and Web Security
Context WG..."

Web Services Connector for JMX Enters Public Review

The JSR 262 has has now entered the Public Review phase. New JMX types
supported for MBean operations: NotificationResult,
NotificationFilterSupport, AttributeChangeNotificationFilter,
MBeanServerNotificationFilter. This allows the JSR 262 connector to
support the new Event Service being defined by JSR 255, which has
MBean operations that use those types. JSR 262 defines a way to use
Web Services to access JMX instrumentation remotely. It provides a
way to use the server part of the JMX Remote API to create a Web Services
agent exposing JMX instrumentation, and a way to use the client part
of the API to access the instrumentation remotely from a Java
application. It also specifies the WSDL definitions used so that the
instrumentation will be available from clients that are not based on
the Java platform, or from Java platform clients accessing the
instrumentation directly using the JAX-RPC API. The Web Services
Connector for Java Management Extensions (JMX) Agents Reference
Implementation Project develops and evolves the reference
implementation of JSR 262 specification. The JSR 262 defines a
connector for JMX that uses Web Services to make JMX instrumentation
available remotely. JMX Connector semantics are preserved when
connecting from a JMX Client. WS-Management standard from the DMTF
is the protocol in use in the connector. This Connector allows WS-Man
native clients to interoperate with JMX Agent. Such clients can be
written in Java language or not (C, C#, JavaScript, Perl, ...). The
JMX technology was developed through the Java Community Process (JCP)
program, and was one of the earliest JSRs (JSR 3). It was subsequently
extended by the JMX Remote API (JSR 160). The future evolutions of
both JSRs have now been merged into a single JSR to define version
2.0 of the JMX specification (JSR 255). A management interface, as
defined by the JMX specification, is composed of named objects called
Management Beans, or MBeans. MBeans are registered with an ObjectName
in an MBean server. To manage a resource or resources in your
application, you create an MBean that defines its management
interface, and then register that MBean in your MBean server. The
content of the MBean server can then be exposed through various
protocols, implemented by protocol connectors, or by protocol adaptors.

RESTful SOA Using XML

Service Oriented Architecture (SOA) is used in companies that have
large numbers of applications for employees in different departments
with varying responsibilities. Many of these applications share
functionalities, but the combinations of functionalities, user-interface
specifics, and usability requirements differ. Like many enterprise
architectures, SOA follows a multitier model, but it doesn't stop there.
Within the server, functionalities are divided over separate services.
A client can consume one or more of the services, and one service can
be consumed by many clients. The result is a loosely coupled architecture
that propagates the reusability of existing software. SOA fits
particularly well in large companies that have several hundred poorly
integrated applications and that need to clean up their IT
infrastructures. SOA is a proven practice, capable of working effectively
in large environments. Adapters can to translate legacy applications to
services that integrate as backends to modern applications. Middleware
technology is available to orchestrate services and control access to
specific functionalities in the service. Because the need for SOAs is
highest in this area, vendors of middleware technology typically focus
their products toward large and heavyweight solutions. Usually, SOA is
implemented with the SOAP protocol, described by a Web Services
Description Language (WSDL) document. Although many developer tools make
it relatively easy to work with SOAP and WSDL, I consider them heavyweight
technology, because they're hard to work with if you don't use those
tools. You can implement SOA just as well by sending simple messages
over Hypertext Transfer Protocol (HTTP). Basically, this is what RESTful
Web services do. Representational State Transfer (REST; the name was
coined by Roy Fielding) isn't a protocol or technology: It's an
architectural style. REST, a lightweight alternative to SOAP, is resource
oriented rather than action oriented. It's often summarized as bringing
back remote procedure calls to GET, POST, PUT, and DELETE statements
using HTTP. In my opinion, this is the second important step. More Information

Codecs, Metadata, and Addressing: Video on the Web Workshop Report

W3C announced that a published Report on the W3C Video on the Web
Workshop is now available. Thirty-seven organizations discussed video
and audio codecs, spatial and temporal addressing, metadata, digital
rights management, accessibility, and other topics related to ensuring
the success of video as a "first class citizen" of the Web. W3C thanks
Cisco for hosting the Workshop, which took place 12-13 December 2007
simultaneously in San Jose, California and Brussels, Belgium. Five
major areas of possible work emerged from the Workshop: video codecs,
metadata, addressing, cross-group coordination and best practices for
video content. The W3C team will work with interested parties to
evaluate the situation with regards to video codecs, and what, if
anything, W3C can do to ensure that codecs, containers, etc. for the
Web encourage the broadest possible adoption and interoperability. As
for metadata, one direction would be to create a Working Group tasked
to come up with a simple common ontology between the existing standards
which defines a mapping between this ontology and existing standards
and defines a roadmap for extending the ontology, including information
related to copyright and licensing rights. W3C should also consider
creating a Group to investigate the important issue of addressing. The
goal would be to: (1) provide a URI syntax for temporal and spatial
addressing; (2) investigate how to attach metadata information to
spatial and temporal regions when using RDF or other existing
specifications, such as SMIL or SVG. A Group working on guidelines and
best practices for effective video and audio content on the Web could
be useful, and would look at the entire existing delivery chain from
producers to end-users, from content delivery, to metadata management,
accessibility or device independence. Also available online: forty-two
position papers and Workshop minutes.

Universal Middleware: What's Happening With OSGi and Why You Should Care

The Open Services Gateway Initiative (OSGi) Alliance is working to
realize the vision of a "universal middleware" that will address issues
such as application packaging, versioning, deployment, publication,
and discovery. In this article we'll examine the need for the kind of
container model provided by the OSGi, outline the capabilities it would
provide, and discuss its relationship to complementary technologies
such as SOA, SCA, and Spring. Enterprise software is often composed of
large amounts of complex interdependent logic that makes it hard to
adapt readily to changes in requirements from the business. You can
enable this kind of agility by following a Service Oriented Architecture
(SOA) pattern that refactors a system into application modules grouped
by business functions that expose their public functionality as services
(interfaces)... we'll explain how an Open Services Gateway initiative
(OSGi) container would solve them. We'll begin with an introduction
to the OSGi's solution to the problem, concepts, and platform, and
then we'll delve into the evolution of the OSGi from its past in the
world of embedded devices to its future in enterprise systems. We'll
also explain the relationship between the OSGi and other initiatives,
containers, and technologies to provide a comprehensive picture of
the OSGi from the perspective of software development... Conceptually
both SCA and OSGi provide a composite model for assembling a
services-based composite application that can expose some services
to the external world as well as invoke external services. In OSGi
R4, declarative services define a model to declare a component in
XML, capturing its implementation and references. Besides SCA-like
component-level information, the OSGi model captures additional
information to control runtime behavior. For example, R4 provides
bind/unbind methods to track the lifecycle or manage target services
dynamically. SCA metadata defines wires between components or from a
component to a reference in its composite model...

Conference Event Package Data Format Extension for Centralized

Members of the IETF Centralized Conferencing (XCON) Working Group have
published an initial Internet Draft for "Conference Event Package Data
Format Extension for Centralized Conferencing (XCON)." The XCON
framework defines a notification service that provides updates about
a conference instance's state to authorized parties using a notification
protocol. The "Data Format Extension" memo specifies a notification
mechanism for centralized conferencing which reuses the SIP (Session
Initiation Protocol) event package for conference state. Additionally,
the notification protocol specified in this document supports all the
data defined in the XCON data model (i.e., data model as originally
defined in RFC 4575) plus all the extensions, plus a partial notification
mechanism based on XML patch operations. Section 5.4 provides an XML
Schema for Partial Notifications. Generating large notifications to
report small changes does not meet the efficiency requirements of some
bandwidth-constrained environments. The partial notifications mechanism
specified in this section is a more efficient way to report changes in
the conference state. In order to obtain notifications from a conference
server's notification service, a client subscribes to the 'conference'
event package at the server as specified in RFC 4575. The NOTIFY
requests within this event package can carry an XML document in the
"application/conference-info+xml" format. Additionally, per this
specification, NOTIFY requests can also carry XML documents in the
"application/xcon-conference-info+xml" and the
"application/xcon-conference-info-diff" formats. A document in the
"application/xcon-conference-info+xml" format provides the user agent
with the whole state of a conference instance. A document in the
"application/ xcon-conference-info-diff+xml" format provides the user
agent with the changes the state of the conference instance has
experimented since the last notification sent to the user agent.

Protect Your Project Zero Applications with OpenID

Access control-based security of application resources is one of the
core features of Project Zero. The OpenID Foundation describes OpenID
as an open, decentralized, free framework for user-centric digital
identity. OpenID takes advantage of already existing Internet technology
(URI, HTTP, SSL, Diffie-Hellman) and realizes that people are already
creating identities for themselves whether it be at their blog,
photostream, profile page, and so on. With OpenID you can easily
transform one of these existing URIs into an account you can use at
sites which support OpenID logins. Project Zero adopted the OpenID
technology as part of its security offering. In this article, the third
and final part of the series, you learn about Project Zero Security and
how to leverage OpenID authentication, define security rules for the
application, and extend a user registry... OpenID provides increased
flexibility for application deployment by enabling applications to
leverage third-party authentication providers for handling authentication.
Providers such as OpenID have become very common as more users want
a single user profile across multiple sites for blogs, wikis, and other
social networking activities. Additionally, many Web sites do not want
to maintain, or require users to continually provide, the same
profile-related information just to ensure that the user credentials
are valid. We hope this final article in the series has helped you learn
how to use the OpenID technology in the Project Zero platform to achieve
this decentralized authentication, and that the entire series has helped
you understand best practices for building the all-important security
features into your Zero applications. As a developer of fast-paced,
user-driven Web 2.0 applications, you know how vital security is to
both your customers and your business.

Sunday, February 17, 2008

SourceForge: Office Binary (doc, xls, ppt) Translator to Open XML Project

"As promised last month, the binary documentation (.doc, .xls, .ppt)
is now live. In addition to this, the project to create an open source
translator (binary to Open XML) has now been formed on SourceForge,
and the development roadmap has been published. While the project is
still in its infancy, you can see what the planned project roadmap is,
as well as an early draft of a mapping table between the Word binary
format (.doc) and the Open XML format (.docx). The binary documentation
itself is also available; it's all covered under the Open Specification
Promise. Another great surprise in all of this is that we've made the
documentation for a few other supporting technologies available as it
may be of use to folks implementing the binary formats: (1) Windows
Compound Binary File Format Specification; (2) Windows Metafile Format
(.wmf) Specification; (3) Ink Serialized Format (ISF) Specification..."
From the Overview: "The main goal of the Office Binary (doc, xls, ppt)
Translator to Open XML project is to create software tools, plus
guidance, showing how a document written using the Binary Formats
(doc, xls, ppt) can be translated into the Office Open XML format. As
a result customers can use these tools to migrate from the binary formats
to Office Open XML Format thus enabling them to more easily access their
existing content in the new world of XML. The Translator will be
available under the open source Berkeley Software Distribution (BSD)
license, which allows that anyone can use the mapping, submit bugs and
feedback, or contribute to the project. On February 15th 2008, Microsoft
has made it even easier to get access to the binary formats documentation
from [the Microsoft Office Binary File Formats web site], and the binary
formats have also been made available under the Microsoft Open
Specification Promise. The Office Open XML file format has been approved
as an Ecma standard and is available [online]. We have chosen to use an
Open Source development model that allows developers from all around the
world to participate and contribute to the project. More Information See also the SourceForge Project: Click Here

OASIS Members Submit Charter for ebXML Core (ebCore) Technical Committee

OASIS announced the submission of a draft charter for a new ebXML Core
(ebCore) Technical Committee. The ebXML Core TC is to be the maintenance
group for ebXML TC specifications as these specifications are completed
or transitioned to the ebXML Core TC. The OASIS ebXML Joint Committee
is disbanding, so ebXML Core TC will take on the roles and the work of
the ebXML JC, and in addition will do maintenance on the standards that
have been produced by the ebXML TCs: ebXML Messaging, ebXML CPPA, ebXML
ebBP, ebXML IIC, and ebXML RegRep. Companies represented by the TC
proposers include Axway Software, The Boeing Company, British
Telecommunications plc, Fujitsu Limited, and Sonnenglanz Consulting.
The ebXML Core TC will provide the means to manage clarifications,
modifications, and enhancements for the specifications that ebXML TCs
have either completed and/or turned over as work in progress for
completion through the OASIS standards process. The ebXML Core TC may
issue errata for specifications that they maintain and may complete
reviews and changes required by editor's on committee drafts received
by the TC for completion. The ebXML Core TC may form subcommittees to
provide focus for specific specification tasks as they arise. The ebXML
Core TC may lead the formation of charters for new ebXML TCs for major
new versions of specifications. The TC may also produce new conformance
profiles and adjunct documents complementing existing specifications.
The ebXML Core TC will solicit new end user requirements as well as
implementation enhancements and change requests. The TC will also
explore synergies with UN/CEFACT, WS-* specifications and SOA best
practices. The ebXML Core TC may update schemas, examples,
specifications and other products of ebXML TC activities. More Information

XML 1.0 (Fifth Edition)

The fifth edition of XML 1.0 is now a 'Proposed Edited Recommendation'
(PER). New editions do little more than incorporate errata, hardly
newsworthy. This one is different. Fifth Edition is now out for review.
The review period is long, lasting until 16-May-2008, because one of
the proposed changes is significant. Before the fifth edition, XML 1.0
was explicitly based on Unicode 2.0. As of the fifth edition, it is
based on Unicode 5.0.0 or later. This effectively allows not only
characters used today, but also characters that will be used tomorrow.
One of the real strengths of XML from the very beginning was that it
required processors to support Unicode. This made XML, and all XML
processors, international. But as Unicode has been extended to support
languages written in Cherokee, Ethiopic, Khmer, Mongolian, Canadian
Syllabics, and other scripts, XML 1.0's explicit use of Unicode 2.0 has
prevented it from growing as well. That's a problem that XML must fix
if it wants to continue to be regarded as a universal text format...
The fifth edition does not change the status of any existing XML 1.0
document with respect to well-formedness or validity. Nor does it
introduce any of the backwards-incompatible changes introduced in XML 1.1.
It isn't entirely without pain, unfortunately. Even if we imagine that
all parsers will be updated to reflect the fifth edition (and it's
possible to be optimistic on this point as it actually makes parsers
smaller and simpler) eventually, there will be some period of time in
which your (fourth edition) parser might reject my (fifth edition)
document. The XML Core WG is taking the position that the benefits of
extending XML 1.0 in this way outweigh the costs imposed by the change.
It remains to be seen if the community will agree. Bear in mind that
this sort of change isn't entirely unprecedented, we previously
decoupled 'xml:lang' attributes from the relevent RFCs and we tinkered
with the specific version of Unicode 3 referenced. That said, this is
still a much more substantial change. More Information See also the XML-DEV discussion thread: Click Here

OASIS TC Publishes Code List Representation (Genericode) Version 1.0

G. Ken Holman announced that "Code List Representation (Genericode)
Version 1.0" (Committee Specification 01) has been published, and is
available online. Edited by Anthony B. Coates on behalf of the OASIS
Code List Representation TC, this document describes the OASIS Code
List Representation model and W3C XML Schema, known collectively as
'genericode'. Code lists, or enumerated values, have been with us since
long before computers. Most people would agree that the following is
a code list: {'SUN', 'MON', 'TUE', 'WED', 'THU', 'FRI', 'SAT'}. Code
lists should be well understood and easily dealt with by now.
Unfortunately, they are not. As is often the case, if you take a
fundamentally simple concept, you find that everyone professes to
understand it with complete clarity. When you look more closely, you
find that everybody has their own unique view of what the problem is
and how it should be solved. If code lists were really so simple and
obvious, there would already be a single, well-known and accepted way
of handling them in XML. There is no such agreed solution, though.
The problem is that while code lists are a well understood concept,
people don't actually agree exactly on what code lists are, and how
they should be used. The OASIS Code List Representation format,
'genericode', is a single model and XML format (with a W3C XML Schema)
that can encode a broad range of code list information. The XML format
is designed to support interchange or distribution of machine-readable
code list information between systems. Note that genericode is not
designed as a run-time format for accessing code list information, and
is not optimized for such usage. Rather, it is designed as an
interchange format that can be transformed into formats suitable for
run-time usage, or loaded into systems that perform run-time processing
using code list information. There are 3 kinds of genericode documents,
all supported by the one W3C XML Schema: (1) Column Set documents
(contain definitions of genericode columns or keys that can be imported
into code list documents or into other column set documents); (2) Code
List documents (contain metadata describing the code list as a whole,
as well as explicit code list data -- codes and associated values); (3)
Code List Set documents (contain references to particular versions of
code lists, and can also contain version-independent references to code
lists; a code list set document can be used to define a particular
configuration of versions of code lists that are used by a project,
application, standard, etc.). Work on the corresponding CVA formats is
still underway. More Information

W3C Last Call Working Draft: CSS Namespaces Module

W3C announced the release of an updated, Last Call Working Draft for
the "CSS Namespaces Module" specification, updating the previous WD
published 2006-08-28. The previous draft was edited by Peter Linss and
Chris Lilley. The deadline for comments is 7-March-2008. This CSS
Namespaces Module defines syntax for using namespaces in CSS. It defines
the '@namespace' rule for declaring a default namespace and for binding
namespaces to namespace prefixes. It also defines a syntax for using
those prefixes to represent namespace-qualified names. It does not
define where such names are valid or what they mean: that depends on
their context and is defined by a host language, such as Selectors,
that references the syntax defined in the CSS Namespaces module. Note
that a CSS client that does not support this module will, if it properly
conforms to CSS's forward-compatible parsing rules, ignore all
'@namespace' rules, as well as all style rules that make use of namespace
qualified names. The syntax of delimiting namespace prefixes in CSS was
deliberately chosen so that these CSS clients would ignore the style
rules rather than possibly match them incorrectly. A document or
implementation cannot conform to CSS Namespaces alone, but can claim
conformance to CSS Namespaces if it satisfies the conformance requirements
in this specification when implementing CSS or another host language that
normatively references this specification. Conformance to CSS Namespaces
is defined for two classes: (1) style sheet: a CSS style sheet or a
complete unit of another host language that normatively references CSS
Namespaces; (2) interpreter: someone or something that interprets the
semantics of a style sheet, where CSS user agents fall under this
category. CSS is the Web's primary style sheet language for specifying
the rendering of text documents, in particular those expressed in HTML
and XML-based formats. It can also be used to specify portions of the
rendering of certain non-text formats, such as SMIL (multimedia) and SVG
(vector graphics). The model of text-flow and the set of properties of
CSS are also shared with XSL, W3C's style language for complex formatting
of XML-based document formats, though XSL is developed by a separate WG.
In addition to visual output (screen, print), CSS also contains styling
properties for speech output. The CSS WG develops and maintains the CSS
language and related technologies. CSS allows both authors and readers
to specify the display or other rendering of documents, such as those in
HTML or SVG. CSS has several levels, from simple (level 1) to complex
(level 3) and several 'profiles,' which describe how CSS applies on
different media (TV, handheld, etc.). Level 1 is a Recommendation, level
2 is in maintenance, level 3 is currently being developed. More Information

SEC Financial Explorer Supports XBRL Interactive Data

The U.S. Securities and Exchange Commission (SEC) has announced the
launch of the "Financial Explorer" on the SEC Web site to help
investors quickly and easily analyze the financial results of public
companies. XBRL is a member of the family of languages based on XML
(Extensible Markup Language), which is a standard for the electronic
exchange of data between businesses and on the internet. Under XML,
identifying tags are applied to items of data so that they can be
processed efficiently by computer software. Financial Explorer paints
the picture of corporate financial performance with diagrams and charts,
using financial information provided to the SEC as "interactive data"
in Extensible Business Reporting Language (XBRL). At the click of a
mouse, Financial Explorer lets investors automatically generate
financial ratios, graphs, and charts depicting important information
from financial statements. Information including earnings, expenses,
cash flows, assets, and liabilities can be analyzed and compared across
competing public companies. The software takes the work out of
manipulating the data by entirely eliminating tasks such as copying and
pasting rows of revenues and expenses into a spreadsheet. That frees
investors to focus on their investments' financial results through
visual representations that make the numbers easier to understand.
Financial Explorer is open source software, meaning that its source
code is free to the public, and technology and financial experts can
update and enhance the software. As interactive data becomes more
commonplace, investors, analysts, and others working in the financial
industry may develop hundreds of Web-based applications that help
investors garner insights about financial results through creative
ways of analyzing and presenting the information. In addition to
Financial Explorer, the SEC currently offers investors two other
online viewers -- the Executive Compensation viewer and the Interactive
Financial Report viewer, also available at online. The Executive
Compensation viewer enables investors to instantly compare what 500
of the largest U.S. companies are paying their top executives. The
Interactive Financial Report viewer also helps investors gather,
analyze, and compare key financial disclosures filed voluntarily by
public companies using XBRL. To date, there have been 307 such filings
from 74 companies. Under the SEC's interactive data filing program,
companies may continue to file XBRL data voluntarily, pending
anticipated Commission rulemaking. More Information See also the Financial Explorer web site: Click Here

WLS 10.3 Tech Preview Supports Service Component Architecture (SCA)

WebLogic 10.3 Tech preview now supports Service Component Architecture
(SCA) runtime. The SCA specification has two main parts: implementation
of service components (which can be done in any language) and the
assembly model which is the linking of components through wiring (which
is done through XML files). Every component technology (Spring, POJO,
EJB etc) that wants to participate in the SCA framework should support
SCA metadata. The SCA specification defines language bindings for each
of the technologies. In my opinion, SCA is the next evolution of building
interoperable distributed systems. The claim to fame for SOAP based web
services is that it provided a programming model where clients do not
care as to which programming language the service is implemented. The
client can be written in any language that has a SOAP binding. The only
restriction on the client is that it has to use the SOAP API. Thus by
service enabling your existing business services and modifying your
legacy clients to speak SOAP, web services have made enterprise
integration easier compared to yester years. Now SCA takes this
interoperability to the next level, where now your clients can stay as
is and do not have to use the same transport as the service. All the
client knows is that it is a remotable service. Let us say you have a
Java client that was talking to an EJB. Now this EJB has been converted
into a web service. In this case, the Java client does not have to
change to use SOAP API. Instead it can still use its EJB client code
because the web service (which is an SCA component) can be decorated
with an EJB binding. Thus the service can be implemented in one technology
such as Spring, POJO, EJB, web service or BPEL and it can be decorated
with a different binding (Spring, POJO, EJB, web services etc) to support
different clients. By including the SCA runtime on WLS, customers can
take advantage of the RASP functionality provided by WLS for the deployed
SCA components. The infrastructural capabilities such as security,
transactions, reliable messaging that are to be handled declaratively
through policies under the SCA specification can all be provided by WLS. More Information

Friday, February 15, 2008

W3C Publishes Best Practices for XML Internationalization

W3C's Internationalization Tag Set (ITS) Working Group has published a
Group Note for "Best Practices for XML Internationalization." The
specification provides a set of guidelines for developing XML documents
and schemas that are internationalized properly. Following the best
practices describes here allow both the developer of XML applications,
as well as the author of XML content to create material in different
languages. This document and "Internationalization Tag Set (ITS)
Version 1.0" implement requirements formulated in "Internationalization
and Localization Markup Requirements." This note is intended to
complement the W3C ITS Recommendation, since not all
internationalization-related issues can be resolved by the special
markup described in ITS. The best practices in this document therefore
go beyond application of ITS markup to address a number of problems
that can be avoided by correctly designing the XML format, and by
applying a additional guidelines when developing content. Guidelines
for designers and developers of XML applications are presented in
three sections. Section 2 "When Designing an XML Application" provides
a list of some of the important design choices you should make in
order to ensure the internationalization of your format. Section 4
"Generic Techniques" provides additional generic techniques such as
writing ITS rules or adding an attribute to a schema; such techniques
apply to many of the best practices. Section 5 "ITS Applied to
Existing Formats" provides a set of concrete examples on how to apply
ITS to existing XML based formats; this section illustrates many of
the guidelines in this document. Guidelines for users and authors of
XML content are outlined in other document sections. Section 3 "When
Authoring XML Content" provides a number of guidelines on how to create
content with internationalization in mind. Many of these best practices
are relevant regardless of whether or not your XML format was developed
especially for internationalization. Section 4.1 "Writing ITS Rules"
provides practical guidelines on how to write ITS rules. Such
techniques may be useful when applying some of the more advanced
authoring best practices. More Information

Google Code Project Provides an Enterprise Java XACML Implementation

The 'enterprise-java-xacml' Google Code Project provides a high
performance XACML 2.0 implementation that can used in the enterprise
environment. A first release has been announced; the software is
made available under the Apache License 2.0. Enterprise Java XACML
intends to fully implement OASIS XACML 2.0 and will support XACML
3.0 in the future. It is a totally independent implementation. It
fully implements XACML 2.0 core standard and has passed all
conformance tests. It provides PDP that can accept XACML requests
and returns XACML responses. The software is said to offer a highly
effective target indexing mechanism that greatly speeds up policy
searching: completely cached decisions that can speed up the
evaluation, and completely cached policies that can speed up the
evaluation. It supports a plugable data store mechanism: users can
implement their own data store by implementing only a few interfaces;
a file data store implementation is provided. It features a plugable
context factory: users can implement their own context factory that
wrap request/response in a specific format, and a default
implementation is supplied. A plugable logger mechanism means users
can implement their own logger mechanism: "I've provided 2 types of
logger, one is log4j, the other is a default logger; if log4j
conflicts with user's system, they may want to use this default one."
The tool supports an extensible XACML function registering mechanism;
users can write their own functions and register them to PDP and then
use in policies. The extensible attribute retriever mechanism means
that users can write their own attribute retriever to retrieve
attributes from external systems. It provides simple PAP APIs that
can be used to produce XACML policy files; users who want write an
XACML policy administrative UI can also rely on these APIs. Both
XACML APIs and an application framework are supported, which means
users can incorporate this implementation by calling XACML APIs from
their own applications. The implementation also provides a standalone
application framework that users can start and directly send XACML
request to it for evaluation. The software is distributed with unit
tests and conformance tests against XACML 2.0. More Information

Yet Another Computer Language

"...Microsoft is designing yet another computer language... it's a
declarative language [but] 'Declarative' is an awfully broad term, with
multiple meanings. Standard ML is considered declarative, and so are
its derivatives OCAML and F#. Prolog and rule-based AI systems are
considered declarative. You declare the rules: the logic engine decides
how to run them. SQL queries are declarative: you describe the data you
want to see, and the query optimizer figures out how to get it out of
the database. Haskell is considered declarative as well as functional,
not to mention that it has monads... XAML is a declarative language for
the domain of graphics. It was designed as an extension of XML. It's
such an expressive language that Charlie Petzold, arguably one of XAML's
most vocal proponents, built himself a alternative to Microsoft's XAMLPad
called XAML Cruncher, so that he could 'interactively type XAML code and
see the object it creates.' In Visual Studio 2008, Microsoft included a
bidirectional, split-screen XAML designer, so that you can create XAML
by dragging and dropping objects and by typing XAML code, with the ability
to freely switch back and forth between the two methods. I freely admit
to needing these tools; I can almost never write XAML that will display
correctly on the first try. Watching the Connected Systems Division (CSD)
at Microsoft over the years, it has been clear that they have been on a
code-reduction path. Why? SOAP was invented by Don Box and others to be
an XML-based lingua franca for communication among disparate computer
applications and systems. The functional deficiencies of SOAP were
addressed by the WS-* series of standards, to give it security,
authentication, reliability, and so on. All of those standards made it
harder to write conformant client and server code, raising the complexity
by orders of magnitude..." From Microsoft's "XAML Overview" document:
"XAML simplifies creating a UI for the .NET Framework programming model.
You can create visible UI elements in the declarative XAML markup, and
then separate the UI definition from the run-time logic by using
code-behind files, joined to the markup through partial class definitions.
The ability to mix code with markup in XAML is important because XML by
itself is declarative, and does not really suggest a model for flow
control. An XML based declarative language is very intuitive for creating
interfaces ranging from prototype to production, especially for people
with a background in web design and technologies. Unlike most other
markup languages, XAML directly represents the instantiation of managed
objects. This general design principle enables simplified code and
debugging access for objects that are created in XAML." More Information See also the XAML Overview: Click Here

Layer 7 Announces XML Firewall and XML Networking Gateway Products

Layer 7 Technologies has announced its XML Firewall and XML Networking
Gateway software products support the Solaris 10 Operating System (OS)
running on SPARC platforms from Sun Microsystems. Layer 7 is the only
XML security and networking vendor to offer server software for Solaris
10 OS running on SPARC and x86 platforms through an upgradeable family
of XML appliances for Service Oriented Architectures (SOA) and Web 2.0
applications. For customers with processor-intensive SOA and Web 2.0
applications, Layer 7 introduced support for Solaris 10 OS on SPARC to
further the scalability, density and performance offerings by Sun's
SPARC Enterprise Servers. Many of Layer 7's customers use SPARC-based
platforms for high-volume data center applications making SPARC
technology their first choice for SOA and XML applications. The
SecureSpan XML Firewall combines the capabilities of the SecureSpan XML
Accelerator and Data Screen with advanced identity and message level
security to address the broadest range of behind the firewall, portal
and B2B SOA security challenges. The SecureSpan XML Firewall includes
support for all leading directory, identity, access control, Single
Sign-On (SSO) and Federation services. This provides SOA and security
architects unparalleled flexibility in defining and enforcing
identity-driven SOA security policies leveraging SSO session cookies,
Kerberos tickets, SAML assertions and PKI. The SecureSpan XML Firewall
also provides architects with advanced policy controls for specifying
message and element security rules including the ability to branch
policy based on any message context. Key storage, encryption and signing
operations can be handled in FIPS 140-2 certified acceleration hardware
onboard the appliance or centrally through Safenet's Luna HSM. The
SecureSpan XML Firewall has demonstrated compliance with all major WS*
and WS-I security protocols including WS-Security, WS-SecureConversation,
WS-SecurityPolicy, WS-Trust, WS-Secure Exchange, WS-Policy and WS-I
Basic Security Profile. The SecureSpan Firewall also supports SAML 1.1
and 2.0 both in sender vouches and holder of key models.

XML at 10

Ten years ago today XML was born. That's when it was first published as
a Recommendation. XML goes back a little further than that, it
gestated, to stick to the metaphor, for almost two years at the W3C:
Dan Connolly announced the creation of the SGML Working Group mailing
list on 28-August-1996. It predates even that, of course, in the vision
of Yuri Rubinsky, Jon Bosak, and many others who imagined bringing the
full richness of generalized markup vocabularies to the then nascent
World Wide Web. My personal, professional career goes back to the fall
of 1993, so I came onto the scene only late in the development of 'SGML
on the Web' as an idea. It's earliest history is lost in the blur of
fear, excitement, and delight that I felt as I was thrust by circumstance
into the SGML community. I joined O'Reilly on the very first day of an
unprecedented two-week period during which the production department,
the folks who actually turn finished manuscripts into books, was closed.
The department was undergoing a two-week training period during which
they would learn SGML and, henceforth, all books would be done in SGML.
The day was a Monday in November, 1993; I know this for sure because I
still have the T-Shirt... Despite an inauspicious start, I have
essentially made my career out of it. I learned SGML at O'Reilly and
began working on DocBook, I worked in SGML professional services at
Arbortext, and I joined Sun to work in the XML Technology Center. XML
has been good to me. Things have not turned out as planned. The economic
forces that took over when the web became 'the next big thing' are
more interested in pixel-perfect rendering, animation, entertainment,
and advertising than in richly structured technical content. HTML 5
may be the last nail in the 'SGML on the Web' coffin, but few would
deny that XML has been a huge success. [Note: The DocBook Version 5.0
release is a complete rewrite of DocBook in RELAX NG. The intent of
this rewrite is to produce a schema that is true to the spirit of
DocBook while simultaneously removing inconsistencies that have arisen
as a natural consequence of DocBook's long, slow evolution. The
OASIS Technical Committee has taken this opportunity to simplify a
number of content models and tighten constraints where RELAX NG makes
that possible.]

W3C XML 10 Years

Ten years ago, on 10 February 1998, W3C published the "Extensible Markup
Language (XML) 1.0" specification as a W3C Recommendation. W3C is marking
the ten-year anniversary of XML by celebrating "XML10" and extending
thanks to the dedicated communities -- including people who have
participated in W3C's XML groups and mailing lists, the SGML community,
and xml-dev -- whose efforts have created a successful family of
technologies based on the solid XML 1.0 foundation. The success of XML
is a strong indicator of how dedicated individuals, working within the
W3C Process, can engage with a larger community to produce industry-changing
results. "Today we celebrate the success of open standards in preserving
Web data from proprietary ownership," said Jon Bosak, who led the W3C
Working Group that produced XML 1.0. Tim Bray of Sun Microsystems:
"There is essentially no computer in the world, desk-top, hand-held, or
back-room, that doesn't process XML sometimes. This is a good thing,
because it shows that information can be packaged and transmitted and used
in a way that's independent of the kinds of computer and software that
are involved. XML won't be the last neutral information-wrapping system;
but as the first, it's done very well." Indeed, one can hardly get through
the day without using technology that is based on XML in some fashion.
When you fill your auto tank with gas, XML often flows from pump to
station. When you configure your digital camera, on some models you do
so via XML-based graphical controls. When you plug it into a computer,
the camera and the operating system communicate with each other in XML.
When you download digital music, the software you use to organize it is
likely to store information about songs as XML. And when you explore
the planet Mars, XML goes with you... W3C would like to extend
congratulations to the participants of the XML Working Group that created
the standard: Jon Bosak, Paula Angerstein, Tim Bray (co-Editor), James
Clark, Dan Connolly, Steve DeRose, Dave Hollander, Eliot Kimber, Tom
Magliery, Eve Maler, Murray Maloney, Makoto Murata, Joel Nava, Conleth
O'Connell, Jean Paoli (co-Editor), Peter Sharpe, C. M. Sperberg-McQueen
(co-Editor), and John Tigue.

Expressing SNMP SMI Datatypes in XML Schema Definition Language

Members of the IETF Operations and Management Area Working Group Working
Group have published a new Internet Draft in the online directories:
"Expressing SNMP SMI Datatypes in XML Schema Definition Language." The
memo defines the IETF standard expression of Simple Network Management
Protocol (SNMP) Structure of Management Information (SMI) datatypes in
Extensible Markup Language (XML) Schema Definition (XSD) language. The
primary objective of this memo is to enable production of XML documents
that are as faithful to the SMI as possible, using XSD as the validation
mechanism. This memo is the first in a set of three related and
(logically) ordered specifications: (1) SNMP SMI Datatypes (RFC 2578)
in XSD; (2) SNMP MIB Structure (RFC 2578) in XSD. (3) SNMP Textual
Conventions (RFC 2579) in XSD. As a set, these documents define the XSD
equivalent of SMIv2 to encourage XML-based protocols to carry, and
XML-based applications to use, the information modeled in the
SMIv2-compliant Management Information Base ("The MIB"). Various
independent schemes have been devised for expressing the SMI datatypes
and textual conventions in W3C Schema (XSD). These schemes have exhibited
a degree of commonality (especially concerning the numeric SMI datatypes),
but also sufficient differences (especially concerning the non-numeric
SMI datatypes) to preclude general interoperability. The primary purpose
of this memo is to define a standard expression of SMI datatypes in XSD
to ensure uniformity and general interoperability in this respect.
Internet operators, management tool developers, and users will benefit
from the wider selection of management tools and the greater degree of
unified management -- with attendant improvements in timeliness and
accuracy of management information -- which such a standard will
facilitate.

Relax-WS: Trying To Make WSDL Easier To Use?

As with many things concerning Web Services, there are vociferous
arguments for and against WSDL -- even before WSDL 2.0 poured oil on
the fires. One of the main arguments against WSDL is the verbosity and
complexity of what's involved in writing a WSDL for a service. However,
the Relax-WS project is attempting to provide a solution there. The
idea is to extend Relax-NG Compact Syntax by adding support for
services, ports, operations and messages. The project aims to encourage
developers to think about the WSDL from the start, as part of the
service contract and not as an afterthought. From the Google Code
project page 'relax-ws: A relaxing way to create web service
definitions': "WSDL is a key technology for SOA, and yet creating and
editing these files is about as much fun as straightening all the
noodles in a bowl of spaghetti with a pair of tweezers. Relax-ws
provides a simple, compact syntax for generating WSDL's. It does this
by extending RelaxNG Compact syntax with support for services, ports,
operations, and messages. Some teams use code-driven development,
whereby they write Java or C# interfaces and let their framework
generate the WSDL. This is fast for development, but can easily result
in platform-specific features sneaking in, which renders the interface
unusable for cross-platform clients. An even greater problem with
code-driven development is the evaporation of interface metadata that
occurs during translation into WSDL. Comments are not converted, nor
are any but the most simple type declarations (i.e., the length of an
'xsd:string' field, or the number of digits in a decimal field, etc).
These are important attributes for the consumer of the service to know
about. The opposite approach is WSDL-driven development. The programmer
begins with a WSDL file, and as part of the build generates the service
interface that is then implemented by one or more classes. The challenge
here lies in creating the WSDL! Relax-WS aims to provide a simple,
programmer-friendly syntax, without losing any of the metadata..."

New Draft for W3C Architectural Recommendation "The Self-Describing Web"

A new draft of the TAG Finding on "The Self-Describing Web" is now
available as an editor's draft. "Significant changes in this rewrite
include: (1) A major new section introduces and highlights the standard
HTTP-based retrieval "algorithm" that user agents employ to access
self-describing resource representations. (2) There is now extensive
discussion of URI-based extensibility, and of the ability of user agents
to dynamically acquire rules for interpreting new sorts of content by
retrieving OWL ontologies, namespace documents (RDDL), etc. Such dynamic
discovery using URIs is now to some degree a unifying theme for the
latter half of the finding. Examples are also given of using URIs as
the basis for extensible attribute values (e.g. link relationships) and
other similar data fields in Web representations. (3) The discussion of
RDF and RDFa has been updated and clarified and the section on GRDDL has
been added. (4) Numerous examples have been added." Document Abstract:
"The Web is designed to support flexible exploration of information, by
human users and by automated agents. For such exploration to be productive,
information published by many different sources and for a variety of
purposes must be comprehensible to a wide range of Web client software.
HTTP and other Web technologies can be used to deploy resources that
are self-describing, in the sense that only widely available information
is necessary for understanding them. Starting with a URI, there is a
standard algorithm that a user agent can apply to retrieve and interpret
a representation of such resources. Furthermore, when such self-describing
resources are linked together, the Web as a whole can support reliable,
ad hoc discovery of information. This finding describes how document
formats, markup conventions, attribute values, and other data formats
can be designed to facilitate the deployment of self-describing Web
content."

Tuesday, February 12, 2008

XML at X; Film at XI

The original XML Recommendation is 10 years old today. Happy XML Day!
These anniversaries feel a little artificial to me; my first clear
memory of the XML work was a teleconference Jon Bosak had arranged among
the "SGML on the Web Editorial Review Board" members in June (?) 1996,
so for me XML is twelve and a half years old. As something of a birthday
present, today I'm publishing something SGML-flavored that I hope may
still be of use, or at least morbid interest, to modern XML
practitioners. You see, I cowrote a book in the just-prior-to-XML era
with another of my lifelong friends, Jeanne El Andaloussi, about SGML,
in SGML. In DocBook, as a matter of fact. That methodology I mentioned
above, with design principles and stuff? That came from this book. Now
that the book is out of print, she and I discussed the matter, and we
agreed to publish it here... You'll have to be the judge of how well
the content has stood the test of time, but I can tell you the markup
did beautifully. With a huge dollop of help from Norm Walsh (both his
DocBook stylesheets and his mad skillz), the SGML-to-XML-to-HTML
processing pipeline was downright trivial. Voila! We present to you
(online): "Developing SGML DTDs: From Text to Model to Markup." More Information See also the classic e-book: CLICK HERE

New OASIS Standard: XML Localization Interchange File Format (XLIFF) v1.2

OASIS has announced the approval of the XML Localization Interchange
File Format (XLIFF) specification Version 1.2 as an OASIS Standard. The
specification was produced by members of the OASIS XML Localisation
Interchange File Format (XLIFF) Technical Committee. The purpose of the
XLIFF vocabulary is to store localizable data and carry it from one step
of the localization process to the other, while allowing interoperability
between tools. The specification is tool-neutral, supports the entire
localization process, and supports common software, document data formats,
and markup languages. The specification provides an extensibility
mechanism to allow the development of tools compatible with an
implementer's data formats and workflow requirements. The extensibility
mechanism provides controlled inclusion of information not defined in
the specification. The XLIFF file format serves as a container for
externalized data to be interchanged between software publishers,
documentation writers (including, but not limited to documents written
in DITA, Docbook, HTML, and other XML document formats), localization
tools, and software services providers in order to facilitate all the
phases of the localization process. More Information

The State of BPM: Top-Five Trends

Speaking at this week's Gartner BPM Summit in Las Vegas, Jay Simons,
VP of Marketing for BEA, presented the company's recent research results
on the state of the BPM market, including a survey of 200-plus BEA
customers, mostly IT people but spread across vertical markets and
geographies. They've also gathered information through their online BPM
Lifecycle Assessment. The results show a number of interesting trends
indicating that CIOs and business leaders are focused on improving their
processes. Existing customers described how they expect to get their
ROI from their BPM implementations, and most expect to see ROI over the
next three years. (1) IT embraces BPM enterprisewide, which broadens the
scope for BPM beyond the existing departmental systems, and centralizes
the practices around BPM. In general, this is occurring because of the
ability of BPM to connect applications into improved business processes;
more than half already are or will be connecting BPM and SOA in their
environment. (2) BPM is becoming event-driven, in order to support the
event-driven nature of business today. This will result in much more
agile processes that can respond to both expected and unexpected events.
(3) Increased focus on knowledge-intensive processes, and using
collaborative BPM to enable ad hoc processes both on their own or as an
offshoot from a structured process... (4) Enterprise social computing:
introducing tagging, wiki, social connectedness and the like with more
traditional process management in order to add context and more easily
collaborate. (5) Moving towards dynamic business applications; Yvonne
Genovese spoke in this keynote about the move towards dynamic/composite
applications in order to free organizations from the pre-canned logic
in packaged enterprise applications, but BPM, together with services
exposed in an SOA layer, allows for the fast assembly of applications
that are more suited to current business needs.

W3C's Excessive DTD Traffic

If you view the source code of a typical web page, you are likely to
see something like this near the top: These
[statements] refer to HTML DTDs and namespace documents hosted on W3C's
site. Note that these are not hyperlinks; these URIs are used for
identification. This is a machine-readable way to say "this is HTML".
In particular, software does not usually need to fetch these resources,
and certainly does not need to fetch the same one over and over! Yet we
receive a surprisingly large number of requests for such resources: up
to 130 million requests per day, with periods of sustained bandwidth
usage of 350Mbps, for resources that haven't changed in years. The vast
majority of these requests are from systems that are processing various
types of markup (HTML, XML, XSLT, SVG) and in the process doing something
like validating against a DTD or schema. Handling all these requests
costs us considerably: servers, bandwidth and human time spent analyzing
traffic patterns and devising methods to limit or block excessive new
request patterns. We would much rather use these assets elsewhere, for
example improving the software and services needed by W3C and the Web
Community. You might think something like "don't request the same
resource thousands of times a day, especially when it explicitly tells
you it should be considered fresh for 90 days" would be obvious, but
unfortunately it seems not. At the W3C Systems Team's request the W3C
TAG has agreed to take up the issue of "Scalability of URI Access to
Resources."

WSO2 Joining Open-Source SOA Registry Field

Featuring SOA governance and Web 2.0 collaboration capabilities, WSO2
Registry offers a repository for storing information and a registry for
locating it. A Web-based interface is included along with Web 2.0 features
like tags, ratings, and comments systems. Users can store and manage
enterprise metadata in a wiki-style model. WSO2's structured repository
supports XML and SOA metadata formats along with arbitrary data, such
as Microsoft Office documents, images, files, and text formats. A
catalog of enterprise information can be built that includes services,
service descriptions, employee data, and ongoing projects. "It's a
registry and repository product, so it basically organizes and lets you
store in a versioned and REST-compatible way all the SOA metadata that
happens to be in your enterprise," said Glen Daniels, director of Java
platforms at WSO2. Metadata can include service descriptions, XML schemas,
and configuration data. The company follows MuleSource in announcing an
open-source registry and repository for SOA. According to the announcement:
"The WSO2 Registry supports several usage scenarios. First, it can be
embedded in any Java application that needs a registry and repository
to store resources with Web 2.0-style features for commenting, tagging
and rating those resources. Second, it offers a REST-style Web API that
allows the WSO2 Registry to be used remotely from any application from
any language, including Java, PHP, C++, and Javascript. The Web API is
built on the popular Atom and AtomPub protocols, allowing any feed reader
(such as Google Reader or Bloglines reader) to browse the contents of
the Registry. Finally, the WSO2 Registry comes with an attractive
AJAX-powered user interface, which allows it to be used as a Web
application by both business and technical users alike. The product can
be used with the embedded database or be configured to use an existing
database such as MySQL, Oracle or SQLServer..."

Do We Really Need Structured Document Formats?

Do we really need structured document formats? Structured document
formats like DITA, DocBook, and Solbook are characterized by deeply
nested tags and a multitude of schema constraints. Unstructured
tagging languages like HTML, on the other hand, are wide open. In one
meeting, every reason we came up with that made them seem necessary,
was answered by a convincing counter argument. "Reuse" would seem to
be the most important reason. And maybe there are some compelling
cases. But maybe all-out reuse isn't needed. Maybe we really only
need a very restricted form that solves those cases. In at least the
case of version-dimension reuse, variable substitution and conditional
metadata seem to be a darn good idea. And in at least the case of
table and list tags, nesting seems to be a requirement. So it's clearly
not the case that we can completely do without such capabilities. On
the other hand, the counter arguments against other forms of variable
substitution and conditional metadata remain intact -- at times, it is
just too costly to keep them working, especially in an environment that
changes frequently. And nesting everything may well be overkill, when
so few forms of nesting are actually indispensable. This post summarizes
the arguments we considered. Do they demolish the case for structured
documents in a highly fluid setting like the software industry? Do they
demolish the case for structured documents and reuse? Are they wrong
in some important respect? Or do they overlook some vitally important
point that makes structured document formats irreplacable?

Open-Source Movement Turns 10

The past decade has been marked by enormous achievements and some serious
setbacks, says Bruce Perens, co-founder of the Open Source Initiative.
This weekend marks the 10th anniversary of the publication of the "Open
Source Definition" and the public announcement of the formation of the
Open Source Initiative. The decade has been marked both by enormous
achievements and serious setbacks. "This was the first time that the
general public heard what open source was about. Friday, February 8 is
the last day of Decade Zero of open source, while Saturday, February 9
is the anniversary of open source and the start of Decade One. It's a
computer scientist thing. We always start counting from zero," said
Bruce Perens, creator of the Open Source Definition and co-founder of
the Open Source Initiative. While acknowledging the trailblazing role
of Richard Stallman, founder of the Free Software Foundation, Perens
also acknowledged the conflict that has existed between open-source and
free-software evangelism. "I always intended to have open source be
another way of talking about free software, tailored to the ears of
business people, that would eventually lead them to a greater
appreciation of Richard Stallman's arguments on that front. This has
come to pass, and I hope you'll continue to make it so," Perens said
in a blog posting. From the blog: "We have actually changed the way
that innovation happens. Innovation has gone public. Many companies,
institutions, and individuals share innovation on a daily basis,
entirely in the open, through Free Software development communities.
The products they produce are the leaders in their field. Public
innovation eliminates the high transaction costs of lawyers, lawsuits
and licensing. It focuses on building a fertile community across the
market for idea creation and utilization rather than dividing the
market for the direct monetization of ideas as property. This is the
economically most efficient approach for most companies."

Put to the Test: Nexaweb Enterprise Web Suite 2.0

When you investigate Nexaweb Enterprise Web Suite 2.0, you get a sense
that it was created de novo by a group of smart people who studied the
requirements for building robust, rich-Internet-application-style
enterprise applications (like plenty of scalability, security, and data
access), and who considered the available standards and commonly used
tools (Java, JavaScript, Ajax, XML, SOA, etc.) They then set about piecing
together what they viewed as a simpler, consistent, mostly familiar, and
efficient whole. I tested the Nexaweb Platform in three server/hardware
configurations and found the installations to be smooth, requiring
surprisingly little post-install tweaking. Some of this is probably the
result of using standards, coupled with relatively tight control of
client, communication, and server. An interesting innovation is the use
of Nexaweb XML to first create the data presentation UI and then, through
a data framework plug-in, asynchronously handle the data going to or from
the client. Ajax works this way and users find the approach more
responsive. The data framework approach supports a wide variety of
external data handlers in JSP, JSTL, Struts, XSTL, or MVC. Pre-built
components are geared to Web 2.0, rich Internet application, SOA and
mobile applications. Nexaweb includes the Internet Messaging Bus (IMB) as
its way of providing these features and guaranteeing reliable messaging.
Nexaweb keeps it simple, for example riding the http channel through port
80 so that it can be instantly compatible with most firewalls. Nexaweb
supports a Universal Client Framework -- which, despite the name, isn't
all things to all developers, but it does make it possible for developers
who prefer Java, JavaScript, or Ajax to work on Nexaweb apps. The hitch
is that they must learn Nexaweb's declarative language, NXML (Nexaweb XML),
to produce the UI and wrap the other code. In Nexaweb's case the DOM
(Document Object Model) houses the NXML and provides the commands for
the local browser.

Scenes from a Recommendation 1: Chicago, Cafe des Artistes

The XML spec became a W3C Recommendation ten years ago this week. Tim
Bray has posted some character sketches from the period; Eve Maler has
followed suit with some recollections (and an online version of
Maler/El Andaloussi! Woo hoo!); this has inspired me to think about
doing the same. What follows is the first in (what I hope will be) a
series of moments I remember from the creation of XML. If you look,
you can find a lot of stories about the beginning of XML. It surprised
me, at first, that they all seem to be different; it surprised me even
more to find some told in the first person by people whom I had not
suspected of being involved with XML at all. But I shouldn't have been
surprised. Scores or hundreds of people were involved in the development
of XML, thousands in its spread and uptake. In some sense, then, XML
will have had scores, or hundreds, or thousands of beginnings. Why
should I think I know about them all? Questions like 'How did X start?'
often mean not 'How did X start?' but 'How did you come to be involved
in X?' -- or, at least, that's how we answer them. The beginnings of XML?
I don't know. But I'll tell you what I do know; I know when I first
heard about it. The second WWW conference was in Chicago, in October
1994. With Bob Goldstein, one of my colleagues at the University of
Illinois at Chicago computer center, I had submitted a paper on how
the Web would achieve its true potential only once it had SGML
awareness ('HTML to the Max')...

Tibco Adds Eclipse, ESB to SOA Platform

Tibco Software is now shipping its ActiveMatrix 2.0 platform for
managing SOA, featuring an Eclipse-based development environment and
an enterprise service bus. The platform offers expanded capabilities
for integration, composite application development, and governance,
Tibco said. Users can build and manage SOA applications, supporting
technologies such as Java, .Net, and service mediation. Applications
can be built that combine services developed with disparate technologies
and manage them via a single infrastructure. The platform features
several components, including ActiveMatrix BusinessWorks, for
integration, and ActiveMatrix Service Grid, for assembling services.
New to the platform is ActiveMatrix Service Bus, for on-ramping
services and implementing content or context-based routing. A common
environment based on the Eclipse platform is provided for business
analysts, architects, and developers for development and management.
ActiveMatrix covers a spectrum of capabilities ranging from service
virtualization, with developers able to build services in a tool of
their choice, to governance and integration. With ActiveMatrix 2.0,
Tibco seeks to help users manage large-scale SOA rollouts with a single
platform. Much of the technical coding involved in service creation
and deployment is replaced with configuration. Governance and management
capabilities enable administrators to deploy applications and apply
policies from a single console. Also featured is expanded support of
the Service Component Architecture (SCA) specification to improve
interoperability in deploying SOA.

Friday, February 8, 2008

2007 Turing Award Winners Announced for Work on Model Checking

Edmund M. Clarke, E. Allen Emerson, and Joseph Sifakis are the
recipients of the 2007 A.M. Turing Award for their work on an automated
method for finding design errors in computer hardware and software. The
method, called Model Checking, is the most widely used technique for
detecting and diagnosing errors in complex hardware and software design.
It has helped to improve the reliability of complex computer chips,
systems and networks. According to the ACM announcement, "Model Checking
started as an academic research idea. The continuing research of Clarke,
Emerson, and Sifakis as well as others in the international research
community over the last 27 years led to the creation of new logics, as
well as new algorithms and surprising theoretical results. This in turn
has stimulated the creation of many Model Checking tools by both academic
and industrial teams, resulting in the widespread industrial use of Model
Checking... Among the beneficiaries of Model Checking are personal
computer users, medical device makers, and nuclear power plant operators.
As computerized systems pervade daily life, consumers rely on digital
controllers to supervise critical functions of cars, airplanes, and
industrial plants. Digital switching technology has replaced analog
components in the telecommunications industry, and security protocols
enable e-commerce applications and privacy. Wherever significant
investments or human lives are at risk, quality assurance for the
underlying hardware and software components becomes paramount. The
Turing Award, named for British mathematician Alan M. Turing, carries
a $250,000 prize, with financial support provided by Intel Corporation
and Google Inc.

LoST: A Location-to-Service Translation Protocol

Members of the IETF Emergency Context Resolution with Internet
Technologies (ECRIT) Working Group have published an updated Internet
Draft for the "LoST: A Location-to-Service Translation Protocol"
specification. The 77-page specification describes an XML-based protocol
for mapping service identifiers and geodetic or civic location
information to service contact URIs. In particular, it can be used to
determine the location-appropriate Public Safety Answering Point (PSAP)
for emergency services. Appendix A supplies a corresponding
non-normative RELAX NG Schema in XML Syntax. Protocols such as NAPTR
records and the Service Location Protocol (SLP) can be used to discover
servers offering a particular service. However, for an important class
of services the appropriate specific service instance depends both on
the identity of the service and the geographic location of the entity
that needs to reach it. Emergency telecommunications services are an
important example; here, the service instance is a Public Safety
Answering Point that has jurisdiction over the location of the user
making the call. The document describes a protocol for mapping a
service identifier and location information compatible with PIDF-LO
to one or more service URIs. Service identifiers take the form of the
service URNs; location information here includes revised civic location
information and a subset of the PIDL-LO profile which consequently
includes the Geo-Shapes defined for Geography Markup Language (GML).
Example service URI schemes include SIP, XMPP, and TEL. While the
initial focus is on providing mapping functions for emergency services,
it is likely that the protocol is applicable to other service URNs. For
example, in the United States, the "2-1-1" and "3-1-1" service numbers
follow a similar location-to- service behavior as emergency services.
LoST Satisfies the requirements for mapping protocols, providing a
number of operations, centered around mapping locations and service
URNs to service URLs and associated information. For civic addresses,
LoST can indicate which parts of the civic address are known to be
valid or invalid, thus providing address validation. LoST indicates
errors in the location data to facilitate debugging and proper user
feedback, but also provides best-effort answers. LoST queries can be
resolved recursively or iteratively.

XML Daily Newslink. Thursday, 07 February 2008

The Distributed Management Task Force, Inc. (DMTF) has announced a plan
to work with The Green Grid to develop standards designed to improve
interoperability of technology solutions within the data center. DMTF
and The Green Grid plan to collaborate to develop an interface for
heterogeneous management, across data centers, and for IT and non-IT
equipment. The Green Grid is a global consortium chartered to develop
energy efficiency standards, processes, measurements and technologies
for global data centers and business computing ecosystems. As DMTF is
an industry organization leading the development, adoption and promotion
of interoperable management initiatives and standards, DMTF will
support The Green Grid in reaching its mission. In order to support
its goals, The Green Grid will actively pursue the DMTF's Web-Based
Enterprise Management (WBEM), a suite of management and Internet
standard technologies developed to unify the management of distributed
computing environments. WBEM will form the basis of the management
interfaces The Green Grid defines. As a DMTF collaborator, The Green
Grid will be able to leverage and extend the DMTF technologies and
apply them to help improve energy efficiency in the data center and
business computing ecosystems. In addition, the partnership will benefit
The Green Grid by providing access to the expertise and broad membership
of DMTF. As the newest member of the DMTF Alliance Partner program,
which defines formalized liaison relationships between the DMTF and
other key standards bodies, The Green Grid anticipates producing
interface specifications based upon WBEM technologies in approximately
12-18 months. DMTF WBEM Protocols include CIM-XML (a WBEM protocol that
uses XML over HTTP to exchange Common Information Model [CIM] information)
and WS-Management (a specification which promotes interoperability
between management applications and managed resources by identifying a
core set of Web service specifications and usage requirements to expose
a common set of operations).

Observatory Service Broker (OSB) Contributed to CECID

The University of Hong Kong Center for E-Commerce Infrastructure
Development (CECID) announced that the Observatory Service Broker
(OSB) has been contributed to the open-source community website. The
Observatory Service Broker (OSB) is one of the project deliverable of
Project Plumber. The OSB is an enhanced version of a process-driven
enterprise service bus, which supports a choreography-based business
process. It is based on a published Java standard JSR 208: Java
Business Integration (JBI). JBI is a Java-based standard addressing the
EAI and B2B issues based on the paradigms and principles advocated by
SOA. It defines the standards for composite plug-ins in SOA architecture,
as well as how the plug-ins can communicate with each other. Along with
the OSB, there is a Business Process State Tracking (BPST) engine
registering the rule sets to guide the execution of business processes
and keep track the state of each rule set. The business integration
SPIs enable the creation of a Java business integration environment
for specifications such as WSCI, BPEL4WS, and the W3C Choreography
Working Group. OSB, BPST, and their source code have been released to
CECID open-source community website under GNU General Public License
Version 2. Later on, useful information and articles ranging from
technical issue to general usage of OSB will be available at the
community website. Commenced on March 2006, the Project Plumber aims
to research and develop a software platform to facilitate enterprises
to design and implement service oriented applications to transact with
business partners based on key business-to-business (B2B) standards.
The objectives of this project are to extend Service-Oriented
Architecture (SOA) to support processing of electronic transactions
between enterprises, to develop Web Service components to support
reliable and secure B2B applications based on open standards like
Universal Business Language (UBL), and to develop service modeling
methodology and software to facilitate design of electronic transaction
services.

OpenXML: A Poster Child for Open Standards Development?

"I have seen some attacks on OpenXML saying it is not an 'open' standard.
I am quite puzzled by those attacks and think that OpenXML makes the
case for open development of standards. Understand that as the Project
Editor for ISO/IEC 26300 and the OpenDocument Format TC editor in OASIS,
I carry no brief for OpenXML. However, a well defined and publicly
controlled OpenXML would be a great benefit for future work on the
OpenDocument Format standard so I have no reason to wish it ill. OpenXML
has progressed from being developed in a closed environment to being
handed over to approximately 70% of the world's population for future
development so I am missing the 'not open' aspect of OpenXML. If
anything, the improvements made to OpenXML during that process make it
a poster child for the open standards development process... Ecma TC
45 is composed of a wide variety of users, developers and others
interests who wanted to see Microsoft Corporation adopt an XML based
format for its office software. Over the course of a year with a very
aggressive meeting schedule, TC 45 produced a document that was
approximately three (3) times longer than the original submission and
that was in many ways a better proposal. That to me illustrates the
difference between talking to yourself and opening up the development
process to a larger group of people. After being revised by TC 45,
OpenXML was submitted via fast-track for approval as an ISO/IEC standard
(DIS 29500) in JTC 1/SC 34. If you look at the roster for SC 34,
approximately 70% of the world's population has a seat at the table
via their national bodies to discuss the final version of OpenXML. The
fast track process created unnaturally short deadlines but the national
bodies labored very hard to produce over 3,000 comments on OpenXML and
TC 45 labored just as hard to produce answers for those comments.
Answers that often offered substantial changes, I think for the better,
to OpenXML."

Apache Tuscany Java 1.1 Released: SCA Meets Web 2.0

The Apache Tuscany team announced today the Version 1.1 release of the
Java SCA project. Apache Tuscany is a runtime environment based on the
Service Component Architecture (SCA). SCA is a new component model that
facilitates the construction of composite applications. SCA is a set of
specifications initially developed by IBM and BEA which are now being
standardized by OASIS as part of the Open Composite Services Architecture
(Open CSA). The Tuscany SCA Java 1.1 release adds a number of features
including: (1) a JMS binding; (2) improved policy support; (3) an
implementation extension for representing client side Javascript
applications as SCA components. Jean-Sebastien Delfino and Luciano
Resende (IBM) commented in an InfoQ interview. JS, on the "widget"
implementation: "you can now include, in an SCA composition, client
components implemented as HTML + Javascript the AJAX way, running in
your Web browser, wired to server-side components using Tuscany's
JSON-RPC and ATOM bindings for example. Basically it is about embracing
Web 2.0 client components in a distributed SCA composition... We generate
some additional JavaScript after introspection of the references the
implements all the the plumbing code to support JSON-RPC and ATOM and
the Reference class wrapping the references that you can use in your
business logic. The Tuscany community will have to decide what's coming
ahead, as we're just getting 1.1 out, but I envision progress in the
following areas: simpler and more complete SCA policy support; more
policies -- making progress with the transaction policy; improved
end-to-end SCA contribution / deployment / distribution story; an SCA
domain administration application; integration with Geronimo;
improvements of the Web 2.0 bindings -- perhaps using Apache Abdera
for ATOM and adding cross-domain support to the JSON-RPC binding;
optimizations of the Tuscany databinding support; more platform
integration testing on Tomcat, Geronimo, etc." Luciano: "BPEL support
is not complete yet: services are supported, but references are not;
properties are not supported either, but they will require an extension
to the BPEL language. This may come next if it is requested by the
community. I have just updated the BPEL implementation guide."

OpenID Foundation Scores Top-Shelf Board Members

If the OpenID Foundation were a liquor cabinet, it just got stocked with
some Grey Goose, Rhum Clement, and Gran Patron. The foundation, which
is pushing for a universal Internet login standard, announced on
Thursday that representatives from Google, Microsoft, Yahoo, IBM, and
VeriSign have become its first corporate board members. They join
existing board members Scott Kveton (Vidoop), David Recordon (Six Apart),
Dick Hardt (Sxip Identity), Martin Atkins (independent), Artur Bergman
(Wikia), Johannes Ernst (NetMesh), Drummond Reed (Parity Communications),
and executive director Bill Washburn. According to the announcement:
"Last year, OpenID grew by leaps and bounds both as a technology and as
a community. At the beginning of 2006, there were fewer than 20-million
OpenID enabled URLs and less than 500 websites where they could be used.
Today there are over a quarter of a billion OpenIDs and well over 10,000
websites to accept them. OpenID has grown to be implemented by major
open source projects such as Drupal, cornerstone Web 2.0 services such
as those by 37signals and Six Apart, as well as a mix of large companies
including as Apple, Google, and Yahoo!. Today is about truly recognizing
the accomplishments of the entire OpenID community which has certainly
grown beyond the small grassroots community where it started in late 2005.
One of the other accomplishments of the Foundation in 2007 was working
with AOL, Microsoft, VeriSign, Sun, Symantec, and Yahoo! to develop an
intellectual property rights policy and process for technical OpenID
specification work which was finalized in December 2007. While all of
these community accomplishments have been great, each was made possible
by the community's willingness to include the resources of companies
alongside the efforts of individual contributors. By bringing on these
companies and their resources, the OpenID Foundation will now be able to
better serve the needs of the entire OpenID community."

Aggregate RSS and Atom information using XQuery

The Really Simple Syndication (RSS) and Atom standards provide XML
structures of items for a variety of different uses. The most common
use for both RSS and Atom feeds is as the data dissemination format to
promote Weblogs and news sites. The RSS and Atom feeds contain relatively
small amounts of information. Thus, you can easily download the files
and reduce the load on the Web servers rather than supply all of the
information normally distributed when the user views a full page of blog
posts. In addition, the RSS and Atom files also contain more detailed
classification information such as author, title, subject and keyword
tagging information to help identify and organize the data within the
feeds. In this article we look at the basics of XQuery processing of
RSS and Atom feeds to turn a single feed into an HTML document. We then
produce a more complete solution for outputting the information in a
format that suits your needs, including sorting, merging multiple feeds
and even handling different feed and source information types. XQuery
offers a flexible method to process XML files. Some find this method is
easier to follow syntactically. Certainly some XQuery abilities, such
as the flexibility to create to a single intermediary XML document that
you can reparse to handle different sources and input formats, help
solve some issues experienced when you process XML files.

Grails 1.0 Web Framework Ready

The Grails 1.0 open source Web application development framework was
announced this week by G2One, which specializes in Groovy and Grails
technology, and the Grails development team. Grails is built on Java
and the Groovy language. It leverages APIs from the Java enterprise
sphere including Spring, Hibernate, and SiteMesh, G2One and the
development team said. With Grails, Java and Ruby developers get
convention-based rapid development while leveraging existing knowledge
and capitalizing on APIs Java developers have used for years. Plug-ins
enable Grails to work with technologies such as Adobe Flex, Google Web
Toolkit, and the Yahoo UI library. The 1.0 version has been in the
making for two years and eight months. New features including an ORM
DSL (Object Relational Mapping Domain Specific Language) for advanced
mappings, support for easy-to-use filters, and content negotiation.
REST (Representational State Transfer) also is leveraged, as is JNDI
(Java Naming and Directory Interface). ORM DSL allows Grails to support
legacy databases in applications. Filters apply cross-cutting behaviors
to Web applications to apply capabilities such as security, tracing,
and logging. With REST support, Grails allows for existing Web objects
to be converted to XML or JSON (JavaScript Object Notation), with
tasks being automated. With JNDI, Grails provides the ability through
Spring to look up existing programming objects such as a data source.

Real Web 2.0: Linking Open Data

Throughout this column I've placed strong emphasis on the aspects of
Web 2.0 that concern open, shared data rather than flashy effects.
Certainly Ajax is important because when used well it can enhance the
usability of Web sites. But Web feeds, open, Web-friendly APIs, and
third-party plug-in and mashup capabilities are the real substance of
Web 2.0. One community closely associated with the Web's original
stewards, the W3C, is committed to a particular, coherent set of
practices along these lines. The Linking Open Data (LOD) community
combines the vision of the W3C for using semantic features to enhance
the Web with the pragmatism that characterizes mainstream Web 2.0. The
[stated] goal of the W3C SWEO Linking Open Data community project is
'to extend the Web with a data commons by publishing various open
datasets as RDF on the Web and by setting RDF links between data items
from different data sources.' The emphasis on RDF is natural for the
W3C, which has been pushing the technology for a decade, but one
development that gives LOD extra legs is the emergence of influential
voices realizing that insistence on strict RDF format across the board
is probably not the best present strategy for winning over Web
developers. LOD supports RDF as a conceptual model, but the new
emphasis is more on linking and openness than on any one syntax. After
all, RDF is merely URIs, links, and labels, so any model that includes
these three can readily work with RDF systems. The full LOD community
is a penumbra around the W3C-led core who support all the advantages of
opening up data that I've discussed so far in this column, and who see
RDF, Atom, JSON, and so on as merely tools for Web developers to open
up their data. LOD means making it easier for people to discover
important things you place on the Web, and making it easier for them
to do unexpected, fruitful things with them. The next time you have a
Web project, start by thinking of it in terms of what information and
non-information resources are represented in the Web app, and do
everything you can to give each one a well-designed HTTP URI and a
semantically rich data format, and create links, links, and more links.

Microsoft Declares Its Modeling Love With a New Language, 'D'

A handful of Microsoft's top developers are working to create a new
programming language, code-named 'D,' which will be at the heart of
the Microsoft's push toward more intuitive software modeling. D is a
key component of Microsoft's Oslo software-oriented architecture (SOA)
technology and strategy. Microsoft outlined in vague terms its plans
and goals for Oslo in late fall 2007, hinting that the company had a
new modeling language in the works, but offering no details on what
it was or when the final version would be delivered. D will be a
declarative language aimed at non-developers, and will be based on
eXtensible Application Markup Language (XAML), sources, who asked not
to be named, said. Sources close to Microsoft confirmed the existence
of D, which they described as a forthcoming 'textual modeling language.'
In addition to D, sources said, Microsoft also is readying a
comlementary editing tool, code-namd 'Intellipad,' that will allow
developers to create content for the Oslo repository under development
by Microsoft. Intellipad is the Emacs.Net text editor for which
Microsoft has seeking developers over the past couple of months... At
last week's Lang.Net 2008 conference -- a meeting of programming gurus
from Microsoft and other vendors held on the Redmond campus --
Microsoft's Chief Modeling Officer Don Box provided some more clues
about where Microsoft is going on the tool and platform front with
Oslo. Box said Microsoft wasn't interested in creating some grandiose
1980s' style computer-aided-software-engineering (CASE) tool; it was
thinking more along the lines of providing a class designer. The goal,
according to Box: "putting more and more of your application into data
and putting less in code."

Oracle Launches Data Integration Suite

Oracle has launched the Oracle Data Integration Suite, which combines
traditional data integration capabilities with an array of middleware
and tooling for constructing a service oriented architecture. Data
Integration Suite costs $60,000 per CPU for a package that bundles
Oracle Data Integrator and Oracle/Hyperion Data Relationship Manager
with the company's BPEL Process Manager, enterprise service bus,
application server, business-to-business engine, and business rules
engine. Oracle's suite aligns its data-integration offerings with its
Fusion Middleware line for SOA. Additional options in the suite
include a new pair of data quality tools, Oracle Data Quality for
Data Integrator and Oracle Data Profiling, which the company developed
with Harte-Hanks Trillium Software. Also, Oracle is optionally offering
its Coherence Data Grid, technology acquired through Oracle's purchase
of Tangosol last year, and a number of adapters, including ones for
applications and unstructured content, as options... Marketing materials
announcing the Oracle's release stress the suite's applicability to
heterogenous environments, noting its support for a broad array of
databases, including IBM DB2, MySQL, Microsoft SQL Server, Teradata,
and Oracle.

W3C Releases Extensible Markup Language (XML) 1.0 (Fifth Edition)

A Fifth Edition of "Extensible Markup Language (XML) 1.0" has been
published by members of the W3C XML Core Working Group. This fifth
edition is not a new version of XML. As a convenience to readers, it
incorporates the changes dictated by the accumulated errata to the
Fourth Edition of XML 1.0, dated 16 August 2006. In particular, erratum
'E09' relaxes the restrictions on element and attribute names, thereby
providing in XML 1.0 the major end user benefit currently achievable
only by using XML 1.1. A preliminary implementation report is available,
together with a Test Suite designed to help assess conformance to
this specification. The XML Core WG wishes to ensure continued
universal interoperability for XML 1.0. To this end, the WG will not
request that this Fifth Edition of XML 1.0 become a Recommendation
until the following criteria are satisfied: (1) At least three months
have passed since the publication of this PER; (2) There are at least
three implementations that pass the test suite for each of the errata
that have been newly applied to the Fifth Edition. Rationale for
Primary Change: "... The proposed change to XML 1.0 will relax the
restrictions on names, used not only for element and attribute names
but also identifiers and enumerated attribute values. Those who prefer
to retain the constraints on names from the previous version of XML
1.0 in their documents will be free to do so, but those who wish to
use names that incorporate these additional characters will be able
to do so."

The Future of XML: How Will You Use XML in Years to Come?

XML's future lies with the Web, and more specifically with Web publishing.
It seems a little funny to have to say that. After all, isn't publishing
what the Web is about? The Web was designed first and foremost as a
mechanism to publish information. What else can it do? Quite a lot. The
last three years have seen an explosion of interest in Web applications
that go far beyond traditional Web sites. Word processors, spreadsheets,
games, diagramming tools, and more are all migrating into the browser.
This trend will only accelerate in the coming year as local storage in
Web browsers makes it increasingly possible to work offline. But XML is
still firmly grounded in Web 1.0 publishing, and that's still very
important... So now you know how you'll write XML in 2008 (Word or
OpenOffice), and you know how you'll send it to the server (APP - Atom
Publishing Protocol). The last question is where to put all this wonderful
XML. Traditionally, this question has had two answers. The first is to
save the XML in a file system. The second is to stuff it in a Binary
Large Object (BLOB) in a relational database. Both are kludges, and
neither performs very well for Web sites. What we need is a database
designed to work with the hierarchical structures of typical Web
documents rather than cutting across them. For the first time, such
databases now exist at multiple scales, they're stable, and they're
ready to use. On the low end, eXist and Berkeley DBXML are looking
better and better. On the high end, expensive big-iron XML databases
like Mark Logic will continue to convert big publishers who can afford
the cost of entry. Hybrid solutions like IBM DB2 9 pureXML will drive
XQuery adoption among customers who need to mix documents with tabular
data. Compared to earlier products like these, the new breed are more
stable, more scalable, and more reliable. Most important, they now
share a standard language, XQuery 1.0, finally released after years
of development.

Wednesday, February 6, 2008

Proposed Recharter of IETF Public-Key Infrastructure (X.509 PKIX) WG

The IESG Secretary announced the availability of a proposed modified
charter submitted for the Public-Key Infrastructure (X.509) PKIX
working group in the Security Area of the IETF. The IESG has not made
any determination as yet. As proposed: "The PKIX Working Group was
established in the fall of 1995 with the goal of developing Internet
standards to support X.509-based Public Key Infrastructures (PKIs).
Initially PKIX pursued this goal by profiling X.509 standards developed
by the CCITT (later the ITU-T). Later, PKIX initiated the development
of standards that are not profiles of ITU-T work, but rather are
independent initiatives designed to address X.509-based PKI needs in
the Internet. Over time this latter category of work has become the
major focus of PKIX work, i.e., most PKIX-generated RFCs are no longer
profiles of ITU-T X.509 documents. PKIX has produced a number of
standards track and informational RFCs... PKIX will continue to track
the evolution of ITU-T X.509 documents, and will maintain compatibility
between these documents and IETF PKI standards, since the profiling of
X.509 standards for use in the Internet remains an important topic for
the working group... PKIX will pursue new work items in the PKI arena
if working group members express sufficient interest, and if approved
by the cognizant Security Area director. For example, certificate
validation under X. 509 and PKIX standards calls for a relying party
to use a trust anchor as the start of a certificate path. Neither X.509
nor extant PKIX standards define protocols for the management of trust
anchors. Existing mechanisms for managing trust anchors, e.g., in
browsers, are limited in functionality and non-standard. There is
considerable interest in the PKI community to define a standard model
for trust anchor management, and standard protocols to allow remote
management. Thus a future work item for PKIX is the definition of such
protocols and associated data models.

A Look at the First HTML 5 Working Draft

The World Wide Web Consortium (W3C) has published a draft of the
HTML 5 specification, the first major revision to the language since
HTML 4 was released more than ten years ago. In the intervening time
the web has gone from being primarily a static medium to being about
interactive applications and media-rich content, with developers
increasingly moving their applications to the web. HTML 5 is intended
to reflect that change. Amongst the new features squarely targeted at
application developers, HTML 5 introduces a number of new Javascript
APIs. These can be used in conjunction with corresponding HTML elements.
A number of new presentation elements have also been introduced with
support for familiar page components such as headers, footers, figures,
dialog (used to mark-up a conversation), and navigation. There is a
new datagrid element which will support interactive tables and trees,
a datalist element for combo boxes, and a progress attribute which
represents the completion of a long running task. Support for RSS feeds
within the page markup has also been added. For forms the input
element's type attribute has new support for dates, times, emails and
URLs, so that the browser can provide the user interface elements, for
example a calendar date picker or integration with the user's address
book, and submit the data in a defined format to the server. HTML 5
also drops support for some well-know features. The most notable is
support for frames, which have long been considered detrimental to
accessibility and usability. It should be noted that dropped features
will continue to be supported by browsers that also fully support the
HTML 5 standard, since support for legacy versions of HTML will remain
for many years.

The Ranvier URL Mapper: Letting URL Structure Invoke Application Work

The responsibility of a Uniform Resource Identifier (URI) is to uniquely
name a resource in the world. The most familiar subset of URIs is
Uniform Resource Locators (URLs), which take on the additional
responsibility of providing a description of how to obtain the named
resource (in other words, a network location and protocol to use in
fetching an electronic document or stream). Some URIs are Uniform
Resource Names (URNs) without being URLs, that is, they name a resource,
but do not provide specific details on how to obtain it... The oldest
and most popular Web servers have generated URIs whose form directly
mirrors the file system structure of the machine that hosts resources.
The URI schema itself specifies a hierarchical "path" component of URIs
(though a path is potentially empty), but does not require any literal
mapping between a URI path structure and a file system. Ranvier is a
Python package you can integrate into Web application frameworks to map
incoming URL requests to source code. It does this by a mechanism of
delegation-and-consumption, which differs from more common regular
expression-based URL rewriting. Ranvier also serves as a central
registry of all the URLs in a Web application and can itself generate
the URLs necessary for cross-linking pages. The registry function
allows Ranvier to assure the integrity of links and automate coverage
analysis. Ranvier is pure Python code and does not have any third-party
dependencies; it should be usable (with a bit of adaptor code) in any
Python-based Web application framework... There are certainly many
cases where domain resources are semantically hierarchical, and not
merely in a way that mirrors peculiarities of the development framework
and tools used in implementation. Ranvier provides a flexible way of
organizing dispatch of functional aspects of URI processing into
multiple reusable blocks of code.

Strategic Security: Get a Handle on Authentication

It's a common dilemma: You host multiple Web-accessible applications,
for both internal customers and external users. A few of your developers
are keeping up on the last programming trends and security models, while
some of your highest-seniority employees are stuck in programming models
outdated a decade ago. You've got a hodgepodge of access and
authentication methods, along with a lot of client-server interaction,
and a little bit of Web services and SOA, as well as Citrix or Terminal
Services thrown in. There are even a few people still dialing in on phone
lines to access dumb terminal-based applications. Truth be told, if
someone asked what you thought of the situation, you'd reply it's a deck
of cards just waiting to be pushed over by the right inquisitive hacker.
You've got to get control of your applications and authentication models,
so where do you start and what do you do? There are six broad areas that
you'll need to address: education, strategy, standardization, policies,
remediation, and retirement. Education: educate people about the various
authentication components. Essentially, you want to explain identity,
authentication, authorization, and access control (and accounting/auditing),
or simply AAA, as parts of a systematic process, each of which can be
accomplished using various methods. And you want to push for more maturity
on each of those concepts. If single users end up with multiple identities,
you need an identity management system (or maybe federated identities,
if multiple companies are involved). You want to move authentication from
passwords to something more sophisticated, such as two-factor
authentication. You want to move access control from Discretionary Access
Controls (DAC) to client-server impersonation and eventually Role-Based
Access Control (RBAC). Finally, the data you protect must be categorized
according to sensitivity and protected accordingly...

DITA: Reusable XML

IBM and software vendor JustSystems announce the availability of a
methodology that allows organizations to break up huge Extensible
Markup Language (XML) documents into reusable pieces. "The Darwin
Information Typing Architecture (DITA) Maturity Model," co-authored by
IBM and JustSystems, is the first step-by-step process for implementing
DITA, officials from the companies said. DITA can be applied to content
that is highly branded or regulated and broadly leveraged, including
technical documents, marketing materials and regulatory filings,
according to Paul Wlodarczyk, vice president of solutions consulting at
JustSystems: "The DITA Maturity Model recognizes that each organization
is adopting DITA at its own pace. So, the model starts from square one,
laying out the key steps that any organization can take to successfully
adopt DITA." One of DITA's most attractive features is its support for
incremental adoption. However, organizations at different stages of
adoption claim radically different numbers for cost of migration and
return on investment. To address these issues, the DITA Maturity Model
divides DITA adoption into six levels, each with its own required
investment and associated return on investment. As a result, users can
assess their own capabilities and goals relative to the model and
choose the initial adoption level appropriate for their needs and
schedule.

Employ Metadata to Enhance Search Filters

In this article the author shows how to use metadata for pooling
information already resident in an application to create a flexible
search interface that reduces complexity and increases users'
productivity. Easily customizable and configurable software is becoming
increasingly important, and a flexible interface for searching is one
way in which software is becoming more configurable. The key to
achieving flexibility is through using metadata. Consider an application
that stores customer, item, and order information in a database. The
interface for searching through orders could apply any number of filters,
but presenting all possible combinations together can very quickly
become overwhelming for users. It is often beneficial to allow some
customization or configuration for choosing the appropriate filters
based on several factors including the business process, the role of
the individual, or those that are specific to the user's needs. With
traditional query templates, the complexity grows quickly with every
new search filter that is added. However, by using metadata to model a
query and its filters, you can reduce the complexity of the software,
while creating a more flexible solution... Metadata can be loosely
defined as data about data, or in this case search-filter data about
order data. The World Wide Web Consortium (W3C), the group responsible
for XML standards, recommends using Resource Description Framework
(RDF) for representing metadata (in XML or other formats). You can
store RDF in a variety of formats, but the example discussed here will
use an RDF/XML file because it has the widest support. XML tags are
used to structure the file format of RDF/XML. The outer tags represent
the resources, their nested tags represent properties, and inside the
property tags is a property value, which may be text or another resource
tag.

Achieving Separation of Concerns Using BPEL

The vast majority of software producers focus exclusively on
domain-specific solutions. In this way, software is becoming more
customized and, correspondingly, less generic. While some end users
(particularly large corporate customers) may be able to request
features that closely fit their business processes, it's likely that
most of us end up with a poor fit between our deployed software and
our business process needs. The end result is massive cross-vendor
duplication of software development that tries to implement code as
well as business process logic. An interesting separation of concerns
is becoming possible by the use of BPEL (Business Process Execution
Language), which allows for business process logic to be expressed in
a specific language and to be tied into external software. This reduces
(and potentially eliminates) the need to code business process logic
in a traditional programming language (such as Java or C++/C). In
turn, this provides a clear separation between software features and
business processes. By taking the business process logic (e.g.,
workflow management) out of the application code, the latter becomes
simpler and more focused. In this article, I'll review the idea and
merits of separating software features from business processes in the
context of BPEL. Along the way, we'll see how this leads neatly to
the need for highly generic software. The latter is (in my opinion)
a pressing concern for all software developers... I think that IT should
endeavor to become as streamlined as possible and BPEL/web services
suggests itself as a possible path to take. By removing business
process logic from code, we would see the potential emergence of
generic software for web services use. Business process logic would
then reside in a BPEL layer that would orchestrate the required
service calls. This would help to reduce the growing complexity of
software and systems.

Process Component Models: The Next Generation in Workflow?

This article arguments that the gap between the analysis and the
implementation of business processes is far bigger then the marketing
of today's workflow tools might suggest. Also it will propose a much
more realistic way of dealing with this situation. The current standards
and initiatives will be explained with enough depth so that you can see
how they relate to the movements and why. In the discussions, I'll
identify the strengths and weaknesses of each discussed technology and
describe the proper and improper ways of using them. At the end, a new
type of workflow technology is introduced called process component model.
This type of framework can handle multiple process languages and it
can support process languages that better support the transition from
analysis process diagrams to executable processes. BPEL is an executable
process language, which is good for integration purposes, but it's not
suited for supporting Business Process Management cause of its tight
coupling with technical service invocations. BPMN serves the analysts
in drawing analysis diagrams, but it's not executable. XPDL is a less
adopted file format, which might be superseded by BPDM. The gap between
analysis languages and executable languages still remains too big to be
practical. In order to create a more realistic approach to BPM for
widespread adoption, we need to start by making a better distinction
between analysis process models and executable process models. Once we
abandon the idea that non-technical business analysts can draw
production-ready software in diagrams, we can come to a much more
realistic and practical approach to business process management. When
linking an analysis process model with an executable process
implementation, the clue is not to include too many of the sophisticated
details of the analysis process notation in the diagram. By using only
the intersection of what the analysis language and the executable
process language offers, a common language can be created for the
business analyst and the developers, based on one single diagram.
Different environments and different functional requirements require
different executable process languages. The current idea that one
process language would be able to cover all forms of BPM, workflow
and orchestration is just too ambitious. And if such an effort would
succeed, the resulting process language would be way too complex for
practical usage...

OGC Approves Sensor Web Observations and Measurements Encoding Standard

The Open Geospatial Consortium (OGC) announced that its members have
approved version 1.0 of the Observations and Measurements Encoding
specification as a final OpenGIS Implementation Standard. The two-part
Observations and Measurements Encoding specification "defines an abstract
model and an XML schema encoding for observations and measurements. This
framework is required for use by other OGC Sensor Web Enablement (SWE)
standards as well as for general support for OGC compliant systems
dealing in technical measurements in science and engineering. As a new
international consensus standard in an era of increasing scientific
cooperation, O&M promises to play an important role in Web-based
publishing of real-time and archived scientific data across research
disciplines and application domains." An 'Observation' is an action with
a result which has a value describing some phenomenon. The observation
is modelled as a Feature within the context of the General Feature Model.
An observation feature binds a result to a feature of interest, upon
which the observation was made. The aim of the OpenGIS O&M Standard
is to "define terms used for measurements and the relationships between
them, mainly to improve the ability of software systems to discover and
use live and archived digital data produced by measuring systems. When
scientists and engineers encode data in O&M, they can easily publish
the data (or live data feeds) in catalogs and registries so others can
efficiently discover, access and use the data, using relatively simple
software. The scope of the specification covers observations and
measurements whose results may be quantities, categories, temporal and
geometry values, coverages, and composites and arrays of any of these."

Russia is Open to Open Protocols

Golovin (Executive Director BIG-RU -- BACnet Interest Group Russia):
"I started with BACnet in 2005 when the BIG-RU (BACnet Interest Group
Russia) association was founded. Before then I was familiar with the
BACnet movement in the US. In my opinion, the Russian market is more
similar to the USA market, than the European. The main idea was to
bring the latest achievements in open standards to the Russian market
and BACnet was the best choice. During the last 2 years the BIG-RU has
familiarized an estimated 80% of building owners in Russia with BACnet
benefits. In 2007 the KNX Association (ex-EIB protocol) invited me to
manage their association in Russia while working with BIG-RU. It is a
great experience, because we can say that BACnet grew from HVAC systems,
and EIB/KNX grew from lighting and low-voltage systems. Both protocols
are relative and part of ISO standard 16484-5. We can now support our
customers with a wider field of open building standards applications.
Let's say that the joining of BACnet and KNX promotion work is a process
of globalization of the market of open standards in Russia... Nobody
wants to be locked into a single brand system. The situation is that
every open protocol should be implemented where it is practical.
Traditionally BACnet is more suitable for integration of large systems,
multi-vendor projects and systems which could be expanded in the future.
LonWorks is more often applicable in the field level (for products
integration) and in middle sized projects. Very often both protocols
exist in the same project: BACnet at the management level and automation
level, and LonWorks on the field level. A good example of such a system
is the 'Federation' Tower project in Moscow..."

WS-Are-We-There-Yet

We now have web services to almost any conceivable control system. We
have BACnet-XML and BACREST. We have TAC Web Services. We have LON XML.
We have oBIX. So...are we there yet? [...] One of the reasons that I
am watching NBIMS so closely from the oBIX vantage point is that the
higher semantics will need to be there before oBIX has an enterprise
interface. I have a hard time imagining what applying Policy to point
services even means. As Enterprise programmers will never be control
engineers, we are going to have to wrap up the standard functions
into business services. In oBIX if I define a set of functionalities
for a given period of time, it is called a contract. For these to be
useful, we will need to pre-define several contracts and make each of
them discoverable. Discovery will mean that we need to describe each
one in terms of the service it provides. The Description/Catalogue will
need to be machine readable rather than human readable, which means it
will be based on Semantics... I believe that as we all become more
familiar with NBIMS, we will be able to discover the Semantics needed
to do this. Somewhere in the check list of Services to be Commissioned,
in the Systems in the Energy Model, and even in the Function analyzed
by the Code Compliance checker are the Semantics needed to create
discoverable abstract services. And that XML will be an order of
magnitude more useful because it is semantically laden.

Google, NTT and the US GSA Deploy SAML 2.0 for Digital Identity Management

Liberty Alliance, the global identity consortium working to build a more
trusted internet for consumers, governments and businesses worldwide,
today released highlights of SAML 2.0-based digital identity management
applications that are delivering real world value to users and
organizations around the globe. These applications are among the many
public and private sector deployments helping to drive a more secure and
privacy-respecting internet identity layer across applications, sectors
and regions based on SAML 2.0 standards. With government organizations in
The Americas, Asia, Australia and Europe building and deploying SAML
2.0-based identity applications, SAML 2.0 has become the standard of
choice in the global eGovernment and public sectors. These governments
are relying on SAML 2.0 to deliver a wide variety of new online services
to citizens, help meet compliance mandates and to provide business and
trading partners with a secure and trusted platform for conducting
identity related transactions. A digital map and description of global
eGovernment deployments based on SAML 2.0 Liberty Federation is available...
Using SAML 2.0 allows Google's customers to treat web-based authentication
to Google Apps the same way they treat authentication to their other
services... NTT has developed SASSO, a personal Identity Provider that
enables users to single-sign-on to a PC and leverage the strong
authentication capabilities of the mobile phone to conduct a wide range
of secure identity-based transactions. SASSO uses the increasingly
ubiquitous mobile phone as an Identity Provider (IdP) to allow users to
access a Service Provider (SP). Once authenticated by their own mobile
phone, the IdP on the mobile phone issues a SAML assertion signed by a
private key and sends that assertion to SPs.

Rogue Wave Accelerates SOA Data Services Creation in C++

Rogue Wave has announced the release of Rogue Wave HydraSDO for XML 2.2
and the next edition of HydraSDO for Databases. Rogue Wave HydraSDO
data components automate the creation of high-performance,
service-oriented data services in Java and C++. The components enable
developers to expose any data source as lightweight, independent, and
decentralized data services through the Service Data Object (SDO) API,
the industry standard data access in SOA. Rogue Wave HydraSDO for
XML 2.2 enables XML documents to be read and updated using the SDO API.
HydraSDO for XML provides a data access service (DAS) for parsing XML
data and populating a DataGraph consisting of DataObjects and a Change
Summary. HydraSDO for Databases enables developers to use the SDO API
to access relational data in both loosely coupled and traditional
tightly coupled application architectures. The component provides
read/write capability for relational databases using the SDO API without
the need to write SQL statements. HydraSDO for Databases includes support
for leading databases including Oracle, SQL Server, MySQL and Sybase
databases. Both XML and relational database data sources are made
available through the simple XML-style SDO interface, which can be used
by multiple applications as a real-time SOA data service. HydraSDO data
components work stand-alone or can be seamlessly integrated with Rogue
Wave HydraSCA, the first product available for deploying high-performance
SOA applications based on the Service Component Architecture (SCA)
specification.

Atomojo Atom Publishing Protocol (APP) Server 0.7.0 Release Available

This atomojo Google Code project contains both an Atom Publishing
Protocol (APP) server and client. While both are intended to be used
together, as they implement a standard protocol, they can be used with
other APP-enable applications. The client is a Firefox plugin that
contains an XPCOM component for interacting with the Atom Publishing
Protocol (APP). Atomojo's APP implementation has been coming along
quite nicely and release 0.7.0 is quite stable. Here's a short list
of some features: full APP implementation; hierarchical feeds; XQuery
support; large binary support; full REST interfaces for administration;
integration with external authentication services; indexing of Atom
category elements; retrieval of entries and feeds via categorization;
metadata services for context and query by term and term value. The
Atomojo server provides a uniform way to store multiple feeds and
manipulate them with the Atom Publishing Protocol (APP). Feeds are
organized hierarchically and indexed by their categorization. It
provides both metadata feeds for getting context information about
feeds as well as pulling a feed for each categorization stored in the
database. The server runs on top of a restlet.org engine and uses a
rest-style URI architecture for its feeds. Feed are organized
hierarchically just like a file system but they are store in an XML
database called eXist. In addition to eXist, there is a metadata
index that is stored in an embedded Derby database. This index stores
information about what feeds contain what entries as well as
information about categorization. Once the server is started, you can
get the service introspection document by a GET on the server root...

Sanjiva Weerawarana on Open Source SOA Middleware

In this interview, Stefan Tilkov talks to Sanjiva Weerawarana about web
services and REST, about core standards that are essential for web
services standards, open source SOA tooling, scripting languages and
web services, and the strategy of WSO2 in providing open source
middleware. [As to key WS-* specifications:] The most important one
from a service oriented perspective, is the thing that is used to
describe what the service does. WSDL is the one that people are using
for that. There is a whole diversion of the spec that widely deployed,
WSDL 1.1. But WSDL 2.0 is a hugely improved spec; it's really not even
a WSDL, it's a completely different language in many, many ways.
Unfortunately adoption has been slow yet, it just came up this year, and
it is going to take some time to major vendors have to got to revisions
to get to that point, if they want to get there they have to have a much
better way to describe services. The other key spec is of course is the
base wire protocol that everybody uses which is SOAP, and there is a
series of specs that extend SOAP with security, with reliability,
transactions, and so there is WS-Security, WS-SecurePolicy and WS-Secure
Conversation and WS-Trust, those are the four security specs that matter.
On reliable messaging there is something called WS-ReliableMessaging,
in transactions there is something called the WS-AtomicTransaction,
WS-BusinessAgreement and WS-Coordination. There are many applications
where you never touched reliable messaging or transactions. So a large
percentage of the people who built web services would never actually
end up invoking these things, and in fact most people who actually use
web services shouldn't be knowing about these specs. This is the
underneath infrastructure that people like me who build web services
should know about...

eGovernment and the Web Workshop Report: Toward More Transparent Government

W3C has published a Workshop Report: eGovernment and the Web Workshop:
"Toward More Transparent Government". On 18-19 June 2007, the World
Wide Web Consortium (W3C) and the Web Science Research Initiative held
a workshop entitled, Toward More Transparent Government at the US National
Academy of Sciences in Washington, DC. The goal of the workshop was to
find ways to facilitate the deployment of Web standards across eGovernment
sites and help shape the ongoing research agenda in the development of
Web technology and public policy in order to realize the potential of
the Web for access to and use of government information. The Call for
Participation had required participants to submit position papers.
Twenty-two (22) position papers were received. The workshop was chaired
by Daniel J. Weitzner (MIT/W3C), Ari Schwartz (Center for Democracy and
Technology) and Nigel Shadbolt (University of Southampton, UK). The
final workshop session considered key lessons learned and identified
possible next steps that W3C and WSRI could take together with the
eGovernment user, vendor and research communities around the world. After
two days of policy and technology presentations, the over-arching theme
heard over and over again is the need to take steps, institutional,
legal and technical, toward publishing data with re-use in mind. The
participants identified several steps to meet this goal...

Building Asynchronous Services using Service Component Architecture

Mike Edwards of IBM discusses the need for asynchronous services when
you build an application using a service-oriented architecture.
Building asynchronous services can get complicated, but is made
straightforward using Service Component Architecture (SCA). The steps
involved in using SCA to create an asynchronous service and asynchronous
service client are described in this article. "Asynchronous services,
where responses are returned a long time after an initial request is
made, are simply a fact of life -- not everything happens immediately!
[...] Facilities for writing clients to synchronous services are well
provided by most programming models and frameworks. The same can be
said for writing synchronous services. The synchronous call-and-return
model is the standard form for writing code in most programming languages.
As a result, writing a synchronous service or a synchronous service
client is usually not much more than a simple extension of the regular
programming model. In Java, for example, this usually means that a
service is implemented as a method in a class, while the service client
is the invocation of a method on a class... Providing Asynchronous
service implementations and also providing clients to asynchronous
services is made simpler with Service Component Architecture. SCA
provides a request followed by callback response model for asynchronous
services, allied with the use of callbackID to tie the original request
to the callback response. SCA enables this to be done for a variety of
underlying communication methods between the client and the service,
eliminating the need for the code to be dependent on complex middleware
APIs.

MarkLogic Server Used in Princeton Theological Seminary Digital Library

Mark Logic Corporation announced that Princeton Theological Seminary has
implemented MarkLogic Server as the new basis for the library's new
digital collection. The library has launched a system for publishing
digital content to give users better access to and navigation through
more than 100,000 digital objects, including digitized representations
of historic photographs, portraits, artifacts, and journals. This
provides library members -- both seminary students pursuing advanced
degrees in divinity or theology, as well as the general public -- with
new levels of access and interactivity with historical and modern
theological works. The Seminary Library implemented MarkLogic Server to
enhance the library's existing browsing services with search and faceted
navigation including the Web 2.0 concept of user-tagging. Based on a
model of tag clouds, users apply key words to items in the digital archive,
which are then able to be used as search tools or for browsing via faceted
navigation. The digital collection is divided into visual collections
and textual collections. Mark Logic replaced the existing digital
collections infrastructure and provided a fixed, index-based navigation
of individual and multipart digital works which have been scanned from
a wealth of content related to the history of Princeton Seminary. The
index and holding metadata are stored in the metadata encoding and
transcription standard (METS), a library metadata XML standard developed
by the Library of Congress. The METS XML files describe books and journals
consisting of 100s of pages of content. Previously, access to the
collection would result in interminable wait times, often with the
browser simply timing out. Using MarkLogic, the Seminary Library can
run these queries in XQuery, returning results on these large XML files
often in less than a second and never more than three seconds.

ILOG Updates Rules; Now Supports .NET 3.0 and WCF

ILOG has announced its ILOG Rules for .NET 3.0, a tool that lets you
edit and manage business rules in Microsoft Word and Excel through a
new "Rules" tab in the Office 2007 toolbar or "ribbon". ILOG Rules for
.NET 3.0 also provides support for Windows Communication Foundation
(WCF), Microsoft's unified programming model for building service-oriented
(SOA) applications. Debuting with this release, ILOG Rules for .NET 3.0
has a modular architecture similar to its sister product, ILOG JRules,
for Java. These modules include a dedicated rule authoring environment
for business users called "Rule Solutions for Office." Other features
include: (1) Portable Rule Editing Environment for Business users. Rule
Solutions for Office and Rule Team Server for SharePoint Services
combined to promote a document-centric approach to rule management,
leveraging new features in Office 2007. Users collaborate easily, taking
rules where Word documents go -- disconnected, routed, attached, and
printed. (2) Out-of-the-box SOA deployment using Rule Execution Server
for .NET. Traditionally, SOA investments have been expensive. (3) Enhanced
Rule Management for Business Analysts. Business analysts can more easily
collaborate with developers working on a common set of rules in their
familiar environment. (4) Enhanced performance with RetePlus and
FastPath available for the first time on .NET Framework. In this
version, ILOG is introducing RetePlus and sequential execution with
FastPath for .NET platform. The combination of RetePlus and FastPath
in ILOG Rules for .NET 3.0 eliminates the need for a customer to
compromise when selecting an algorithm.

WSO2 Injects Mashups Into The Enterprise

WSO2's Mashup Server draws together a variety of enterprise information
or services and allows them to be combined into a new application or
"mashup." Mashups are typically associated with end user applications
that make use of information or services readily available on the Web,
such as an apartment-hunting application that taps intoGoogle Maps.
WSO2 is extending the mashup idea to inside the enterprise. Any internal
information or service that can be presented as a Web service may be
found by Mashup Server and utilized as part of a new application, said
Jonathan Marsh, WSO2 director of mashup technologies. Mashup Server
relies on the popular Javascript language to define what resources are
going to be tapped for a new application. Mashup Server includes an
administrative user interface where Javascript can be used to identify
and sequence services. But its Javascript may also be written wherever
the developer prefers to compose it, which could include a simple text
editor or an established integrated development environment, then
imported into the Mashup Server, said Marsh. The WSO2 Enterprise Service
Bus is an integration broker meant to serve as a basis for building out
services oriented architecture. Its built on top of Synapse, an Apache
incubator project that translates between applications and provides
automatic routing of XML messages. Mashup Server can discover new
services and capture information about them, which it stores for use
in future mash-ups WSO2 Mashup Server is released under the Apache
License Version 2.0. It features: (1) support for consuming and deploying
services using dynamic scripting languages; (2) trivial deployment and
redeployment; (3) automatic and UI-based generation of Web services
artifacts -- e.g. wsdl, schema, policy; (4) a set of gateways into a
variety of information sources, including SOAP and POX/REST Web services,
as well as plain old Web pages; (5) human-consumable results through a
variety of user interfaces including Web pages, portals, e-mail, Instant
Messenger service, Short Message Service (SMS), etc.

Battle on Microsoft Standard Push

A global war has broken out over Microsoft's bid to make the XML document
format used in Office 2007 an international standard. Rivals and the
open-source community fear a yes vote for Office Open XML will stymie
the existing ISO file standard, OpenDocument Format, and give Microsoft
an ongoing commercial advantage. The file format standard is a key
factor in ensuring present and future access to digital documents used
by business or held in archives. The aim is to ensure
backward-compatibility despite changes in software and publishing
technologies. The software giant has been trying to secure national
votes for a coming ISO ballot on OOXML, after the first ballot failed
last September. IBM has been particularly outspoken about the issue, but
local government programs executive Kaaren Koomen insists it's not
simply a battle between two multinationals. "ISO has a policy that,
wherever possible, there should only be one standard to maximise
interoperability and functionality. We have an international standard
for digital documentation, ODF, which was developed by Microsoft, IBM,
Sun, Oracle and the open-source community some years back. Microsoft
pulled out of that process and decided to develop its own standard, OOXML.
Now Microsoft is trying to convince the rest of the ICT community to
adopt its standard." Microsoft Australia chief technology officer Greg
Stone says the company was simply responding to repeated requests to
make its specification available. Rick Jelliffe, a developer of XML-based
desktop tools and a long-standing participant in standards work, says
the bottom line for Microsoft is keeping in the game. "This is really
important for them. My take is that over the past 10 years Microsoft
has lost its bread-and-butter systems integrator market. It had a
thriving sector that was devoted and tied to it, but the advent of web
technologies meant the old advantages of lock-in to proprietary formats
suddenly became disadvantages, because you can't integrate with other
systems."

Validator for XML Schema: XSV Version 3.1.1

Henry S. Thompson (W3C Technical Advisory Group; HCRC Language Technology
Group, University of Edinburgh) announced the release of a vew version
of the XML Schema Validator (XSV). XSV is an open source (GPLed)
work-in-progress attempt at a conformant schema-aware processor, as
defined by XML Schema Part 1: Structures, Second Edition of
28-October-2004. The simplest way to use XSV is via a form-based
interface on the web. There is a Win32 one-click installation, and source
distributions are available for the more adventurous. All installers are
now up-to-date: Windoz executable, .deb, .rpm and source versions. The
major changes since the last public release are: (1) corner cases for
nested numeric exponents are handled correctly; (2) XSV no longer requires
PyLTXML (our fast Python/C XML parser), will run without it, provided
you have PyXML installed. XSV can be run with various flags to control
the kind and level of validation. If you enter more than one URI, the
second etc. will be used to schema-validate the document at the first
URI. "Show Warnings" will display warning messages, e.g. about use of
wildcards "Keep Going" enables continuation of schema-validation after
finding errors. Check as complete schema: Normally XSV interprets its
first input as a document to be validated, and the remaining inputs,
if any, as schema documents for use in that validation. This means that
if the only input is a schema document, XSV normally just validates that
document against the Schema for Schema Documents (XMLSchema.xsd), but
does not also check the Schema REC's constraints on the corresponding
schema. Ticking the "Check as complete schema" box causes XSV to treat
all its inputs as schema documents, check them against the Schema for
Schema Documents and check the Schema REC's constraints on the
corresponding schema.

Will HP's Extended GIF Partnership Help With SOA Interoperability?

HP today launched new versions of Systinet and SOA Manager, adding new
features aimed at run-time governance " enforcing the policies that
traditional design-time registries and repositories define. It is also
announcing a large expansion of its Governance Interoperability Framework
(GIF), a set of specifications that smooth links between the registry
or repository and other components. The expansion of GIF is largely
through ten new partners, all of whom will support the spec. The single
most important new partner is Oracle, which is already a major player
in SOA and set to become moreso with its acquisition of BEA, already a
GIF member. The most interesting looks like LogicLibrary, a specialist
registry/repository vendor and Systinet competitor. Its support turns
GIF from what had been a Systinet-centric program into something more
vendor neutral. Most of the XML security gateway industry have also
joined the GIF program: HP is announcing support form Cisco Systems,
Alcatel-Lucent, Vordel and Layer 7 Technologies, which leaves IBM
DataPower the only major player in XML security that's not involved.
The other four new members are orchestration specialist Active Endpoints
and Web 2.0 development tool vendors JackBe, Nexaweb and Sonoa Systems,
which could help to bridge the gap between rich Internet applications
and SOA back end systems. But not everyone has embraced GIF. The main
competitor is SOA Link, a similar program started by Infravio before
it was acquired by webMethods and then Software AG. HP's new software
(Systinet 2.52 and SOA Manager 2.5) tries to close the loop between
design-time governance and run-time management, something several other
vendors are working on.

Progress Software Adds Cross-Process Visibility with Actional 7.1

Progress Software has beefed up its Actional SOA management offerings
with the release of Progress Actional 7.1, which provides unified
visibility into business processes, and connects those business processes
to the underlying SOA infrastructure. Key features of the latest release
include an automatic discovery feature that keeps information accurate,
allowing users to compare how processes change from day to day. User can
also set thresholds for alters about behavior and performance, and policy
enforcement will automatically adjust when services or processes change.
Progress, Bedford, Mass., added the Actional product line to its SOA
arsenal just a little over two years ago with the acquisition of Actional
Corporation in a $32-million deal. Progress said that Actional 7.1 will
integrate with Lombardi TeamWorks, and the company plans to provide native
support for other business-process management (BPM) solutions, including
offerings from Software AG and Fujitsu. Actional also includes a software
development kit (SDK) that allows third parties to add support for other
BPM and SOA infrastructure products. The new version also includes support
for non-XML payload data, which is designed to allow users to inspect
and analyze message content in such existing services as Remote Method
Invocation (RMI) and Enterprise JavaBean (EJB).

IBM Makes SOA Play with AptSoft Buy

IBM is looking to beef up its service-oriented architecture portfolio
by buying AptSoft, a private company that develops infrastructure
software to help companies determine cause-and-effect relationships
between business events. AptSoft, based in Burlington, Mass., develops
software that falls into the complex event processing category of
software that works within a SOA (service-oriented architecture)
framework to help trigger BPM (business process management) events.
AptSoft's software spans a number of areas in IBM's vast product
portfolio groups, including WebSphere, Information Management and
Tivoli. IBM also plans to fold AptSoft into newer product categories
that include RFID and Web 2.0 capabilities, areas where IBM has ramped
up investment over the past couple of years. Complex event processing
software helps companies identify patterns and establish connections
between events. Once the software determines a trend in events --
whether they occur over a millisecond or over hours or days -- it
initiates a business process trigger. AptSoft's namesake Director
for CEP is a platform that runs on a company's network, where it
monitors and correlates activities across applications, Web services,
databases and devices, according AptSoft. Based on user-defined rules,
the software detects events or patterns -- a new customer is added,
a product is sold but a shipment isn't scheduled, a prospect researches
the same topic on a company's Web site for a week --- and then
orchestrates the disparate elements of the infrastructure to execute
a process. Where AptSoft fits into IBM's SOA strategy is in its ability
to enable a SOA framework that supports what AptSoft refers to as a
new class of composite applications (those applications that are made
up of) Web services, events from services and events from an event
cloud of people, devices, applications, databases and networks.

Web 2.0 Security

Web 2.0 is an umbrella term coined to include technologies used for
providing user-centric web based services. Here, the services are
architected and programmed so that they can be personalized and used
dynamically. The architectural philosophy is called Service Oriented
Architecture (SOA). This document provides security aspects for Web 2.0
based Services. It provides a comprehensive list of threats that need
to be considered for mitigation when deploying Web 2.0 services. It
also provides ideas on mitigating the described threats. The document
is intended for CIOs and Enterprise IT Professionals (e.g.,
Administrators, Directors) who are planning or implementing or deploying
Web 2.0 Services, and for Network & System Architects. The paper
discusses several security threats, including Feed Injection;
Authentication; Validation; Client Side Attacks: Cross-Site Scripting &
Forgery; Client Side Attacks: Command Execution and Zones; Client Side
Attacks: Generic.

Watching WADL

I'm following the discussion of RESTful Web description in general, and
WADL in particular, with both difficulty and interest. The first hurdle
that a RESTful description format faces is probably the biggest; how
it's used by clients. My experience is that WADL provides most of its
value on the server-side (e.g., for in-development service modelling,
documentation, and review, as well as limited code generation), but
much discussion keeps on circling around to the client side, perhaps
because of the well-worn footpath off the cliff that WSDL provided. If
clients use a WADL file to generate static code that calls the described
service without checking to see if the WADL has changed, they're going
to be tightly coupled to the WADL definition, and therefore no better
than WSDL or any other interface description. Blech. On the other hand,
if you use the WADL at runtime to dynamically create the URLs and
representations you send and parse the ones you receive, it's all good,
and in this way WADL is acting like a Web format should -- using
hypertext (in this case, a generic XML format) as the engine of
application state. In this way, it's no different than the APP service
document format or HTML forms, except that WADL is less
application-specific in the first case, and more flexible in the latter...
I think there will be two ways to get a (somewhat) RESTful Web service
into the world, for the foreseeable future. One will be to work with a
group of people to identify a broad problem space, standardise one
(or a few) media types, defining their semantics, and an
application-specific format that glues them together into an interface.
Atom Publishing Protocol is a great example of this, and it certainly
has legs. The other will be to skip the huddle, define your own formats
and semantics, and throw it over the wall, knowing that you'll get your
problem solved in the short term, but without the considerable leverage
of a widely adopted and understood format and interface...

XmlTransform: A General-Purpose XSLT Pre-Processor

The general-purpose XML transformer and/or validator discussed in this
article, named "XmlTransform" operates on an arbitrarily deep
directory tree containing files you want to transform. As output it
optionally generates multi-level indices and can even add navigational
linkages. XmlTransform's validation capability is reasonably
straightforward; it lets you ensure that the set of XML files used for
a transformation are valid according to specified XML schema. You may
elect to validate input files, output files (after transformation), or
both. The program's transformation capability is more interesting. One
common application of a transformation engine is as a pre-processor, a
very handy thing indeed when designing web pages. Overall, because of
the number of options and their effects on the output, XmlTransform does
have a fairly steep learning curve, but if you have a problem to tackle
that it can handle, it can be quite a time saver. In the real world,
XmlTransform originally served to generate static pages on my open source
web site. Rather than write in HTML, I can write pages in a shorthand
custom XML dialect and let XmlTransform automatically take care of the
fancy headers, footers, page linkages, copyright date, and so forth. But
XmlTransform is useful in other situations as well. For example, it can
act as a SQL documentation generator akin to Ndoc (for C#) or JavaDoc
(for Java).

Perspective: Acid2, Acid3, and the Power of Default

"Acid2 is a complex Web browser test page that shows a smiley face
when rendered correctly. The test, published by the Web Standards
Project, has been a tremendous success in weeding out browser bugs
that stop Web designers from reaching pixel perfection in their pages.
Safari and Opera ship Acid2-compliant versions, and the upcoming
Firefox 3 will also pass the test. Recently, Microsoft announced that
Internet Explorer version 8 can render Acid2, and it showed a screenshot
to back the claim. The news was received with joy and excitement in
the Web-authoring community. Finally, it seems, Microsoft has decided
to take Web standards seriously. Designers will no longer have to spend
countless hours trying to get their pages to look right in Internet
Explorer while adhering to standards. Unfortunately, I think that the
celebration is premature. I predict that IE 8 will not pass Acid2,
after all... Acid3 will follow in the footsteps of Acid1 and Acid2;
it's a tough one-page test that displays a quirky graphic when rendered
correctly. No browser will pass the test at the time of its release.
All vendors are equally challenged. Whereas Acid2 was a static Web
page, Acid3 will be a dynamic Web application. When browsers are
improved to pass Acid3, it will become easier to write Web applications
that work interoperably across browsers... The IE 8 team has shown
that it can render Acid2 correctly. Now it's time for Microsoft to put
its code to good use."

Is Tomcat an Application Server?

"In this article I tackle the question of whether Tomcat is an
application server. I start by explaining the distinctions between app
servers, Web servers, and Java EE containers, and then look at some
scenarios where a Web server like Tomcat could be used appropriately as
an app server. I show a scaled architecture, starting with the sort of
lightweight implementation where Tomcat shines, and concluding with a
complex service-oriented architecture, where you would be better off
with a full-fledged Java EE application server... Apache Tomcat can be
used as an application server, especially for less complex Java EE Web
applications. According to some figures, Tomcat is the Web/application
server environment most used by Java developers. Tomcat's popularity is
due to its ease of use and support for many features considered to be
standard in a Java Web application environment, including WAR file
deployment, JNDI resources, JDBC data sources, JSP support, session
replication, virtual hosting support, clustering support, and JMX-based
management and monitoring. Tomcat is also a favorite for Java enterprise
development due to the fact that its runtime performance as a standalone
server is very competitive. With Tomcat version 6, some new features
have been added including asynchronous HTTP request handling via Comet,
thread pool sharing, non-blocking connectors, enhanced JMX management
and monitoring, Servlet 2.5, and JSP 2.1. Even with these new features,
however, Tomcat does not support the entire Java EE stack. Where Tomcat
and other Web servers fall short is in the area of features such as
distributed transactions, EJBs, and JMS. Applications requiring support
for these components are usually more at home in with a Java EE
application server such as JBoss, Geronimo, WebLogic, WebSphere, or
Glassfish. Many Java EE application servers actually use Tomcat as
their Web container.

Rails, REST, and Anarchist XML

've been digging deeper and deeper into Rails, concluding that after
many years of frameworks offering me more headaches than benefits,
Rails finally provides enough good for me to think it worth using a
potentially constraining framework... Today, by default, Rails
scaffolding does something really simple for XML: it lets you avoid
thinking very hard about the XML coming in and going out. Now, I know,
you could if you wanted create Builder templates that let you ensure
that all of the XML going out was perfectly structured according to
this or that specification, but you don't have to... Rails' RESTful
scaffolding also accepts XML documents coming in through PUT or POST.
Again, it's not advertising a formal schema (by default, anyway).
Instead it seems to be doing something like Examplotron -- look at
a sample document, see what's there, and imitate it. Rails can get
away with this for a couple of reasons. First, the type system within
Rails is extremely simple, and not that hard to specify within documents.
Second, Rails has a pretty thorough understanding within the framework
of how data flows - all that ActiveRecord goodness ensures that Rails
knows what's supposed to be in the data, and makes it more comfortable
sending and receiving without an external set of checks. Perhaps my
favorite part of this approach is that a lot of developers are
fixated on the HTML side of things, but the scaffolding generates
the XML side too. It comes for free. Left as it arrives, it's an
opportunity to open a much wider set of services to XML manipulation,
at zero cost to developers. (Well, except for some potential surprise
if and when people start using their services that way.) It's taken
a decade, and the Rails models are pretty much data-centric rather
than the documents I love working with - but we may finally be reaching
the point where XML is starting to behave the way Walter Perry said
it should, unconstrained by data structures negotiated in advance...

Public Working Draft for HTML 5: A Vocabulary and Associated APIs

The World Wide Web Consortium (W3C) has announced the publication of
a First Public Working Draft of HTML 5: A Vocabulary and Associated
APIs for HTML and XHTML. The specification is intended to replace, viz.,
become the new version of, what was previously defined in the HTML4,
XHTML 1.x, and DOM2 HTML specifications. The HTML 5 specification defines
the fifth major revision of the core language of the World Wide Web: HTML.
In this version: (1) new features are introduced to help Web application
authors, (2) new elements are introduced based on research into
prevailing authoring practices, and (3) special attention has been given
to defining clear conformance criteria for user agents in an effort to
improve interoperability. The new features are presented in the companion
Working Draft HTML 5 Differences from HTML 4. According to the W3C
announcement, the HTML 5 specification "helps to improve interoperability
and reduce software costs by giving precise rules not only about how to
handle all correct HTML documents but also how to recover from errors.
Ajax and related innovations have propelled demands for a new standard
that allows people to create Web applications that interoperate across
desktop and mobile platforms. Some of the most interesting new features
for authors are APIs for drawing two-dimensional graphics, embedding and
controlling audio and video content, maintaining persistent client-side
data storage, and for enabling users to edit documents and parts of
documents interactively." The new specification differs from previous
versions of "HTML" in that it defines an abstract language for describing
documents and applications, as well as some APIs for interacting with
in-memory representations of resources that use this language.

Ajax and XML: Use Ajax Techniques to Create Input Forms

Augmenting your HTML forms with Asynchronous JavaScript + XML (Ajax)
callbacks to the server is a practical way to add Web 2.0 functionality
to your application. When you think about Web 2.0 applications, often
the most glamorous of them come to mind: the video of YouTube, the
über-cool scrolling map of Google Maps, the geo-location functionality
in Flikr. Often overlooked in such sites, however, is the humble HTML
form that has undergone a big transformation with the popularization of
Ajax technology. In this article, I show you how to use the Prototype.js
JavaScript library to solve common user experience problems as you
augment forms with Ajax code. You will discover a variety of techniques
to add Ajax code and enhance the user experience for PHP applications.

IE Struggles to Be Compatible

There's a new browser war brewing, and it's not between Microsoft and
Mozilla. Internet Explorer is in a state of conflict with itself and
Web standards. The conflict will expand next month, when Microsoft
sends enterprises an Internet Explorer 7 Valentine. On February 12, IE
7 will dispatch through WSUS (Windows Server Update Services). The
days of enterprises blocking the browser will end. Desktops running
Windows XP and IE 6 will get the update. Those running Windows Vista
already have it. IE 7 is notorious for breaking applications and some
Web sites, and the reasons for both calamities are somewhat different.
Security architectural changes, mainly around ActiveX controls, are
the compatibility killer for many homegrown applications and for some
Web sites. Microsoft's efforts to make Internet Explorer a
standards-based browser has caused Web site compatibility problems...
In a long blog posted overnight, Chris Wilson, Microsoft's IE platform
architect, comes clean about efforts to achieve some kind of balance
between standards compliance and backwards compatibility. It's an ugly
story that he tells. But they say that confession is good for the soul --
or perhaps software development. To me, the scariest part of Wilson's
story is what's not yet written: IE 8. Microsoft's solution : Put more
onus on Web developers, which must insert a tag for rightly rendering
the content in the most standards way. IE 8 will keep the same quirks
and standards modes as IE 7. What he's really say is this: IE 7 broke
the Web once, and Microsoft doesn't want IE 8 to do the same. So for
the mess of DOCTYPE rendering modes everywhere, IE 8 will hold to the
IE 7 status quo. But to get the benefits of the new IE 8 rendering
engine, Web developers will have to tag their sites to support the new
browser. I wouldn't exactly call that a formula for mass adoption.

Thinking About HTML5

HTML 5 is big. Big in a lot of different ways. I'm trying to understand
some of them. Let the random mutterings begin... The genesis of this
essay was some thinking about validity, well-formedness, markup
minimization, and parsing. The design space for markup, especially
markup that will be authored by hand (directly or indirectly), is pretty
big. It's interesting to compare how SGML, XML, and HTML 5 fit in that
space. SGML was designed with ease of authoring in mind, at least to
the extent that minimizing how much markup one had to type was an ease
to authoring. Because SGML required (pre-corrigendum[1]) all documents
to be valid, this flexibility came at a terrible price. SGML parsers
were fiendishly hard to implement correctly. In the SGML world, those
typing conveniences go hand-in-hand with validity. XML was designed
with ease of parsing in mind. In particular, it relaxed the validity
constraint and obviated the need for a DTDs. Without a DTD, it's
impossible to know where implied markup boundaries should go, so you
can't have any. Because you don't know the vocabulary. SGML and XML
are both 'meta markup languages': they have no defined vocabulary. SGML
includes a mechanism that allows users to invent their own tag
vocabularies; XML has several such mechanisms. HTML 5, in contrast, is
explicitly a single vocabulary (or perhaps a small family of vocabularies).
As such, it would be much less interesting where it not for two facts:
first, it is a revision of the single most important vocabulary on the
planet and second, it is neither SGML nor XML. One of the two 'authoring
formats' described by the HTML 5 specification is a custom one. The other
is XML, but in fact both are described as just concrete syntaxes for
'an abstract language for describing documents and applications' which
is what is really being defined. The goal of the custom parser, as I
understand it, is that it imposes an unambiguous HTML 5 interpretation
on any random stream of characters... While that offers some apparent
benefits to end users (they don't for example, have to remember to type
quotes around their attribute values), I harbor some reservations about
whether or not this strategy will be a good thing for the broader markup
community in the long run.

SAML: The Master Key?

Imagine a day when instead of setting up an account with each
organization you do business with, you set up a single account, which
all the parties can consult. Such a setup could be useful for federal
agencies for a number of reasons. For one, federal employees often
need to access systems and data held by agencies other than their own.
For another, e-government initiatives involve people who often hold
no government-recognized credentials. How does the government
authenticate their identities? The General Services Administration's
E-Authentication Identity Federation initiative can meet these needs,
said David Temoshok, director of identity policy and management at
GSA's Office of Governmentwide Policy. The program is a central hub
for facilitating interactions among different organizations. And one
of the ways E-Authentication can offer this service is through an
emerging Extensible Markup Language-based standard, called the Security
Assertion Markup Language (SAML), which was first developed by OASIS
and later adopted by the Liberty Alliance as the backbone for its
efforts to offer tools for federated network identity... Through the
Liberty Alliance, GSA also maintains a list of SAML-based products
that are interoperable. Like the common terminology, this streamlines
the process of setting up an authenticating relationship with another
party. In September [2007], GSA mandated that all products undergoing
SAML interoperability testing be certified to be interoperable with
Version 2.0 of SAML.

Lombardi Teamworks Conquers BPM with Superb Tools, Flexible Execution

The most well-rounded business process management system (BPMS) we've
tested to date, Lombardi Software's Teamworks combines an execution
and events monitoring engine with a close-knit IDE and tools for
modeling and simulation analysis. With the inclusion of human-centric,
collaborative workflow and services-based integration hooks, Teamworks
can deliver near-seamless mapping, testing, and deployment to execute
most any enterprise workflow. The Activity Wizard made creating rules,
and defining human- and system-side interactions, much easier tasks.
Solid introspection across Java and Web services -- including a new
UDDI tool -- helped hasten discovery and development. Transports are
well represented with SOAP and HTTP/REST-style invocations, as well as
JMS and others. Support for BPMN intermediate events helps you flag
exceptions and initiate compensation rollback procedures in the absence
of more ACID-grade transaction management... Teamworks is rich in
features and strong on tools, with additional perks such as a SharePoint
add-on to build Web parts portlets, good subprocess exposure via Web
services, a connector for Progress Sonic ESB (with hooks to Teamworks
from Progress Actional in the works), and SAML support (one of the few
BPM solutions to make the claim)... One may use popular browsers to
complete process work or integrate with existing portals using JSR-168
or WSRP. On the downside, although Teamworks uses standard BPMN
(Business Process Modeling Notation) for designs, its runtime engine
is proprietary. This could limit execution portability compared to
engines such as BEA/Fuego or Fiorano that handle BPEL natively.

XBRL Versioning Specification Release 1.0

Hugh Wallis (XBRL International, Director of Technical Standards)
announced that the XBRL Versioning Working Group (VWG) has issued the
first Public Working Draft for the Versioning Specification. The
corresponding requirements document, conformance suite, and associated
XBRL Infoset specification are also available online. The XBRL 2.1
specification defines the syntax and semantics of that syntax in order
to define concepts, resources and relationships between those concepts
and concepts and resources in a DTS (Discoverable Taxonomy Set). One of
the most important benefits of the XBRL technology is that XBRL is a
standard about how to define the report content. This means that new
reporting requirements from regulators to regulated entities can always
be incorporated using the same XBRL 2.1 technology without having to
change the XBRL technology. In fact, when a new version of a DTS is
released, the regulated entities will have to adapt existing mappings
between internal data and the old concept definitions to the new concept
definitions. It is expected that for all concepts with no significant
changes the mapping rules can be adapted automatically... This
specification defines the syntax and semantics of the versioning report.
The versioning report represents the changes made to the concept
definitions and resources that exist in two different DTSs. In the
most common use case, the two DTSs will represent two consecutive
versions of the same taxonomy. The most important motivation for the
standardisation of a versioning report is the capacity to communicate
changes in a DTS to taxonomy users. This capacity allows taxonomy users
to identify and apply changes to internal systems. Some of the changes
may be performed automatically by software using the versioning report
and without human intervention and some other changes will always
require human intervention... Because the versioning report is a
communication tool, different taxonomy authors may want to communicate
the same things in different ways. In this sense, the set of possible
differences found between two DTSs can be ignored, considered separately
or grouped together and documented separately in the versioning report.
This specification does not impose any obligation to taxonomy authors
to document changes in a specific way but rather provides a framework
to standardize the way the changes are communicated so applications
can consume that information for its purposes.

Editor's Disposition of Comments Out (OOXML)

The Editor's Disposition of Comments is quite an important document in
the standards development process at ISO. After National Bodies submit
their initial positions and comments on a late draft standard, the
editor of the standard puts together a document to try to satisfy the
various comments. Even though the Disposition of Comments document is
not official, in the sense that anything in it is automatically accepted,
it is usually the starting point for comment resolution, and, given
that most comments are uncontroversial, is often the end-point too.
Monday 14th Jan was the self-imposed deadline for the circulation of
the IDS 29500 Editor's Disposition of comments. The comments and
disposition documents have been leaked to the web, with no tears from
anyone. Here [chart] is my rough characterization of them... The Editor
(Rex Jaeschke on behalf of ECMA TC45) has accepted the lion's share.
There is a small chunk of comments that are out of scope. There is a
small chunk which the Editor has decided are issues for the maintenance
phase, not the fast-track process: these are typically how comments
like 'ODF has feature X, why doesn't OOXML support it?' There is another
chunk of issues where the Editor disagrees with the substance of the
comment, but wants to address the issue by adding clarifying or helpful
text to the specification: for example, the issue of bitmasks is handled
by giving examplars of how to handle them in XSD, RELAX NG, Schematron,
DTLL and XSLT. And finally, another chunk where the Editor disagrees,
and gives the rationale for the disagreement. Suggested resolutions for
[four issues]: (1) Spreadsheet dates to go back before 1900, and can use
ISO 8601 date format; (2) DEVMODE concerns printer-dependent data which
may be binary: the editor suggests some minimal changes to say
'information' rather than 'data structure' and to show how the system
would work with some future XML-based print structure (3) VML is being
withdrawn from the places it is used in the specification, which now
use DrawingML; (4) For Maths, the Editor recommends allowing alternative
formats in particular recommending MathML: this is not to replace the
OOXML Maths, but in the context of 'rehydration' which is where you want
to round-trip through systems that don't support your full language...

Dispelling Myths Around ODF

The 'In-Depth Research Overview' published by Burton Group on January 11,
2008 ("What's Up, .DOC? ODF, OOXML, and the Revolutionary Implications
of XML in Productivity Applications") generated a large volume of
response from commentators, including a response from the ODF Alliance
("Open Document Format Alliance Response to the Burton Group's Report").
In this connection, Erwin Tenhumberg's blog has been cited frequently.
He writes: "Recent articles, reports and documents show that there are
still a lot of misperceptions regarding ODF in the market. Apparently,
many people are still not well informed about ODF even though they
choose to write about ODF. Therefore, I thought it can't hurt trying to
dispel a couple of myths around ODF... I hope the information [above]
has helped to dispel a few myths around ODF and helps people reading
articles, blog entries and analyst reports about ODF. From my point of
view, ODF provides all an office suite user needs, and I'm sure that we
will read a lot more great and encouraging news about ODF in 2008.
Some of the "myths" discussed: ODF is controlled by Sun; ODF equals Open
Source; ODF is very closely related to OpenOffice.org; Customer care
about features, not formats; ODF is not being adopted; ODF has a very
limited feature set; The Web makes ODF irrelevant; ODF is incompatible
with Microsoft Office." Also in this connection, Carol Geyer (OASIS)
reported that: "OASIS has been in contact with Burton Group concerning
their report, "What's Up Doc?"-- as has IBM, ODF Alliance, Sun, and
others. As a result, Burton analyst, Guy Creese, has published a blog
post entitled, 'Some Counterpoint to Our ODF/OOXML Report'... Creese
notes that Burton plans to update the report in the next quarter '...
with an eye to (1) accepting or rebutting arguments made by others,
(2) taking into account the result of the ISO vote, and (3) clarifying
points that people misunderstood or misinterpreted."

Designing a Service Science Discipline with Discipline

This article has been published in the IBM Systems Journal special
issue on "Service Science, Management, and Engineering" (Volume 47/
Number 1, 2008). The paper "relates our experiences at the University
of California, Berkeley (UC Berkeley), designing a service science
discipline. We wanted to design a discipline of service science in
a principled and theoretically motivated way. We began our work by
asking, 'What questions would a service science have to answer?' and
from that we developed a new framework for understanding service
science. This framework can be visualized as a matrix whose rows are
stages in a service life cycle and whose columns are disciplines that
can provide answers to the questions that span the life cycle. This
matrix systematically organizes the issues and challenges of service
science and enables us to compare our model of a service science
discipline with other definitions and curricula. This analysis
identified gaps, overlaps, and opportunities that shaped the design
of our curriculum and in particular a new survey course that serves
as the cornerstone of service science education at UC Berkeley... In
contrast to service operations, the service-oriented architecture
(SOA) perspective that underlies the design and deployment of Web-based
services views the service life cycle in a nearly opposite way. SOA
methodologies emphasize service design because precise, modular,
specification-of-service interfaces and outputs are essential for
reuse and interoperability. Instead of the highly variable experience
of person-to-person services, service delivery in an SOA context is
efficient and scalable. Service quality is objectively measured and
often governed by service-level agreements that emphasize activities
and measurements of the service provider. Our evaluation of the
service life cycle from different perspectives forced us to confront
the semantic challenge of harmonizing the conceptual and linguistic
categories of different disciplines so that we could frame questions
in ways that all of us could accept and understand..." [Note: Bob
Glushko (Center for Document Engineering, UC Berkeley ) is a member
of the OASIS Board of Directors.]

The Current State-of-art in Newspaper Digitization: A Market Perspective

In the last few years, as digitization has gradually moved from an
experimental and temporal activity towards one that is structural and
continuous, mass digitization projects have been gaining ground. Almost
simultaneously with the 'coming-of-age' of digitization, an increasing
number of large-scale newspaper digitization projects (Austria, Australia,
Belgium, Finland, Chili, Sweden, New Zealand, USA) have emerged. From
2007 to 2011, within the framework of the project Databank of Digital
Daily Newspapers (DDD), the Koninklijke Bibliotheek (KB, the National
Library of the Netherlands) will digitize and put online 8 million pages
from a selection of national, regional, local and colonial Dutch daily
newspapers. Focal points in this survey of current practices included:
digital imaging technology, OCR, zoning and segmentation, metadata
extraction, searchability and web delivery systems. Many of the surveyed
companies are involved in developing zoning and segmentation techniques.
Some offer the whole process from digitization to segmentation and
presentation as a package deal. Other companies have a modular approach;
they deliver XML-based, segmented newspaper pages and offer the use of
their presentation and search systems as options... For zoning and
segmentation about half of all survey respondents use the ALTO-format.
ALTO (Analyzed Layout and Text Object) is a standardized XML format for
storing layout and content information. Some advanced segmentation
techniques can automatically recognize and capture article headlines,
page numbers and publication dates. The initial results after automated
segmentation are largely determined by the level of irregularity in the
layout. Nearly all respondents are able to provide basic metadata such
as newspaper title, issue, page, article headline, etc. They support
export of these elements to Dublin Core, METS, NEWSML and custom-made
schemas.

A WebDAV Search Grammar for XML Properties

This version -00 Internet Draft specifies XS:xml-search, an optional
search grammar for use with the Web Distributed Authoring and
Versioning (WebDAV) SEARCH protocol. The full expression power of
XPath may exceed the requirement in simple use cases, therefore some
provisions are made in order to reduce the cost of implementing this
specification, as well as the computational cost of evaluating allowed
queries. The intent of the document is to extend the 'DAV:basicsearch'
grammar for dealing with properties whose values are XML fragments.
Since the WebDAV property namespace is flat, and resources may have at
most one value for a property of a given name, XML documents allowing
repeatable elements cannot be expressed as a set of independent WebDAV
properties (i.e by mapping some elements to properties), and the
'DAV:basicsearch' schema cannot be applied to such XML content because
it deals with property values as a whole. 'XS:xml-search' is proposed
as a different search grammar because it defines a new element
(namely XS:filter) that modify the query semantics. XML Extensibility:
The extensibility mechanism from Section 17 of RFC 4918 (i.e., to
process received XML documents as if unexpected elements and attributes,
and all children of unrecognized elements, were not present) may be
inappropriate when dealing with queries because they would not be
evaluated as specified by the client (e.g. the query criteria may be
loosen or the result record or may be incomplete). The omission of
unexpected content might not be realized by the client. The security
considerations of WebDAV SEARCH and WebDAV (RFC 4918), as well as
those of HTTP/1.1 and XML are applicable to the WebDAV extension
described in this document.

Generate Ajax J2EE Web Applications with jpa2web

This article presents a new open source tool (jpa2web) which generates
J2EE Ajax-based Web applications from JPA-annotated beans. Using the
ZK framework, the applications generated by this tool allow your users
to add, delete, search, modify, and interconnect instances of
database-synchronized objects in a friendly, Ajax-based Web user
interface. ZK is an open source, Ajax Web framework used to create a
rich user interface for Web applications, with little programming and
no JavaScript code necessary. With ZK, you can design Web applications
much as if they were desktop applications. ZK takes care of the client
and server Ajax processing. All that needs to be done is to specify
the user interface by creating simple XML files (called zul files) and
scripting the event handlers in the language of choice: Java code
(compiled), Bean Shell (interpreted Java), Groovy, Ruby, JavaScript,
and some others. Readily available tools have dramatically reduced
the impedance created between Java objects and their database storage;
specially, the ease with which Java classes can now be annotated to
specify the way objects should be persisted. Developers are freed from
the onerous task of writing up database integration codes. Hibernate
solves the persistence issue; however, Web pages need to be created to
handle these elements. A typical scenario for a medium-sized Web
application can proceed something like this: The developer starts by
coding the Plain Old Java Objects (POJOs) that represent a particular
domain model, and then proceeds to create the different transactions
and the Web user interface. A subset of the elements of the model will
frequently involve non-transactional data. Customers, clients,
countries, locations, employees, and companies are typical elements
of a business model that are maintained by a few operators. Why not
ted beans? And why not make this presentation a friendly Ajax experience?
Regardless of its limitations, jpa2web is a useful tool in many
scenarios. You can use it to quickly generate the Web interfaces for
non-transactional elements just by having the annotated beans. You can
also use it for testing by creating the necessary instances of objects
in a database, thus avoiding verbose entity creation scripts.

Create Rich Applications with JavaFX Script

JavaFX Script is a scripting language designed to facilitate the
creation of rich client and Internet applications. JavaFX Script,
which made its debut last spring, is a scripting language that runs on
top of Java Platform, Standard Edition 6 (Java SE) and makes it easy
to code sophisticated user interfaces. The language is highly portable
and can run on any system that supports Java technology, without local
installation. It uses underlying Java technologies to let you create
GUIs of any size or complexity easily. This article walks you through
the basics of the JavaFX Script language and uses a sample application
to introduce some UI components. JavaFX Script is a statically typed
language, which means that the data type of every variable, parameter,
and method return value is known at compile time. JavaFX Script is
also a declarative programming language: it describes what the
application is like rather than how to create it. The algorithm that
determines how to display the application on the screen is left to the
support software (Swing's Java 2D APIs). Because of these traits,
JavaFX Script is well suited for GUI creation. JavaFX Script
licensing: JavaFX Script, as of this writing, is one of a family of
JavaFX products from Sun Microsystems. (The only other family member
currently is JavaFX Mobile, an operating and application environment
for Java technology-enabled devices.) Sun has announced that JavaFX
Script will be licensed under the GNU Public License v2 in the future.
Meanwhile, the JavaFX Community is built around sharing early versions
of the JavaFX Script language and collaborating on its developmen

New Metro Policy Project

Metro (a high-performance, extensible, easy-to-use web service stack)
has a new Policy Project. Description: The policy project has two
distinct goals: (1) In the short term, make JAX-WS WS-policy aware by
moving out the generic policy code from project Tango. (2) In the
long term, provide a common, abstract policy API layer. The API design
should be independent of any particular policy expression language.
Instead it should be use case driven and ease-of-use oriented. The
project is inspired by the effort and experience gained with WS-Policy
and other policy languages for web services in project Tango. Unlike
the WS-Policy implementation in project Tango, this project is meant
to approach the policy domain in a much more general sense and to
enable policy features in a wide spectrum of Java SE and Java EE
applications. Fabian Ritzmann writes: "We are currently working on
moving the existing policy code out of Project Tango and into a
separate workspace. This open source project is hosted as a part of
the Metro community within GlassFish on java.net... Essentially, we
are moving up the policy code in the Metro stack from Tango to a base
library..."

Tracking XML Data Changes Easily with SDO

Tracking data changes is an essential requirement in many software,
application, and business-integration scenarios. Rigorously implementing
this requirement is relatively difficult because modeling and working
with the delta for typical changes is generally very involved. On the
other hand, repetitively implementing it in all the applicable
situations is a waste as a single proper model for the delta is
suitable for many situations, and in most of cases, the requirements
are similar. Service Data Object (SDO), a BEA Systems and IBM-led JSR
defining a generic solution for heterogeneous data access, provides
developers with an easy-to-implement mechanism for tracking data history
at the system level. This article shows an example of processing XML
data with SDO using version 1.0 of Apache Tuscany, a Java implementation
of SDO. Since SDO is not (yet) the standard solution for XML processing,
the article also covers basic XML data operations in SDO to provide
context. The XML data-processing example in this article assumes the
following three phases with a separate party responsible for each phase:
(1) Create, (2) Process, and (3) Review. The XML data are transported
between these phases (and parties) through a file system. The central
scenario of this example is as follows: the second party needs to record
changes he/she makes to an XML file created by the first party, and
when the third party reviews the XML data, he/she hopes to know of such
changes. If you use the Track Changes feature of Microsoft Word, you
will recognize the value of these requirements immediately. Many
applications have these requirements, including optimistic
concurrency-control implementation, synchronization of offline
application data with an active database, and business process management
(BPM) systems. We now demonstrate how SDO helps to i