Search This Blog

Tuesday, February 12, 2008

W3C's Excessive DTD Traffic

If you view the source code of a typical web page, you are likely to
see something like this near the top: These
[statements] refer to HTML DTDs and namespace documents hosted on W3C's
site. Note that these are not hyperlinks; these URIs are used for
identification. This is a machine-readable way to say "this is HTML".
In particular, software does not usually need to fetch these resources,
and certainly does not need to fetch the same one over and over! Yet we
receive a surprisingly large number of requests for such resources: up
to 130 million requests per day, with periods of sustained bandwidth
usage of 350Mbps, for resources that haven't changed in years. The vast
majority of these requests are from systems that are processing various
types of markup (HTML, XML, XSLT, SVG) and in the process doing something
like validating against a DTD or schema. Handling all these requests
costs us considerably: servers, bandwidth and human time spent analyzing
traffic patterns and devising methods to limit or block excessive new
request patterns. We would much rather use these assets elsewhere, for
example improving the software and services needed by W3C and the Web
Community. You might think something like "don't request the same
resource thousands of times a day, especially when it explicitly tells
you it should be considered fresh for 90 days" would be obvious, but
unfortunately it seems not. At the W3C Systems Team's request the W3C
TAG has agreed to take up the issue of "Scalability of URI Access to
Resources."

1 comment:

Edson said...

I get the error below in many webservices of our company:

System.Web.HttpUnhandledException: System.Xml.Schema.XmlSchemaException: The global attribute ‘http://www.w3.org/XML/1998/namespace:lang’ has already been declared. at

As these webservices are widely called by our applications, I suspect that errors (which are many and frequent) occur due to excessive calls to W3c. How do I find out if this is the reason and, if so, how can I treat it? Is there any document or article that I can consult?
Thank you