Xponent logo Xponent
Specialists In Large XML Documents Privacy Policy  Contact

Xponent's Mostly XML Blog


Http Overload

An article (http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic) posted on the W3C team blog by Ted Guild on February 8, 2008 describes a significant problem that is easily preventable if developers exercise due diligence. Guild explains that software applications wrongly attempt to access http URIs with the result of excessive and uncecessary load on W3C servers. Guild gives the following two examples:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" ...>

Guild points out that "these are not hyperlinks...." and further that "software does not usually need to fetch these resources." He offers several suggestions to developers on how to prevent this problem, but does not give any code examples.

Microsoft developers using the XmlReader class to parse xml simply have to add the following two lines to their code to prevent the XmlReader from accessing URIs referenced in a DTD declaration:

settings.ProhibitDtd = false;
settings.XmlResolver = null;

where settings is an instance of the XmlReaderSettings class. Assigning a false value toProhibitDtd will prevent the XmlReader from throwing an exception when a DTD reference is encountered, whereas setting it true will, which will cause the reader to abort further parsing. Setting the XmlResolver to null causes the reader to ignore the externally referenced DTD. This allows the reader to parse the entire xml document without accessing an externally referenced DTD.

Guild states "Yet we receive a surprisingly large number of requests for such resources: up to 130 million requests per day, with periods of sustained bandwidth usage of 350Mbps, for resources that haven't changed in years." Writing a followup comment to his own article, Guild states on June 15, 2009 "Java based applications and libraries are presently accounting for nearly 1/4th of our DTD traffic (in the hundred of millions a day). There is also another more substantial source of traffic which the vendor is working to correct in the hopefully near future."

Visit the above referenced blog to get the latest developments on this issue.

Submitted by Bill Conniff, Founder of Xponent, on October 23, 2009




Copyright Ⓒ 2008-2023. Xponent LLC. All rights reserved.