5 Replies Latest reply on Dec 31, 2007 8:55 AM by ericjava

    Don't use DTDs in your .xml files

      Here's a quick blog entry on some of the fun that can arise when using DTD declarations in your .xml files within Seam:

      http://chiralsoftware.com/blog/No-route-to-host-while-parsing-b6b5f0cbb79e5cd8.html

      Short conclusion: don't.

        • 1. Re: Don't use DTDs in your .xml files
          yilmaz_

          That is not true. Scheme provides validation templates for your xml file. This provides platform independency. This files bundled with jar files. If dom4j can not find it. It downloads it from internet. I think this guy has no knowlegde about this or he has some serious configuration issues.

          • 2. Re: Don't use DTDs in your .xml files

            I think the problem is not that you were using a DTD but that you were using the wrong DTD. There is a Seam 2 pages DTD. If you use that instead of the 1.2 DTD you should be OK.

            • 3. Re: Don't use DTDs in your .xml files

               

              "yilmaz_" wrote:
              That is not true. Scheme provides validation templates for your xml file.


              Right, but the point is, if the parser can't find the schema file, it doesn't try to fetch it over the net.

              "yilmaz_" wrote:
              If dom4j can not find it. It downloads it from internet.


              And I'm making the point that that is a bad thing.

              "yilmaz_" wrote:
              I think this guy has no knowlegde about this or he has some serious configuration issues.


              Well, obviously the configuration issue is that there is an error in a pages.xml file. What's bad is how this system responded to the error.

              A good response: "In the file pages.xml, you refer to DTD: http://... which isn't in the classpath."

              A bad response: silently making an outgoing network connection, and then failing with a "no route to host" error without even telling me which file it's trying to get.

              And then I go on to make the point that if any website is using dom4j to parse user-supplied XML documents, it's possible to create a document which contains a line with a malicious DTD URL, and that could in fact be exploitable.

              I perfectly understand about DTDs, but you can expect, especially in a large application, there could be some pages.xml file somewhere that's still using an old DTD when switching to a newer version of the JSF jar or whatever, and that can result in one behaviour with a network connection and a different behaviour without, which is really bad.

              dom4j shouldn't be doing this kind of thing.

              • 4. Re: Don't use DTDs in your .xml files
                pmuir

                Please a JIRA issue for a better error message from Seam if the dtd/xsd can't be found

                • 5. Re: Don't use DTDs in your .xml files

                  Done:

                  http://jira.jboss.org/jira/browse/JBSEAM-2441

                  But this really is a long-standing bug / deficiency in dom4j. It should have various configuration options of what to do if a DTD can't be found:

                  1. Silently ignore the problem and continue parsing
                  2. Throw an exception that says, "I don't have that DTD in my classpath." This should be the default behaviour.
                  3. Attempt to fetch the resource over the net. This should need to be explicitly configured, because it's almost always the wrong thing to do, and is probably going to fail anyway.