|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Suggested patches for resolver: Windows driver-letter paths and resolveSystem() and <uri>While working with xml-commons-resolver, I discovered that the code
does not handle pathnames that utilize window's driver letters. The code appears to lose the "absoluteness" of the path, causing resolution of other entities/files to fail that have it for a base. Also, and probably a little more controversal, is the resolution of system IDs. I noticed that resolverSystem() does not do a resolveURI() if no system mapping exists. This is a problem when using the <schemavalidate> task in Ant. I'm working with catalogs that contain numerous <uri> entries to remap http URLs to local file URLs. Unfortunately, Ant/Xerces fails to resolve to the local URLs because resolveSytem() is used (because the URLs appears in SYSTEM idenitifiers in the documents). The XML resolution spec is not clear if the resolver should also check <uri> entries or if the XML parser should do a URI lookup if a SYSTEM lookup fails (it appears Saxon may actually do this since it does not have the problem that Ant/Xerces does). To address the immediate problem, I checked resolveSystem() to call resolveURI() if it fails to find anything, and the change is in the Catalog.java.patch attached (but the patch also includes the Windows pathname fix also). Are any of these changes worth including in the resolver code base? --ewh |
|
|
Re: Suggested patches for resolver: Windows driver-letter paths and resolveSystem() and <uri>Hi Earl,
The change you're suggesting to resolveSystem() would break compatibility. It also doesn't fit with the semantics of the method though I'm sure you knew that before you suggested it. Any user of the resolver already has the power to call resolveURI() after resolveSystem() if they choose to or make any other sequence of calls they want on the Catalog. Have you considered asking the Ant developers to modify the behaviour of the <schemavalidate> task or provide some way to tune it? As for your Windows drive letter patch you should attach that to a Bugzilla issue [1]. A warning though ... No one has been maintaining the codebase these days so can't say when that would get reviewed or committed. Probably going to take a developer with an itch to scratch to get things moving again. Thanks. [1] https://issues.apache.org/bugzilla/ Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: mrglavas@... E-mail: mrglavas@... earlhood@... wrote on 04/14/2008 07:00:01 PM: > While working with xml-commons-resolver, I discovered that the code > does not handle pathnames that utilize window's driver letters. The code > appears to lose the "absoluteness" of the path, causing resolution of > other entities/files to fail that have it for a base. > > Also, and probably a little more controversal, is the resolution of > system IDs. > I noticed that resolverSystem() does not do a resolveURI() if no system > mapping exists. This is a problem when using the <schemavalidate> > task in Ant. I'm working with catalogs that contain numerous <uri> > to remap http URLs to local file URLs. > > Unfortunately, Ant/Xerces fails to resolve to the local URLs because > resolveSytem() is used (because the URLs appears in SYSTEM idenitifiers > in the documents). > > The XML resolution spec is not clear if the resolver should also check <uri> > entries or if the XML parser should do a URI lookup if a SYSTEM lookup > fails (it appears Saxon may actually do this since it does not have > the problem > that Ant/Xerces does). > > To address the immediate problem, I checked resolveSystem() to call > resolveURI() if it fails to find anything, and the change is in the > Catalog.java.patch > attached (but the patch also includes the Windows pathname fix also). > > Are any of these changes worth including in the resolver code base? > > --ewh |
|
|
Re: Suggested patches for resolver: Windows driver-letter paths and resolveSystem() and <uri>On April 17, 2008 at 00:49, Michael Glavassevich wrote:
> The change you're suggesting to resolveSystem() would break compatibility. > It also doesn't fit with the semantics of the method though I'm sure you > knew that before you suggested it. Any user of the resolver already has the > power to call resolveURI() after resolveSystem() if they choose to or make > any other sequence of calls they want on the Catalog. Have you considered > asking the Ant developers to modify the behaviour of the <schemavalidate> > task or provide some way to tune it? I think it may be Xerces since that is what Ant uses by default. When I get time, I can examine the Xerces code to see if I can provide a patch for it. A concern I have about the resolving algorithm, as it is described in the W3C doc, is that it appears to lack how <uri> entries are to be handled. It seems to properly resolve something, a <uri> entry check should always be done, probably at the end of resolving a public ID, system ID, or entity. If the resolver code does not do this, at least the entity manager (which in essences is a "resolver") should. If it is something that all entity managers should do, why not encapsulate it in the resolver? > As for your Windows drive letter patch you should attach that to a Bugzilla > issue [1]. A warning though ... No one has been maintaining the codebase > these days so can't say when that would get reviewed or committed. Probably > going to take a developer with an itch to scratch to get things moving > again. ... > [1] https://issues.apache.org/bugzilla/ Thanks for the pointer. --ewh -- Earl Hood, <earl@...> Web: <http://www.earlhood.com/> PGP Public Key: <http://www.earlhood.com/gpgpubkey.txt> |
|
|
Re: Suggested patches for resolver: Windows driver-letter paths and resolveSystem() and <uri>Earl Hood <earl@...> wrote on 04/17/2008 10:32:43 AM:
> On April 17, 2008 at 00:49, Michael Glavassevich wrote: > > > The change you're suggesting to resolveSystem() would break compatibility. > > It also doesn't fit with the semantics of the method though I'm sure you > > knew that before you suggested it. Any user of the resolver already has the > > power to call resolveURI() after resolveSystem() if they choose to or make > > any other sequence of calls they want on the Catalog. Have you considered > > asking the Ant developers to modify the behaviour of the <schemavalidate> > > task or provide some way to tune it? > > I think it may be Xerces since that is what Ant uses by default. > When I get time, I can examine the Xerces code to see if I can > provide a patch for it. It is Ant or whatever application which uses the parser that is in control of resource resolution. EntityResolver, LSResourceResolver and its friends are just interfaces. Xerces calls whatever implementation that's been registered with it. It's the application's responsibility to choose or write an implementation which does what it needs. > A concern I have about the resolving algorithm, as it is described > in the W3C doc, is that it appears to lack how <uri> entries are to > be handled. It seems to properly resolve something, a <uri> entry > check should always be done, probably at the end of resolving > a public ID, system ID, or entity. > > If the resolver code does not do this, at least the entity manager > (which in essences is a "resolver") should. If it is something > that all entity managers should do, why not encapsulate it in the > resolver? > > > As for your Windows drive letter patch you should attach that to a > > issue [1]. A warning though ... No one has been maintaining the codebase > > these days so can't say when that would get reviewed or committed. Probably > > going to take a developer with an itch to scratch to get things moving > > again. > ... > > [1] https://issues.apache.org/bugzilla/ > > Thanks for the pointer. > > --ewh > -- > Earl Hood, <earl@...> > Web: <http://www.earlhood.com/> > PGP Public Key: <http://www.earlhood.com/gpgpubkey.txt> Thanks. Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: mrglavas@... E-mail: mrglavas@... |
|
|
Re: Suggested patches for resolver: Windows driver-letter paths and resolveSystem() and <uri>Earl Hood wrote:
> While working with xml-commons-resolver, I discovered that the code > does not handle pathnames that utilize window's driver letters. The code > appears to lose the "absoluteness" of the path, causing resolution of > other entities/files to fail that have it for a base. Earl, i don't know if this is related, but at Apache Forrest we have troubles in a certain situation. The newest Resolver works fine on Windows when we use it with Xerces via Apache Cocoon, but fails when we use it via Apache Ant. Last time i looked, none of our Windows developers have yet found time to take it up with the Ant project. Here is my reply to Norman Walsh on this commons-dev list: http://markmail.org/message/4vozuvf2gwwuk33k Re: File URLs will be the death of me Date: 03 Jul 2006 which also links to an issue report. When i investigated that issue i found that Ant had some strange code for Windows path handling in the "xmlcatalog" task. It seemed to be a workaround for problems with Resolver. After Norm fixed it here at Apache XML Commons, perhaps the workaround in Ant now fails. Dunno. -David |
|
|
Re: Suggested patches for resolver: Windows driver-letter paths and resolveSystem() and <uri>On April 17, 2008 at 00:49, Michael Glavassevich wrote:
> The change you're suggesting to resolveSystem() would break compatibility. > It also doesn't fit with the semantics of the method though I'm sure you > knew that before you suggested it. Any user of the resolver already has the > power to call resolveURI() after resolveSystem() if they choose to or make > any other sequence of calls they want on the Catalog. Have you considered > asking the Ant developers to modify the behaviour of the <schemavalidate> > task or provide some way to tune it? Maybe the patch should go into the classes in the tools area versus Catalog itself. I.e. The classes that actually implement the EntityResolver and URIResolver interfaces should by updated to do a URI-map lookup. This way, Catalog semantics are preserved, but the resolving classes are fixed. Thoughts? --ewh |
| Free Forum Powered by Nabble | Forum Help |