[Proposal] fix http.getFileList()

View: New views
6 Messages — Rating Filter:   Alert me  

[Proposal] fix http.getFileList()

by Joakim Erdfelt-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'd like to fix a few faults in ...

https://svn.apache.org/repos/asf/maven/wagon/trunk/wagon-providers/wagon-http-shared/src/main/java/org/apache/maven/wagon/shared/http/HtmlFileListParser.java

... with regards to detecting links.

Jtidy is the main culprit, and the use of the various plexus string
utility functions to determine if a link belongs to the page itself and
not the parent folder or a downstream folder (this is not a problem with
plexus StringUtils, just a bad use of it, and a bad assumption that all
we needed was string manipulations), as well as detecting if it is an
absolute file vs a dynamic file.

This was fixed in an alternative implementation found at ...

http://svn.apache.org/repos/asf/maven/wagon/branches/wagon-http-with-webdav/src/main/java/org/apache/maven/wagon/providers/http/links/LinkParser.java

... using nekohtml with the java.net.URI class routines and straight
jaxp parsing.

For the record, this is not a proposal for the other functionality
within wagon-http-with-webdav proof of concept, just the link parsing
and detecting needed by the Wagon.getFileList() method with straight
http wagons.

- Joakim

---------------------------------------------------------------------
To unsubscribe, e-mail: wagon-dev-unsubscribe@...
For additional commands, e-mail: wagon-dev-help@...


Re: [Proposal] fix http.getFileList()

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

+1. Neko is smaller than jtidy.

I wonder if it actually makes sense to bring in just a subset of neko  
to suit this purpose, either in source or binary form, to make it even  
smaller and to shade the classes?

- Brett

On 12/04/2008, at 1:19 PM, Joakim Erdfelt wrote:

> I'd like to fix a few faults in ...
>
> https://svn.apache.org/repos/asf/maven/wagon/trunk/wagon-providers/wagon-http-shared/src/main/java/org/apache/maven/wagon/shared/http/HtmlFileListParser.java
>
> ... with regards to detecting links.
>
> Jtidy is the main culprit, and the use of the various plexus string  
> utility functions to determine if a link belongs to the page itself  
> and not the parent folder or a downstream folder (this is not a  
> problem with plexus StringUtils, just a bad use of it, and a bad  
> assumption that all we needed was string manipulations), as well as  
> detecting if it is an absolute file vs a dynamic file.
>
> This was fixed in an alternative implementation found at ...
>
> http://svn.apache.org/repos/asf/maven/wagon/branches/wagon-http-with-webdav/src/main/java/org/apache/maven/wagon/providers/http/links/LinkParser.java
>
> ... using nekohtml with the java.net.URI class routines and straight  
> jaxp parsing.
>
> For the record, this is not a proposal for the other functionality  
> within wagon-http-with-webdav proof of concept, just the link  
> parsing and detecting needed by the Wagon.getFileList() method with  
> straight http wagons.
>
> - Joakim
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
> For additional commands, e-mail: wagon-dev-help@...

--
Brett Porter
brett@...
http://blogs.exist.com/bporter/


---------------------------------------------------------------------
To unsubscribe, e-mail: wagon-dev-unsubscribe@...
For additional commands, e-mail: wagon-dev-help@...


Re: [Proposal] fix http.getFileList()

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Any thoughts with moving forward with this Joakim?

- Brett

On 14/04/2008, at 9:42 AM, Brett Porter wrote:

> +1. Neko is smaller than jtidy.
>
> I wonder if it actually makes sense to bring in just a subset of  
> neko to suit this purpose, either in source or binary form, to make  
> it even smaller and to shade the classes?
>
> - Brett
>
> On 12/04/2008, at 1:19 PM, Joakim Erdfelt wrote:
>
>> I'd like to fix a few faults in ...
>>
>> https://svn.apache.org/repos/asf/maven/wagon/trunk/wagon-providers/wagon-http-shared/src/main/java/org/apache/maven/wagon/shared/http/HtmlFileListParser.java
>>
>> ... with regards to detecting links.
>>
>> Jtidy is the main culprit, and the use of the various plexus string  
>> utility functions to determine if a link belongs to the page itself  
>> and not the parent folder or a downstream folder (this is not a  
>> problem with plexus StringUtils, just a bad use of it, and a bad  
>> assumption that all we needed was string manipulations), as well as  
>> detecting if it is an absolute file vs a dynamic file.
>>
>> This was fixed in an alternative implementation found at ...
>>
>> http://svn.apache.org/repos/asf/maven/wagon/branches/wagon-http-with-webdav/src/main/java/org/apache/maven/wagon/providers/http/links/LinkParser.java
>>
>> ... using nekohtml with the java.net.URI class routines and  
>> straight jaxp parsing.
>>
>> For the record, this is not a proposal for the other functionality  
>> within wagon-http-with-webdav proof of concept, just the link  
>> parsing and detecting needed by the Wagon.getFileList() method with  
>> straight http wagons.
>>
>> - Joakim
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
>> For additional commands, e-mail: wagon-dev-help@...
>
> --
> Brett Porter
> brett@...
> http://blogs.exist.com/bporter/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
> For additional commands, e-mail: wagon-dev-help@...
>

--
Brett Porter
brett@...
http://blogs.exist.com/bporter/


---------------------------------------------------------------------
To unsubscribe, e-mail: wagon-dev-unsubscribe@...
For additional commands, e-mail: wagon-dev-help@...


Re: [Proposal] fix http.getFileList()

by brettporter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Joakim? Down to just 5 open issues (and I'm considering moving some of  
them out), so if you want to get this in, now would be the time :)

- Brett

On 20/05/2008, at 8:30 AM, Brett Porter wrote:

> Any thoughts with moving forward with this Joakim?
>
> - Brett
>
> On 14/04/2008, at 9:42 AM, Brett Porter wrote:
>
>> +1. Neko is smaller than jtidy.
>>
>> I wonder if it actually makes sense to bring in just a subset of  
>> neko to suit this purpose, either in source or binary form, to make  
>> it even smaller and to shade the classes?
>>
>> - Brett
>>
>> On 12/04/2008, at 1:19 PM, Joakim Erdfelt wrote:
>>
>>> I'd like to fix a few faults in ...
>>>
>>> https://svn.apache.org/repos/asf/maven/wagon/trunk/wagon-providers/wagon-http-shared/src/main/java/org/apache/maven/wagon/shared/http/HtmlFileListParser.java
>>>
>>> ... with regards to detecting links.
>>>
>>> Jtidy is the main culprit, and the use of the various plexus  
>>> string utility functions to determine if a link belongs to the  
>>> page itself and not the parent folder or a downstream folder (this  
>>> is not a problem with plexus StringUtils, just a bad use of it,  
>>> and a bad assumption that all we needed was string manipulations),  
>>> as well as detecting if it is an absolute file vs a dynamic file.
>>>
>>> This was fixed in an alternative implementation found at ...
>>>
>>> http://svn.apache.org/repos/asf/maven/wagon/branches/wagon-http-with-webdav/src/main/java/org/apache/maven/wagon/providers/http/links/LinkParser.java
>>>
>>> ... using nekohtml with the java.net.URI class routines and  
>>> straight jaxp parsing.
>>>
>>> For the record, this is not a proposal for the other functionality  
>>> within wagon-http-with-webdav proof of concept, just the link  
>>> parsing and detecting needed by the Wagon.getFileList() method  
>>> with straight http wagons.
>>>
>>> - Joakim
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
>>> For additional commands, e-mail: wagon-dev-help@...
>>
>> --
>> Brett Porter
>> brett@...
>> http://blogs.exist.com/bporter/
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
>> For additional commands, e-mail: wagon-dev-help@...
>>
>
> --
> Brett Porter
> brett@...
> http://blogs.exist.com/bporter/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
> For additional commands, e-mail: wagon-dev-help@...
>

--
Brett Porter
brett@...
http://blogs.exist.com/bporter/


---------------------------------------------------------------------
To unsubscribe, e-mail: wagon-dev-unsubscribe@...
For additional commands, e-mail: wagon-dev-help@...


Re: [Proposal] fix http.getFileList()

by Joakim Erdfelt-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'll roll this into wagon-http-shared this weekend.

- Joakim

Brett Porter wrote:

> Joakim? Down to just 5 open issues (and I'm considering moving some of
> them out), so if you want to get this in, now would be the time :)
>
> - Brett
>
> On 20/05/2008, at 8:30 AM, Brett Porter wrote:
>
>> Any thoughts with moving forward with this Joakim?
>>
>> - Brett
>>
>> On 14/04/2008, at 9:42 AM, Brett Porter wrote:
>>
>>> +1. Neko is smaller than jtidy.
>>>
>>> I wonder if it actually makes sense to bring in just a subset of
>>> neko to suit this purpose, either in source or binary form, to make
>>> it even smaller and to shade the classes?
>>>
>>> - Brett
>>>
>>> On 12/04/2008, at 1:19 PM, Joakim Erdfelt wrote:
>>>
>>>> I'd like to fix a few faults in ...
>>>>
>>>> https://svn.apache.org/repos/asf/maven/wagon/trunk/wagon-providers/wagon-http-shared/src/main/java/org/apache/maven/wagon/shared/http/HtmlFileListParser.java 
>>>>
>>>>
>>>> ... with regards to detecting links.
>>>>
>>>> Jtidy is the main culprit, and the use of the various plexus string
>>>> utility functions to determine if a link belongs to the page itself
>>>> and not the parent folder or a downstream folder (this is not a
>>>> problem with plexus StringUtils, just a bad use of it, and a bad
>>>> assumption that all we needed was string manipulations), as well as
>>>> detecting if it is an absolute file vs a dynamic file.
>>>>
>>>> This was fixed in an alternative implementation found at ...
>>>>
>>>> http://svn.apache.org/repos/asf/maven/wagon/branches/wagon-http-with-webdav/src/main/java/org/apache/maven/wagon/providers/http/links/LinkParser.java 
>>>>
>>>>
>>>> ... using nekohtml with the java.net.URI class routines and
>>>> straight jaxp parsing.
>>>>
>>>> For the record, this is not a proposal for the other functionality
>>>> within wagon-http-with-webdav proof of concept, just the link
>>>> parsing and detecting needed by the Wagon.getFileList() method with
>>>> straight http wagons.
>>>>
>>>> - Joakim
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
>>>> For additional commands, e-mail: wagon-dev-help@...
>>>
>>> --
>>> Brett Porter
>>> brett@...
>>> http://blogs.exist.com/bporter/
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
>>> For additional commands, e-mail: wagon-dev-help@...
>>>
>>
>> --
>> Brett Porter
>> brett@...
>> http://blogs.exist.com/bporter/
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
>> For additional commands, e-mail: wagon-dev-help@...
>>
>
> --
> Brett Porter
> brett@...
> http://blogs.exist.com/bporter/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: wagon-dev-unsubscribe@...
> For additional commands, e-mail: wagon-dev-help@...
>


--
- Joakim Erdfelt
  joakim@...
  Open Source Software (OSS) Developer


---------------------------------------------------------------------
To unsubscribe, e-mail: wagon-dev-unsubscribe@...
For additional commands, e-mail: wagon-dev-help@...


Re: [Proposal] fix http.getFileList()

by Joakim Erdfelt-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This change has been rolled into wagon-http-shared

- Joakim

Joakim Erdfelt wrote:
> I'll roll this into wagon-http-shared this weekend.
>
> - Joakim
>
> Brett Porter wrote:
>> Joakim? Down to just 5 open issues (and I'm considering moving some
>> of them out), so if you want to get this in, now would be the time :)
>>
>> - Brett


---------------------------------------------------------------------
To unsubscribe, e-mail: wagon-dev-unsubscribe@...
For additional commands, e-mail: wagon-dev-help@...

LightInTheBox - Buy quality products at wholesale price!