|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
Re: HttpClientFeedFetcherNick,
Ok. I assume that you will not post it then.
Rae * Rae Egli * * (760) 684-4650 - regli@... *--------------------------------------------- ----- Original Message ---- From: Nick Lothian <nlothian@...> To: "dev@..." <dev@...> Sent: Sunday, June 29, 2008 4:07:14 PM Subject: FW: HttpClientFeedFetcher Please use the mailing list
From: Rae Egli [mailto:regli@...]
Nick,
As I mentioned, I now looked into HttpClientFeedFetcher. I looked at the timeout issues and am sending you the source with the new timeout handling incorporated. There are a few comments I'd like to make first though:
1. I am using the same method names as those used in URLConnection (setConnectTimeout, setReadTimeout) because I'd like to continue your approach to keep these classes as interchangeable as possible.
2. However, it also makes sense to allow setting and getting ClientParams for additional needs which is why there are setter and getter methods for these.
3. While testing some of my sample feeds using HttpClient. I came across some unpleasantries related to "useragent". In fact, I think that the statement: System.setProperty("httpclient.useragent", getUserAgent()); is far too late to make any difference and potentially should be deleted. Any changes that need to be made here should use addRequestHeader or similar anyway at this point in the code, I think.
4. I discovered that the feed: returned a 403: ERROR: Authentication required for that resource. HTTP Response code was:403
In order to fix it, I had to impersonate a browser (Opera in this case) via the method.addRequestHeader. Once I did that, everything is fine. You can duplicate the problem if you use the enclosed and heavily (and ugly) modified FeedReaderUsingFetcherWithHttpClient. Once you uncomment the addRequestHeader code, you'll see that after a potentially few timeout loops, the feed can be read.
As a result, a added the instance variable method with a getter so that these modifications can be applied by the requester. I know it is not very elegant but it does the job, at least for now.
A few more things. In contrast to URLConnection, the setConnectTimeout (setConnectionManagerTimeout) only failed once in all my tests even set to 1 millisec. Clearly HttpClient recognizes it as it's reported in its log. However, the real effect is minimal, it appears. This is clearly not true in the case of setConnectTimeout which allows the full control I was looking for.
Thanks for you help and I'd appreciate any feedback and discussion items.
Rae
IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party. This email represents the views of the individual sender, which do not necessarily reflect those of education.au limited except where the sender expressly states otherwise. It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects. education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this email. -----Inline Attachment Follows----- /* * Copyright 2004 Sun Microsystems, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. * */ package com.sun.syndication.fetcher.impl; import java.io.IOException; import java.io.InputStream; import java.net.HttpURLConnection; import java.net.MalformedURLException; import java.net.URL; import java.util.zip.GZIPInputStream; import org.apache.commons.httpclient.Credentials; import org.apache.commons.httpclient.Header; import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.URI; import org.apache.commons.httpclient.HttpException; import org.apache.commons.httpclient.HttpMethod; import org.apache.commons.httpclient.methods.GetMethod; import org.apache.commons.httpclient.params.HttpClientParams; import com.sun.syndication.feed.synd.SyndFeed; import com.sun.syndication.fetcher.FetcherEvent; import com.sun.syndication.fetcher.FetcherException; import com.sun.syndication.io.FeedException; import com.sun.syndication.io.SyndFeedInput; import com.sun.syndication.io.XmlReader; /** * @author Nick Lothian */ public class HttpClientFeedFetcher extends AbstractFeedFetcher { private FeedFetcherCache feedInfoCache; private CredentialSupplier credentialSupplier; private HttpClientParams httpClientParams; private HttpMethod method; public HttpClientFeedFetcher() { super(); // set default parameters setHttpClientParams(new HttpClientParams()); method = new GetMethod(); } /** * @param cache */ public HttpClientFeedFetcher(FeedFetcherCache cache) { this(); setFeedInfoCache(cache); } public HttpClientFeedFetcher(FeedFetcherCache cache, CredentialSupplier credentialSupplier) { this(cache); setCredentialSupplier(credentialSupplier); } /** * @return the feedInfoCache. */ public synchronized FeedFetcherCache getFeedInfoCache() { return feedInfoCache; } /** * @param feedInfoCache the feedInfoCache to set */ public synchronized void setFeedInfoCache(FeedFetcherCache feedInfoCache) { this.feedInfoCache = feedInfoCache; } /** * @return Returns the credentialSupplier. */ public synchronized CredentialSupplier getCredentialSupplier() { return credentialSupplier; } /** * @param credentialSupplier The credentialSupplier to set. */ public synchronized void setCredentialSupplier(CredentialSupplier credentialSupplier) { this.credentialSupplier = credentialSupplier; } /** * @return The currently used HttpClient GetMethod object. * */ public synchronized HttpMethod getHttpMethod() { return method; } /** * @return Returns the httpClientParams. */ public synchronized HttpClientParams getHttpClientParams() { return this.httpClientParams; } /** * @param httpClientParams The httpClientParams to set. */ public synchronized void setHttpClientParams(HttpClientParams httpClientParams) { this.httpClientParams = httpClientParams; } /** * @param timeout Sets the connect timeout for the HttpClient but using the URLConnection method name. * Uses the HttpClientParams method setConnectionManagerTimeout instead of setConnectTimeout * */ public synchronized void setConnectTimeout(int timeout) { httpClientParams.setConnectionManagerTimeout(timeout); } /** * @return The currently used connect timeout for the HttpClient but using the URLConnection method name. * Uses the HttpClientParams method getConnectionManagerTimeout instead of getConnectTimeout * */ public int getConnectTimeout() { return (int) this.getHttpClientParams().getConnectionManagerTimeout(); } /** * @return The currently used read timeout for the URLConnection, 0 is unlimited, i.e. no timeout */ public synchronized void setReadTimeout(int timeout) { httpClientParams.setSoTimeout(timeout); } /** * @param timeout Sets the read timeout for the URLConnection to a specified timeout, in milliseconds. */ public int getReadTimeout() { return (int) this.getHttpClientParams().getSoTimeout(); } /** * @see com.sun.syndication.fetcher.FeedFetcher#retrieveFeed(java.net.URL) */ public SyndFeed retrieveFeed(URL feedUrl) throws IllegalArgumentException, IOException, FeedException, FetcherException { if (feedUrl == null) { throw new IllegalArgumentException("null is not a valid URL"); } // TODO Fix this //System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.SimpleLog"); HttpClient client = new HttpClient(httpClientParams); if (getCredentialSupplier() != null) { client.getState().setAuthenticationPreemptive(true); // TODO what should realm be here? Credentials credentials = getCredentialSupplier().getCredentials(null, feedUrl.getHost()); if (credentials != null) { client.getState().setCredentials(null, feedUrl.getHost(), credentials); } } System.setProperty("httpclient.useragent", getUserAgent()); String urlStr = feedUrl.toString(); FeedFetcherCache cache = getFeedInfoCache(); if (cache != null) { // retrieve feed method.setURI(new URI(urlStr,true)); method.addRequestHeader("Accept-Encoding", "gzip"); try { if (isUsingDeltaEncoding()) { method.setRequestHeader("A-IM", "feed"); } // get the feed info from the cache // Note that syndFeedInfo will be null if it is not in the cache SyndFeedInfo syndFeedInfo = cache.getFeedInfo(feedUrl); if (syndFeedInfo != null) { method.setRequestHeader("If-None-Match", syndFeedInfo.getETag()); if (syndFeedInfo.getLastModified() instanceof String) { method.setRequestHeader("If-Modified-Since", (String)syndFeedInfo.getLastModified()); } } method.setFollowRedirects(true); int statusCode = client.executeMethod(method); fireEvent(FetcherEvent.EVENT_TYPE_FEED_POLLED, urlStr); handleErrorCodes(statusCode); SyndFeed feed = getFeed(syndFeedInfo, urlStr, method, statusCode); syndFeedInfo = buildSyndFeedInfo(feedUrl, urlStr, method, feed, statusCode); cache.setFeedInfo(new URL(urlStr), syndFeedInfo); // the feed may have been modified to pick up cached values // (eg - for delta encoding) feed = syndFeedInfo.getSyndFeed(); return feed; } finally { method.releaseConnection(); } } else { // cache is not in use HttpMethod method = new GetMethod(urlStr); try { method.setFollowRedirects(true); int statusCode = client.executeMethod(method); fireEvent(FetcherEvent.EVENT_TYPE_FEED_POLLED, urlStr); handleErrorCodes(statusCode); return getFeed(null, urlStr, method, statusCode); } finally { method.releaseConnection(); } } } /** * @param feedUrl * @param urlStr * @param method * @param feed * @return * @throws MalformedURLException */ private SyndFeedInfo buildSyndFeedInfo(URL feedUrl, String urlStr, HttpMethod method, SyndFeed feed, int statusCode) throws MalformedURLException { SyndFeedInfo syndFeedInfo; syndFeedInfo = new SyndFeedInfo(); // this may be different to feedURL because of 3XX redirects syndFeedInfo.setUrl(new URL(urlStr)); syndFeedInfo.setId(feedUrl.toString()); Header imHeader = method.getResponseHeader("IM"); if (imHeader != null && imHeader.getValue().indexOf("feed") >= 0 && isUsingDeltaEncoding()) { FeedFetcherCache cache = getFeedInfoCache(); if (cache != null && statusCode == 226) { // client is setup to use http delta encoding and the server supports it and has returned a delta encoded response // This response only includes new items SyndFeedInfo cachedInfo = cache.getFeedInfo(feedUrl); if (cachedInfo != null) { SyndFeed cachedFeed = cachedInfo.getSyndFeed(); // set the new feed to be the orginal feed plus the new items feed = combineFeeds(cachedFeed, feed); } } } Header lastModifiedHeader = method.getResponseHeader("Last-Modified"); if (lastModifiedHeader != null) { syndFeedInfo.setLastModified(lastModifiedHeader.getValue()); } Header eTagHeader = method.getResponseHeader("ETag"); if (eTagHeader != null) { syndFeedInfo.setETag(eTagHeader.getValue()); } syndFeedInfo.setSyndFeed(feed); return syndFeedInfo; } /** * @param client * @param urlStr * @param method * @return * @throws IOException * @throws HttpException * @throws FetcherException * @throws FeedException */ private static SyndFeed retrieveFeed(String urlStr, HttpMethod method) throws IOException, HttpException, FetcherException, FeedException { InputStream stream = null; if ((method.getResponseHeader("Content-Encoding") != null) && ("gzip".equalsIgnoreCase(method.getResponseHeader("Content-Encoding").getValue()))) { stream = new GZIPInputStream(method.getResponseBodyAsStream()); } else { stream = method.getResponseBodyAsStream(); } try { XmlReader reader = null; if (method.getResponseHeader("Content-Type") != null) { reader = new XmlReader(stream, method.getResponseHeader("Content-Type").getValue(), true); } else { reader = new XmlReader(stream, true); } return new SyndFeedInput().build(reader); } finally { if (stream != null) { stream.close(); } } } private SyndFeed getFeed(SyndFeedInfo syndFeedInfo, String urlStr, HttpMethod method, int statusCode) throws IOException, HttpException, FetcherException, FeedException { if (statusCode == HttpURLConnection.HTTP_NOT_MODIFIED && syndFeedInfo != null) { fireEvent(FetcherEvent.EVENT_TYPE_FEED_UNCHANGED, urlStr); return syndFeedInfo.getSyndFeed(); } SyndFeed feed = retrieveFeed(urlStr, method); fireEvent(FetcherEvent.EVENT_TYPE_FEED_RETRIEVED, urlStr, feed); return feed; } public interface CredentialSupplier { public Credentials getCredentials(String realm, String host); } } -----Inline Attachment Follows----- /* * Copyright 2004 Sun Microsystems, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. * */ package collab.feed.fetcher.samples; import java.net.SocketTimeoutException; import java.net.URL; import org.apache.commons.httpclient.params.HttpClientParams; import org.apache.commons.httpclient.HttpMethod; import com.sun.syndication.feed.synd.SyndFeed; import com.sun.syndication.fetcher.FeedFetcher; import com.sun.syndication.fetcher.FetcherEvent; import com.sun.syndication.fetcher.FetcherListener; import com.sun.syndication.fetcher.impl.FeedFetcherCache; import com.sun.syndication.fetcher.impl.HashMapFeedInfoCache; import com.sun.syndication.fetcher.impl.HttpClientFeedFetcher; /** * Reads and prints any RSS/Atom feed type. Converted from the * original Rome sample FeedReader * <p> * @author Alejandro Abdelnur * @author Nick Lothian * */ public class FeedReaderUsingFetcherWithHttpClient { public static void main(String[] args) { boolean ok = false; int connectTimeout = 100; int readTimeout = 200; HttpClientParams httpClientParams = new HttpClientParams(); HttpClientParams httpAltClientParams = new HttpClientParams(); System.setProperty("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.SimpleLog"); System.setProperty("org.apache.commons.logging.simplelog.showdatetime", "true"); System.setProperty("org.apache.commons.logging.simplelog.log.httpclient.wire.header", "debug"); System.setProperty("org.apache.commons.logging.simplelog.log.org.apache.commons.httpclient", "debug"); if (args.length==1) { try { URL feedUrl = new URL(args[0]); FeedFetcherCache feedInfoCache = HashMapFeedInfoCache.getInstance(); // FeedFetcher fetcher = new HttpURLFeedFetcher(feedInfoCache); HttpClientFeedFetcher conn = new HttpClientFeedFetcher(feedInfoCache); httpClientParams.setConnectionManagerTimeout(connectTimeout); httpClientParams.setSoTimeout(readTimeout); httpAltClientParams = httpClientParams; conn.setHttpClientParams(httpClientParams); // conn.setConnectTimeout(connectTimeout); // conn.setReadTimeout(readTimeout); HttpMethod method = conn.getHttpMethod(); method.addRequestHeader("User-Agent", "Opera/9.25 (Windows NT 5.1; U; en)"); method.addRequestHeader("Accept" ,"text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1"); method.addRequestHeader("Accept-Language", "fr-LU,fr;q=0.9,en;q=0.8"); method.addRequestHeader("Accept-Charset", "iso-8859-1, utf-8, utf-16, *;q=0.1"); method.addRequestHeader("Accept-Encoding", "deflate, gzip, x-gzip, identity, *;q=0"); method.addRequestHeader("Connection", "Keep-Alive, TE"); method.addRequestHeader("TE", "deflate, gzip, chunked, identity, trailers"); FeedFetcher fetcher = conn; FetcherEventListenerImpl listener = new FetcherEventListenerImpl(); fetcher.addFetcherEventListener(listener); System.err.println("Retrieving feed " + feedUrl); // Retrieve the feed. // We will get a Feed Polled Event and then a // Feed Retrieved event (assuming the feed is valid) int i = 0; while (true) { try { SyndFeed feed = fetcher.retrieveFeed(feedUrl); System.err.println(feedUrl + " retrieved"); System.err.println(feedUrl + " has a title: " + feed.getTitle() + " and contains " + feed.getEntries().size() + " entries."); break; } catch (SocketTimeoutException e){ i++; if (i > 5) { System.err.println(feedUrl + " tried more than 5 times to read but unable to complete"); return; } System.err.println(feedUrl + " retrieve failed - " + e.getMessage()); System.err.println(feedUrl + " now extending timeout"); connectTimeout = connectTimeout + 50; readTimeout = readTimeout + 230; // httpClientParams.setConnectionManagerTimeout(connectTimeout); // httpClientParams.setSoTimeout(readTimeout); httpAltClientParams.setConnectionManagerTimeout(connectTimeout); httpAltClientParams.setSoTimeout(readTimeout); conn.setHttpClientParams(httpAltClientParams); // conn.setConnectTimeout(connectTimeout); // conn.setReadTimeout(readTimeout); } } // We will now retrieve the feed again. If the feed is unmodified // and the server supports conditional gets, we will get a "Feed // Unchanged" event after the Feed Polled event System.err.println("Polling " + feedUrl + " again to test conditional get support."); SyndFeed feed2 = fetcher.retrieveFeed(feedUrl); System.err.println("If a \"Feed Unchanged\" event fired then the server supports conditional gets."); ok = true; } catch (Exception ex) { System.out.println("ERROR: "+ex.getMessage()); ex.printStackTrace(); } } if (!ok) { System.out.println(); System.out.println("FeedReader reads and prints any RSS/Atom feed type."); System.out.println("The first parameter must be the URL of the feed to read."); System.out.println(); } } static class FetcherEventListenerImpl implements FetcherListener { /** * @see com.sun.syndication.fetcher.FetcherListener#fetcherEvent(com.sun.syndication.fetcher.FetcherEvent) */ public void fetcherEvent(FetcherEvent event) { String eventType = event.getEventType(); if (FetcherEvent.EVENT_TYPE_FEED_POLLED.equals(eventType)) { System.err.println("\tEVENT: Feed Polled. URL = " + event.getUrlString()); } else if (FetcherEvent.EVENT_TYPE_FEED_RETRIEVED.equals(eventType)) { System.err.println("\tEVENT: Feed Retrieved. URL = " + event.getUrlString()); } else if (FetcherEvent.EVENT_TYPE_FEED_UNCHANGED.equals(eventType)) { System.err.println("\tEVENT: Feed Unchanged. URL = " + event.getUrlString()); } } } } To unsubscribe, e-mail: dev-unsubscribe@... For additional commands, e-mail: dev-help@... |
|
|
RE: HttpClientFeedFetcherI’m just reviewing the code now Nick, Ok. I assume that you will not post it then. Rae
*--------------------------------------------- ----- Original Message ---- Please use the mailing list [snip] IMPORTANT: This e-mail, including any attachments, may contain private or confidential information. If you think you may not be the intended recipient, or if you have received this e-mail in error, please contact the sender immediately and delete all copies of this e-mail. If you are not the intended recipient, you must not reproduce any part of this e-mail or disclose its contents to any other party. This email represents the views of the individual sender, which do not necessarily reflect those of education.au limited except where the sender expressly states otherwise. It is your responsibility to scan this email and any files transmitted with it for viruses or any other defects. education.au limited will not be liable for any loss, damage or consequence caused directly or indirectly by this email. |
| Free Forum Powered by Nabble | Forum Help |