Re: Timeout errors using Net::HTTP on Windows
- From: Steve Midgley <public@xxxxxxxxxx>
- Date: Sun, 1 Jul 2007 00:56:39 +0900
Hi Raghu,
Since this weird Net timeout issue has a solution now, I'd recommend
that you spin out a series of threads to scrape your websites. That way,
even if each thread spends a long time waiting on Net:HTTP, you still
have lots of other threads working on other sites - aggregate wait time
goes way down if you have 100 simultaneous HTTP requests all waiting on
different sites. Be careful not to have too many threads request from
the same site at the same time - not only is this bad manners but some
systems will block you instantly if you do this.
Pragprog guys have a good little tutorial on threading which even
includes some info on net requests (though be sure to make the changes
described above for execution expired issues first):
http://www.rubycentral.com/book/tut_threads.html
Steve
Raghu Kumar wrote:
Hi,
Thanks a lot... I was struck with this problem ... now I have no
problem.
But in my project I have lot of different web pages to scrap, is
anything possible to tweak so that responses are fast ?
Raghu
Ben Brightwell wrote:
Interesting results!!! What an understatement!! How on earth did you
figure that out? It worked like a charm. Initially my problem was just
that the HTTP request took FOREVER... then i started randomly getting
the timeout errors mentioned above. I went into the net/protocol.rb file
and made this change and voila!. Not only did the errors go away, but
the request takes literally 1/30th of the time it did before it
eventually got the data or even timed out.. Marcin Coles, you are the
genius of the day! Thank you!
Marcin Coles wrote:
Well, I'm no expert, and I certainly wasn't going to learn another
language, so I decided to do some tests because this was a problem for
me too.
I went into protocol.rb, to the rbuf_fill method (where it actually
starts the timeout thread).
the code there was
def rbuf_fill
timeout(@read_timeout) {
@rbuf << @io.sysread(1024)
}
end
now timeout takes 2 parameters - a time in seconds(?) and an exception
class to raise (defaults to Error).
When I changed the code to the following, it started to work for me.
New code:
def rbuf_fill
timeout(@read_timeout,ProtocolError) {
@rbuf << @io.sysread(1024)
}
end
Obviously this is not exactly extensively tested - but an interesting
results.
Marcin
--
Posted via http://www.ruby-forum.com/.
.
- References:
- Re: Timeout errors using Net::HTTP on Windows
- From: Ben Brightwell
- Re: Timeout errors using Net::HTTP on Windows
- From: Raghu Kumar
- Re: Timeout errors using Net::HTTP on Windows
- Prev by Date: Tk TopLevel window
- Next by Date: Re: is it bug? for
- Previous by thread: Re: Timeout errors using Net::HTTP on Windows
- Next by thread: [ANN] attributes-3.3.0
- Index(es):
Relevant Pages
|
Loading