Re: Timeout errors using Net::HTTP on Windows



Hi Raghu,

Since this weird Net timeout issue has a solution now, I'd recommend
that you spin out a series of threads to scrape your websites. That way,
even if each thread spends a long time waiting on Net:HTTP, you still
have lots of other threads working on other sites - aggregate wait time
goes way down if you have 100 simultaneous HTTP requests all waiting on
different sites. Be careful not to have too many threads request from
the same site at the same time - not only is this bad manners but some
systems will block you instantly if you do this.

Pragprog guys have a good little tutorial on threading which even
includes some info on net requests (though be sure to make the changes
described above for execution expired issues first):

http://www.rubycentral.com/book/tut_threads.html

Steve

Raghu Kumar wrote:
Hi,

Thanks a lot... I was struck with this problem ... now I have no
problem.

But in my project I have lot of different web pages to scrap, is
anything possible to tweak so that responses are fast ?

Raghu

Ben Brightwell wrote:
Interesting results!!! What an understatement!! How on earth did you
figure that out? It worked like a charm. Initially my problem was just
that the HTTP request took FOREVER... then i started randomly getting
the timeout errors mentioned above. I went into the net/protocol.rb file
and made this change and voila!. Not only did the errors go away, but
the request takes literally 1/30th of the time it did before it
eventually got the data or even timed out.. Marcin Coles, you are the
genius of the day! Thank you!

Marcin Coles wrote:
Well, I'm no expert, and I certainly wasn't going to learn another
language, so I decided to do some tests because this was a problem for
me too.

I went into protocol.rb, to the rbuf_fill method (where it actually
starts the timeout thread).

the code there was

def rbuf_fill
timeout(@read_timeout) {
@rbuf << @io.sysread(1024)
}
end

now timeout takes 2 parameters - a time in seconds(?) and an exception
class to raise (defaults to Error).

When I changed the code to the following, it started to work for me.
New code:

def rbuf_fill
timeout(@read_timeout,ProtocolError) {
@rbuf << @io.sysread(1024)
}
end

Obviously this is not exactly extensively tested - but an interesting
results.

Marcin


--
Posted via http://www.ruby-forum.com/.

.



Relevant Pages

  • VIA SATA Raid needs a long time to recover from suspend
    ... Then if there was an IO request made immediately after resuming, ... Changing the timeout resolved this. ... finally did clear) it would timeout and fail. ... It seemed the kernel ...
    (Linux-Kernel)
  • Re: Problems with the block-layer timeouts
    ... clear a idea of when the timeout period should begin. ... Each request has its own timer, and as it is added to the queue, we ... What the driver chooses to do with the ...
    (Linux-Kernel)
  • Big Uploads with IIS 6.0
    ... I've already posted this question on the IIS forum, ... I have an ASP.NET application that does big uploads on ... upload fails randomly (as if the server had given up on the request). ... timeout in the web.config file, ...
    (microsoft.public.dotnet.framework.aspnet)
  • Problems with the block-layer timeouts
    ... clear a idea of when the timeout period should begin. ... Each request has its own timer, and as it is added to the queue, we ... Requeuing ...
    (Linux-Kernel)
  • EAP Session
    ... in that the timeout is set to 1 second with no retries. ... EAP request/Identity has an ID set to 1 and every second another request is ... // which works around these misbehaving implementations. ... control over so changing the timeouts and retries is not an option. ...
    (microsoft.public.windowsce.platbuilder)

Loading