Re: WWW::Mechanize with frames



AlexG wrote:
Hi,

I'm trying to do some screen scraping from a site using frames. Using
WWW::Mechanize gives back an 'error' page from the site rather than the
data I wanted:


This is the content of the frame page. It, in turn, fetches other pages and loads them into its frames. Browsers that do not support frames see the content in the noframes element.


If you want to snarf a framed page, you'll need to treat each framed items as the separate HTML pages that they are.

Here it appears to be the pages flat_navigation.php4?ecno=1.2.1.12 , flat_head.php4?ecno=1.2.1.12&organism= and flat_result.php4?ecno=1.2.1.12&organism%5B%5D= .

You'll  need to supply the complete URL of course.

I do not think that Mechanize handles frames by default, but you could teach it to grab the frame elements and parse the src attribute, then construct the full URL.

James
--

http://www.ruby-doc.org       - Ruby Help & Documentation
http://www.artima.com/rubycs/ - Ruby Code & Style: Writers wanted
http://www.rubystuff.com      - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com     - Playing with Better Toys
http://www.30secondrule.com   - Building Better Tools


.



Relevant Pages

  • Re: WWW::Mechanize with frames
    ... I do not think that Mechanize handles frames by default, but you could teach it to grab the frame elements and parse the src attribute, then construct the full URL. ... It renders fine in FF, which clearly knows how to handle frames, but ... http://www.rubystuff.com - The Ruby Store for Ruby Stuff http://www.jamesbritt.com - Playing with Better Toys http://www.30secondrule.com - Building Better Tools ...
    (comp.lang.ruby)
  • WWW::Mechanize with frames
    ... I'm trying to do some screen scraping from a site using frames. ... Alex Gutteridge ... Prev by Date: ...
    (comp.lang.ruby)