Re: How soon will Google index 25.000 pages?



Op Fri, 30 Sep 2005 13:45:47 GMT schreef David:

> On Fri, 30 Sep 2005 14:56:25 +0200, Jan Paul van de Berg
> <janp@xxxxxxxxxxxx> wrote:
>
>>I'm about to go live with a portal site that has 25.000+ pages. Every page
>>has the Google search tool. My worry is that Google doesn't index all the
>>pages at once so that the search tool doesn't give accurate results in the
>>beginning. How soon will Google index all my pages?
>
> Depends on how many links the site has, (how popular) if it's a lot
> then can be well under a month for the majority of a site of that
> size. If you have few links can be never.
>
> This section http://www.classic-literature.co.uk/poetry/ was added
> this month (19th, so 11 days ago) so far 54,000 pages are indexed-
>
> http://www.google.com/search?num=100&hl=en&lr=&safe=off&c2coff=1&rls=GGLD%2CGGLD%3A2005-16%2CGGLD%3Aen&q=site%3Ahttp%3A%2F%2Fwww.classic-literature.co.uk%2Fpoetry%2F
>
> It's got that many indexed so quickly because the home page of this
> site is PR7 and always has spiders on it, so anything added will be
> found quickly (same day).
>
> No idea how many pages are under that section, so could be many more
> next week. Tends to increase slowly after a lot of pages are indexed,
> odds of finding a missing page out of 25,000+ is low, so an individual
> page can be missed indefinitely.
>
> Also depends on the link structure. The section above has pages like
> this
> http://www.classic-literature.co.uk/poetry/Books/browse-10248.html at
> the bottom are links to the page you are on and the next page. It says
> there are 9719 pages in that section (so 9719 x 10 products, 97,000+
> pages) it stops at page 250 though (an Amazon thing) so each page like
> the one above can have up to 250 pages with 10 products on (so 2,750
> pages under that section). To get to the 250th page a spider will have
> to follow page 1, to page 2, to page 3 ......... to page 249, to page
> 250.
>
> The likely hood of a spider getting to page 250 quickly is low, so
> could be a long, long time before I see this page
> http://www.classic-literature.co.uk/poetry/Books/browse-10248--250.html
> indexed (will be quicker now because of this link :-)).
>
> You can see it with this search
> http://www.google.com/search?num=100&hl=en&lr=&safe=off&c2coff=1&rls=GGLD%2CGGLD%3A2005-16%2CGGLD%3Aen&q=site%3Ahttp%3A%2F%2Fwww.classic-literature.co.uk%2Fpoetry%2FBooks%2Fbrowse-10248&btnG=Search
> only 7 pages are indexed.
>
> I've had this page
> http://www.free-recipes.co.uk/gourmet-food-store/GourmetFood/browse-14015391-salesrank-1.html
> up for I think 5 months now, so far 44 pages are indexed out of over
> 1,000-
> http://www.google.com/search?num=100&hl=en&lr=&safe=off&c2coff=1&rls=GGLD%2CGGLD%3A2005-16%2CGGLD%3Aen&q=site%3Ahttp%3A%2F%2Fwww.free-recipes.co.uk%2Fgourmet-food-store%2FGourmetFood%2Fbrowse-14015391&btnG=Search
>
> There's a failing in the link structure for these sites IF I wanted
> every page possible indexed. If you want every page indexed arrange
> the link structure so it takes as few clicks as possible from the home
> page to get to the deepest pages, if you can keep it to 4 or less and
> you have a fair amount of links to the site you should get the whole
> site indexed eventually.
>
> David

Thanks a lot for the info! My site has static HTML pages and the deepest
page is 4 clicks from the home page. All pages have a "you are here: home
-> categories -> sujects -> item" navigation so whenever somebody links to
a deep page, the higher pages should be found as well.
--
De Antwoordman
.



Relevant Pages

  • Re: How soon will Google index 25.000 pages?
    ... My worry is that Google doesn't index all the ... >pages at once so that the search tool doesn't give accurate results in the ... The likely hood of a spider getting to page 250 quickly is low, ... There's a failing in the link structure for these sites IF I wanted ...
    (alt.internet.search-engines)
  • Re: open source .NET search engine?
    ... pulling a napoleon- like google does-- and try to enter EVERY MARKET at ... open source .NET search engine try ... As for spidering, there are many website copiers out there, try HTTrack ... I want to spider Home Depot websites and sell it to Lowes. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: is google still dancing ???
    ... there was an update to Google's spider called ... what is Big Daddy then? ... removed them from their link pop database. ... Link popularity can't be investigated by using link:www.example.com in Google. ...
    (alt.internet.search-engines)
  • Re: What do you do if another site is stealing your copy
    ... >less traffic from Google and the number of pages in google for us has ... That's not enough to cause a duplicate content penalty. ... If you did block the scraper spider ...
    (alt.internet.search-engines)
  • Re: [article] Google questions Microsoft search
    ... Google has spoken to the European Commission about concerns it has ... over the way Microsoft's search tool is part of its Internet Explorer ... Microsoft has included a search box in its new version of Internet ...
    (alt.internet.search-engines)