Changing urls so that search engines can crawl

prophecy Jun 15, 2003 2:38 PM

Does Nukes have anything to handle this problem?

1. Re: Changing urls so that search engines can crawl

julien1 Jun 15, 2003 3:48 PM (in response to prophecy)

I am thinking about it , any ideas ?
Actions
2. Re: Changing urls so that search engines can crawl

techn9ne Jun 16, 2003 2:03 PM (in response to prophecy)

Google can crawl all sites and consists of the vast majority of traffic from "search engines".

Directories like yahoo and dmoz ( which powers aol ) are by far the largest market share.

I think it would be a cool feature for some of the other search engines but shouldn't be a priority.
Actions
3. Re: Changing urls so that search engines can crawl

prophecy Jun 19, 2003 10:01 PM (in response to prophecy)

I am mostly referring to google. Google can't crawl too far into most cms's and judging by how this looks on the jboss site, I can probably safely assume the same with this one.

it's because that index.html?modules=blah&blah......
Actions
4. Re: Changing urls so that search engines can crawl

jcooley Jul 16, 2003 12:22 PM (in response to prophecy)

Zope uses urls like
moduleName/op
So you get URLs like
help/index_html?parameter1=x
The idea is that each module is a Zope (python) object and you call methods on it. It might be worth having a look at how they do it.

James
Actions
5. Re: Changing urls so that search engines can crawl

julien1 Jul 16, 2003 2:09 PM (in response to prophecy)

that is very similar to nukes, each op is an mbean operation called on the mbean.

julien
Actions
6. Re: Changing urls so that search engines can crawl

cnovara Sep 27, 2004 12:57 PM (in response to prophecy)

There is another, simpler way to achieve the same result :
create some "dummy pages" presenting a flat view of database content, organised with "clean" (friendly) URL links.
Robots can easily follow those links to index the full content.

A batch process can compute thos pages each night for example ...

Another way : detect bots crawl and presents "custom pages to them. thi pages presents only friendly URLs.
At the same time, we could log the web crawl for futur analyses ...

I'm working on it[/img]
Actions

Go to original post