Wednesday, 18 September 2013

Get all the urls under domain (YQL?)

Get all the urls under domain (YQL?)

I want to get all the urls under a domain.
When I looked at their robots.txt. It clearly states that some of the
folders are not for robots but I am wondering is there a way to get the
all the urls that are open to robots. There is no sitemap on the
robots.txt.
Any idea would be helpful and I am also curious will there be a Yahoo
Query Language(YQL) solution for this purpose because this work has
probably already been done by Yahoo.
Thanks !

No comments:

Post a Comment