Hacker News new | past | comments | ask | show | jobs | submit login

It can scrape linked pages too by defining the depth but make sure the depth parameter is not too much else it will consume too much memory and time.



Playing around with the UI, I cannot see where that depth would be set. Is it not a per-datasource variable?

Is the "scrape linked pages" configured to be "sandboxed" within a url hierarchy (so adding example.com/foo/ would add all linked pages that are also under example.com/foo/) or not (so it would also include linked pages to other domains or subfolders)?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: