Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Threading for a crawler is just a dirty way of not handling distribution. When you will need more than one server your threads won't save you. It has nothing to do with Node.js and thread support.


I wasn't creating a new search engine, I was doing a one-off scraping job in my spare time. Creating a fully distributed solution would have been total overkill. But threading could and would have helped.

Honestly, stupidly hostile and ignorant comments like this are the absolute worst thing about Hacker News.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: