Category: page scraping
-
Google+ Scraper – retrieve data from Google+ profiles with NodeJS and CoffeeScript
fhemberger/googleplus-scraper – GitHub
A lot of Javascript, CoffeeScript, NodeJS, etc.
-
Firefox Add-on “Dafizilla Table2Clipboard”
Dafizilla Table2Clipboard :: Add-ons for Firefox
sources on Sourceforge.net
If you want to paste data in Microsoft Excel or OpenOffice Calc with correct disposition simply use Table2Clipboard.
-
harvesting HTML-obfuscated web-sites looks like horror to you?
I just completed 2 tasks, where I faced obfuscated CGI forms. It was quite a challenge, and I didn’t anticipate the final success from the beginning. But it’s done.
Now I am rather eager to apply my technology for interesting and lucrative tasks.
-
rather satisfied with today’s page scraping work
I did not experience much trouble, everything works just as expected. There could be more days like this one.
-
another page scraping task for the same client
It’s getting funnier again, after I got more familiar again with my “old” tool set.
- At first I care for the forward navigation.
- Got the loop operating.
- But will the loop also stop?
- Yes, the loop stops successfully.
- Now for the content.
- No, reworking the loop first.
- Alright, the navigational part works fine.
- Now for the content.
- Content matched and split.
- CSV output is fine.
- TBD: RSS and Atom output.
- …