Google images scraper is Datacol-based module, which implements image scraping from Google Images by specified keywords. In our example downloaded files are stored to Pictures folder on the local computer. Extra information (image height, width and query) is exported to xlsx file. You can also tune Datacol settings to upload images to FTP and publish content to popular CMS like WordPress, DLE, Joomla etc.
1. Install Datacol trial version; 2. Choose content-parsers/google-images-extractor.par in the campaign tree and click Start button to launch Google images extractor campaign.
Click image to enlarge
Before launching content-parsers/google-images-extractor.par you can adjust the Input data. Select the campaign in the campaign tree for this purpose. In this way you can setup keywords to extract Google images for.
Please contact us if the Google images scraper will not collect data after you have made changes to the Starting URL list. Click image to enlarge
3. Wait for data extraction results to appear. When you see the first results, you can force running campaign to stop (click Stop button).
Click image to enlarge
4. After campaign is finished/stopped you can find google-images-extractor.xlsx file in Documents folder.
[spoiler show=”What if the Google images extractor is blocked (banned) by the source website?” hide=”What if the Google images extractor is blocked (banned) by the source website?”]
If the source website blocks your IP-address (after blocking you will get no more extraction results), use proxy.
[/spoiler]
[/tab]
[tab name=”Data processing and Export”] Data processing options for data, harvested by Google images extractor: