Last Updated on 2021-06-07 by Clay
Using "Google Images" to search some pictures is a very common thing in our life. Whether you are looking for cute cats, homework cover page, even lovely beauties...... Google Search is such a great tool.
Sometimes, based on some projects requirements, we need to download many pictures in Google Search. Maybe a bit slower if you use the manual method.
And then, we consider use crawler to do it. But another problem happen: the degree of difficulty of Google Search, is more difficult than expected. (Of course you can do it, I'm sure, but it's not a fast way.)
Everyone will encounter the same problem. So, a master implemented this package in Python: "google_images_download". You can use it to download the pictures you want! and no more than ten lines of code!
Here is the Github link: https://github.com/hardikvasa/google-images-download
You can read more guide in their document.
google_images_download
First, we need to use this command to install this package.
sudo pip3 install google_image_download
And we take a look for the sample code:
# -*- coding: utf-8 -*- from google_images_download import google_images_download response = google_images_download.googleimagesdownload() arguments = {"keywords": "dog", "limit": 20, "print_urls": True} paths = response.download(arguments) print(paths)
- keywors: The item we want to search
- limit: The max number we download
- print_urls: print the website link or not
And we use "downlaod()" function to start it. The output of program will show you the picture is download success or damage.
The pictures we download will put in the a new folder (The name is your "keyword") it created automatically in current folder.
Issues
I meet a problem when I want to download more pictures. It has a limit of picture numbers. For a example, I set the parameter "limit" to 1000:
# -*- coding: utf-8 -*- from google_images_download import google_images_download response = google_images_download.googleimagesdownload() arguments = {"keywords": "dog", "limit": 1000, "print_urls": True} paths = response.download(arguments) print(paths)
Output:
Looks like we cannot locate the path the 'chromedriver' (use the '--chromedriver' argument to specify the path to the executable.) or google chrome browser is not installed on your machine.
As you can see, we can't download the picture when we want to download pictures more than 100. With a error report, maybe we need to set the path of "chromedriver".
But it's very trouble. Maybe fix the source code is a faster way.
Edit the "google_images_download.py", and you may find the code:
if limit < 101: raw_html = self.download_page(url) # download page
And we change the limit number:
if limit < 100000: raw_html = self.download_page(url) # download page
And then we can download pictures more than 100.