Skip to content

[Python] Use googletrans Python Package to use Google Translate

If we want to translate a language to another language, many people will open the Google Translate web.

Of course, the reputation of Google Translate is quite strong, and the increasing maturity of NLP-related deep learning technology in recent years, the quality of Google Translate has also improved year by year.

So, if we need to use program to translate an article or text, we surely want to connect to Google Translate to do it.

Basically, when thinking of using Python to connect with Google Translate, I think there should be the following three methods:

  • Use Google Translate API (this should be the most stable method)
  • Crawl yourself
  • Use packages packaged by others

Naturally, it is the fastest to use a package packaged by others. What I want to record today is how to use "googletrans", a package that directly calls Google Translate on Python. It can be said that pip is installed and used, which is very convenient.


Preparation

You need to use the following command to install googletrans python package.

sudo pip3 install googletrans

By the way, this package still has its limitations.

  • String length must not exceed 15000 characters
  • When the webpage is updated, you need to wait for this package to be updated to work properly
  • HTTP 5xx Error means that Google blocked the IP address

However, this is still a great package, after all, it is easy to use.


Translate

First of all, since Google Translate is connected, the most important thing is the "translation" function!

import googletrans
from pprint import pprint


# Initial
translator = googletrans.Translator()


# Basic Translate
results = translator.translate('我覺得今天天氣不好。')
print(results)
print(results.text)



Output:

Translated(src=zh-CN, dest=en, text=I think the weather is bad today., pronunciation=None, extra_data="{'translat…")
I think the weather is bad today.

It can be seen that without any settings, the input "string" will automatically detect the most likely language; and the output language is defaulted to English.

Of course, we can also specify the language we want to output. Let's look at an example:

print('English:', translator.translate('我覺得今天天氣不好', dest='en').text)
print('Japanese:', translator.translate('我覺得今天天氣不好', dest='ja').text)
print('Korean:', translator.translate('我覺得今天天氣不好', dest='ko').text)



Output:

English: I think the weather is bad today
Japanese: 私は、今日の天気は悪いだと思います
Korean: 나는 날씨가 오늘 나쁜 생각

We can see that setting the "dest" parameter can select the language we want to translate.


Detect language

What's interesting is that if we have an unknown language today, we can actually use this package to "detect" which language the unknown language is.

For example, the following paragraph of Japanese (yes, I know this is Japanese).

# Detect
unknown_sentence = 'おはよう'
results = translator.detect(unknown_sentence)
print(results)
print(results.lang)



Output:

Detected(lang=ja, confidence=1.0)
ja

The returned result also shows that this text is in Japanese.


Get Language Index

Of course, the functions of "Specify Translation Language", "Detect Language Type" and so on above all rely on the "Index" of the language we need to know.

For example, everyone knows that "en" is in English, no problem. But what if it is "af" today?

I checked, "af" stands for "afrikaans" (Afrikaans). yes, exactly as I expected!

well, then again, in fact, we can also use googletrans to check the language encoding.

from pprint import pprint
pprint(googletrans.LANGCODES)



Output:

{'Filipino': 'fil',
'Hebrew': 'he',
'afrikaans': 'af',
'albanian': 'sq',
'amharic': 'am',
'arabic': 'ar',
'armenian': 'hy',
'azerbaijani': 'az',
'basque': 'eu',
'belarusian': 'be',
'bengali': 'bn',
'bosnian': 'bs',
'bulgarian': 'bg',
'catalan': 'ca',
'cebuano': 'ceb',
'chichewa': 'ny',
'chinese (simplified)': 'zh-cn',
'chinese (traditional)': 'zh-tw',
'corsican': 'co',
'croatian': 'hr',
'czech': 'cs',
'danish': 'da',
'dutch': 'nl',
'english': 'en',
'esperanto': 'eo',
'estonian': 'et',
'filipino': 'tl',
'finnish': 'fi',
'french': 'fr',
'frisian': 'fy',
'galician': 'gl',
'georgian': 'ka',
'german': 'de',
'greek': 'el',
'gujarati': 'gu',
'haitian creole': 'ht',
'hausa': 'ha',
'hawaiian': 'haw',
'hebrew': 'iw',
'hindi': 'hi',
'hmong': 'hmn',
'hungarian': 'hu',
'icelandic': 'is',
'igbo': 'ig',
'indonesian': 'id',
'irish': 'ga',
'italian': 'it',
'japanese': 'ja',
'javanese': 'jw',
'kannada': 'kn',
'kazakh': 'kk',
'khmer': 'km',
'korean': 'ko',
'kurdish (kurmanji)': 'ku',
'kyrgyz': 'ky',
'lao': 'lo',
'latin': 'la',
'latvian': 'lv',
'lithuanian': 'lt',
'luxembourgish': 'lb',
'macedonian': 'mk',
'malagasy': 'mg',
'malay': 'ms',
'malayalam': 'ml',
'maltese': 'mt',
'maori': 'mi',
'marathi': 'mr',
'mongolian': 'mn',
'myanmar (burmese)': 'my',
'nepali': 'ne',
'norwegian': 'no',
'pashto': 'ps',
'persian': 'fa',
'polish': 'pl',
'portuguese': 'pt',
'punjabi': 'pa',
'romanian': 'ro',
'russian': 'ru',
'samoan': 'sm',
'scots gaelic': 'gd',
'serbian': 'sr',
'sesotho': 'st',
'shona': 'sn',
'sindhi': 'sd',
'sinhala': 'si',
'slovak': 'sk',
'slovenian': 'sl',
'somali': 'so',
'spanish': 'es',
'sundanese': 'su',
'swahili': 'sw',
'swedish': 'sv',
'tajik': 'tg',
'tamil': 'ta',
'telugu': 'te',
'thai': 'th',
'turkish': 'tr',
'ukrainian': 'uk',
'urdu': 'ur',
'uzbek': 'uz',
'vietnamese': 'vi',
'welsh': 'cy',
'xhosa': 'xh',
'yiddish': 'yi',
'yoruba': 'yo',
'zulu': 'zu'}

References


Read More

2 thoughts on “[Python] Use googletrans Python Package to use Google Translate”

Leave a Reply