Quickly generate word cloud Python module: wordcloud

Introduction

word-cloud is a very famous word in natural language processing domain. At first I thought it was just to calculate the frequency of the vocabulary and display the high-frequency words large.

But not only that, the shapes and the styles of characters are all learned not as simple as I thought.

I am a person who likes new things, so I came to study the wordcloud module in Python today and record my experience here.

wordcloud example

For the first use, we need to install with the following instructions:

pip3 install wordcloud

Then, we need to have the text before we can start “counting word frequency”. The text I selected here is a number of articles on the teaching of “Word Vector” that I have saved before. In order to test the effect of the word cloud, I selected ten of them and combined them, and then segmented them.

Since it is an English corpus, the word segmentation tool I choose is NLTK. If you are interested, maybe you can refer to what I wrote before: NLTK Tutorial —— A Python package

The following is a simple sample code:

# -*- coding: utf-8 -*-

import nltk

from wordcloud import WordCloud



text = open('data.txt', 'r', encoding='utf-8').read()

text = ' '.join(nltk.word_tokenize(text))

cloud = WordCloud().generate(text)

cloud.to_file('output.png')

Output:

You can use generate() function to count the words and make a word cloud, and use to_file() function to save it to picture.

Quickly generate word cloud Python module: wordcloud

Introduction

wordcloud example

References

Related

Leave a ReplyCancel reply

Quickly generate word cloud Python module: wordcloud

Introduction

wordcloud example

References

Share this:

Related

Leave a ReplyCancel reply