Skip to content

[Python] Use Python to read and write csv file

Last Updated on 2021-08-02 by Clay

The csv file (Comma-Separated Values) is a very convenient format for displaying data. It is a plain text format for storing table data. This article mainly introduces how to read and write by Python.

The famous word processing software Excel has two most common file extensions, one is xlsx and the other is csv. The meaning of "Comma-Separated Values" may not be easy to understand just by looking at the explanation, let's look at an example directly.

csv file example in Excel
csv file in Excel

The picture above is a csv file I created and opened with Excel. Now I use Notepad++ to open it.

csv file example in Notepad++
csv file in Notepad++

We can see that there is only one "," between the text and the number to separate them as data for different lines. In this way, do you have a little understanding of what is meant by "Comma-Separated values" ?

Let's enter today's topic. I will record how to use Python to read and write csv files. This is one of the indispensable skills in various data analysis fields.


Read and write csv file

We will use the IMDB data set as an example. IMDB is a classic movie review data set, which is often used to test the classification model (because there are two labels "positive" and "negative")

We can download the IMDB data set here: https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews/download

After downloading and decompressing, we should get the IMDB csv file. I renamed it to imdb.csv here.

The reading method is very simple:

# -*- coding: utf-8 -*-
import csv

with open('imdb.csv', newline='', encoding='utf-8') as csvfile:
    rows = csv.reader(csvfile)
    for row in rows:
        print(row)



There are too many output items, so I won't list them here. You can look at the printed items. Basically, there are two elements in a row, the former is a movie review, and the latter is a positive or negative label.

In addition to this reading method, in fact, we can also use the name at the beginning of each line to print out the data.

IMDB data in csv format
IMDB csv file

Here is a brief demonstration, basically the two categories are review and sentiment.

# -*- coding: utf-8 -*-
import csv

with open('imdb.csv', newline='', encoding='utf-8') as csvfile:
    rows = csv.DictReader(csvfile)
    for row in rows:
        print(row['sentiment'])



Here we only choose to print out the sentiment part, so our output results will only see positive and negative two types.

After reading, let's see how to write.

# -*- coding: utf-8 -*-
import csv

with open('test.csv', 'w', newline='', encoding='utf-8') as csvfile:
    rows = csv.writer(csvfile)
    rows.writerow(['Today', 'is', 'a', 'nice', 'day', '.'])
    rows.writerow(['Today', 'is', 'a', 'bad', 'day', '.'])



Output:

how to storage the data in csv file
The csv file storage method

Just as easy!


Supplement

If you want to organize the data into a csv file, as mentioned earlier, csv file is actually just "Comma-Separated Values". Think about it this way, in fact, we can directly use Python strings to archive csv file.

# -*- coding: utf-8 -*-
a = ['Today', 'is', 'a', 'nice', 'day', '.']
b = ['Today', 'is', 'a', 'bad', 'day', '.']

csvFile = ''
ab = [a, b]
for line in ab:
    for word in line:
        csvFile += '{},'.format(word)

    csvFile += '\n'

open('test2.csv', 'w', encoding='utf-8').write(csvFile)



Output:

violence way of saving csv file
display csv file with we save
The csv file storage method 2

This storage is quite intuitive. The above is the experience notes on how to read and write csv file in Python.


References


Read More

Leave a Reply