Skip to content

[Python] Use PyYAML Package To Load YAML Format File

I used to think Json file can satisfy any situation of my requirement, but in fact, it doesn’t. Strictly speaking, Json files are not suitable for human reading.

I finally understand why I see many YAML format files in other people’s projects. YAML format is better for human reading.

So I wrote this article to record how to use PyYAML package in Python to access YAML files.


What is YAML format?

According to the introduction of Wikipedia, YAML is a format for serialization data, and the so-called serialization data refers to a data that can quickly convert data structure or object into usable format.

In the most vernacular way, YAML is a convenient format that can be store in plain text files, which can be quickly load by programs and converted to specific data types.

By the way, the process of converting data into YAML format it is called “serialization“, and conversely loading YAML format into objects in the program is called “deserialization“.

Let’s compare the differences between YAML and Json formats below.

Suppose I have such a piece of data (stored in Python’s Dict):

{
'2020-10-30': {'name': 'Use Python to copy pictures to the clipboard',
               'update_date': '2020-11-01',
               'url': 'https://clay-atlas.com/us/blog/2020/10/30/python-en-pillow-screenshot-copy-clipboard/'},

 '2020-11-04': {'name': 'Use Python PyInstaller to make an executable file with picture',
                'update_date': '2020-11-7',
                'url': 'https://clay-atlas.com/us/blog/2020/11/04/python-en-package-pyinstaller-picture/'}
}

As you can see, this is the record of my blog article, with the date of writing as the key value, and three attributes such as the upload URL, article title, and last update date are stored.

In the properties stored in YAML format, the display is very clean:

'2020-10-30':
  name: Use Python to copy pictures to the clipboard
  update_date: '2020-11-01'
  url: https://clay-atlas.com/us/blog/2020/10/30/python-en-pillow-screenshot-copy-clipboard/
'2020-11-04':
  name: Use Python PyInstaller to make an executable file with picture
  update_date: '2020-11-7'
  url: https://clay-atlas.com/us/blog/2020/11/04/python-en-package-pyinstaller-picture/



On the contrary, what about the files stored in Json? The following are the files saved by Json:

{"2020-11-04": {"url": "https://clay-atlas.com/us/blog/2020/11/04/python-en-package-pyinstaller-picture/", "name": "Use Python PyInstaller to make an executable file with picture", "update_date": "2020-11-7"}, "2020-10-30": {"url": "https://clay-atlas.com/us/blog/2020/10/30/python-en-pillow-screenshot-copy-clipboard/", "name": "Use Python to copy pictures to the clipboard", "update_date": "2020-11-01"}}



Yes, I haven’t changed it! The default storage format is a whole line.

Although it seems that there is no way to display beautifully, but I think the YAML format is very beautiful! It is also convenient to modify the parameters manually.


Use PyYAML to load and write files in YAML format

Then we enter the focus of this article: how to read and write YAML files? First of all, there seem to be many packages in Python that can read and write YAML files. Here I recommend using PyYAML. (Although, you can also write a parser yourself to do it.)

First we need to install PyYAML package.

pip3 install pyyaml

After installation, let’s first take a look at how to store data through pyyaml.


Write YAML format file

The following is a simple sample file. The data stored are the number of animals in the “home” and “office”.

I have a simple dog and two cats at home, and the office is full of seafood.

# coding: utf-8
import yaml


def main():
    # Data
    data = {
        'home': {
            'pets': ['dog', 'cat', 'cat'],
            'numbers': 3,
        },
        'office': {
            'pets': ['fish', 'crab', 'shrimp'],
            'numbers': 3,
        }
    }

    # Save
    with open('example.yml', 'w') as f:
        yaml.dump(data, f, Dumper=yaml.CDumper)


if __name__ == '__main__':
    main()



The interface of the storage method looks exactly the same as the design of the Json package, which is really good, and you can easily write down the storage methods between different packages.

Next, we open the saved example.yml file to have a look:

home:
  numbers: 3
  pets:
  - dog
  - cat
  - cat
office:
  numbers: 3
  pets:
  - fish
  - crab
  - shrimp



Quite easy to understand, and manual modification is also convenient.


Load YAML format files

Then enter the loading part next.

# coding: utf-8
import yaml
from pprint import pprint


def main():
    with open('example.yml', 'r') as stream:
        data = yaml.load(stream, Loader=yaml.CLoader)

    pprint(data)    

if __name__ == '__main__':
    main()



Output:

{'home': {'numbers': 3, 'pets': ['dog', 'cat', 'cat']},
 'office': {'numbers': 3, 'pets': ['fish', 'crab', 'shrimp']}}

It’s that simple. I think that in the future, I will adjust many parameter files that need to be modified by myself to be stored in YAML format files.


References


Read More

Leave a Reply