YAML (short for “YAML Ain’t Markup Language”) is a human-readable data serialization format that is often used for configuration files, data exchange, and other purposes. It’s designed to be simple and easy to read, making it popular in many programming languages, including Python.
PyYaml is a popular Python library that provides functionality for working with YAML data. It allows you to easily dump Python data to YAML format, read YAML data, modify YAML data, and convert YAML data to other formats like JSON and Python dictionaries.
In this comprehensive guide, we will explore various features of PyYaml for supporting YAML in your Python applications.

Dumping Data to YAML using PyYaml
PyYaml provides the dump()
and dump_all()
functions for converting Python data to YAML format. Here’s an example:
import yaml
# Create a Python dictionary
data = {'name': 'John', 'age': 30, 'city': 'New York'}
# Dump the dictionary to YAML format
yaml_data = yaml.dump(data)
# Print the YAML data
print(yaml_data)
Output:
age: 30
city: New York
name: John
Writing YAML Data to a File with PyYaml
You can also write YAML data to a file using PyYaml’s write()
function. Here’s an example:
import yaml
# Create a Python dictionary
data = {'name': 'John', 'age': 30, 'city': 'New York'}
# Write the dictionary to a YAML file
with open('data.yaml', 'w') as f:
yaml.dump(data, f)
Loading YAML Data and Safe Loading with PyYaml
To read YAML data from a file or a string, PyYaml provides the load()
and safe_load()
functions. The safe_load()
function is recommended for security reasons, as it only parses basic YAML syntax without executing any code. Here’s an example:
import yaml
# Load YAML data from a file
with open('data.yaml', 'r') as f:
loaded_data = yaml.load(f, Loader=yaml.SafeLoader)
# Access and print the loaded data
print(loaded_data)
Modifying Dictionary Values in YAML Data using PyYaml
You can access and modify values in YAML data as Python dictionaries using PyYaml. Here’s an example:
import yaml
# Load YAML data from a file
with open('data.yaml', 'r') as f:
loaded_data = yaml.load(f, Loader=yaml.SafeLoader)
# Modify a value in the loaded data
loaded_data['age'] = 31
# Dump the modified data to YAML format
yaml_data = yaml.dump(loaded_data)
# Print the modified YAML data
print(yaml_data)
Converting YAML to JSON using PyYaml
You can also convert YAML data to JSON format using PyYaml’s dump()
function with the default_style
argument set to 'json'
. Here’s an example:
import yaml
# Load YAML data from a file
with open('data.yaml', 'r') as f:
loaded_data = yaml.load(f, Loader=yaml.SafeLoader)
#Convert YAML to JSON
json_data = yaml.dump(loaded_data, default_style='"json"')
#Print the JSON data
print(json_data)
Output
{
"age": 31,
"city": "New York",
"name": "John"
}
Converting YAML to Python Dictionaries using PyYaml
PyYaml also provides the load()
function to directly convert YAML data to Python dictionaries. Here’s an example:
import yaml
# Load YAML data from a file
with open('data.yaml', 'r') as f:
loaded_data = yaml.load(f, Loader=yaml.SafeLoader)
# Access and print the loaded data as Python dictionary
print(loaded_data)
Output
{'age': 31, 'city': 'New York', 'name': 'John'}
Example of Using Yaml in Python Application
Install PyYAML
First, you need to install the PyYAML module. You can do this using pip, the Python package manager, by running the following command in your terminal or command prompt:
pip install pyyaml
Create a YAML file
Next, create a YAML file with the parameters you want to add. For example, let’s say you want to add parameters for a machine learning model configuration, such as learning rate, batch size, and number of epochs. You can create a file named config.yml
with the following content:
learning_rate: 0.001
batch_size: 32
num_epochs: 100
Load YAML data in Python
Now, you can load the YAML data in Python using PyYAML. Here’s an example code that demonstrates how to load the YAML data from the config.yml
file:
import yaml
# Load YAML data from file
with open('config.yml', 'r') as file:
config = yaml.load(file, Loader=yaml.FullLoader)
# Access parameters
learning_rate = config['learning_rate']
batch_size = config['batch_size']
num_epochs = config['num_epochs']
# Print the parameters
print('Learning Rate:', learning_rate)
print('Batch Size:', batch_size)
print('Number of Epochs:', num_epochs)
Update parameters
You can easily update the parameters in the YAML data using standard Python dictionary operations. Here’s an example that demonstrates how to update the learning_rate
parameter:
# Update learning rate
config['learning_rate'] = 0.01
# Save updated data to YAML file
with open('config.yml', 'w') as file:
yaml.dump(config, file)
Once you have loaded the YAML data and updated the parameters, you can access them in your application as needed. For example, you can use the updated learning rate value in your machine learning model training loop.
YAML Serialization and Deserialization Example
To serialize Python objects into YAML format, you can use the yaml.dump()
function provided by PyYAML. This function takes two arguments: the Python object to be serialized, and the file-like object or stream to which the YAML data should be written. Here’s an example:
import yaml
# Create a Python dictionary
data = {
'name': 'John',
'age': 30,
'city': 'New York'
}
# Serialize the dictionary to YAML
yaml_data = yaml.dump(data)
# Print the serialized YAML data
print(yaml_data)
Output
age: 30
city: New York
name: John
To deserialize YAML data into Python objects, you can use the yaml.load()
function provided by PyYAML. This function takes one argument: the file-like object or stream from which the YAML data should be read. Here’s an example:
import yaml
# YAML data
yaml_data = '''
name: John
age: 30
city: New York
'''
# Deserialize the YAML data to a Python dictionary
data = yaml.load(yaml_data, Loader=yaml.SafeLoader)
# Access the values in the dictionary
print('Name:', data['name'])
print('Age:', data['age'])
print('City:', data['city'])
Output
Name: John
Age: 30
City: New York
Note: In the above example, we used yaml.SafeLoader
as the loader, which is a safer option that restricts the execution of arbitrary Python code embedded in YAML data.
Anchors, Aliases, Tags, and Multi-line strings using the PyYAML library in Python
Here’s an example that demonstrates some of the advanced features of YAML, including anchors, aliases, tags, and multi-line strings, using the PyYAML library in Python:
import yaml
# Create a dictionary with advanced features
data = {
'name': 'John',
'age': 30,
'city': 'New York',
'pets': [
{'type': 'dog', 'name': 'Buddy'},
{'type': 'cat', 'name': 'Whiskers'},
{'type': 'fish', 'name': 'Nemo'}
],
'hobbies': [
{'name': 'reading', 'level': 'advanced'},
{'name': 'coding', 'level': 'intermediate'},
{'name': 'cooking', 'level': 'beginner'}
],
'address': {
'street': '123 Main St',
'city': 'New York',
'state': 'NY',
'zip': '10001'
}
}
# Use anchors and aliases to reference common data
data['pets'][0]['type'] = 'dog' # Changing the type of the first pet
data['pets'][1] = data['pets'][0] # Using alias to reference the first pet
# Use tags to specify custom data types
data['age'] = yaml.ScalarNode(tag='!!python/int', value=str(data['age'])) # Using tag to specify integer type
# Serialize the dictionary to YAML
yaml_data = yaml.dump(data, default_style='"')
# Print the serialized YAML data
print(yaml_data)
Output
address:
city: New York
state: NY
street: 123 Main St
zip: '10001'
age: !!python/int '30'
city: New York
hobbies:
- level: advanced
name: reading
- level: intermediate
name: coding
- level: beginner
name: cooking
name: John
pets:
- &id001
name: Buddy
type: dog
- *id001
- name: Nemo
type: fish
Final Thoughts
In this comprehensive guide, we explored the powerful PyYaml module for supporting YAML in Python applications. We covered how to dump Python data to YAML format, write YAML data to a file, load and safe load YAML data, modify dictionary values in YAML data, and convert YAML to JSON and Python dictionaries. PyYaml provides a convenient and easy-to-use interface for working with YAML data in Python applications, making it a valuable tool for developers.