If we have two yaml files how would we compare keys and print mismatched and/or missing keys? I tried DeepDiff but it takes dictionaries, iterables, etc, how would I convert yaml files to dictionary and use DeepDiff or any other method?
martineau
115k25 gold badges160 silver badges284 bronze badges
asked Sep 2, 2020 at 9:04
0
Following worked for me:
import yaml
from deepdiff import DeepDiff
def yaml_as_dict[my_file]:
my_dict = {}
with open[my_file, 'r'] as fp:
docs = yaml.safe_load_all[fp]
for doc in docs:
for key, value in doc.items[]:
my_dict[key] = value
return my_dict
if __name__ == '__main__':
a = yaml_as_dict[yaml_file1]
b = yaml_as_dict[yaml_file2]
ddiff = DeepDiff[a, b, ignore_order=True]
print[ddiff]
answered Sep 2, 2020 at 10:09
FaisalFaisal
1491 silver badge12 bronze badges
1
Try out this package deepdiff.I had a similar usecase and found it very helpfull.
answered Sep 2, 2020 at 9:11
OriginOrigin
7341 gold badge6 silver badges16 bronze badges
Use PyYAML To convert to flattened
dict
, then compare.
answered Sep 2, 2020 at 9:10
Abhijit SarkarAbhijit Sarkar
19.8k16 gold badges101 silver badges186 bronze badges
To load a yaml file as a dictionary you can use PyYAML:
import yaml
with open["example.yaml", 'r'] as fp:
d = yaml.safe_load[fp]
answered Sep 2, 2020 at 9:11
bp7070bp7070
3022 silver badges6 bronze badges
Python is a powerful programming language widely used in many applications. One of its many practical applications is in working with YAML files.
In this article, we will learn how to use the yaml module, ruamel.yaml and some other ways to compare two YAML files in python to see if they are equivalent.
What is a YAML file?
A YAML file is used for storing data in the YAML format. It is a human-readable data serialization format that helps store data in a structured way. YAML files are often used as configuration, data, and script files.
These files are easy to read and understand; they can be edited with any text editor and are often used in conjunction with other data files, such as JSON files.
They help store data of variety of formats, including:
- Strings
- Integers
- Floats
- Booleans
- Arrays
- Objects
Some of the characteristics of YAML files include:
- Human-readable
- Machine-readable
- Easy to edit
- Easy to parse
- Well suited for use in configuration files
- Well suited for use in localization files
Python provides several modules for working with XML. Two of the most popular are the standard library’s …
Comparing YAML Files in Python
Let’s compare two Yaml files using different options that are available in python. Consider having two Yaml files [named file1.yaml and file2.yaml] with the following data.
---
name: John
age: 25
city: New York
---
name: Jane
age: 24
city: Paris
hobby:
- cricket
- hockey
- football
Now let’s see different approaches to compare two yaml files in python.
Advertisements01.
Using difflib.unified_diff[]
If we want to compare the two files line by line, we can use the difflib.unified_diff[] function. Here python compares files and shows differences between 2 yaml files.
import difflib
# open the two files to be compared
file1 = open['file1.yaml', 'r']
file2 = open['file2.yaml', 'r']
# read the two files
text1 = file1.readlines[]
text2 = file2.readlines[]
# compare the two files using unified_diff[]
for line in difflib.unified_diff[text1, text2, fromfile='file1.yaml', tofile='file2.yaml']:
print[line]
# close the files
file1.close[]
file2.close[]
If we want to compare the two files character by character, we can use the difflib.ndiff[] function. Here python compares files and shows differences between 2 yaml files character by character.
import difflib
import yaml
data1 = '''
name: John
age: 25
city: New York
'''
data2 = '''
name: Jane
age: 24
city: Paris
'''
yaml1 = yaml.safe_load[data1]
yaml2 = yaml.safe_load[data2]
diff = difflib.ndiff[yaml.dump[yaml1, default_flow_style=False],
yaml.dump[yaml2, default_flow_style=False]]
print[''.join[diff]]
03.
Using yaml python library
AdvertisementsThe code below reads the two files, compares them, and determines whether the two files are identical or not. Here it does not print the character or line differences between two files.
import yaml
file1 = yaml.safe_load[open["file1.yaml"]]
file2 = yaml.safe_load[open["file2.yaml"]]
if file1 == file2:
print["The files are the same."]
else:
print["The files are different."]
04.
Without using the yaml library
Here also, the code reads the yaml file and prints the result as “files are the same” if they are equal; otherwise, “files are different.” Here we have not used any library to read the yaml files.
Advertisementswith open['file1.yaml'] as f1, open['file2.yaml'] as f2:
for line1, line2 in zip[f1, f2]:
if line1 != line2:
print["files are different"]
break
else:
print["files are the same"]
05.
Using Ruamel.yaml library
Here we are comparing two yaml files using the ruamel.yaml python library. The code does not tell the character or line difference between two files, instead it just tells if two files are identical or different.
import ruamel.yaml
def compare_yaml_files[file1, file2]:
with open[file1] as f1:
with open[file2] as f2:
data1 = ruamel.yaml.load[f1, Loader=ruamel.yaml.RoundTripLoader]
data2 = ruamel.yaml.load[f2, Loader=ruamel.yaml.RoundTripLoader]
if data1 != data2:
print[file1,'and',file2,'are not identical.']
else:
print[file1,'and',file2,'are identical.']
if __name__ == '__main__':
file1 = "file1.yaml"
file2 = "file2.yaml"
compare_yaml_files[file1, file2]
Over here, let’s consider the two yaml files that have the below data.
---
name: Jane
age: 24
city: Paris
hobby:
- cricket
- hockey
- football
---
name: Jane
age: 25
city: Paris
hobby:
- cricket
- hockey
- football
In file2.yaml we have age = 24 but in file2_copy.yaml we have age=25. And rest data is the same. Now let’s see how we can spot this difference.
import yaml
file1 = 'file2.yaml'
file2 = 'file2_copy.yaml'
with open[file1] as f1, open[file2] as f2:
data1 = yaml.safe_load[f1]
data2 = yaml.safe_load[f2]
diff = {}
for key in data1:
if key not in data2:
diff[key] = data1[key]
elif data1[key] != data2[key]:
diff[key] = [data1[key], data2[key]]
for key in data2:
if key not in data1:
diff[key] = data2[key]
print[diff]
The above code will print the below output:
{'age': [24, 25]}
As you can see, it has successfully found the difference between the two files.