
Therefore, to add support for a new file type, all one needs to do is “lift” it to the intermediate representation. This allows Graphtage to have generic comparison algorithms that can work on any input file type. Graphtage’s diffing algorithms operate on an intermediate representation rather than on the data structures of the original file format. However, trees and forests are special cases that can be mapped in polynomial time, with reasonable constraints on the types of edits possible. This is true even for restricted classes of graphs like DAGs. In general, optimally mapping one graph to another cannot be executed in polynomial time, and is therefore not tractable for graphs of any useful size (unless P=NP). In contrast, here is Graphtage’s output for the same pair of files: The problem is that changing dict keys breaks the diff: Since “bar” was changed to “zab,” the canonical representation changed, and the traditional diff algorithm considered them separate edits (lines 2 and 15 of the diff).

That result is not very useful, particularly if the input files are large. $ cat modified.json | jq -M -sort-keys > $ cat original.json | jq -M -sort-keys > We don’t need no fancy tools for that! Here’s effectively what they do: Take this JSON as an example:Įxisting tools effectively canonicalize the JSON (e.g., sort dictionary elements by key and format lists with one item per line), and then perform a traditional diff. Most extant diffing algorithms and utilities assume that the structures are ordered. Ordered nodes in the tree (e.g., JSON lists) and, in particular, mappings (e.g., JSON dicts) are challenging. How are existing diff tools insufficient?
#Json compare to detect channge install#
To install the utility, run: pip3 install graphtage If you’ve ever wrangled a gnarly REST API, disentangled the output of a template-generated webpage, or confabulated a config file (and subsequently needed to figure out which specific change was the one that made things work), you’ve probably fought with-and been disappointed by-the current state of open-source semantic diffing tools. Tree-like file formats are becoming increasingly common as a means for transmitting and storing data. You can even compare files that are in two different formats! And when paired with our PolyFile tool, you can semantically diff arbitrary file formats.

Graphtage is semantically aware, which allows it to map differences across unordered structures like JSON dicts and XML element tags. Graphtage lets you see what’s different between two files quickly and easily, but it isn’t a standard line-oriented comparison tool like diff. What Graphtage does differently and better.Its name is a portmanteau of “graph” and “graftage” (i.e., the horticultural practice of joining two trees together so they grow as one). Graphtage is a command line utility and underlying library for semantically comparing and merging tree-like structures such as JSON, JSON5, XML, HTML, YAML, and TOML files.
