Show added, modified, or deleted DVC-tracked files and directories between two commits in the DVC repository, or between a commit and the workspace.
usage: dvc diff [-h] [-q | -v]
[--show-json] [--show-hash] [--show-md]
[a_rev] [b_rev]
positional arguments:
a_rev Old Git commit to compare (defaults to HEAD)
b_rev New Git commit to compare (defaults to the current workspace)
Prints a list of files and directories added, modified, deleted in a Git commit
b_rev
as compared to another Git commit a_rev
. Both a_rev
and b_rev
accept any Git revision - branch or tag
name, Git commit hash, etc.
It defaults to comparing the current workspace and the last commit (HEAD
), if
arguments a_rev
and b_rev
are not specified.
Options --show-json
and --show-hash
can be used to modify format of the
output of this command. See the Options and Examples
sections below for more details.
dvc diff
does not have an effect when the repository is not tracked by Git,
for example when dvc init
was used with the --no-scm
option.
Note that current
dvc diff
implementation does not show the line-to-line comparison among the files in each revision, likegit diff
or GNUdiff
can. This is because the data data tracked by DVC can come in many possible formats e.g. structured text, or binary blobs, etc. For an example on how to create line-to-line text file comparison, refer to this comment.
--show-json
- prints the command's output in easily parsable JSON format,
instead of a human-readable table.--show-md
- prints the command's output in Markdown table format.--show-hash
- print file and directory hash values along with their path.
Useful for debug purposes.--hide-missing
- do not list data missing from both workspace and cache
(not in cache
). Only list files and directories which have been expliclity
added, modified, or deleted. This option does nothing when comparing two Git
commits.-h
, --help
- prints the usage/help message, and exit.-q
, --quiet
- do not write anything to standard output. Exit with 0 if no
problems arise, otherwise 1.-v
, --verbose
- displays detailed tracing information.For these examples we can use the Get Started project.
The minimal dvc diff
, run without arguments, defaults to comparing DVC-tracked
files between HEAD
(last Git commit) and the current workspace
(uncommitted changes, if any):
$ dvc diff
To see the difference between the very previous commit of the project and the
workspace, we can use HEAD^
as a_rev
:
$ dvc diff HEAD^
Added:
data/data.xml
files summary: 1 added, 0 deleted, 0 modified
$ dvc diff baseline-experiment bigrams-experiment
Modified:
data/features/
data/features/test.pkl
data/features/train.pkl
model.pkl
prc.json
scores.json
files summary: 0 added, 0 deleted, 5 modified
The output from this command confirms that there's a difference in 5 files
between the tags baseline-experiment
and bigrams-experiment
.
Let's use the same command as above, but with JSON output and including hash values:
$ dvc diff --show-json --show-hash \
baseline-experiment bigrams-experiment
It outputs:
{
"added": [],
"deleted": [],
"modified": [
{
"path": "data/features/",
"hash": {
"old": "3338d2c21bdb521cda0ba4add89e1cb0.dir",
"new": "42c7025fc0edeb174069280d17add2d4.dir"
}
},
...
{
"path": "model.pkl",
"hash": {
"old": "43630cce66a2432dcecddc9dd006d0a7",
"new": "662eb7f64216d9c2c1088d0a5e2c6951"
}
}
...
]
}