Like Git, DVC allows for a distributed environment and collaboration. We make it
easy to consistently get all your data files and directories into any machine,
along with matching source code. All you need to do is to setup
remote storage for your DVC
project, and push the data there, so others can reach it. Currently DVC
supports Amazon S3, Microsoft Azure Blob Storage, Google Drive, Google Cloud
Storage, SSH, HDFS, and other remote locations. The list is constantly growing.
(For a complete list and configuration instructions, refer to dvc remote add
.)
As an example, let's take a look at how you could setup an S3 remote storage for a DVC project, and push/pull to/from it.
If you don't already have one available in your S3 account, follow instructions
in
Create a Bucket.
As an advanced alternative, you may use the
aws s3 mb
command instead.
To actually configure an S3 remote in the project, supply the URL
to the bucket where the data should be stored to the dvc remote add
command.
For example:
$ dvc remote add -d myremote s3://mybucket/path
Setting 'myremote' as a default remote.
The
-d
(--default
) option setsmyremote
as the default remote storage for this project.
This will add myremote
to your .dvc/config
. The config
file now has a
remote section for it:
['remote "myremote"']
url = s3://mybucket/path
[core]
remote = myremote
dvc remote modify
provides a wide variety of options to configure S3 buckets.
Let's commit your changes and push your code:
$ git add .dvc/config
$ git push
After adding data to the project with dvc run
or other commands,
it'll be stored in your local cache. Upload it to remote storage
with the dvc push
command:
$ dvc push
Code and DVC-files can be safely committed and pushed with Git.
Please use regular Git commands to download code and DVC-files from your Git servers. For example:
$ git clone https://github.com/example/project.git
$ cd myproject
or
$ git pull
To download data files for your project, run:
$ dvc pull
dvc pull
will download the missing data files from the default remote storage
configured in the .dvc/config
file.