Every month we share news, findings, interesting reads, community takeaways, and everything else along the way.
Look here for updates about DVC, our journey as a startup, projects by our users and big ideas about best practices in ML and data science.
A view from Barrancas del Cobre, shot by Jorge Orpinel Pérez. Jorge has mastered the art of working on DVC remotely.
Welcome to the April Heartbeat, our monthly roundup of cool happenings, good reads and other bright spots in our community.
Adapting to the pandemic. Although the world seems different than when we posted last month, the DVC community is steady and strong. As a predominantly distributed company, we've been developing our infrastructure for remote work from the get-go. It isn't always easy to schedule an all-hands meeting across 9 time zones but we make it work. This experience has prepared us well for the COVID-19 pandemic: although there are new challenges (like caring for families while working from home) we've been able to weather the transition to fully remote work relatively well.
Before social distancing started, DVC technical writer Jorge Orpinel Pérez has worked from a canoe. Check out more photos from his workations on Instagram.
DVC sponsors DivOps. In a time when many conferences are going remote out of necessity, we were fortunate to be part of an intentionally remote conference this month! We sponsored DivOps, a fully-online meeting led by women in DevOps. The DivOps lineup included speakers from GitHub, DropBox, Gremlin and more. DVC data scientist Elle (that's me!) gave a ten-minute talk about MLOps and CI/CD, so please check out the video. Another very relevant talk was from Anna Petrovicheva, CEO of Xperience AI; Anna spoke about her team's development workflow for deep learning projects and gave a clear overivew of how they use DVC.
DVC on the airwaves. In early March, Elle was interviewed on an episode of The Data Stream podcast about a DVC data science project, building a public dataset of posts from the "Am I the Asshole?" subreddit.
This month, DVC has released some new features and updates:
metrics diff
functionality, which lets you compare
metrics from different commits side-by-side
(check out the docs to
learn more)DVC and R working together One of our favorite blogs this month came from Marcel Ribeiro-Dantas, a developer and PhD student at the Institut Curie. Marcel wrote about using DVC to manage projects in R, particularly defining and versioning pipelines of data processing and analysis that can be reproduced easily. While DVC is language agnostic, much of our user content has been Python-centric, so it's exciting to see a detailed post for the R-using data scientist (for more about R with DVC, see Marija Ilić's post)!
Also, Marcel recently gave an interview on The Data Hackers Podcast, a Portuguese-language show. Listen for a shout-out about DVC!
DVC is in another book! Last month we reported that DVC is part of a Packt book, "Learn Python by Building Data Science Applications". This month, DVC got a mention in a just-released O'Reilly book, "Building Machine Learning Pipelines" by Hannes Hapke and Catherine Nelson.
Building Machine Learning Pipelines
Some more links we like. Here are a few other discussions that have caught our attention.
MLOps can be fun. Jeroen France's blog, "MLOps: Not as boring as it sounds!", reads like a "coming of age" story about embracing engineering as a data scientist. It's part-motivational, part tutorial- definitely worth a read. Here's a sample:
No-one wants to baby-sit, maintain, and troubleshoot their own models once they are in production. Every data scientist secretly hopes they can pawn that job off to an engineering team, or maybe an intern, right? Well, in fact MLOps is going to make your data science life a lot better.
From Juan Juan López López's blog.
Thanks for reading. As always, let us know what you're making with DVC and what links are catching your interest in the blog comments, on Twitter, and our Discord channel. Be safe and be in touch!