The empirical laws of how long it takes to review and release a pull request depending on its size. We inferred them from 100,000 pull requests to commercial closed-source repositories belonging to our software development analytics SaaS clients.

I like to read posts that describe software development best practices. It’s fun to compare your own experience with other people’s. I admit that I don’t always follow them since so many authors like to position their findings as flawless and unexceptionable. My 15 years of software engineering have taught me that everything is a compromise, and every decision depends on the context. And the context is often different from another FAANG’s insight. Anyway, there is one specific best practice that is hard to argue with.

Good developers don’t create big PRs.

Developers should review no more than 200 to…


My experience of deploying Google’s Kubernetes ML toolkit on physical servers with multiple GPUs

Attack on Kubeflow. Image by Anastasia Markovtseva, CC-BY-SA 4.0.

Hardware

I’ve got 3 standard Supermicro towers with 256GB RAM, an SSD, 5 HDDs, and 4 GPUs each. Ethernet connects them to the “controller” Dell server with access to the internet and is supposed to gate SSH connections to the cluster. I name the towers by native cities of the team’s members; I find that scheme more interesting than assigning random adjectives (“aardvark”, “intrepid”), prefixed indexes (“data-science1”, “data-science2”), or Greek alphabet letters (“alpha”, “beta”) that I’ve seen too many times everywhere I used to work. …


Hands-on Tutorials

Data-driven algorithm design using Python and linear programming on a billion Git commit signatures and more.

I’ve recently had to solve an interesting problem: given two unordered lists with real people names, match identities in between.

Two lists with matched people names. Image by Author.

Looks easy, right? Sort both lists, and you are done. Alas, my problem is about two independent lists of people names taken from separate and independent information sources — GitHub and JIRA users belonging to the same organization, to be precise. I have to deal with the following complications:

  • No perfect match exists. There are names in the first list belonging to people that are not present in the second list, and vice versa.
  • The lists have different lengths.
  • The…

Google’s Coral project has recently gone out of beta. According to the benchmarks, Coral devices provide excellent neural network inference acceleration for DIY makers. Those devices ground on the specialized Tensor Processing Unit ASIC (Edge TPU), which proved to be somewhat tricky to work with, but the enforced limitations and quirks are rewarding. I was eager to explore the deep internals of the interoperation between TensorFlow and Edge TPU, and to hack both to do cool, nonstandard, crazy things.

The following expects that you are familiar with TensorFlow and Edge TPU basics. The official documentation is good, so looking through…

Vadim Markovtsev

Machine learning and software engineer. Teams manager. Public speaker. Google Developer Expert in Machine Learning.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store