One on One with Davin Potts: 4. How You Can Help With Your Favorite Open Source Project Without Being an Expert

Posted March 7, 2019 by Paige Roberts, Vertica Open Source Relations Manager

Vertica One on One with Davin Potts, CEO Appliomics, Founder KNIME, Core Python Committer
At the recent Data Day Texas event, I sat down with Davin Potts and had a long conversation about a wide variety of subjects. I divided the conversation into multiple chunks by subject, and have been posting them one chunk at a time. In the first post, we discussed the wide variety of programming languages and tools in use for data science projects right now, and how he became a core Python committer. In the second post, we discussed the advantages of KNIME for a data science consultant like Potts, and the advantages of using SQL in a database to do data manipulation and analysis. In the third post, we dove into a cool new feature coming in the next version of Python.

In this post, Potts gives a few tips on how anyone who uses open source projects like Python can contribute in an important way without being an expert.



Davin Potts: KNIME is open source, Python is open source – there are a lot of open source projects.  I think a lot of people use open source, and harbor a hidden desire to contribute back in some way, but they’re hesitant because they think, “I don’t know enough,” or “I’m not good enough,” or “I need to learn a lot more,” or “I’ll do that this summer,“ or something like that. It’s one of those New Year’s resolution type things that never gets met.

Paige Roberts: Also, if it’s not part of your day job, it’s hard to convince yourself that you should work on code you’re not going to get paid for.

Potts: Exactly. And so, if as part of your day job, you use any kind of open source, even if you’re not a developer, I think it’s generally true that if you’re using some sort of an open source tool and something doesn’t behave the way you wanted it to, it didn’t do what you thought it should, it’s incredibly helpful …

Roberts: Report that.

Potts: Because you need that to work. It’s part of your job. Report, “Hey, I had this problem.”

Roberts: Nobody can fix it if they don’t know there’s a bug.

Right. And while you’re there, because it doesn’t take much to add an entry to say, “Hey, I did the following. The hardest part is trying to explain it coherently to another human being who wasn’t sitting there watching you do that.

So they can reproduce it.

If you can describe it enough. Thankfully, there are a lot of people who do that. But there are a lot of people who don’t coherently describe, …

It’s a lot harder than you would think to give reproducible directions.

Writing articles about things. Writing up documentation for things, it’s a lot harder than people give credit for. But another very impactful and helpful thing is while you’re there, if you’re going to ever take the time to add an entry to say, “Hey, I did the following, and it turned out to be red. I thought it should’ve been blue.” When you add that entry. Look around at some of the other things on the issue tracker. What you’ll find is other entries that don’t clearly describe what happened.

If you add an entry to those saying, “I don’t understand what you were trying to explain.” You’re already taking a huge load off of other people because for any open source project, there are people who actually work on the source code itself. And they can’t keep up with all of the issues that get opened up. If you help review, and provide even the most cursory feedback of “I know what you’re talking about. I had the same problem the other week. It only happens when the moon is half full.”

If you stand on one leg and rotate clockwise… Yeah. [laughing]

That is so terribly helpful to be able to say, “Well, I just did that on my system and it worked for me.” You’ve just saved other people who have extremely limited time a huge amount of effort to provide that feedback, and everybody can potentially do that, and be motivated to do it as part of their day job.

And you don’t have to know a whole lot about the tech to do that. All you have to be is a user.

Right. And that’s super meaningful. The number of issues that get opened up against code that I’m supposed to be responsible for. I can’t keep up with that. I don’t have enough time in the day. And even if they double my salary for working on that, …

Double nothing is still nothing.

[laughing] Yeah. So, I can’t possibly keep up with it even if that was the only thing that I did. And so having other people willing to spend even just an occasional bit of time, If I see an issue that’s been opened up by an individual and no one else has commented on, versus another issue that’s been opened up where at least one other person has commented on it, and it’s a different person, guess which one I’m going to pay attention to first?

Especially, if the comment is “Yeah, I had that problem, too.” And it does it when you do x,y,z every time.

It’s supposed to be a community in terms of if it’s open source, there’s an invitation to others to participate. It doesn’t mean that they have to. They’re not under an obligation. They should never feel like it’s an obligation. But if they want to make a contribution back and it can be justified as part of the work that they do as part of their day job, everybody wins including your employer.

Because you get better code to work with.

Yes. In terms of Python, I have some focused effort on making sure that the shared memory code makes it into the release, that it jumps through the hurdles.

It needs to get properly tested and go through those cycles.

Exactly. And so that’s pulling quite a bit of a tension there. If there are other people who are interested and excited by the idea of shared memory, who have a use case for it, double fantastic. I would love to talk to those people.

Do you want me to put contact information for you in the blog post?

Sure, if it’s Python related, my e-mail is just Davin@python.org.

Oh. Nice.

I don’t get paid, but there are a few fringe benefits. That vanity email is one.

That is a nice one.



Don’t miss the earlier parts of this discussion with Davin Potts. In the first post, we discussed the wide variety of programming languages and tools in use for data science projects right now, and how he became a core Python committer. In the second post, we discussed the advantages of KNIME for a data science consultant like Potts, and the advantages of using SQL to do data manipulation and analysis. In the third post, Potts shared some exciting news about the upcoming Python 3.8 release.

And don’t miss the next post, where we’ll talk about the new Uber created open source Python-Vertica interface.

Learn more about the open source Python-Vertica interface.
Learn more about how you can help with the Python 3.8 test cycle.
Get a copy of Python 3.8 alpha 2 and test.
Read related post: Open Source Software is Free, Like a Puppy