Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The thing about becoming a PG contributor is that the barrier to entry is fairly high.

I love Postgres so much I have a PG tattoo, but from the perspective of the two ways you can contribute:

- As a random user, in your free time: There's not a ton of "Good first issue" type tickets. Where you can ease your way into PG dev by working on something that doesn't require you to have context on many parts of the PG architecture and at least a little historical knowledge on why things are written the way they are. Also, it can be a bit intimidating to have your patches reviewed by the likes of Tom or Andres.

- As a developer for a paid PG company like EDB/PG Pros/Crunchy etc: It's a sort of Catch-22 scenario here, where it's difficult to get hired as a junior without having previous PG hacking experience, but the path to doing that is not the easiest thing in the world.

If I was going to work somewhere that wasn't $CURRENT_CO, it'd be somewhere doing PG work, but there's not a lot of viable avenues/inroads there.



PostgreSQL isn't that special as a codebase. Every codebase has its quirks, every project has its own processes and there's a learning curve. When you switch to a new job as a software engineer, you pick it up. PostgreSQL is no different: you can hire an engineer to work on PostgreSQL.

I'm not sure how well that path works in growing new contributors, though. In a usual company setting, the goals are better defined, and the company is in control. Once you reach the goals, mission accomplished. With an open source project it's more nebulous. Others might have different criteria and different priorities. You are not in control. Choosing the right problems to work on is important.

Other storage or database projects would be a good source of new contributors. If you have worked on another DBMS, you're already familiar with the domain, and the usual techniques and tradeoffs. But to stick around, you need some internal desire to contribute, not just achieve some specific goals.


The biggest hurdle I see is that it is a C project, unfortunately something we can do nothing about. It is so much harder to trust a random code not have to have serious implications for the database. It will take ages for someone to get comfortable with the pg-code-base way of handling errors, basic string manipulation, memory alloc/free etc.

I want to highlight the difference in "making a non-core contribution" to "understanding database internals". I am highlighting it is not the latter, but the former that is the first hurdle.

I wanted to reuse builtin pg code to parse the printed statements from logs - I ended up writing a parser (in a non-C language) myself which was faster.


Couple of points in this post, so will address a few of them:

  "(Paraphrased) C is bad, and it takes forever to pick up the PG-specific C idioms"
There's probably not a productive conversation to be had about C as a language. I will say that as of C23, the language is not quite as barebones as it used to be and incorporates a lot of modern improvements.

On the topic of PG-specific C -- there are a handful of replacements for common operations that you use in PG. Things like "palloc/pfree", and the built-in macros for error and warning logging, etc.

I genuinely don't think it would take a motivated party more than a day or two to pick all of these up -- there aren't that many of them and they tend to map to things you're already used to.

  "I wanted to reuse builtin pg code to parse the printed statements from logs - I ended up writing a parser (in a non-C language) myself which was faster."
It's true that the core PG code isn't written in a modular way that's friendly to integration piecemeal in other projects (outside of libpq).

For THIS PARTICULAR case, the pganalyze team has actually extracted out the parser of PG for including in your own projects:

https://github.com/pganalyze/libpg_query


libpg_query is a godsend of a library. I spent a lot of time writing a custom parser before I found it - was very happy to replace the whole thing. A major boon was the fingerprinting ability - one of my needs was to track query versions in metadata.


I disagree on this. Yes it's C. But I've heard people comment "I don't like writing C, but I don't mind Postgres C".

The bigger hurdle which Peter mentioned in another thread is simply building up enough expertise with the system and having the right level of domain expertise.


> Yes it's C. But I've heard people comment "I don't like writing C, but I don't mind Postgres C".

While "Postgres C" might be wonderful, in practice learning the project's unique idioms is yet another hurdle for newcomers to overcome.


Every project has unique idioms. Let alone ones that are 30+ years old.

Idioms are a baked in cost of learning to contribute to any project.


I found that I learned a lot when trying to write a logical decoding plugin. So I guess if you are a user of Postgres and there’s some small friction you could reduce by writing a plugin, it’s a good way to get started. Scratch your own itch, you don’t have to publish the results :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: