The thing about becoming a PG contributor is that the barrier to entry is fairly...

hlinnaka · on Oct 10, 2023

PostgreSQL isn't that special as a codebase. Every codebase has its quirks, every project has its own processes and there's a learning curve. When you switch to a new job as a software engineer, you pick it up. PostgreSQL is no different: you can hire an engineer to work on PostgreSQL.

I'm not sure how well that path works in growing new contributors, though. In a usual company setting, the goals are better defined, and the company is in control. Once you reach the goals, mission accomplished. With an open source project it's more nebulous. Others might have different criteria and different priorities. You are not in control. Choosing the right problems to work on is important.

Other storage or database projects would be a good source of new contributors. If you have worked on another DBMS, you're already familiar with the domain, and the usual techniques and tradeoffs. But to stick around, you need some internal desire to contribute, not just achieve some specific goals.

harikb · on Oct 10, 2023

The biggest hurdle I see is that it is a C project, unfortunately something we can do nothing about. It is so much harder to trust a random code not have to have serious implications for the database. It will take ages for someone to get comfortable with the pg-code-base way of handling errors, basic string manipulation, memory alloc/free etc.

I want to highlight the difference in "making a non-core contribution" to "understanding database internals". I am highlighting it is not the latter, but the former that is the first hurdle.

I wanted to reuse builtin pg code to parse the printed statements from logs - I ended up writing a parser (in a non-C language) myself which was faster.

gavinray · on Oct 10, 2023

Couple of points in this post, so will address a few of them:

  "(Paraphrased) C is bad, and it takes forever to pick up the PG-specific C idioms"

There's probably not a productive conversation to be had about C as a language. I will say that as of C23, the language is not quite as barebones as it used to be and incorporates a lot of modern improvements.

On the topic of PG-specific C -- there are a handful of replacements for common operations that you use in PG. Things like "palloc/pfree", and the built-in macros for error and warning logging, etc.

I genuinely don't think it would take a motivated party more than a day or two to pick all of these up -- there aren't that many of them and they tend to map to things you're already used to.

  "I wanted to reuse builtin pg code to parse the printed statements from logs - I ended up writing a parser (in a non-C language) myself which was faster."

It's true that the core PG code isn't written in a modular way that's friendly to integration piecemeal in other projects (outside of libpq).

For THIS PARTICULAR case, the pganalyze team has actually extracted out the parser of PG for including in your own projects:

https://github.com/pganalyze/libpg_query

zxexz · on Oct 10, 2023

libpg_query is a godsend of a library. I spent a lot of time writing a custom parser before I found it - was very happy to replace the whole thing. A major boon was the fingerprinting ability - one of my needs was to track query versions in metadata.

craigkerstiens · on Oct 10, 2023

I disagree on this. Yes it's C. But I've heard people comment "I don't like writing C, but I don't mind Postgres C".

The bigger hurdle which Peter mentioned in another thread is simply building up enough expertise with the system and having the right level of domain expertise.

stouset · on Oct 10, 2023

> Yes it's C. But I've heard people comment "I don't like writing C, but I don't mind Postgres C".

While "Postgres C" might be wonderful, in practice learning the project's unique idioms is yet another hurdle for newcomers to overcome.

eatonphil · on Oct 10, 2023

Every project has unique idioms. Let alone ones that are 30+ years old.

Idioms are a baked in cost of learning to contribute to any project.

fanf2 · on Oct 10, 2023

I found that I learned a lot when trying to write a logical decoding plugin. So I guess if you are a user of Postgres and there’s some small friction you could reduce by writing a plugin, it’s a good way to get started. Scratch your own itch, you don’t have to publish the results :-)