You might have read it in the news already in Citus’ blog post by Sumedh Pathak: PostgreSQL Expert Dimitri Fontaine joins Citus Data. I am very happy to join a talented team here at Citus, and excited to work on an Open Source solution for distributed SQL on-top of PostgreSQL! In this article I’m going to cover my first technical contributions to Citus database, as it happens that a few patches of mine made it to the main source tree already.
TL;DR It’s good to be working on PostgreSQL related source code again, and to have the opportunity to solve PostgreSQL related problems at scale!
Contributing to an Open Source project
Work done on Citus distributed database is Open Source and public, so you can see the GitHub repository for the main Citus extension and its citus/commits/master project history.
A good way to discover a new code base is either fixing known bugs or extending the basic functionality, doing some of the grunt work that nobody had the time to handle yet.
Doing small things at the beginning is important, so as to cover the basics and discovering how things are done at the source code level. Moreover, it allows to discover how to contribute to the project: workflow, explicit and implicit policies, communication with the team, tests and coverage, the CI integration, all those things that revolves around the code itself, and are even more important if you want to be able to contribute!
So the first couple of weeks at Citus Data have been just that for me. Lots of things to learn and information to process, and thanks to a great onboarding by Marco, some patches could be written in the meantime.
Early patches
So what do we have? My first ever patch to Citus database implements ALTER TABLE … RENAME TO … for distributed tables. As you can see, it’s a small feature on-top of the existing product, all about convenience for the end-users. A perfect way to start contributing!
The next patch is pretty obvious and implements ALTER INDEX … RENAME TO … for Citus distributed tables. Some more convenience!
While in the theme, there was also ALTER TABLE|INDEX … SET|RESET () on distributed tables to be handled by Citus database yet, so you know what? Yeah, we added that.
This had us realize that a known bug in a related area needed to be taken care of, so the next patch in the series allows to fix CREATE INDEX with storage options on distributed tables.
In the making of those patches, we covered lots of ground with Marco and the team. I’m quite excited about what’s next on the plan now ;-)
Plans
Well with some good luck and lots of hard work, we might some day write “and the rest is history.” But we’re not there yet, are we? So in the following weeks and months, I’m going to work on continuing to make Citus distributed database easier to use.
It’s already quite amazing all the things that you’ll find in this PostgreSQL extension, really: Citus implements a clean semantics for distributed transactions and transparent query routing, and builds on-top of that a very well integrated product with no-downtime scale-out properties.
I’m joining a talented team working on a mature product. I expect most of the work to be in rough edges of it, at the beginning, working on improving a smooth user experience everywhere possible!