New York!

April, 17 2014

A couple of week ago I had the chance to participate into the PGConf NYC 2014 Conference, one of the biggest conferences about PostgreSQL worldwide.

 

Last week some PostgreSQL users, contributors and advocates have organized a really great conference in Stockholm, Sweden, where I had the please to give the following talk:

 

In our previous article Aggregating NBA data, PostgreSQL vs MongoDB we spent time comparing the pretty new MongoDB Aggregation Framework with the decades old SQL aggregates. Today, let's showcase more of those SQL aggregates, producing a nice histogram right from our SQL console.

 

When reading the article Crunching 30 Years of NBA Data with MongoDB Aggregation I coulnd't help but think that we've been enjoying aggregates in SQL for 3 or 4 decades already. When using PostgreSQL it's even easy to actually add your own aggregates given the SQL command create aggregate.

 

Back from the FODESM 2014 Conference, here's the slides I've been using for the Advanced Extension Use Cases talk I gave, based on the ongoing work to be found under the Tour of Extensions index in this web site.

 

FOSDEM 2014

January, 29 2014

This year again the PostgreSQL community is organising a FOSDEM PGDay rigth before the main event. Have a look at the PostgreSQL FOSDEM Schedule, it's packed with awesome talks... personnaly, it's been awhile since I wanted to see so many of them!

 

A long time ago we talked about how to Import fixed width data with pgloader, following up on other stories still online at Postgres OnLine Journal and on David Fetter's blog. Back then, I showed that using pgloader made it easier to import the data, but also showed quite poor performances characteristics due to using the debug mode in the timings. Let's update that article with current pgloader wonders!

 

As presented at the PostgreSQL Conference Europe the new version of pgloader is now able to fully migrate a MySQL database, including discovering the schema, casting data types, transforming data and default values. Sakila is the traditional MySQL example database, in this article we're going to fully migrate it over to PostgreSQL.

 

Back From Dublin

November, 05 2013

Last week I had the pleasure to present two talks at the awesome PostgreSQL Conference Europe. The first one was actually a tutorial about Writing & using Postgres Extensions where we spent 3 hours on what are PostgreSQL Extensions, what you can expect from them, and how to develop a new one. Then I also had the opportunity to present the new version of pgloader in a talk about Migrating from MySQL to PostgreSQL.

 

Denormalizing Tags

October, 24 2013

In our Tour of Extensions today's article is about advanced tag indexing. We have a great data collection to play with and our goal today is to be able to quickly find data matching a complex set of tags. So, let's find out those lastfm tracks that are tagged as blues and rhythm and blues, for instance.

 

At the Open World Forum two weeks ago I had the pleasure to meet with Colin Charles. We had a nice talk about the current state of both MariaDB and PostgreSQL, and even were both interviewed by the Open World Forum Team. The interview is now available online. Dear French readers, it's in English.

 

PostgreSQL is an all round impressive Relational DataBase Management System which implements the SQL standard (see the very useful reference page Comparison of different SQL implementations for details). PostgreSQL also provides with unique solutions in the database market and has been leading innovation for some years now. Still, there's no support for Autonomous Transactions within the server itself. Let's have a look at how to easily implement them with PL/Proxy.

 

Let's get back to our Tour of Extensions that had to be kept aside for awhile with other concerns such as last chance PostgreSQL data recovery. Now that we have a data loading tool up to the task (read about it in the Loading Geolocation Data article) we're going to be able to play with the awesome ip4r extension from RhodiumToad.

 

Last Friday I had the chance to be speaking at the Open World Forum in the NewSQL track, where we had lots of interest and excitement around the NoSQL offerings. Of course, my talk was about explaining how PostgreSQL is Web Scale with some historical background and technical examples about what this database engine is currently capable of.

 

In our previous article about Loading Geolocation Data, we did load some data into PostgreSQL and saw the quite noticable impact of a user transformation. As it happens, the function that did the integer to IP representation was so naive as to scratch the micro optimisation itch of some Common Lisp hackers: thanks a lot guys, in particular stassats who came up with the solution we're seeing now.

 

Loading Geolocation Data

October, 01 2013

As I've been mentionning in the past already, I'm currently rewriting pgloader from scratch in Common Lisp. In terms of technical debt that's akin to declaring bankrupcy, which is both sad news and good news as there's suddenly new hope of doing it right this time.

 

Open World Forum 2013

September, 19 2013

Have you heard about the Open World Forum conference that takes place in Paris, October 3-5, 2013? I'll be presenting a talk about PostgreSQL in the track NewSQL: Managing large data sets with relational technologies.

 

PostgreSQL data recovery

September, 17 2013

The following story is only interesting to read if you like it when bad things happen, or if you don't have a trustworthy backup policy in place. By trustworthy I mean that each backup you take must be tested with a test recovery job. Only tested backups will prove useful when you need them. So go read our Backup and Restore documentation chapter then learn how to setup Barman for handling physical backups and Point In Time Recovery. Get back when you have proper backups, including recovery testing in place. We are waiting for you. Back? Ok, let's see how bad you can end up without backups, and how to still recover. With luck.

 

Using trigrams against typos

September, 06 2013

In our ongoing Tour of Extensions we played with earth distance in How far is the nearest pub? then with hstore in a series about trigger, first to generalize Trigger Parameters then to enable us to Auditing Changes with Hstore. Today we are going to work with pg_trgm which is the trigrams PostgreSQL extension: its usage got seriously enhanced in recent PostgreSQL releases and it's now a poor's man Full Text Search engine.

 

In a previous article about Trigger Parameters we have been using the extension hstore in order to compute some extra field in our records, where the fields used both for the computation and for storing the results were passed in as dynamic parameters. Today we're going to see another trigger use case for hstore: we are going to record changes made to our tuples.

 

Trigger Parameters

August, 23 2013

Sometimes you want to compute values automatically at INSERT time, like for example a duration column out of a start and an end column, both timestamptz. It's easy enough to do with a BEFORE TRIGGER on your table. What's more complex is to come up with a parametrized spelling of the trigger, where you can attach the same stored procedure to any table even when the column names are different from one another.

 

There was SQL before window functions and SQL after window functions: that's how powerful this tool is. Being that of a deal breaker unfortunately means that it can be quite hard to grasp the feature. This article aims at making it crystal clear so that you can begin using it today and are able to reason about it and recognize cases where you want to be using window functions.

 

About the only time when I will accept to work with MySQL is when you need help to migrate away from it because you decided to move to PostgreSQL instead. And it's already been too much of a pain really, so after all this time I began consolidating what I know about that topic and am writing a software to help me here. Consider it the MySQL Migration Toolkit.

 

In our recent article about The Most Popular Pub Names we did have a look at how to find the pubs nearby, but didn't compute the distance in between that pub and us. That's because how to compute a distance given a position on the earth expressed as longitude and latitude is not that easy. Today, we are going to solve that problem nonetheless, thanks to PostgreSQL Extensions.

 

In his article titled The Most Popular Pub Names Ross Lawley did show us how to perform some quite interesting geographic queries against MongoDB, using some nice Open Data found at the Open Street Map project.

 

After spending an awesome week in San Francisco, CA I'm lucky enough to be spending another week in the USA, in Portand, OR. The main excuse for showing up here has been OSCON where I presented a talk about the fotolog migration from MySQL to PostgreSQL.

 

Talking at the SFPUG

July, 19 2013

Those days feel really lucky to me. I'm currently visiting friends and customers in San Francisco, and really enjoying my trip here! Of course Josh Berkus took the opportunity to organise a SFPUG meeting and I had the pleasure of being the speaker over there.

 

Back from CHAR(13)

July, 15 2013

Last week was held the CHAR(13) conference in a great venue in the UK countryside. Not only did we discover UK under good weather conditions and some local beers, we also did share a lot of good ideas!

 

In a recent article here we've been talking about how do do Batch Updates in a very efficient way, using the Writable CTE features available in PostgreSQL 9.1. I sometime read how Common Table Expressions changed the life of fellow DBAs and developers, and would say that Writable CTE are at least the same boost again.

 

In a recent article Craig Kerstiens from Heroku did demo the really useful crosstab extension. That function allows you to pivot a table so that you can see the data from different categories in separate columns in the same row rather than in separate rows. The article from Craig is Pivoting in Postgres.

 

Conferences Report

July, 03 2013

Recently I've been to some more conferences and didn't take the time to blog about them, even though I really did have great fun over there. So I felt I should take some time and report about my experience at those conferences. And of course, some more is on the way, as the PostgreSQL Conference Tour gets busier each year it seems.

 

Tonight I had the pleasure to present a talk at the Dublin PostgreSQL User Group using remote technologies. The talk is about how to make the most ouf of PostgreSQL when using SQL as a developer, and tries to convince you to dive into mastering SQL by showing how to solve an application example all in SQL, using window functions and common table expressions.

 

Nearest Big City

May, 02 2013

In this article, we want to find the town with the greatest number of inhabitants near a given location.

 

Bulk Replication

March, 18 2013

In the previous article here we talked about how to properly update more than one row at a time, under the title Batch Update. We did consider performances, including network round trips, and did look at the behavior of our results when used concurrently.

 

Batch Update

March, 15 2013

Performance consulting involves some tricks that you have to teach over and over again. One of them is that SQL tends to be so much better at dealing with plenty of rows in a single statement when compared to running as many statements, each one against a single row.

 

Emacs Conference

March, 04 2013

The Emacs Conference is happening, it's real, and it will take place at the end of this month in London. Check it out, and register at Emacs Conference Event Brite. It's free and there's still some availability.

 

HyperLogLog Unions

February, 26 2013

In the article from yesterday we talked about PostgreSQL HyperLogLog with some details. The real magic of that extension has been skimmed over though, and needs another very small article all by itself, in case you missed it.

 

PostgreSQL HyperLogLog

February, 25 2013

If you've been following along at home the newer statistics developments, you might have heard about this new State of The Art Cardinality Estimation Algorithm called HyperLogLog. This technique is now available for PostgreSQL in the extension postgresql-hll available at https://github.com/aggregateknowledge/postgresql-hll and soon to be in debian.

 

Playing with pgloader

February, 12 2013

While making progress with both Event Triggers and Extension Templates, I needed to make a little break. My current keeping sane mental exercise seems to mainly involve using Common Lisp, a programming language that ships with about all the building blocks you need.

 

Live Upgrading PGQ

February, 08 2013

Some skytools related new today, it's been a while. For those who where at my FOSDEM's talk about Implementing High Availability you might have heard that I really like working with PGQ. A new version has been released a while ago, and the most recent verion is now 3.1.3, as announced in the Skytools 3.1.3 email.

 

Another Great FOSDEM

February, 04 2013

This year's FOSDEM has been a great edition, in particular the FOSDEM PGDAY 2013 was a great way to begin a 3 days marathon of talking about PostgreSQL with people not only from our community but also from plenty other Open Source communities too: users!

 

A Sunday at FOSDEM

January, 30 2013

The previous article FOSDEM 2013 said to be careful with the PostgreSQL devroom schedule because one of my talks there might get swapped with a slot on the FOSDEM PGDay 2013 which happens this Friday and has been sold out anyway.

 

FOSDEM 2013

January, 29 2013

This year again I'm going to FOSDEM, and to the extra special PostgreSQL FOSDEM day. It will be the first time that I'm going to be at the event for the full week-end rather than just commuting in for the day.

 

pgloader: what's next?

January, 28 2013

pgloader is a tool to help loading data into PostgreSQL, adding some error management to the COPY command. COPY is the fast way of loading data into PostgreSQL and is transaction safe. That means that if a single error appears within your bulk of data, you will have loaded none of it. pgloader will submit the data again in smaller chunks until it's able to isolate the bad from the good, and then the good is loaded in.

 

Another day, another migration from MySQL to PostgreSQL... or at least that's how it feels sometimes. This time again I've been using some quite old scripts to help me do the migration.

 

Extensions Templates

January, 08 2013

In a recent article titled Inline Extensions we detailed the problem of how to distribute an extension's package to a remote server without having access to its file system at all. The solution to that problem is non trivial, let's say. But thanks to the awesome PostgreSQL Community we finaly have some practical ideas on how to address the problem as discussed on pgsql-hackers, our development mailing list.

 

Inline Extensions

December, 13 2012

We've been having the CREATE EXTENSION feature in PostgreSQL for a couple of releases now, so let's talk about how to go from here. The first goal of the extension facility has been to allow for a clean dump and restore process of contrib modules. As such it's been tailored to the needs of deploying files on the file system because there's no escaping from that when you have to ship binary and executable files, those infamous .so, .dll or .dylib things.

 

Editing SQL

November, 06 2012

It's hard to read my blog yet not know I'm using Emacs. It really is a great tool and has a lot to compare to PostgreSQL in terms of extensibility, documentation quality and community. And there's even a native implementation of the PostgreSQL Protocol written in Emacs Lisp.

 

PostgreSQL for developers

November, 02 2012

As Guillaume says, we've been enjoying a great evening conference in Lyon 2 days ago, presenting PostgreSQL to developers. He did the first hour presenting the project and the main things you want to know to start using PostgreSQL in production, then I took the opportunity to be talking to developers to show off some SQL.

 

Another awesome conf

October, 30 2012

Last week was PostgreSQL Conference Europe 2012 in Prague, and it's been awesome. Many thanks to the organisers who did manage to host a very smooth conference with 290 attendees, including speakers. That means you kept walking into interesting people to talk to, and in particular the Hallway Track has been a giant success.

 

Prefixes and Ranges

October, 16 2012

It's been a long time since I last had some time to spend on the prefix PostgreSQL extension and its prefix_range data type. With PostgreSQL 9.2 out, some users wanted me to update the extension for that release, and hinted me that it was high time that I fix that old bug for which I already had a patch.

 

Reset Counter

October, 05 2012

I've been given a nice puzzle that I think is a good blog article opportunity, as it involves some thinking and window functions.

 

PostgreSQL 9.3

September, 15 2012

PostgreSQL 9.2 is released! It's an awesome new release that I urge you to consider trying and adopting, an upgrade from even 9.1 should be very well worth it, as your hardware could suddenly be able to process a much higher load. Indeed, better performances mean more work done on the same budget, that's the name of the game!

 

Autumn 2012 Conferences

August, 02 2012

The PostgreSQL community host a number of conferences all over the year, and the next ones I'm lucky enough to get to are approaching fast now. First, next month in September, we have Postgres Open in Chicago, where my talk about Large Scale Migration from MySQL to PostgreSQL has been selected!

 

PGDay France 2012

June, 08 2012

The french PostgreSQL Conference, pgday.fr, was yesterday in Lyon. We had a very good time and a great schedule with a single track packed with 7 talks, addressing a diverse set of PostgreSQL related topics, from GIS to fuzzy logic, including replication.

 

Back From PgCon

May, 24 2012

Last week was the annual PostgreSQL Hackers gathering in Canada, thanks to the awesome pgcon conference. This year's issue has been packed with good things, beginning with the Cluster Summit then followed the next day by the Developer Meeting just followed (yes, in the same day) with the In Core Replication Meeting. That was a packed shedule!

 

Clean PGQ Subconsumers

April, 26 2012

Now that you're all using the wonders of Cooperative Consumers to help you efficiently and reliably implement your business constraints and offload them from the main user transactions, you're reaching a point where you have to clean up your development environment (because that's what happens to development environments, right?), and you want a way to start again from a clean empty place.

 

PGQ Coop Consumers

March, 12 2012

While working a new PostgreSQL architecture for an high scale project that used to be in the top 10 of internet popular web sites (in terms of visitors), I needed to be able to off load some processing from the main path: that's called a batch job. This needs to be transactional: don't run the job if we did rollback; the transaction, process all events that were part of the same transaction in the same transaction, etc.

 

PostgreSQL 9.1 includes proper extension support, as you might well know if you ever read this very blog here. Some hosting facilities are playing with PostgreSQL at big scale (hello Heroku!) and still meet with small caveats making their life uneasy.

 

pgbouncer munin plugin

November, 16 2011

It seems that if you search for a munin plugin for pgbouncer it's easy enough to reach an old page of mine with an old version of my plugin, and a broken link. Let's remedy that by publishing here the newer version of the plugin. To be honest, I though it already made its way into the official munin 1.4 set of plugins, but I've not been following closely enough.

 

Back From Amsterdam

October, 26 2011

Another great conference took place last week, PostgreSQL Conference Europe 2011 was in Amsterdam and plenty of us PostgreSQL geeks were too. I attended to lot of talks and did learn some more about our project, its community and its features, but more than that it was a perfect occasion to meet with the community.

 

Implementing backups

October, 12 2011

I've been asked about my opinion on backup strategy and best practices, and it so happens that I have some kind of an opinion on the matter.

 

Scaling Stored Procedures

October, 06 2011

In the news recently stored procedures where used as an excuse for moving away logic from the database layer to application layer, and to migrate away from a powerful technology to a simpler one, now that there's no logic anymore in the database.

 

See you in Amsterdam

October, 04 2011

The next PostgreSQL conference is approaching very fast now, I hope you have your ticket already: it's a very promissing event! If you want some help in deciding whether to register or not, just have another look at the schedule. Pick the talks you want to see. It's hard, given how packed with good ones the schedule is. When you're mind is all set, review the list. Registered?

 

Skytools3: walmgr

September, 21 2011

Let's begin the Skytools 3 documentation effort, which is long overdue. The code is waiting for you over at github, and is stable and working. Why is it still in release candidate status, I hear you asking? Well because it's missing updated documentation.

 

PostgreSQL and debian

September, 05 2011

After talking about it for a very long time, work finally did begin! I'm talking about the apt.postgresql.org build system that will allow us, in the long run, to propose debian versions of binary packages for PostgreSQL and its extensions, compiled for a bunch of debian and ubuntu versions.

 

On the PostgreSQL Hackers mailing lists, Andrew Dunstan just proposed some new options for pg_dump and pg_restore to ease our lives. One of the answers was talking about some scripts available to exploit the pg_restore listing that you play with using options -l and -L, or the long name versions --list and --use-list. The pg_staging tool allows you to easily exploit those lists too.

 

Skytools, version 3

August, 26 2011

You can find skytools3 in debian experimental already, it's in release candidate status. What's missing is the documentation, so here's an idea: I'm going to make a blog post series about skytools next features, how to use them, what they are good for, etc. This first article of the series will just list what are those new features.

 

pgfincore in debian

August, 19 2011

As of pretty recently, pgfincore is now in debian, as you can see on its postgresql-9.0-pgfincore page. The reason why it entered the debian archives is that it reached the 1.0 release!

 

pgloader constant cols

August, 12 2011

The previous articles in the pgloader series detailed How To Use PgLoader then How to Setup pgloader, then what to expect from a parallel pgloader setup, and then pgloader reformating. Another need you might encounter when you get to use pgloader is adding constant values into a table's column.

 

pgloader reformating

August, 05 2011

Back to our series about pgloader. The previous articles detailed How To Use PgLoader then How to Setup pgloader, then what to expect from a parallel pgloader setup. This article will detail how to reformat input columns so that what PostgreSQL sees is not what's in the data file, but the result of a transformation from this data into something acceptable as an input for the target data type.

 

See Tsung in action

August, 02 2011

Tsung is an open-source multi-protocol distributed load testing tool and a mature project. It's been available for about 10 years and is built with the Erlang system. It supports several protocols, including the PostgreSQL one.

 

Parallel pgloader

August, 01 2011

This article continues the series that began with How To Use PgLoader then detailed How to Setup pgloader. We have some more fine points to talk about here, today's article is about loading your data in parallel with pgloader.

 

How to Setup pgloader

July, 29 2011

In a previous article we detailed how to use pgloader, let's now see how to write the pgloader.conf that instructs pgloader about what to do.

 

Next month partitions

July, 27 2011

When you do partition your tables monthly, then comes the question of when to create next partitions. I tend to create them just the week before next month and I have some nice nagios scripts to alert me in case I've forgotten to do so. How to check that by hand in the end of a month?

 

How To Use PgLoader

July, 22 2011

This question about pgloader usage coms in quite frequently, and I think the examples README goes a long way in answering it. It's not exactly a tutorial but is almost there. Let me paste it here for reference:

 

Skytools3 talk Slides

July, 19 2011

In case you're wondering, here are the slides from the CHAR(11) talk I gave, about Skytools 3.0, soon to be released. That means as soon as I have enough time available to polish (or write) the documentation.

 

Back From CHAR(11)

July, 13 2011

CHAR(11) finished somewhen in the night leading to today, if you consider the social events to be part of it, which I definitely do. This conference has been a very good one, both on the organisation side of things and of course for its content.

 

We still have this problem to solve with extensions and their packaging. How to best organize things so that your extension is compatible with before 9.1 and 9.1 and following releases of PostgreSQL?

 

While Magnus is all about PG Conf EU already, you have to realize we're just landed back from PG Con in Ottawa. My next stop in the annual conferences is CHAR 11, the Clustering, High Availability and Replication conference in Cambridge, 11-12 July. Yes, on the old continent this time.

 

Preparing for PGCON

May, 12 2011

It's this time of the year again, the main international PostgreSQL Conference is next week in Ottawa, Canada. If previous years are any indication, this will be great event where to meet with a lot of the members of your community. The core team will be there, developers will be there, and we will meet with users and their challenging use cases.

 

Let's say you need to ALTER TABLE foo ALTER COLUMN bar TYPE bigint;, and PostgreSQL is helpfully telling you that no you can't because such and such views depend on the column. The basic way to deal with that is to copy paste from the error message the names of the views involved, then prepare a script wherein you first DROP VIEW ...; then ALTER TABLE and finally CREATE VIEW again, all in the same transaction.

 

While currently too busy at work to deliver much Open Source contributions, let's debunk an old habit of PostgreSQL extension authors. It's all down to copy pasting from contrib, and there's no reason to continue doing $libdir this way ever since 7.4 days.

 

I've been working on skytools3 packaging lately. I've been pushing quite a lot of work into it, in order to have exactly what I needed out of the box, after some 3 years of production and experiences with the products. Plural, yes, because even if pgbouncer and plproxy are siblings to the projets (same developers team, separate life cycle and releases), then skytools still includes several sub-projects.

 

towards pg_staging 1.0

March, 29 2011

If you don't remember about what pg_staging is all about, it's a central console from where to control all your PostgreSQL databases. Typically you use it to manage your development and pre-production setup, where developers ask you pretty often to install them some newer dump from the production, and you want that operation streamlined and easy.

 

Extensions in 9.1

March, 01 2011

If you've not been following closely you might have missed out on extensions integration. Well, Tom spent some time on the patches I've been preparing for the last 4 months. And not only did he commit most of the work but he also enhanced some parts of the code (better factoring) and basically finished it.

 

Back from FOSDEM

February, 07 2011

This year we were in the main building of the conference, and apparently the booth went very well, solding lots of PostgreSQL merchandise etc. I had the pleasure to once again meet with the community, but being there only 1 day I didn't spend as much time as I would have liked with some of the people there.

 

Going to FOSDEM

February, 01 2011

A quick blog entry to say that yes:

 

pg_basebackup

November, 07 2010

Hannu just gave me a good idea in this email on -hackers, proposing that pg_basebackup should get the xlog files again and again in a loop for the whole duration of the base backup. That's now done in the aforementioned tool, whose options got a little more useful now:

 

Introducing Extensions

October, 21 2010

After reading Simon's blog post, I can't help but try to give some details about what it is exactly that I'm working on. As he said, there are several aspects to extensions in PostgreSQL, it all begins here: Chapter 35. Extending SQL.

 

These days, thanks to my community oriented job, I'm working full time on a PostgreSQL patch to terminate basic support for extending SQL. First thing I want to share is that patching the backend code is not as hard as one would think. Second one is that git really is helping.

 

Date puzzle for starters

October, 08 2010

The PostgreSQL IRC channel is a good place to be, for all the very good help you can get there, because people are always wanting to remain helpful, because of the off-topics discussions sometime, or to get to talk with community core members. And to start up your day too.

 

Yeah I'm back on working on my part of the extension thing in PostgreSQL.

 

The major reason why I dislike perl so much, and ruby too, and the thing I'd want different in the Emacs Lisp API so far is how they set developers mind into using regexp. You know the quote, don't you?

 

The drawback of hosting a static only website is, obviously, the lack of comments. What happens actually, though, is that I receive very few comments by direct mail. As I don't get another spam source to cleanup, I'm left unconvinced that's such a drawback. I still miss the low probability of seeing blog readers exchange directly, but I think a tapoueh.org mailing list would be my answer, here...

 

Window Functions example

September, 09 2010

So, when 8.4 came out there was all those comments about how getting window functions was an awesome addition. Now, it seems that a lot of people seeking for help in #postgresql just don't know what kind of problem this feature helps solving. I've already been using them in some cases here in this blog, for getting some nice overview about Partitioning: relation size per “group”.

 

Synchronous Replication

September, 06 2010

Although the new asynchronous replication facility that ships with 9.0 ain't released to the wide public yet, our hackers hero are already working on the synchronous version of it. A part of the facility is rather easy to design, we want something comparable to DRBD flexibility, but specific to our database world. So synchronous would either mean recv, fsync or apply, depending on what you need the standby to have already done when the master acknowledges the COMMIT. Let's call that the service level.

 

Happy Numbers

August, 30 2010

After discovering the excellent Gwene service, which allows you to subscribe to newsgroups to read RSS content ( blogs, planets, commits, etc), I came to read this nice article about Happy Numbers. That's a little problem that fits well an interview style question, so I first solved it yesterday evening in Emacs Lisp as that's the language I use the most those days.

 

Playing with bit strings

August, 26 2010

The idea of the day ain't directly from me, I'm just helping with a very thin subpart of the problem. The problem, I can't say much about, let's just assume you want to reduce the storage of MD5 in your database, so you want to abuse bit strings. A solution to use them works fine, but the datatype is still missing some facilities, for example going from and to hexadecimal representation in text.

 

We're using constants in some constraints here, for example in cases where several servers are replicating to the same federating one: each origin server has his own schema, and all is replicated nicely on the central host, thanks to Londiste, as you might have guessed already.

 

In trying to help an extension debian packaging effort, I've once again proposed to handle it. That's because I now begin to know how to do it, as you can see in my package overview page at debian QA facility. There's a reason why I proposed myself here, it's that yet another tool of mine is now to be found in debian, and should greatly help extension packaging there. You can already check for the postgresql-server-dev-all package page if you're that impatient!

 

Some user on IRC was reading the releases notes in order to plan for a minor upgrade of his 8.3.3 installation, and was puzzled about potential needs for rebuilding GIST indexes. That's from the 8.3.5 release notes, and from the 8.3.8 notes you see that you need to consider hash indexes on interval columns too. Now the question is, how to find out if any such beasts are in use in your database?

 

Today I'm being told once again about SQLite as an embedded database software. That one ain't a database server but a software library that you can use straight into your main program. I'm yet to use it, but it looks like its SQL support is good enough for simple things — and that covers loads of things. I guess read-only cache and configuration storage would be the obvious ones, because it seems that SQLite use cases aren't including mixed concurrency, that is workloads with concurrent readers and writers.

 

This time, we are trying to figure out where is the bulk of the data on disk. The trick is that we're using DDL partitioning, but we want a “nice” view of size per partition set. Meaning that if you have for example a parent table foo with partitions foo_201006 and foo_201007, you would want to see a single category foo containing the accumulated size of all the partitions underneath foo.

 

Emacs and PostgreSQL

July, 22 2010

Those are my two all times favorite Open Source Software. Or Free Software in the GNU sense of the world, as both the BSD and the GPL are labeled free there. Even if I prefer the The Debian Free Software Guidelines as a global definition and the WTFPL license. But that's a digression.

 

Background writers

July, 19 2010

There's currently a thread on hackers about bg worker: overview and a series of 6 patches. Thanks a lot Markus! This is all about generalizing a concept already in use in the autovacuum process, where you have an independent subsystem that require having an autonomous daemon running and able to start its own workers.

 

Logs analysis

July, 13 2010

Nowadays to analyze logs and provide insights, the more common tool to use is pgfouine, which does an excellent job. But there has been some improvements in logs capabilities that we're not benefiting from yet, and I'm thinking about the CSV log format.

 

There's a big trend nowadays about using column storage as opposed to what PostgreSQL is doing, which would be row storage. The difference is that if you have the same column value in a lot of rows, you could get to a point where you have this value only once in the underlying storage file. That means high compression. Then you tweak the executor to be able to load this value only once, not once per row, and you win another huge source of data traffic (often enough, from disk).

 

MVCC in the Cloud

July, 06 2010

At CHAR(10) Markus had a talk about Using MVCC for Clustered Database Systems and explained how Postgres-R does it. The scope of his project is to maintain a set of database servers in the same state, eventually.

 

Back from CHAR(10)

July, 05 2010

It surely does not feel like a full month and some more went by since we were enjoying PGCon 2010, but in fact it was already the time for CHAR(10). The venue was most excellent, as Oxford is a very beautiful city. Also, the college was like a city in the city, and having the accomodation all in there really smoothed it all.

 

Back from PgCon2010

May, 27 2010

This year's edition has been the best pgcon ever for me. Granted, it's only my third time, but still :) As Josh said the "Hall Track" in particular was very good, and the Dev Meeting has been very effective!

 

So, following previous blog entries about importing fixed width data, from Postgres Online Journal and David (perl) Fetter, I couldn't resist following the meme and showing how to achieve the same thing with pgloader.

 

Yes. This pgloader project is still maintained and somewhat active. Development happens when I receive a complaint, either about a bug in existing code or a feature in yet-to-write code. If you have a bug to report, just send me an email!

 

This time we're having a database where sequences were used, but not systematically as a default value of a given column. It's mainly an historic bad idea, but you know the usual excuse with bad ideas and bad code: the first 6 months it's experimental, after that it's historic.

 

So, if you followed the previous blog entry, now you have a new database containing all the static tables encoded in UTF-8 rather than SQL_ASCII. Because if it was not yet the case, you now severely distrust this non-encoding.

 

It happens that you have to manage databases designed by your predecessor, and it even happens that the team used to not have a DBA. Those histerical raisins can lead to having a SQL_ASCII database. The horror!

 

So, after restoring a production dump with intermediate filtering, none of our sequences were set to the right value. I could have tried to review the process of filtering the dump here, but it's a one-shot action and you know what that sometimes mean. With some pressure you don't script enough of it and you just crawl more and more.

 

prefix 1.1.0

November, 30 2009

So I had two bug reports about prefix in less than a week. It means several things, one of them is that my code is getting used in the wild, which is nice. The other side of the coin is that people do find bugs in there. This one is about the behavior of the btree opclass of the type prefix range. We cheat a lot there by simply having written one, because a range does not have a strict ordering: is [1-3] before of after [2-4]? But when you know you have no overlapping intervals in your prefix_range column, being able to have it part of a primary key is damn useful.

 

moment. Lots of attendees, lots of quality talks ( slides are online), good food, great party: all the ingredients were there!

 

prefix 1.0.0

October, 06 2009

So there it is, at long last, the final 1.0.0 release of prefix! It's on its way into the debian repository (targetting sid, in testing in 10 days) and available on pgfoundry to.

 

prefix 1.0~rc2-1

July, 09 2009

I've been having problem with building both postgresql-8.3-prefix and postgresql-8.4-prefix debian packages from the same source package, and fixing the packaging issue forced me into modifying the main prefix Makefile. So while reaching rc2, I tried to think about missing pieces easy to add this late in the game: and there's one, that's a function length(prefix_range), so that you don't have to cast to text no more in the following wildspread query:

 

At long last, after millions and millions of queries just here at work and some more in other places, the prefix project is reaching 1.0 milestone. The release candidate is getting uploaded into debian at the moment of this writing, and available at the following place: prefix-1.0~rc1.tar.gz.

 

PgCon 2009

May, 27 2009

I can't really compare PgCon 2009 with previous years versions, last time I enjoyed the event it was in 2006, in Toronto. But still I found the experience to be a great one, and I hope I'll be there next year too!

 

On the performance mailing list, a recent thread drew my attention. It devired to be about using a connection pool software and prepared statements in order to increase scalability of PostgreSQL when confronted to a lot of concurrent clients all doing simple select queries. The advantage of the pooler is to reduce the number of backends needed to serve the queries, thus reducing PostgreSQL internal bookkeeping. Of course, my choice of software here is clear: PgBouncer is an excellent top grade solution, performs real well (it won't parse queries), reliable, flexible.

 

It's time for Skytools news again! First, we did improve documentation of current stable branch with hosting high level presentations and tutorials on the PostgreSQL wiki. Do check out the Londiste Tutorial, it seems that's what people hesitating to try out londiste were missing the most.

 

The prefix project is about matching a literal against prefixes in your table, the typical example being a telecom routing table. Thanks to the excellent work around generic indexes in PostgreSQL with GiST, indexing prefix matches is easy to support in an external module. Which is what the prefix extension is all about.

 

The problem was raised this week on IRC and this time again I felt it would be a good occasion for a blog entry: how to load an XML file content into a single field?

 

In this russian page you'll see a nice presentation of Skype databases architectures by Asko Oja himself. It's the talk at Russian PostgreSQL Community meeting, October 2008, Moscow, and it's a good read.

 

As it happens, I've got some environments where I want to make sure HOT ( aka Heap Only Tuples) is in use. Because we're doing so much updates a second that I want to get sure it's not killing my database server. I not only wrote some checking view to see about it, but also made a quick article about it in the French PostgreSQL website. Handling around in #postgresql means that I'm now bound to write about it in English too!

 

Londiste Trick

January, 21 2009

So, you're using londiste and the ticker has not been running all night long, due to some restart glitch in your procedures, and the on call admin didn't notice the restart failure. If you blindly restart the replication daemon, it will load in memory all those events produced during the night, at once, because you now have only one tick where to put them all.

 

Fake entry

December, 04 2008

This is a test of a fake entry to see how muse will manage this.