In our previous article Aggregating NBA data, PostgreSQL vs MongoDB we spent time comparing the pretty new MongoDB Aggregation Framework with the decades old SQL aggregates. Today, let's showcase more of those SQL aggregates, producing a nice histogram right from our SQL console.
When reading the article Crunching 30 Years of NBA Data with MongoDB Aggregation I coulnd't help but think that we've been enjoying aggregates in SQL for 3 or 4 decades already. When using PostgreSQL it's even easy to actually add your own aggregates given the SQL command create aggregate.
Notre première rencontre des utilisateurs Parisiens de PostgreSQL avait eu lieue le 28 novembre 2013 et a fait l'objet d'un billet de présentation ici-même : l'article Groupe d'Utilisateurs PostgreSQL à Paris annonce la création du groupe !
The code of
El-Get has been pretty stable for a long time now. About the
whole set of patches against the
4.x series has been about bug fixing corner
cases (sometimes cases that had nothing
cornery about them too) and
providing more and more recipes. That's what you expect from a
software, and that what allows us to move to a
rolling releases model.
Comme annoncé dans un mail sur la liste de diffusion pgsql-fr-generale j'ai le plaisir de démarrer un groupe de rencontres PostgreSQL à Paris ! Les détails pratiques sont disponibles sur PostgreSQL User Group Paris sur Meetup, y compris l'adresse et comment s'inscrire. Il est important de s'inscrire car nous prévoyons d'accueillir des sponsors dont l'investissement sera employé à nourrir et désaltérer notre joyeux groupe !
A long time ago we talked about how to
Import fixed width data with pgloader, following up on other stories still
Postgres OnLine Journal and on
David Fetter's blog. Back then, I
showed that using pgloader made it easier to import the data, but also
showed quite poor performances characteristics due to using the
in the timings. Let's update that article with current
As presented at the PostgreSQL Conference Europe the new version of pgloader is now able to fully migrate a MySQL database, including discovering the schema, casting data types, transforming data and default values. Sakila is the traditional MySQL example database, in this article we're going to fully migrate it over to PostgreSQL.
Last week I had the pleasure to present two talks at the awesome PostgreSQL Conference Europe. The first one was actually a tutorial about Writing & using Postgres Extensions where we spent 3 hours on what are PostgreSQL Extensions, what you can expect from them, and how to develop a new one. Then I also had the opportunity to present the new version of pgloader in a talk about Migrating from MySQL to PostgreSQL.
In our Tour of Extensions today's article is about advanced tag indexing. We have a great data collection to play with and our goal today is to be able to quickly find data matching a complex set of tags. So, let's find out those lastfm tracks that are tagged as blues and rhythm and blues, for instance.
PostgreSQL is an all round impressive Relational DataBase Management System which implements the SQL standard (see the very useful reference page Comparison of different SQL implementations for details). PostgreSQL also provides with unique solutions in the database market and has been leading innovation for some years now. Still, there's no support for Autonomous Transactions within the server itself. Let's have a look at how to easily implement them with PL/Proxy.
Let's get back to our Tour of Extensions that had to be kept aside for awhile with other concerns such as last chance PostgreSQL data recovery. Now that we have a data loading tool up to the task (read about it in the Loading Geolocation Data article) we're going to be able to play with the awesome ip4r extension from RhodiumToad.
Last Friday I had the chance to be speaking at the Open World Forum in the NewSQL track, where we had lots of interest and excitement around the NoSQL offerings. Of course, my talk was about explaining how PostgreSQL is Web Scale with some historical background and technical examples about what this database engine is currently capable of.
In our previous article about Loading Geolocation Data, we did load some data into PostgreSQL and saw the quite noticable impact of a user transformation. As it happens, the function that did the integer to IP representation was so naive as to scratch the micro optimisation itch of some Common Lisp hackers: thanks a lot guys, in particular stassats who came up with the solution we're seeing now.
The following story is only interesting to read if you like it when bad things happen, or if you don't have a trustworthy backup policy in place. By trustworthy I mean that each backup you take must be tested with a test recovery job. Only tested backups will prove useful when you need them. So go read our Backup and Restore documentation chapter then learn how to setup Barman for handling physical backups and Point In Time Recovery. Get back when you have proper backups, including recovery testing in place. We are waiting for you. Back? Ok, let's see how bad you can end up without backups, and how to still recover. With luck.
In our ongoing Tour of Extensions we played with earth distance in How far is the nearest pub? then with hstore in a series about trigger, first to generalize Trigger Parameters then to enable us to Auditing Changes with Hstore. Today we are going to work with pg_trgm which is the trigrams PostgreSQL extension: its usage got seriously enhanced in recent PostgreSQL releases and it's now a poor's man Full Text Search engine.
In a previous article about Trigger Parameters we have been using the extension hstore in order to compute some extra field in our records, where the fields used both for the computation and for storing the results were passed in as dynamic parameters. Today we're going to see another trigger use case for hstore: we are going to record changes made to our tuples.
Sometimes you want to compute values automatically at
INSERT time, like for
duration column out of a
start and an
end column, both
timestamptz. It's easy enough to do with a
BEFORE TRIGGER on your table.
What's more complex is to come up with a parametrized spelling of the
trigger, where you can attach the same
stored procedure to any table even
when the column names are different from one another.
There was SQL before window functions and SQL after window functions: that's how powerful this tool is. Being that of a deal breaker unfortunately means that it can be quite hard to grasp the feature. This article aims at making it crystal clear so that you can begin using it today and are able to reason about it and recognize cases where you want to be using window functions.
About the only time when I will accept to work with MySQL is when you need help to migrate away from it because you decided to move to PostgreSQL instead. And it's already been too much of a pain really, so after all this time I began consolidating what I know about that topic and am writing a software to help me here. Consider it the MySQL Migration Toolkit.
In our recent article about The Most Popular Pub Names we did have a look at how to find the pubs nearby, but didn't compute the distance in between that pub and us. That's because how to compute a distance given a position on the earth expressed as longitude and latitude is not that easy. Today, we are going to solve that problem nonetheless, thanks to PostgreSQL Extensions.
Those days feel really lucky to me. I'm currently visiting friends and customers in San Francisco, and really enjoying my trip here! Of course Josh Berkus took the opportunity to organise a SFPUG meeting and I had the pleasure of being the speaker over there.
Last week was held the CHAR(13) conference in a great venue in the UK countryside. Not only did we discover UK under good weather conditions and some local beers, we also did share a lot of good ideas!
In a recent article here we've been talking about how do do Batch Updates in a very efficient way, using the Writable CTE features available in PostgreSQL 9.1. I sometime read how Common Table Expressions changed the life of fellow DBAs and developers, and would say that Writable CTE are at least the same boost again.
In a recent article Craig Kerstiens from Heroku did demo the really useful crosstab extension. That function allows you to pivot a table so that you can see the data from different categories in separate columns in the same row rather than in separate rows. The article from Craig is Pivoting in Postgres.
Recently I've been to some more conferences and didn't take the time to blog about them, even though I really did have great fun over there. So I felt I should take some time and report about my experience at those conferences. And of course, some more is on the way, as the PostgreSQL Conference Tour gets busier each year it seems.
Tonight I had the pleasure to present a talk at the Dublin PostgreSQL User Group using remote technologies. The talk is about how to make the most ouf of PostgreSQL when using SQL as a developer, and tries to convince you to dive into mastering SQL by showing how to solve an application example all in SQL, using window functions and common table expressions.
Last week came with two bank holidays in a row, and I took the opportunity to design a command language for pgloader. While doing that, I unexpectedly stumbled accross a very nice AHAH! moment, and I now want to share it with you, dear reader.
In this article, we want to find the town with the greatest number of inhabitants near a given location.
Yes it did happen, for real, in London: the Emacs Conference. It was easter week-end. Yet the conference managed to have more than 60 people meet together and spend a full day talking about Emacs. If you weren't there, a live stream was available and soon enough (wait for about two weeks) the video material will be published, as sacha is working on it.
Hier se tenait la cinquième édition de la conférence organisée par dalibo, où des intervenants extérieurs sont régulièrement invités. Le thème hier était à la fois clair et très vaste : la performance.
In the previous article here we talked about how to properly update more than one row at a time, under the title Batch Update. We did consider performances, including network round trips, and did look at the behavior of our results when used concurrently.
Performance consulting involves some tricks that you have to teach over and over again. One of them is that SQL tends to be so much better at dealing with plenty of rows in a single statement when compared to running as many statements, each one against a single row.
In the article from yesterday we talked about PostgreSQL HyperLogLog with some details. The real magic of that extension has been skimmed over though, and needs another very small article all by itself, in case you missed it.
If you've been following along at home the newer statistics developments,
you might have heard about this new
State of The Art Cardinality Estimation Algorithm called
technique is now available for PostgreSQL in the extension
https://github.com/aggregateknowledge/postgresql-hll and soon
to be in
I've discovered recently another Emacs facility that I since then use
several times a day, and I wonder how I did without it before:
the command mark-sexp.
skytools related new today, it's been a while. For those who where at
FOSDEM's talk about
Implementing High Availability you might have heard
that I really like working with
PGQ. A new version has been released a while
ago, and the most recent verion is now
3.1.3, as announced in the
Skytools 3.1.3 email.
This year's FOSDEM has been a great edition, in particular the FOSDEM PGDAY 2013 was a great way to begin a 3 days marathon of talking about PostgreSQL with people not only from our community but also from plenty other Open Source communities too: users!
pgloader is a tool to help loading data into
PostgreSQL, adding some error
management to the
COPY is the fast way of loading data into
PostgreSQL and is transaction safe. That means that if a single error
appears within your bulk of data, you will have loaded none of it.
will submit the data again in smaller chunks until it's able to isolate the
bad from the good, and then the good is loaded in.
Another day, another migration from MySQL to PostgreSQL... or at least that's how it feels sometimes. This time again I've been using some quite old scripts to help me do the migration.
In a recent article titled Inline Extensions we detailed the problem of how to distribute an extension's package to a remote server without having access to its file system at all. The solution to that problem is non trivial, let's say. But thanks to the awesome PostgreSQL Community we finaly have some practical ideas on how to address the problem as discussed on pgsql-hackers, our development mailing list.
We've been having the
CREATE EXTENSION feature in
PostgreSQL for a couple of
releases now, so let's talk about how to go from here. The first goal of the
extension facility has been to allow for a clean
restore process of
contrib modules. As such it's been tailored to the needs of deploying files
file system because there's no escaping from that when you have to
executable files, those infamous
I've been asked about how to integrate the ack tool (you know, the one that is better than grep) into Emacs today. Again. And I just realized that I didn't blog about my solution. That might explain why I keep getting asked about it after all...
Thanks to Mickael on twitter I ran into that article about implementing a very basic Hello World! program as a way to get into a new concurrent language or facility. The original article, titled Concurrent Hello World in Go, Erlang and C++ is all about getting to know The Go Programming Language better.
As Guillaume says, we've been enjoying a great evening conference in Lyon 2 days ago, presenting PostgreSQL to developers. He did the first hour presenting the project and the main things you want to know to start using PostgreSQL in production, then I took the opportunity to be talking to developers to show off some SQL.
Last week was
PostgreSQL Conference Europe 2012 in Prague, and it's been
awesome. Many thanks to the organisers who did manage to host a very smooth
290 attendees, including speakers. That means you kept
walking into interesting people to talk to, and in particular the
Track has been a giant success.
It's been a long time since I last had some time to spend on the
PostgreSQL extension and its
prefix_range data type. With PostgreSQL 9.2
out, some users wanted me to update the extension for that release, and
hinted me that it was high time that I fix that old bug for which I already
had a patch.
I've been given a nice puzzle that I think is a good blog article opportunity, as it involves some thinking and window functions.
PostgreSQL 9.2 is released! It's an awesome new release that I urge you to
consider trying and adopting, an upgrade from even
9.1 should be very well
worth it, as your hardware could suddenly be able to process a much higher
load. Indeed, better performances mean more work done on the same budget,
that's the name of the game!
Please welcome the new stable version of
El-Get, the much awaited
4.1 has now been branched for your pleasure. It's packed with lots of
features to make your life easy, comes with a
Info documentation book and
even has a
logo. That's no joke, I found one, at least:
I stumbled onto an interesting article about performance when using python, called Python performance the easy(ish) way, where the author tries to get the bet available performances out of the dumbiest possible python code, trying to solve a very simple and stupid problem.
The french PostgreSQL Conference, pgday.fr, was yesterday in Lyon. We had a very good time and a great schedule with a single track packed with 7 talks, addressing a diverse set of PostgreSQL related topics, from GIS to fuzzy logic, including replication.
A friend of mine just asked me for advice to tweak some Emacs features, and I think that's really typical of using Emacs: rather than getting used to the way things are shipped to you, when using Emacs, you start wanting to adapt the tools to the way you want things to be working instead. And you can call that the awesome!
Last week was the annual PostgreSQL Hackers gathering in Canada, thanks to the awesome pgcon conference. This year's issue has been packed with good things, beginning with the Cluster Summit then followed the next day by the Developer Meeting just followed (yes, in the same day) with the In Core Replication Meeting. That was a packed shedule!
Now that you're all using the wonders of Cooperative Consumers to help you efficiently and reliably implement your business constraints and offload them from the main user transactions, you're reaching a point where you have to clean up your development environment (because that's what happens to development environments, right?), and you want a way to start again from a clean empty place.
While working a new
PostgreSQL architecture for an high scale project that
used to be in the top 10 of internet popular web sites (in terms of
visitors), I needed to be able to off load some processing from the main
path: that's called a
batch job. This needs to be
transactional: don't run
the job if we did
rollback; the transaction, process all
events that were
part of the same transaction in the same transaction, etc.
PostgreSQL 9.1 includes proper extension support, as you might well know if you ever read this very blog here. Some hosting facilities are playing with PostgreSQL at big scale (hello Heroku!) and still meet with small caveats making their life uneasy.
J'ai eu la chance hier soir de participer à la Battle Language à la Marmite, où j'avais proposé de parler de Emacs Lisp, proposition qui s'est transformée en porte-étendard de la grande famille Lisp. J'ai utilisé avec plaisir certains contenu de Lisperati dans ma présentation et je vous recommande le détour sur ce site !
It seems that if you search for a
munin plugin for
pgbouncer it's easy
enough to reach an old page of mine with an old version of my plugin, and a
broken link. Let's remedy that by publishing here the newer version of the
plugin. To be honest, I though it already made its way into the official
1.4 set of plugins, but I've not been following closely enough.
La conférence européenne à Amsterdam était un très bon évènement de la communauté, avec une organisation impeccable dans un hôtel accueillant. J'ai eu le plaisir d'y parler des extensions et de leur usage dans le cadre du développement applicatif « interne », sous le titre Extensions are good for business logic.
Another great conference took place last week, PostgreSQL Conference Europe 2011 was in Amsterdam and plenty of us PostgreSQL geeks were too. I attended to lot of talks and did learn some more about our project, its community and its features, but more than that it was a perfect occasion to meet with the community.
I've been asked about my opinion on backup strategy and best practices, and it so happens that I have some kind of an opinion on the matter.
La conférence PostgreSQL annuelle en Europe a lieu la semaine prochaine à Amsterdam, et j'espère que vous avez déjà vos billets, car cette édition s'annonce comme un très bon millésime !
In the news recently stored procedures where used as an excuse for moving away logic from the database layer to application layer, and to migrate away from a powerful technology to a simpler one, now that there's no logic anymore in the database.
The next PostgreSQL conference is approaching very fast now, I hope you have your ticket already: it's a very promissing event! If you want some help in deciding whether to register or not, just have another look at the schedule. Pick the talks you want to see. It's hard, given how packed with good ones the schedule is. When you're mind is all set, review the list. Registered?
Dans moins d'un mois se tient la conférence européenne PostgreSQL, pgconf.eu. Il s'agit de quatre jours consacrés à votre SGBD préféré, où vous pourrez rencontrer la communauté européenne, consituée d'utilisateurs, d'entreprises de toutes tailles, de développeurs, de participants en tout genre.
el-get project releases its new stable version,
3.1. This new release
fixes bugs, add a host of new recipes (we have 420 of them and counting) and
some nice new features too. You really want to upgrade.
PostgreSQL 9.1 est dans les bacs ! Vous n'avez pas encore cette nouvelle version en production ? Pas encore évalué pourquoi vous devriez envisager de migrer à cette version ? Il existe beaucoup de bonnes raisons de passer à cette version, et peu de pièges.
Nous avons parlé la dernière fois les règles d' échappement de chaînes avec PostgreSQL, et mentionné qu'utiliser ces techniques afin de protéger les données insérées dans les requêtes SQL n'était pas une bonne idée dans la mesure où PostgreSQL offre une fonctionnalité bien plus adaptée.
PostgreSQL Hackers mailing lists,
Andrew Dunstan just proposed some
new options for
pg_restore to ease our lives. One of the
answers was talking about some scripts available to exploit the
listing that you play with using options
-L, or the long name
pg_staging tool allows you to easily
exploit those lists too.
You can find skytools3 in debian experimental already, it's in release candidate status. What's missing is the documentation, so here's an idea: I'm going to make a blog post series about skytools next features, how to use them, what they are good for, etc. This first article of the series will just list what are those new features.
From the first days of el-get is was quite clear for me that we would reach a point where users would want a nice listing including descriptions of the packages, and a major mode allowing you to select packages to install, remove and update. It was also quite clear that I was not much interested into doing it myself, even if I would appreciate having it done.
Emacs we get to manage a larger and larger setup file (either
~/.emacs.d/init.el), sometime with lots of dependencies, and some
sub-files thanks to the
load function or the
Back to our series about pgloader. The previous articles detailed How To Use PgLoader then How to Setup pgloader, then what to expect from a parallel pgloader setup. This article will detail how to reformat input columns so that what PostgreSQL sees is not what's in the data file, but the result of a transformation from this data into something acceptable as an input for the target data type.
Je viens de publier un billet en anglais intitulé
How to Setup pgloader, qui
complète l'écriture en cours d'un
tutoriel pgloader plus complet. Une fois
de plus, je n'ai pas pris le temps de traduire cet article en français avant
de savoir si cela vous intéresse, ô lecteurs. Si c'est le cas il suffit de
me l'indiquer par mail (ou
courriel, après tout) pour que j'ajoute cela dans
Emacs comes with a pretty good implementation of a terminal emulator,
term. Well not that good actually, but given what I use it for, it's just
what I need. Particulary if you add to that my
cssh tool, so that
ssh to a remote host is just a
=C-= runs the command
cssh-term-remote-open away, and completes on the host name thanks to
When you do partition your tables monthly, then comes the question of when to create next partitions. I tend to create them just the week before next month and I have some nice nagios scripts to alert me in case I've forgotten to do so. How to check that by hand in the end of a month?
C'est une question qui revient régulièrement, et à laquelle je pensais avoir apporté une réponse satisfaisante avec les exemples pgloader. Ce document ressemble un peu à un tutoriel, en anglais, et je l'ai détaillé dans l'article how to use pgloader sur ce même site, en anglais. Si la demande est suffisante, je le traduirai en français.
I stumbled upon the following
cheat sheet for
Emacs yesterday, and it's
worth sharing. I already learnt or discovered again some nice default
chords, like for example
C-x C-o runs the command delete-blank-lines and
C-M-o runs the command split-line. I guess I will use the later one a lot.
Quelle meilleure occupation dans le train du retour de CHAR(11) que de se faire reporteur pour l'occasion ? En réalité, dormir serait une idée tant les soirées se sont prolongées !
CHAR(11) finished somewhen in the night leading to today, if you consider the social events to be part of it, which I definitely do. This conference has been a very good one, both on the organisation side of things and of course for its content.
We still have this problem to solve with extensions and their packaging.
How to best organize things so that your extension is compatible with before
9.1 and following releases of
blog article, you're shown a quite long function that loop through
your buffers to find out if any of them is associated with a file whose full
"projects". Well, you should not be afraid of using
Current el-get status is stable, ready for daily use and packed with extra features that make life easier. There are some more things we could do, as always, but they will be about smoothing things further.
It's this time of the year again, the main international PostgreSQL Conference is next week in Ottawa, Canada. If previous years are any indication, this will be great event where to meet with a lot of the members of your community. The core team will be there, developers will be there, and we will meet with users and their challenging use cases.
Let's say you need to
ALTER TABLE foo ALTER COLUMN bar TYPE bigint;, and
PostgreSQL is helpfully telling you that no you can't because such and such
views depend on the column. The basic way to deal with that is to copy
paste from the error message the names of the views involved, then prepare a
script wherein you first
DROP VIEW ...; then
ALTER TABLE and finally
VIEW again, all in the same transaction.
While currently too busy at work to deliver much Open Source contributions,
let's debunk an old habit of
PostgreSQL extension authors. It's all down to
copy pasting from
contrib, and there's no reason to continue doing
this way ever since
I've been working on
skytools3 packaging lately. I've been pushing quite a
lot of work into it, in order to have exactly what I needed out of the box,
after some 3 years of production and experiences with the products. Plural,
yes, because even if
plproxy are siblings to the projets (same
developers team, separate life cycle and releases), then
includes several sub-projects.
If you don't remember about what pg_staging is all about, it's a central console from where to control all your PostgreSQL databases. Typically you use it to manage your development and pre-production setup, where developers ask you pretty often to install them some newer dump from the production, and you want that operation streamlined and easy.
If you've not been following closely you might have missed out on extensions integration. Well, Tom spent some time on the patches I've been preparing for the last 4 months. And not only did he commit most of the work but he also enhanced some parts of the code (better factoring) and basically finished it.
This year we were in the main building of the conference, and apparently the booth went very well, solding lots of PostgreSQL merchandise etc. I had the pleasure to once again meet with the community, but being there only 1 day I didn't spend as much time as I would have liked with some of the people there.
A quick blog entry to say that yes:
It so happens that a colleague of mine wanted to start using Emacs but couldn't get to it. He insists on having proper color themes in all applications and some sensible defaults full of nifty add-ons everywhere, and didn't want to have to learn that much about Emacs and Emacs Lisp to get started. I'm not even sure that he will Take the Emacs tour.
Yes, you read it well,
recipes, and is now
1.1 release. The reason for this release is mainly that I have
two big chunks of code to review and the current code has been very stable
for awhile. It seems better to do a release with the stable code that exists
now before to shake it this much. If you're wondering when to jump in the
water and switch to using
el-get, now is a pretty good time.
You certainly know that implementing
dynamic triggers in
impossible. But I had a very bad night, being up from as soon as 3:30 am
today, so that when a developer asked me about reusing the same trigger
function code from more than one table and for a dynamic column name, I
didn't remember about it being impossible.
IRC channel is a good place to be, for all the very good help
you can get there, because people are always wanting to remain helpful,
because of the off-topics discussions sometime, or to get to talk with
community core members. And to start up your day too.
Yeah I'm back on working on my part of the extension thing in PostgreSQL.
It's been a week since the last commits in the
el-get repository, and those
were all about fixing and adding recipes, and about notifications. Nothing
core plumbing you see. Also,
0.9 was released on
2010-08-24 and felt
pretty complete already, then received lots of improvements. It's high time
to cross the line and call it
Nowadays, most people would think that email is something simple, you just
setup your preferred client (that's called a
MUA) with some information such
smtp host you want it to talk to (that's call a
MTA and this one is
relayhost). Then there's all the receiving mails part, and that's
again on the server side. Then there's how to get those mail, read them,
flag them, manage them, and that's better served by
IMAP. Let's talk about
sending mails in
smtp for this entry.
I wanted to play with the idea of using the whole keyboard for my
switch-window utility, but wondered how to get those keys in the right order
and all. Finally found
quail-keyboard-layout which seems to exists for such
uses, as you can see:
The drawback of hosting a static only website is, obviously, the lack of
comments. What happens actually, though, is that I receive very few comments
by direct mail. As I don't get another
spam source to cleanup, I'm left
unconvinced that's such a drawback. I still miss the low probability of
seeing blog readers exchange directly, but I think a
list would be my answer, here...
8.4 came out there was all those comments about how getting
window functions was an awesome addition. Now, it seems that a lot of people
seeking for help in
#postgresql just don't know what kind of problem this
feature helps solving. I've already been using them in some cases here in
this blog, for getting some nice overview about
Partitioning: relation size per “group”.
Although the new asynchronous replication facility that ships with 9.0 ain't
released to the wide public yet, our hackers hero are already working on the
synchronous version of it. A part of the facility is rather easy to design,
we want something comparable to
DRBD flexibility, but specific to our
database world. So
synchronous would either mean
depending on what you need the
standby to have already done when the master
COMMIT. Let's call that the
Yes, that's another
el-get related entry. It seems to take a lot of my
attention these days. After having setup the
git repository so that you can
el-get from within itself (so that it's
self-contained), the next
logical step is providing
After discovering the excellent
Gwene service, which allows you to subscribe
newsgroups to read
RSS content (
commits, etc), I came to
read this nice article about
Happy Numbers. That's a little problem that
fits well an interview style question, so I first solved it yesterday
Emacs Lisp as that's the language I use the most those days.
A very good remark from some users: installing and managing
el-get should be
simpler. They wanted both an easy install of the thing, and a way to be able
to manage it afterwards (like, update the local copy against the
authoritative source). So I decided it was high time for getting the code
out of my
~/.emacs.d git repository and up to a public place:
The idea of the day ain't directly from me, I'm just helping with a very
thin subpart of the problem. The problem, I can't say much about, let's just
assume you want to reduce the storage of
MD5 in your database, so you want
bit strings. A solution to use them works fine, but the datatype is
still missing some facilities, for example going from and to hexadecimal
representation in text.
I've been receiving some requests for
el-get, some of them even included a
patch. So now there's support for
http-tar, augmenting the
existing support for
Thanks to you readers of Planet Emacsen taking the time to try those pieces of emacs lisp found in my blog, and also the time to comment on them, some bugs have been fixed, and new releases appeared.
We're using constants in some constraints here, for example in cases where several servers are replicating to the same federating one: each origin server has his own schema, and all is replicated nicely on the central host, thanks to Londiste, as you might have guessed already.
In trying to help an extension debian packaging effort, I've once again proposed to handle it. That's because I now begin to know how to do it, as you can see in my package overview page at debian QA facility. There's a reason why I proposed myself here, it's that yet another tool of mine is now to be found in debian, and should greatly help extension packaging there. You can already check for the postgresql-server-dev-all package page if you're that impatient!
Some user on
IRC was reading the releases notes in order to plan for a minor
upgrade of his
8.3.3 installation, and was puzzled about potential needs for
GIST indexes. That's from the
8.3.5 release notes, and from the
8.3.8 notes you see that you need to consider
hash indexes on
columns too. Now the question is, how to find out if any such beasts are in
use in your database?
I've been using emacs for a long time, and a long time it took me to consider learning Emacs Lisp. Before that, I didn't trust my level of understanding enough to be comfortable in managing my setup efficiently.
Today I'm being told once again about SQLite as an embedded database software. That one ain't a database server but a software library that you can use straight into your main program. I'm yet to use it, but it looks like its SQL support is good enough for simple things — and that covers loads of things. I guess read-only cache and configuration storage would be the obvious ones, because it seems that SQLite use cases aren't including mixed concurrency, that is workloads with concurrent readers and writers.
This time, we are trying to figure out where is the bulk of the data on
disk. The trick is that we're using
DDL partitioning, but we want a “nice”
view of size per
partition set. Meaning that if you have for example a
foo with partitions
foo_201007, you would want
to see a single category
foo containing the accumulated size of all the
Thanks to amazing readers of
planet emacsen, two annoyances of
switch-window.el have already been fixed! The first is that handling of
isn't exactly an option after all, and the other is that you want to avoid
the buffer creation in the simple cases (1 or 2 windows only), because it's
the usual case.
So it's Sunday and I'm thinking I'll get into
el-get sometime later. Now is
the time to present
dim-switch-window.el which implements a
C-x o. I
know of only one way to present a
visual effect, and that's with a screenshot:
There's currently a thread on hackers about bg worker: overview and a series of 6 patches. Thanks a lot Markus! This is all about generalizing a concept already in use in the autovacuum process, where you have an independent subsystem that require having an autonomous daemon running and able to start its own workers.
Nowadays to analyze logs and provide insights, the more common tool to use
pgfouine, which does an excellent job. But there has been some
improvements in logs capabilities that we're not benefiting from yet, and
I'm thinking about the
CSV log format.
There's a big trend nowadays about using column storage as opposed to what PostgreSQL is doing, which would be row storage. The difference is that if you have the same column value in a lot of rows, you could get to a point where you have this value only once in the underlying storage file. That means high compression. Then you tweak the executor to be able to load this value only once, not once per row, and you win another huge source of data traffic (often enough, from disk).
It surely does not feel like a full month and some more went by since we were enjoying PGCon 2010, but in fact it was already the time for CHAR(10). The venue was most excellent, as Oxford is a very beautiful city. Also, the college was like a city in the city, and having the accomodation all in there really smoothed it all.
Yes. This pgloader project is still maintained and somewhat active. Development happens when I receive a complaint, either about a bug in existing code or a feature in yet-to-write code. If you have a bug to report, just send me an email!
This time we're having a database where sequences were used, but not systematically as a default value of a given column. It's mainly an historic bad idea, but you know the usual excuse with bad ideas and bad code: the first 6 months it's experimental, after that it's historic.
Now you know what piece of software is used to publish this blog. I really
like it, the major mode makes it a great experience to be using this tool,
and the fact that you produce the
rsync it all from within Emacs
C-c C-p then
C-c C-r with some easy
elisp code) is a big advantage as far
as I'm concerned. No need to resort to
So, if you followed the previous blog entry, now you have a new database
containing all the
static tables encoded in
UTF-8 rather than
SQL_ASCII. Because if it was not yet the case, you now severely distrust
It happens that you have to manage databases
designed by your predecessor,
and it even happens that the team used to not have a
raisins can lead to having a
SQL_ASCII database. The horror!
So, after restoring a production dump with intermediate filtering, none of our sequences were set to the right value. I could have tried to review the process of filtering the dump here, but it's a one-shot action and you know what that sometimes mean. With some pressure you don't script enough of it and you just crawl more and more.
At pgday there was this form you could fill to give speakers some feedback about their talks. And that's a really nice way as a speaker to know what to improve. And as Magnus was searching a nice looking chart facility in python and I spoke about matplotlib, it felt like having to publish something.
So I had two
prefix in less than a week. It means several
things, one of them is that my code is getting used in the wild, which is
nice. The other side of the coin is that people do find bugs in there. This
one is about the behavior of the
btree opclass of the type
prefix range. We
cheat a lot there by simply having written one, because a range does not
have a strict ordering: is
[1-3] before of after
[2-4]? But when you know
you have no overlapping intervals in your
prefix_range column, being able to
have it part of a
primary key is damn useful.
As you might have noticed, this little blog of mine is not compromising much
and entirely maintained from Emacs. Until today, I had to resort to
upload my publications, though, as I've been too lazy to hack up the tools
integration for simply doing a single
rsync command line. That was one time
So there it is, at long last, the final
1.0.0 release of prefix! It's on its
way into the debian repository (targetting sid, in testing in 10 days) and
So you have a
rolodex like database in your Emacs, or you have this phone
number in a mail and you want to call it. It happens you have
VoIP setup and
Twinkle to make your calls. Maybe you'll then find this
The function didn't allow for using more than one
mailrc file, which isn't a
good idea, so I've just added that. Oh and for
gnus integration what I need
(add-hook 'message-mode-hook 'mail-abbrevs-setup) it seems... so that if
I type the alias it'll get automatically expanded. And to be real lazy and
avoid having to type in the entire alias,
mail-abbrev-complete-alias to the
rescue, assigned to some easy to reach keys.
So I've been adviced to use
~/.mailrc for keeping a basic address book in
Emacs, for use within
gnus for example. I had to resort to the manual to
find out how to use the file aliases when I need them, that is when
composing a mail. For the record, here's what I had to do:
First, here's a way to insert at current position the last message printed
into the minibuffer... well not exactly, in
*Messages* buffer in fact. I was
tired of doing it myself after invoking, e.g.,
At long last,
here it is. With binary versions both for
postgresal-8.4! Unfortunately my other packaging efforts are still waiting
NEW queue, but I hope to soon see
I've been having problem with building both
postgresql-8.4-prefix debian packages from the same source package, and
fixing the packaging issue forced me into modifying the main
Makefile. So while reaching
rc2, I tried to think about missing pieces easy
to add this late in the game: and there's one, that's a function
length(prefix_range), so that you don't have to cast to text no more in the
following wildspread query:
I can't really compare PgCon 2009 with previous years versions, last time I enjoyed the event it was in 2006, in Toronto. But still I found the experience to be a great one, and I hope I'll be there next year too!
On the performance mailing list, a recent
thread drew my attention. It
devired to be about using a connection pool software and prepared statements
in order to increase scalability of PostgreSQL when confronted to a lot of
concurrent clients all doing simple
select queries. The advantage of the
pooler is to reduce the number of
backends needed to serve the queries, thus
reducing PostgreSQL internal bookkeeping. Of course, my choice of software
here is clear:
PgBouncer is an excellent top grade solution, performs real
well (it won't parse queries), reliable, flexible.
The prefix project is about matching a literal against prefixes in your table, the typical example being a telecom routing table. Thanks to the excellent work around generic indexes in PostgreSQL with GiST, indexing prefix matches is easy to support in an external module. Which is what the prefix extension is all about.
The problem was raised this week on
IRC and this time again I felt it would
be a good occasion for a blog entry: how to load an
XML file content into a
In this russian page you'll see a nice presentation of Skype databases architectures by Asko Oja himself. It's the talk at Russian PostgreSQL Community meeting, October 2008, Moscow, and it's a good read.
As it happens, I've got some environments where I want to make sure
Heap Only Tuples) is in use. Because we're doing so much updates a second
that I want to get sure it's not killing my database server. I not only
wrote some checking view to see about it, but also made a
about it in the
French PostgreSQL website. Handling around in
means that I'm now bound to write about it in English too!
So, you're using
londiste and the
ticker has not been running all night
long, due to some restart glitch in your procedures, and the
on call admin
didn't notice the restart failure. If you blindly restart the replication
daemon, it will load in memory all those events produced during the night,
at once, because you now have only one tick where to put them all.
So finaly a blogging software for geeks exists?
This is a test of a fake entry to see how muse will manage this.