<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
  <channel>
    <title>tail -f /dev/dim</title>
    <link>http://tapoueh.org/index.html</link>
    <description>Dimitri Fontaine's blog</description>
    <language>en-us</language>
    <generator>Emacs Muse</generator>
<item>
  <title>pgbouncer munin plugin</title>
  <link>http://tapoueh.org/blog/2011/11/16-pgbouncer-munin.html</link>
  <description><![CDATA[<p>It seems that if you search for a <a href="http://munin-monitoring.org/">munin</a> plugin for <a href="http://wiki.postgresql.org/wiki/PgBouncer">pgbouncer</a> it's easy
enough to reach an old page of mine with an old version of my plugin, and a
broken link. Let's remedy that by publishing here the newer version of the
plugin. To be honest, I though it already made its way into the official
munin <code>1.4</code> set of plugins, but I've not been following closely enough.</p>

<center>
<p><img src="../../../images/bouncing_elephant.gif" alt=""></p>
</center>

<p>As the plugin is 300 lines of python code, it's not a good idea to just
inline it here, so please grab it at <a href="../../../resources/pgbouncer_">pgbouncer_</a>.</p>

<p>You might need to know that the script name once installed should follow the
form <code>pgbouncer_dbname_stats_requests</code> or <code>pgbouncer_dbname_pools</code>, where of
course <code>dbname</code> can contain any number of <code>_</code> characters. This script supports
quite old versions of <em>pgbouncer</em> that didn't accept the normal <code>pq</code> protocol,
you did have to use <code>psql</code> to have any chance of getting the data from a
script, you couldn't then just use a PostgreSQL driver such as <a href="http://initd.org/psycopg/">psycopg2</a>.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 16 Nov 2011 14:00:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/11/16-pgbouncer-munin.html</guid>
</item>
<item>
  <title>Extensions en simple SQL</title>
  <link>http://tapoueh.org/blog/2011/10/31-extensions-sql.html</link>
  <description><![CDATA[<p>La <a href="http://2011.pgconf.eu/">conférence européenne à Amsterdam</a> était un très bon évènement de la
communauté, avec une organisation impeccable dans un hôtel accueillant. J'ai
eu le plaisir d'y parler des extensions et de leur usage dans le cadre du
développement applicatif « interne », sous le titre
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/138-extensions-are-good-for-business-logic/">Extensions are good for business logic</a>.</p>


<center>
<p><a class="image-link" href="http://wiki.postgresql.org/images/f/f1/Using-extensions.pdf">
<img src="../../../images/using-extensions-10.png"></a></p>
</center>

<p>L'idée de ma présentation, que la plupart d'entre vous a loupé je suppose
(en tout cas je n'avais qu'une petite poignée de français dans la salle, et
j'espère avoir des lecteurs qui n'étaient pas à Amsterdam), l'idée est
d'utiliser les mécanismes offerts par les extensions afin de maintenir le
code <code>PL</code> que vous utilisez en production.</p>

<p>Il s'agit la plupart du temps de procédures qui implémentent une partie de
la logique métier de vos applications, mais si proche des données que cela
termine en base directement : c'est une bonne chose, en particulier depuis
<em>PostgreSQL 9.1</em>. Cette version propose en effet une gestion assez complète
des extensions.</p>

<p>Il s'agit de réaliser un <em>empaquetage</em> de vos procédures en suivant la
documentation en ligne et son chapitre
<a href="http://docs.postgresqlfr.org/9.1/extend-extensions.html">35.15. Empaqueter des objets dans une extension</a>. Une fois cela fait, il est
alors possible de déployer votre ensemble de procédure stockée avec la
commande <code>CREATE EXTENSION mesprocs;</code>, et ensuite la commande <code>psql</code> <code>\dx</code> vous
permet de lister les extensions installées et leur numéro de version.</p>

<p>Les mises à jours sont également gérées avec une commande SQL dédiée, il
s'agit alors de <code>ALTER EXTENSION mesprocs UPDATE [TO version];</code>. Il suffit de
fournir des scripts intermédiaires nommés par exemple <code>mesprocs--1.0--1.1.sql</code>
et <code>mesprocs--1.1--1.2.sql</code> et PostgreSQL saura comment passer de <code>1.0</code> à <code>1.1</code>.</p>

<p>Voilà, vous savez presque tout de ma présentation à Amsterdam et vous pouvez
retrouver le reste sur le support proposé en début d'article. Bien sûr je
n'ai pas reproduit ici les questions intéressantes qui m'ont été posées, une
bonne partie d'entre elles sont venues enrichir ma liste de Noël pour les
extensions. Si vous voulez être sûr de trouver cela sous votre sapin,
cependant, le meilleur moyen est encore de m'en parler : sponsoriser les
développement Open Source est une belle démarche :)</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 31 Oct 2011 14:22:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/10/31-extensions-sql.html</guid>
</item>
<item>
  <title>Back From Amsterdam</title>
  <link>http://tapoueh.org/blog/2011/10/26-back-from-amsterdam.html</link>
  <description><![CDATA[<p>Another great conference took place last week,
<a href="http://2011.pgconf.eu/">PostgreSQL Conference Europe 2011</a> was in Amsterdam and plenty of us
PostgreSQL geeks were too. I attended to lot of talks and did learn some
more about our project, its community and its features, but more than that
it was a perfect occasion to meet with the community.</p>

<center>
<p><img src="../../../images/ams-conf-room.jpg" alt=""></p>
</center>

<p><a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/2-dave-page/">Dave Page</a> talked about <code>SQL/MED</code> under the title
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/146-postgresql-at-the-center-of-your-dataverse/">PostgreSQL at the center of your dataverse</a> and detailed what to expert from
a <em>Foreign Data Wrapper</em> in PostgreSQL 9.1, then how to write your own.
Wherever you are currently managing your data, you can easily enough make it
so that PostgreSQL integrates them by means of fetching them to answer your
queries. Which means real time data federating: you don't copy data around,
you remote access them when executing the query.</p>

<p>I might need to come up with new <em>Foreign Data Wrappers</em> in a not too distant
future, now that I better grasp how much work it really is to do that, it
appears to be a good migration strategy too:</p>

<pre class="src">
  INSERT INTO real.table
       SELECT * FROM foreign.table;
</pre>

<p>Another discovery is that apparently <a href="http://code.google.com/p/plv8js/wiki/PLV8">PLv8</a> is ready for public consumption.
Using it can lead to <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/174-heralding-the-death-of-nosql/">Heralding the Death of NoSQL</a>, so use it with care.</p>

<p>In the presentation of <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/156-synchronous-replication-and-durability-tuning/">Synchronous Replication and Durability Tuning</a> we
mainly saw that mixing <em>synchronous</em> and <em>asynchronous</em> transactions in your
application is the key to real performances across the ocean, as the speed
of the light is not infinite. From Baltimore to Amsterdam the latency can
not be better than <code>100ms</code> and that's not the same as <em>instant</em> nowadays.</p>

<p>Then again, depending on the number of concurrent queries to sync over the
ocean link, the experimental setup was able to achieve several thousands of
queries per second, which is validating the model we picked for <em>sync rep</em> and
its implementation.</p>

<p>If you want to read the slides again at home, or if you could not be there
for some reason, then most of the talks are now available online at the
<a href="http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2011">PostgreSQL Conference Europe Talks 2011</a> wiki page.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 26 Oct 2011 10:08:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/10/26-back-from-amsterdam.html</guid>
</item>
<item>
  <title>Implementing backups</title>
  <link>http://tapoueh.org/blog/2011/10/12-backup-strategy.html</link>
  <description><![CDATA[<p>I've been asked about my opinion on backup strategy and best practices, and
it so happens that I have some kind of an opinion on the matter.</p>

<p>I tend to think best practice here begins with defining properly the <em>backup
plan</em> you want to implement. It's quite a complex matter, so be sure to ask
yourself about your needs: what do you want to be protected from?</p>

<center>
<p><img src="../../../images/online-backup.jpg" alt=""></p>
</center>

<p>The two main things to want to protect from are hardware loss (crash
disaster, plane in the data center, fire, water flood, etc) and human error
(<code>UPDATE</code> without a where clause). Replication is an answer to the former,
archiving and dumps to the latter. You generally need both.</p>

<p>Often enough “backups” include <code>WAL</code> <em>archiving</em> and <em>shipping</em> and nightly or
weekly <em>base backups</em>, with some retention and some scripts or procedures
ready to setup <a href="http://www.postgresql.org/docs/9.1/static/continuous-archiving.html">Point In Time Recovery</a> and recover some data without
interfering with the WAL archiving and shipping. Of course with PostgreSQL
9.0 and 9.1, the <em>WAL Shipping</em> can be implemented with <em>streaming replication</em>
and you can even have a <em>Hot Standby</em>. But for backups you still want
archiving.</p>

<p>Mostly I still implement <code>pg_dump -Fc</code> nightly backups with a custom retention
(for example, 1 backup a month kept 2 years, 1 backup a week kept 6 or 12
months, 1 backup a night kept 1 to 2 weeks), when the database size allow
the <code>pg_dump</code> run to remain constrained in the <em>maintenance window</em>, if any.</p>

<p>Don't forget that while <code>pg_dump</code> runs, you can't roll out <em>DDL changes</em> to the
production system any more, so you want to be careful about this
<em>maintenance window</em> thing. When you have one.</p>

<p><em>Physical backups</em> are not locking <em>rollouts</em> away, but they often suck a good
deal of the <em>IO bandwidth</em> so you need to pick up a right timing to do them.
That's how you can get to once a week base backup and WAL <em>archiving</em>.</p>

<p>If you can't <code>pg_dump</code> production, maybe you can have <em>automated restore jobs</em>
from the <em>physical backups</em> that you then <code>pg_dump -Fc</code>, so that you still have
that. That can come up handy, really: you can't test your <em>major upgrade</em> out
of a <em>physical backup</em>.</p>

<p>Also, <strong><em>obviously</em></strong>, never consider your backup strategy implemented until you
have either <em>automated restores</em> in place or a regular schedule to exercise
them (<em>staging instances</em>, devel instances).</p>

<p>Then as far as the practical tools go, I tend to think that <a href="http://tapoueh.org/pgsql/pgstaging.html">pg_staging</a> is
worth its installation complexity, and for WAL archiving and base backup I
recommend <a href="http://skytools.projects.postgresql.org/doc/walmgr.html">walmgr</a> from <a href="http://wiki.postgresql.org/wiki/SkyTools">Skytools</a>, that's a very handy tool. When using
PostgreSQL <code>9.0</code> or <code>9.1</code>, consider using <a href="http://packages.debian.org/experimental/skytools3-walmgr">walmgr3</a> so that it's behaving nice
alongside <em>streaming replication</em>.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 12 Oct 2011 22:22:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/10/12-backup-strategy.html</guid>
</item>
<item>
  <title>Extensions, applications</title>
  <link>http://tapoueh.org/blog/2011/10/10-extensions-applicatives.html</link>
  <description><![CDATA[<p>La <a href="http://2011.pgconf.eu/">conférence PostgreSQL annuelle en Europe</a> a lieu la semaine prochaine à
Amsterdam, et j'espère que vous avez déjà vos billets, car cette édition
s'annonce comme un très bon millésime !</p>

<p>Je présenterai donc comment utiliser les extensions, le titre en anglais est
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/138-extensions-are-good-for-business-logic/">Extensions are good for business logic</a>, et l'idée est de voir comment
exploiter les extensions afin de mieux gérer vos mises à jours en bases de
données.</p>

<p>Le cycle de vie des bases de données en production inclue souvent
l'utilisation d'une base de développement où le schéma évolue au rythme des
besoins des développeurs, et de temps en temps on consolide une partie de
ces modifications (dans des <em>rollouts</em>, scripts contenant principalement des
<code>DDL</code>) afin de les déployer en production — si possible avec une étape
intermédiaire en préproduction, tout de même.</p>

<p>Savoir ce qui est déployé en développement et comment en retirer le script à
jouer en production peut être parfois fastidieu.  Quand ce n'est pas le cas,
c'est que le travail a été fait en amont, ce qui est le signe d'une bonne
organisation, avec les surcoûts que l'on peut imaginer.</p>

<p>Les <a href="http://www.postgresql.org/docs/9.1/static/extend-extensions.html">extensions</a> telles que présentes dans PostgreSQL 9.1 vous permettent de
mieux gérer ce genre de cas, en optimisant le surcoût : il ne disparaît pas,
mais devient opérationnel plutôt que de rester une charge d'organisation.</p>

<p>Allez, je vous laisse maintenant, je dois me préparer pour la conférence :)</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 10 Oct 2011 10:35:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/10/10-extensions-applicatives.html</guid>
</item>
<item>
  <title>Scaling Stored Procedures</title>
  <link>http://tapoueh.org/blog/2011/10/06-scaling-with-stored-procedures.html</link>
  <description><![CDATA[<p>In the news recently <em>stored procedures</em> where used as an excuse for moving
away logic from the database layer to application layer, and to migrate away
from a powerful technology to a simpler one, now that there's no logic
anymore in the database.</p>

<p>It's not the way I would typically approach scaling problems, and apparently
I'm not alone on the <em>Stored Procedures</em> camp.  Did you read this nice blog
post <a href="http://ora-00001.blogspot.com/2011/07/mythbusters-stored-procedures-edition.html">Mythbusters: Stored Procedures Edition</a> already?  Well it happens in
another land that where my comfort zone is, but still has some interesting
things to say.</p>

<p>I won't try and address all of the myths they attack in a single article.
Let's pick the scalability problems, the two of them I think about are code
management and performances.  We are quite well equiped for that in
PostgreSQL, really.</p>

<p>For code maintainance we now have <a href="http://www.postgresql.org/docs/9.1/static/extend-extensions.html">PostgreSQL Extensions</a>, which allows you to
pack all your procedures into separate <em>extensions</em>, and to maintain a version
number and upgrade procedures for each of them.  You can handle separate
rollouts in development for going from <code>1.12</code> to <code>1.13</code> then <code>1.14</code> and after the
developers tested it more completely and changed their mind again on the
best API they want to work with, <code>1.15</code> which is stamped ok for production.
At this point, <code>ALTER EXTENSION UPGRADE</code> will happily apply all the rollouts
in sequence to upgrade from <code>1.12</code> straight to <code>1.15</code> in one go.  And if you
prefer to bake a special careful script to handle that big jump, you also
can provide a specific <code>extension--1.12--1.15.sql</code> script.</p>

<p>Of course you're managing all those files with your favorite <em>SCM</em>, to answer
to some other myth from the blog reference we are loosely following.</p>

<center>
<p><a class="image-link" href="http://postgresqlrussia.org/articles/view/131">
<img src="../../../images/Moskva_DB_Tools.v3.png"></a></p>
</center>

<p>I wanted to talk about the other side of the scalability problem, which is
the operations side of it.  What happens when you need to scale the database
in terms of its size and level of concurrent activity?  PostgreSQL earned a
very good reputation at being able to scale-up, what about scaling-out?
Certainly, now that you're all down into <em>Stored Procedure</em>, it's going to be
a very bad situation?</p>

<p>Well, in fact, you're then in a very good position here, thanks to <a href="http://wiki.postgresql.org/wiki/PL/Proxy">PLproxy</a>.
This <em>extension</em> is a custom procedural language whose job is to handle a
cluster of database shards that all expose the same PL API, and it's very
good at doing that.</p>

<p><em>Stored Procedures</em> are a very good tool to have, be sure to get comfortable
enough with them so that you can choose exactly when to use them.  If you're
not sure about that, we at <a href="http://www.2ndquadrant.com/">2ndQuadrant</a> will be happy to help you there!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 06 Oct 2011 18:23:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/10/06-scaling-with-stored-procedures.html</guid>
</item>
<item>
  <title>See you in Amsterdam</title>
  <link>http://tapoueh.org/blog/2011/10/04-see-you-in-Amsterdam.html</link>
  <description><![CDATA[<p>The next <a href="http://2011.pgconf.eu/">PostgreSQL conference</a> is approaching very fast now, I hope you have
your ticket already: it's a very promissing event!  If you want some help in
deciding whether to register or not, just have another look at <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/">the schedule</a>.
Pick the talks you want to see.  It's hard, given how packed with good ones
the schedule is.  When you're mind is all set, review the list.  Registered?</p>

<p>I'll be presenting another talk about extensions, but this time I've geared
up to use cases, with <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/138-extensions-are-good-for-business-logic/">Extensions are good for business logic</a>.  The idea is
not to talk about how to make PostgreSQL play fair with extensions including
at <em>dump</em> and <em>restore</em> times, that's already done and I've been talking only
too much about it.  The idea this time is to figure out how much you get
from this feature.</p>

<p>If you ever felt like something is missing in your processes between pushing
rollouts in devel environments and refining them as developers are testing
and preparing something for the live databases, then we have something for
you here.  Including how to easily compare state between production and
development, but without having to guess or reverse engineer anything.</p>

<p>Yeah, extensions are all about getting even more professional!  A great tool
you'll be happy to master!</p>

<p>And now I need to prepare a damn good slide deck, right?  See you there! :)</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 04 Oct 2011 14:25:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/10/04-see-you-in-Amsterdam.html</guid>
</item>
<item>
  <title>PostgreSQL à Amsterdam</title>
  <link>http://tapoueh.org/blog/2011/09/27-pgconf-eu.html</link>
  <description><![CDATA[<p>Dans moins d'un mois se tient la conférence européenne PostgreSQL,
<a href="http://2011.pgconf.eu/">pgconf.eu</a>.  Il s'agit de quatre jours consacrés à votre SGBD préféré, où
vous pourrez rencontrer la communauté européenne, consituée d'utilisateurs,
d'entreprises de toutes tailles, de développeurs, de participants en tout
genre.</p>

<p>C'est l'endroit où aller pour apprendre comment le projet fonctionne,
comprendre les impacts des nouvelles versions sur votre architecture, avoir
une discussion technique pointue sur cette fonctionalité que vous voudriez
voir arriver dans la prochaine version, ou simplement vous rendre compte de
l'énergie formidable qui est insuflée dans ce projet !</p>

<p>Évidemment <a href="http://2ndquadrant.fr/">2ndQuadrant</a> sera de la partie, nous présenterons plusieurs de nos
<a href="http://www.2ndquadrant.com/fr/les-fonctionnalites-de-postgresql-91/">contributions PostgreSQL 9.1</a>.  Cela commencera avec la formation d'une
journée complète de <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/81-greg-smith/">Greg</a>, <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/162-performance-from-start-to-crash/">Performance From Start to Crash</a> : si vous voulez
apprendre comment aborder les performances d'un serveur PostgreSQL par le
<em>leader</em> international du domaine, auteur du livre
<a href="http://www.amazon.fr/Bases-donn%C3%A9es-PostgreSQL-Gregory-Smith/dp/274402483X/ref=sr_1_1?ie=UTF8&amp;qid=1316183931&amp;sr=8-1">Bases de données PostgreSQL 9.0</a>, réservez vite votre place !</p>

<p>Les présentation au format classique commencent le lendemain, et en trois
jours la liste des présentation de notre <a href="http://www.2ndquadrant.com/fr/profil-de-lequipe/">équipe 2ndQuadrant</a> est assez
copieuse.  Voyons cela.</p>

<p>Nous commençons avec <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/144-migration-to-postgresql-a-holistic-view/">Migration to PostgreSQL - a holistic view</a> par
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/78-harald-armin-massa/">Harald Armin Massa</a>, qui propose un point de vue intéressant sur les raisons
qui retiennent certaines migrations.  Ensuite <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/34-gianni-ciolli/">Gianni Ciolli</a> présentera
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/159-look-out-the-window-functions-and-free-your-sql/">Look Out The Window Functions (and free your SQL)</a> ou comment résoudre
simplement des problèmes complexes lorsque l'on dispose d'outils avancés.</p>

<p>Une autre présentation à ne pas rater,
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/156-synchronous-replication-and-durability-tuning/">Synchronous Replication and Durability Tuning</a> détaille comment profiter au
mieux de PostgreSQL 9.1 afin d'obtenir les garanties de durabilité des
données souhaitées dans votre application.  Et cette présentation est animée
par <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/81-greg-smith/">Greg Smith</a> et <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/17-simon-riggs/">Simon Riggs</a>.  Ce dernier a développé la <em>réplication
synchrone</em>, et <em>Hot Standby</em> avant cela.  Vous ne trouverez personne au monde
mieux placé pour faire cette présentation !</p>

<p>Les deux prochaines présentation de nos <a href="http://expert-postgresql.fr/">experts PostgreSQL</a>, en continuant
notre lecture du <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/">programme de pgconf.eu</a> dans l'ordre, ont lieu au même
moment.  Le choix ne sera pas facile entre <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/158-improving-vacuum-suction/">Improving VACUUM Suction</a> par Greg
à nouveau, et une comparaison de <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/183-londiste-3-et-slony-21/">londiste 3 et slony 2.1</a> par
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/57-cedric-villemain/">Cédric Villemain</a>, en français.</p>

<p>À suivre, <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/138-extensions-are-good-for-business-logic/">Extensions are good for business logic</a> que je vous présenterai
moi-même, vous pouvez voir ma présentation sur la fiche qui porte mon nom :
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/14-dimitri-fontaine/">Dimitri Fontaine</a>.  Il s'agit d'une présentation en anglais qui détaille
comment utiliser les extensions dans le cadre de la maintenance de la partie
<em>procédures stockées</em> d'une application.</p>

<p>Et pour finir le deuxième jour des conférences 2ndQuadrant, vous pourrez
apprendre avec Gianni comment
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/160-debugging-complex-sql-queries-with-writable-ctes/">Debugging complex SQL queries with writable CTEs</a>, une fonctionnalité
contribuée au projet par un autre consultant <a href="http://www.2ndquadrant.com/fr/contact/">2ndQuadrant</a>, Marko Tiikkaja.</p>

<p>Et il reste encore une journée !  Nous ne mentons pas en disant que le
programme est complet !  Le dernier jour de la conférence n'est pas le moins
intéressant, j'espère que vous aurez su garder un peu d'énergie pour suivre…</p>

<p><a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/speaker/17-simon-riggs/">Simon Riggs</a> qui présentera sa vision de la <a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/199-postgresql-roadmap/">PostgreSQL Roadmap</a> pour les
prochaines années.  Ce n'est bien sûr que sa vision personnelle, mais
lorsque l'on fait le bilan de ces 7 dernières années de
<a href="http://www.2ndquadrant.com/fr/histoire-postgresql/">contributions à PostgreSQL</a>, on voit à quel point son opinion personnelle
peut avoir du poids dans le développement du projet.</p>

<p>À suivre, la présentation de Greg sur son sujet de prédilection :
<a href="http://www.postgresql.eu/events/schedule/pgconfeu2011/session/157-bottom-up-database-benchmarking/">Bottom-up Database Benchmarking</a>.  Tout ce que vous avez toujours voulu
savoir sur les mesures de performances de vos bases de données, sans jamais
oser le demander.  Quelque chose dans ce style en tout cas :)</p>

<p>Bien sûr d'autres présentations sont disponibles et retiendront votre
attention, ce billet vous présente seulement celles qui seront données par
les <a href="http://expert-postgresql.fr/">experts PostgreSQL</a> de <a href="http://www.2ndquadrant.com/fr/expertise-postgresql/">2ndQuadrant</a>.  En vous souhaitant bonne conférence
à tous, j'espère avoir le plaisir de vous retrouver à Amsterdam le mois
prochain !</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 27 Sep 2011 11:10:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/27-pgconf-eu.html</guid>
</item>
<item>
  <title>Skytools3: walmgr</title>
  <link>http://tapoueh.org/blog/2011/09/21-skytools-walmgr-part-1.html</link>
  <description><![CDATA[<p>Let's begin the <a href="http://wiki.postgresql.org/wiki/SkyTools">Skytools 3</a> documentation effort, which is long overdue.  The
code is waiting for you over at <a href="https://github.com/markokr/skytools">github</a>, and is stable and working.  Why is
it still in <em>release candidate</em> status, I hear you asking?  Well because it's
missing updated documentation.</p>

<p><a href="http://packages.debian.org/experimental/skytools3-walmgr">WalMgr</a> is the Skytools component that manages <em>WAL shipping</em> for you, and
archiving too.  It knows how to prepare your master and standby setup, how
to take a base backup and push it to the standby's system, how to archive
(at the satndby) master's WAL files as they are produced and have the
standby restore from this archive.</p>

<p>What's new in <code>walmgr</code> from Skytools 3 is its support for <em>Streaming
Replication</em> that made its way into PostgreSQL 9.0 and is even more useful in
PostgreSQL 9.1 (better monitoring, synchronous replication option).</p>

<h2>Getting ready</h2>

<p class="first">Now, I'm using debian here, and a build virtual machine where I'm doing the
<em>backporting</em> work.  As <a href="http://www.postgresql.org/about/news.1349">PostgreSQL 9.1</a> is now out, let's use that.</p>

<pre class="src">
:~$ pg_lsclusters
Version Cluster   Port Status Owner    Data directory
8.4     main      5432 online postgres /var/lib/postgresql/8.4/main ...
9.0     main      5433 online postgres /var/lib/postgresql/9.0/main ...
9.1     main      5434 online postgres /var/lib/postgresql/9.1/main ...
</pre>

<p>After some editing of the configuration files (enabling <em>hot standby</em> and
switching <code>pg_hba.conf</code> to <code>trust</code> for the sake of this example), we can see
that the cluster is ready to be abused:</p>

<pre class="src">
:~$ sudo pg_ctlcluster 9.1 main restart
:~$ psql --cluster 9.1/main  -U postgres \
-c <span style="color: #ad7fa8; font-style: italic;">"select name, setting from pg_settings where name in ('max_wal_senders', 'wal_level')"</span>
      name       |   setting
-----------------+-------------
 max_wal_senders | 1
 wal_level       | hot_standby
(2 rows)

:~$ sudo mkdir -p /etc/walshipping/9.1/main /var/lib/postgresql/walshipping
:~$ sudo chown -R postgres:postgres /etc/walshipping /var/lib/postgresql/walshipping

:~$ ssh-keygen -t dsa
:~/.ssh$ cp id_dsa.pub authorized_keys
:~$ ssh localhost
</pre>

<p>So the order of operations is to prepare a standby, then have it restore
from the archives, then activate the wal streaming and check that the setup
allows the standby to switch back and forth between the streaming and the
archives.</p>


<h2>Setting walmgr</h2>

<p class="first">To prepare the standby, we will do a <em>base backup</em> of the master.  That step
is handled by <code>walmgr</code>, so we first need to set it up.  Here's the sample
<code>master.ini</code> file:</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">walmgr</span>]
<span style="color: #eeeeec;">job_name</span>             = wal-master
<span style="color: #eeeeec;">logfile</span>              = /var/log/postgresql/%(job_name)s.log
<span style="color: #eeeeec;">pidfile</span>              = /var/run/postgresql/%(job_name)s.pid
<span style="color: #eeeeec;">use_skylog</span>           = 0

<span style="color: #eeeeec;">master_db</span>            = port=5434 dbname=template1
<span style="color: #eeeeec;">master_data</span>          = /var/lib/postgresql/9.1/main/
<span style="color: #eeeeec;">master_config</span>        = /etc/postgresql/9.1/main/postgresql.conf
<span style="color: #eeeeec;">master_bin</span>           = /usr/lib/postgresql/9.1/bin

<span style="color: #888a85;"># </span><span style="color: #888a85;">set this only if you can afford database restarts during setup and stop.
</span><span style="color: #eeeeec;">master_restart_cmd</span>   = pg_ctlcluster 9.1 main restart

<span style="color: #eeeeec;">slave</span> = 127.0.0.1
<span style="color: #eeeeec;">slave_config</span> = /etc/walshipping/9.1/main/standby.ini

<span style="color: #eeeeec;">walmgr_data</span>          = /var/lib/postgresql/walshipping/9.1/main
<span style="color: #eeeeec;">completed_wals</span>       = %(walmgr_data)s/logs.complete
<span style="color: #eeeeec;">partial_wals</span>         = %(walmgr_data)s/logs.partial
<span style="color: #eeeeec;">full_backup</span>          = %(walmgr_data)s/data.master
<span style="color: #eeeeec;">config_backup</span>        = %(walmgr_data)s/config.backup

<span style="color: #888a85;"># </span><span style="color: #888a85;">syncdaemon update frequency
</span><span style="color: #eeeeec;">loop_delay</span>           = 10.0
<span style="color: #888a85;"># </span><span style="color: #888a85;">use record based shipping available since 8.2
</span><span style="color: #eeeeec;">use_xlog_functions</span>   = 0

<span style="color: #888a85;"># </span><span style="color: #888a85;">pass -z to rsync, useful on low bandwidth links
</span><span style="color: #eeeeec;">compression</span>          = 0

<span style="color: #888a85;"># </span><span style="color: #888a85;">keep symlinks for pg_xlog and pg_log
</span><span style="color: #eeeeec;">keep_symlinks</span>        = 1

<span style="color: #888a85;"># </span><span style="color: #888a85;">tell walmgr to set wal_level to hot_standby during setup
</span><span style="color: #888a85;">#</span><span style="color: #888a85;">hot_standby          = 1
</span>
<span style="color: #888a85;"># </span><span style="color: #888a85;">periodic sync
</span><span style="color: #888a85;">#</span><span style="color: #888a85;">command_interval     = 600
</span><span style="color: #888a85;">#</span><span style="color: #888a85;">periodic_command     = /var/lib/postgresql/walshipping/periodic.sh
</span></pre>

<p>And the <code>/etc/walshipping/9.1/main/standby.ini</code> companion:</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">walmgr</span>]
<span style="color: #eeeeec;">job_name</span>             = wal-standby
<span style="color: #eeeeec;">logfile</span>              = /var/log/postgresql/%(job_name)s.log
<span style="color: #eeeeec;">use_skylog</span>           = 0

<span style="color: #eeeeec;">slave_data</span>           = /var/lib/postgresql/9.1/standby
<span style="color: #eeeeec;">slave_bin</span>            = /usr/lib/postgresql/9.1/bin
<span style="color: #eeeeec;">slave_stop_cmd</span>       = pg_ctlcluster 9.1 standby stop
<span style="color: #eeeeec;">slave_start_cmd</span>      = pg_ctlcluster 9.1 standby start
<span style="color: #eeeeec;">slave_config_dir</span>     = /etc/postgresql/9.1/standby/

<span style="color: #eeeeec;">walmgr_data</span>          = /var/lib/postgresql/walshipping/9.1/main
<span style="color: #eeeeec;">completed_wals</span>       = %(walmgr_data)s/logs.complete
<span style="color: #eeeeec;">partial_wals</span>         = %(walmgr_data)s/logs.partial
<span style="color: #eeeeec;">full_backup</span>          = %(walmgr_data)s/data.master
<span style="color: #eeeeec;">config_backup</span>        = %(walmgr_data)s/config.backup

<span style="color: #eeeeec;">backup_datadir</span>       = no
<span style="color: #eeeeec;">keep_backups</span>         = 0
<span style="color: #888a85;"># </span><span style="color: #888a85;">archive_command =
</span>
<span style="color: #888a85;"># </span><span style="color: #888a85;">primary database connect string for hot standby -- enabling
</span><span style="color: #888a85;"># </span><span style="color: #888a85;">this will cause the slave to be started in hot standby mode.
</span><span style="color: #eeeeec;">primary_conninfo</span>     = host=127.0.0.1 port=5434 user=postgres
</pre>

<p>And let's get started:</p>

<pre class="src">
:~$ cp standby.ini /etc/walshipping/9.1/main/

:~$ walmgr3 -v master.ini setup
2011-09-21 16:57:05,685 30450 INFO Configuring WAL archiving
2011-09-21 16:57:05,687 30450 DEBUG found 'archive_mode' in config -- enabling it
2011-09-21 16:57:05,687 30450 DEBUG found 'wal_level' in config -- setting to 'archive'
2011-09-21 16:57:05,688 30450 DEBUG modifying configuration: {'archive_mode': 'on', 'wal_level': 'archive', 'archive_command': '/usr/bin/walmgr3 /var/lib/postgresql/master.ini xarchive %p %f'}
2011-09-21 16:57:05,688 30450 DEBUG found parameter archive_mode with value ''off''
2011-09-21 16:57:05,690 30450 DEBUG found parameter wal_level with value ''minimal''
2011-09-21 16:57:05,690 30450 DEBUG found parameter archive_command with value ''''
2011-09-21 16:57:05,691 30450 INFO Restarting postmaster
2011-09-21 16:57:05,691 30450 DEBUG Execute cmd: 'pg_ctlcluster 9.1 main restart'
2011-09-21 16:57:09,404 30450 DEBUG Execute cmd: 'ssh' '-Tn' '-o' 'Batchmode=yes' '-o' 'StrictHostKeyChecking=no' '127.0.0.1' '/usr/bin/walmgr3' '/etc/walshipping/9.1/main/standby.ini' 'setup'
2011-09-21 16:57:09,712 30450 INFO Done

postgres@squeeze64:~$ walmgr3 master.ini backup
2011-09-21 17:00:17,259 30702 INFO Backup lock obtained.
2011-09-21 17:00:17,277 30692 INFO Execute SQL: select pg_start_backup('FullBackup'); [port=5434 dbname=template1]
2011-09-21 17:00:17,791 30712 INFO Removing expired backup directory: /var/lib/postgresql/walshipping/9.1/main/data.master
2011-09-21 17:00:18,200 30692 INFO Checking tablespaces
2011-09-21 17:00:18,202 30692 INFO pg_log does not exist, skipping
2011-09-21 17:00:18,259 30692 INFO Backup conf files from /etc/postgresql/9.1/main
2011-09-21 17:00:18,590 30731 INFO First useful WAL file is: 000000010000000200000092
2011-09-21 17:00:19,901 30759 INFO Backup lock released.
2011-09-21 17:00:19,919 30692 INFO Full backup successful

:~$ walmgr3 /etc/walshipping/9.1/main/standby.ini listbackups

List of backups:

Backup set      Timestamp                Label       First WAL
--------------- ------------------------ ----------- ------------------------
data.master     2011-09-21 17:00:17 CEST FullBackup  000000010000000200000092
</pre>

<p>Following articles will show how to manage that archive and how to go from
that to an <em>Hot Standby</em> fed by either <em>Streaming Replication</em> or <em>Archives</em>.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 21 Sep 2011 17:21:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/21-skytools-walmgr-part-1.html</guid>
</item>
<item>
  <title>el-get-3.1</title>
  <link>http://tapoueh.org/blog/2011/09/16-el-get-3.1.html</link>
  <description><![CDATA[<p>The <a href="https://github.com/dimitri/el-get">el-get</a> project releases its new stable version, <code>3.1</code>. This new release
fixes bugs, add a host of new recipes (we have 420 of them and counting) and
some nice new features too.  You really want to upgrade.</p>

<h2>New features</h2>

<p class="first">Among the features you will find dependencies management and <code>M-x
el-get-list-packages</code>, that you should try as soon as possible.  Of course,
don't miss <code>M-x el-get-self-update</code> that eases the process somehow.</p>

<center>
<p><img src="../../../images/emacs-el-get-list-packages.png" alt=""></p>
</center>

<p>This shows the result of <code>M-x el-get-list-packages</code>.  The packages that don't
have a description are the one from <a href="http://www.emacswiki.org/cgi-bin/wiki?action=index;match=%5C.(el&#124;tar)(%5C.gz)%3F%24">emacswiki</a> that doesn't provide a listing
of the filename <em>and</em> the first line of the file (it usually follows the
format <code>;;; filename.el --- description here</code>).  As we don't want to mirror
the website just to be able to provide descriptions, we just don't have them
now.</p>

<p>Another nice new feature, contributed by a user that wanted to self-learn
<a href="http://www.gnu.org/software/emacs/manual/html_node/elisp/index.html">elisp</a>, is the <code>el-get-user-package-directory</code> support.  Just place in there
some <code>init-my-package.el</code> files, and when <em>el-get</em> wants to init the <code>my-package</code>
package, it will load that file for you.  That helps managing your setup,
and I'm already using that in my own <code>~/.emacs.d/</code> repository.</p>


<h2>Upgrading</h2>

<p class="first">The upgrading is to be done with some care, though, because you need to edit
your packaging setup.  The <code>el-get-sources</code> variable used to be both where to
setup extra recipes and the list of packages you want to have installed, and
several people rightfully insisted that I should change that.  I've been
slow to be convinced, but there it is, they were right.</p>

<p>So now, <a href="http://www.emacswiki.org/emacs/el-get">el-get</a> works from the current status of packages and will init all
those packages you have installed.  Which means that you just <code>M-x
el-get-install</code> a package and don't think about it anymore.  If you need to
override this behavior, it's still possible to do so by specifying the whole
list of packages you want initialized (and installed if necessary) on the
<code>(el-get 'sync ...)</code> call.</p>

<p>That later setup is useful if you want to share your el-get selection on
several machines.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 16 Sep 2011 14:13:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/16-el-get-3.1.html</guid>
</item>
<item>
  <title>PostgreSQL 9.1</title>
  <link>http://tapoueh.org/blog/2011/09/19-sortie-de-9.1.html</link>
  <description><![CDATA[<p><a href="http://www.postgresql.org/about/news.1349">PostgreSQL 9.1</a> est dans les bacs ! Vous n'avez pas encore cette nouvelle
version en production ?  Pas encore évalué pourquoi vous devriez envisager
de migrer à cette version ?  Il existe beaucoup de bonnes raisons de passer
à cette version, et peu de pièges.</p>

<p>Nous commençons à lire des articles qui reprennent la nouvelle dans la
presse française, et j'ai le plaisir de mentionner celui de <a href="http://www.programmez.com/actualites.php?titre_actu=Sortie-de-PostgreSQL-91-&#33;&amp;id_actu=10190">programmez.com</a>
qui annonce « un système d'extensions inégalé ».  En tant que développeur
des <a href="http://www.postgresql.org/docs/9.1/static/extend-extensions.html">Extensions</a> dans PostgreSQL, je ne peux qu'être non seulement d'accord
avec eux, mais aussi flatté :)</p>

<p>Bons tests à tous, et bonne mises à jour pour les plus chanceux !</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 14 Sep 2011 10:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/19-sortie-de-9.1.html</guid>
</item>
<item>
  <title>Éviter les injections SQL</title>
  <link>http://tapoueh.org/blog/2011/09/07-eviter-les-injections-sql.html</link>
  <description><![CDATA[<p>Nous avons parlé la dernière fois les règles d'<a href="http://tapoueh.org/blog/2011/08/18-echappements-de-chaine.html">échappement de chaînes</a> avec
PostgreSQL, et mentionné qu'utiliser ces techniques afin de protéger les
données insérées dans les requêtes SQL n'était pas une bonne idée dans la
mesure où PostgreSQL offre une fonctionnalité bien plus adaptée.</p>

<p>Nous faisons face ici à un problème de sécurité très bien décrit dans le
billet humoristique de <a href="http://xkcd.com/327/">Little Boby Tables</a>, dont je vous recommande la
lecture. L'idée est simple, la mise en place de contre mesure fourmille de
pièges subtils, à moins d'utiliser la solution décrite ci-après.</p>

<center>
<p><img src="http://imgs.xkcd.com/comics/exploits_of_a_mom.png" alt=""></p>
</center>

<p>Lorsque l'on envoie une requête SQL à PostgreSQL, celle-ci contient
pêle-mêle un mélange de mots-clés SQL et de données utilisateurs. Dans la
requête <code>SELECT colname FROM table WHERE pk = 1234;</code>
l'élément <code>1234</code> est une donnée fournie à PostgreSQL. Lorsque l'on utilise
d'autre types de données, on va parler de <em>litéral</em>, qui peut être ou non
<em>décoré</em>.  Un exemple ?</p>

<pre class="src">
=# SELECT <span style="color: #ad7fa8; font-style: italic;">'undecorated literal'</span>, pg_typeof(<span style="color: #ad7fa8; font-style: italic;">'undecoreted literal'</span>),
          date <span style="color: #ad7fa8; font-style: italic;">'today'</span>, pg_typeof(date <span style="color: #ad7fa8; font-style: italic;">'today'</span>);
      ?column?       | pg_typeof |    date    | pg_typeof
<span style="color: #888a85;">---------------------+-----------+------------+-----------
</span> undecorated literal | unknown   | 2011-09-07 | date
(1 row)
</pre>

<p>Outre l'aspect types de données (un litéral non décoré est de type <em>unknown</em>
jusqu'à ce qu'une opération force son type, c'est ce qui permet d'avoir du
polymorphisme dans PostgreSQL), nous voyons ici que PostgreSQL doit faire la
différence entre le SQL lui-même et les paramètres qui le composent. Il sait
bien sûr faire cela, il suffit d'encadrer les valeurs dans des simples
guillemets ou bien d'utiliser la notation dite de <a href="http://docs.postgresqlfr.org/9.0/sql-syntax.html#sql-syntax-dollar-quoting">dollar quoting</a>. Mais si
l'on ne prend pas de précautions, l'utilisateur peut terminer la séquence
d'échappements depuis le champ de saisie du formulaire…</p>

<p><a href="http://docs.postgresql.fr/9.1/libpq.html">libpq</a> est la librairie standard cliente de PostgreSQL et fourni des <em>API</em> de
connexion et propose une fonction <a href="http://docs.postgresql.fr/9.1/libpq-exec.html#libpq-pqexecparams">PGexecParams</a>. Cette fonction expose un
mécanisme disponible dans le protocole de communication de PostgreSQL
lui-même : il est possible de faire parvenir le SQL et les données qu'il
contient dans deux parties différentes du messages plutôt que de les
mélanger. Ainsi, le serveur n'a plus du tout à deviner où commencent et où
terminent les données dans la requête, il lui suffit de regarder dans le
tableau séparé contenant les données quand il en a besoin.</p>

<p>Terminées les injections SQL !</p>

<p>Note : cette fonction est exposée dans la plupart des pilotes de connexion,
et même en PHP, dont la popularité et l'exposition me poussent à donner une
référence plus précise : utilisez <a href="http://fr2.php.net/manual/en/function.pg-query-params.php">pg_query_params</a>, son intérêt n'est pas
simplement syntaxique, il va jusque dans la définition des échanges de
données entre le client (votre code PHP) et le serveur (PostgreSQL).</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 07 Sep 2011 11:36:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/07-eviter-les-injections-sql.html</guid>
</item>
<item>
  <title>Éviter les injections SQL</title>
  <link>http://tapoueh.org/blog/2011/09/07-requete-parametree.html</link>
  <description><![CDATA[<p>Nous avons parlé la dernière fois les règles d'<a href="http://tapoueh.org/blog/2011/08/18-echappements-de-chaine.html">échappement de chaînes</a> avec
PostgreSQL, et mentionné qu'utiliser ces techniques afin de protéger les
données insérées dans les requêtes SQL n'était pas une bonne idée dans la
mesure où PostgreSQL offre une fonctionnalité bien plus adaptée.</p>

<p>Nous faisons face ici à un problème de sécurité très bien décrit dans le
billet humoristique de <a href="http://xkcd.com/327/">Little Boby Tables</a>, dont je vous recommande la
lecture. L'idée est simple, la mise en place de contre mesure fourmille de
pièges subtils, à moins d'utiliser la solution décrite ci-après.</p>

<center>
<p><img src="http://imgs.xkcd.com/comics/exploits_of_a_mom.png" alt=""></p>
</center>

<p>Lorsque l'on envoie une requête SQL à PostgreSQL, celle-ci contient
pêle-mêle un mélange de mots-clés SQL et de données utilisateurs. Dans la
requête <code>SELECT colname FROM table WHERE pk = 1234;</code>
l'élément <code>1234</code> est une donnée fournie à PostgreSQL. Lorsque l'on utilise
d'autre types de données, on va parler de <em>litéral</em>, qui peut être ou non
<em>décoré</em>.  Un exemple ?</p>

<pre class="src">
=# SELECT <span style="color: #ad7fa8; font-style: italic;">'undecorated literal'</span>, pg_typeof(<span style="color: #ad7fa8; font-style: italic;">'undecoreted literal'</span>),
          date <span style="color: #ad7fa8; font-style: italic;">'today'</span>, pg_typeof(date <span style="color: #ad7fa8; font-style: italic;">'today'</span>);
      ?column?       | pg_typeof |    date    | pg_typeof
<span style="color: #888a85;">---------------------+-----------+------------+-----------
</span> undecorated literal | unknown   | 2011-09-07 | date
(1 row)
</pre>

<p>Outre l'aspect types de données (un litéral non décoré est de type <em>unknown</em>
jusqu'à ce qu'une opération force son type, c'est ce qui permet d'avoir du
polymorphisme dans PostgreSQL), nous voyons ici que PostgreSQL doit faire la
différence entre le SQL lui-même et les paramètres qui le composent. Il sait
bien sûr faire cela, il suffit d'encadrer les valeurs dans des simples
guillemets ou bien d'utiliser la notation dite de <a href="http://docs.postgresqlfr.org/9.0/sql-syntax.html#sql-syntax-dollar-quoting">dollar quoting</a>. Mais si
l'on ne prend pas de précautions, l'utilisateur peut terminer la séquence
d'échappements depuis le champ de saisie du formulaire…</p>

<p><a href="http://docs.postgresql.fr/9.1/libpq.html">libpq</a> est la librairie standard cliente de PostgreSQL et fourni des <em>API</em> de
connexion et propose une fonction <a href="http://docs.postgresql.fr/9.1/libpq-exec.html#libpq-pqexecparams">PGexecParams</a>. Cette fonction expose un
mécanisme disponible dans le protocole de communication de PostgreSQL
lui-même : il est possible de faire parvenir le SQL et les données qu'il
contient dans deux parties différentes du messages plutôt que de les
mélanger. Ainsi, le serveur n'a plus du tout à deviner où commencent et où
terminent les données dans la requête, il lui suffit de regarder dans le
tableau séparé contenant les données quand il en a besoin.</p>

<p>Terminées les injections SQL !</p>

<p>Note : cette fonction est exposée dans la plupart des pilotes de connexion,
et même en PHP, que la popularité et l'exposition me poussent à donner une
référence plus précise : utilisez <a href="http://fr2.php.net/manual/en/function.pg-query-params.php">pg_query_params</a>, son intérêt n'est pas
simplement syntaxique, il va jusque dans la définition des échanges de
données entre le client (votre code PHP) et le serveur (PostgreSQL).</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 07 Sep 2011 11:36:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/07-requete-parametree.html</guid>
</item>
<item>
  <title>PostgreSQL and debian</title>
  <link>http://tapoueh.org/blog/2011/09/05-apt-postgresql-org.html</link>
  <description><![CDATA[<p>After talking about it for a very long time, work finally did begin!  I'm
talking about the <a href="https://github.com/dimitri/apt.postgresql.org">apt.postgresql.org</a> build system that will allow us, in the
long run, to propose <code>debian</code> versions of binary packages for <a href="http://www.postgresql.org/">PostgreSQL</a> and
its extensions, compiled for a bunch of debian and ubuntu versions.</p>

<p>We're now thinking to support the <code>i386</code> and <code>amd64</code> architectures for <code>lenny</code>,
<code>squeeze</code>, <code>wheezy</code> and <code>sid</code>, and also for <code>maverick</code> and <code>natty</code>, maybe <code>oneiric</code> too
while at it.</p>

<p>It's still the very beginning of the effort, and it was triggered by the
decision to move <code>sid</code> to <code>9.1</code>.  While it's a good decision in itself, I still
hate to have to pick only one PostgreSQL version per debian stable release
when we have all the technical support we need to be able to support all
stable releases that <em>upstream</em> is willing to maintain. If you've been living
under a rock, or if you couldn't care less about <code>debian</code> choices, the problem
here for debian is ensuring security (and fixes) updates for PostgreSQL —
they promise they will handle the job just fine in the social contract, and
don't want to have to it without support from PostgreSQL if a <em>debian stable</em>
release contains a deprecated PostgreSQL version.</p>

<p>That opens the door for PostgreSQL community to handle the packaging of its
solutions as a service to its debian users.  We intend to open with support
for <code>8.4</code>, <code>9.0</code> and <code>9.1</code>, and maybe <code>8.3</code> too, as <a href="http://qa.debian.org/developer.php?login=myon">Christoph Berg</a> is doing good
progress on this front.  See, it's teamwork here!</p>

<p>We still have more work to do, and setting up the build environment so that
we are able to provide the packages for so much targets will indeed be
interesting. Getting there, a step after another.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 05 Sep 2011 17:14:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/09/05-apt-postgresql-org.html</guid>
</item>
<item>
  <title>pg_restore -L &amp; pg_staging</title>
  <link>http://tapoueh.org/blog/2011/08/29-pgstaging-and-pgrestore-listing.html</link>
  <description><![CDATA[<p>On the <a href="http://archives.postgresql.org/pgsql-hackers">PostgreSQL Hackers</a> mailing lists, <a href="http://people.planetpostgresql.org/andrew/">Andrew Dunstan</a> just proposed some
new options for <code>pg_dump</code> and <code>pg_restore</code> to ease our lives.  One of the
answers was talking about some scripts available to exploit the <a href="http://www.postgresql.org/docs/9.0/static/app-pgrestore.html">pg_restore</a>
listing that you play with using options <code>-l</code> and <code>-L</code>, or the long name
versions <code>--list</code> and <code>--use-list</code>.  The <a href="../../../pgsql/pgstaging.html">pg_staging</a> tool allows you to easily
exploit those lists too.</p>

<p>The <code>pg_restore</code> list is just a listing of one object per line of all objects
contained into a <em>custom</em> dump, that is one made with <code>pg_dump -Fc</code>.  You can
then tweak this listing in order to comment out some objects (prepending a <code>;</code>
to the line where you find it), and give your hacked file back to <code>pg_restore
--use-list</code> so that it will skip them.</p>

<p>What's pretty useful here, among other things, is that a table will have in
fact more than one line in the listing.  One is for the <code>TABLE</code> definition,
another one for the <code>TABLE DATA</code>.  So that <code>pg_staging</code> is able to provide you
with options for only restoring some <em>schemas</em>, some <em>schemas_nodata</em> and even
some <em>tablename_nodata_regexp</em>, to use directly the configuration options
names.</p>

<p>How to do a very simple exclusion of some table's data when restoring a
dump, will you ask me?  There we go.  Let's first prepare an environment,
where I have only a <a href="http://www.postgresql.org/">PostgreSQL</a> server running.</p>

<pre class="src">
$ git clone git://github.com/dimitri/pg_staging.git
$ git clone git://github.com/dimitri/pgloader.git
$ for s in */*.sql; do psql -f $s; done
$ pg_dump -Fc &gt; pgloader.dump
</pre>

<p>Now I have a dump with some nearly random SQL objects in it, let's filter
out the tables named <em>reformat</em> and <em>parallel</em> from that.  We will take the
sample setup from the <code>pg_staging</code> project.  Going the quick route, we will
not even change the default sample database name that's used, which is
<code>postgres</code>.  After all, the <code>catalog</code> command of <code>pg_staging</code> that we're using
here is a <em>developer</em> command, you're supposed to be using <code>pg_staging</code> for a
lot more services that just this one.</p>

<pre class="src">
$ cp pg_staging/pg_staging.ini .
$ (echo <span style="color: #bc8f8f;">"schemas = public"</span>;
   echo <span style="color: #bc8f8f;">"tablename_nodata_regexp = parallel,reformat"</span>) \
  &gt;&gt; pg_staging.ini
$ echo <span style="color: #bc8f8f;">"catalog postgres pgloader.dump"</span> \
   | python pg_staging/pg_staging.py -c pg_staging.ini
 ; Archive created at Mon Aug 29 17:17:49 2011
 ;
 ; [EDITED OUTPUT]
 ;
 ; Selected TOC Entries:
 ;
3; 2615 2200 SCHEMA - public postgres
1864; 0 0 COMMENT - SCHEMA public postgres
1536; 1259 174935 TABLE public parallel dimitri
1537; 1259 174943 TABLE public partial dimitri
1538; 1259 174951 TABLE public reformat dimitri
;1853; 0 174935 TABLE DATA public parallel dimitri
1854; 0 174943 TABLE DATA public partial dimitri
;1855; 0 174951 TABLE DATA public reformat dimitri
1834; 2606 174942 CONSTRAINT public parallel_pkey dimitri
1836; 2606 174950 CONSTRAINT public partial_pkey dimitri
1838; 2606 174955 CONSTRAINT public reformat_pkey dimitri
</pre>

<p>We can see that the objects indeed are skipped, now how to really go about
the <code>pg_restore</code> is like that:</p>

<pre class="src">
$ createdb foo
$ echo <span style="color: #bc8f8f;">"catalog postgres pgloader.dump"</span> \
 |python pg_staging/pg_staging.py -c pg_staging.ini &gt; short.list
$ pg_restore -L short.list -d foo pgloader.dump
</pre>

<p>The little bonus with using <code>pg_staging</code> is that when filtering out a <em>schema</em>
it will track all tables and triggers from that schema, and also the
functions used in the trigger definition.  Which is not as easy as it
sounds, believe me!</p>

<p>The practical use case is when filtering out <code>PGQ</code> and <code>Londiste</code>, then the <code>PGQ</code>
triggers will automatically be skipped by <code>pg_staging</code> rather than polluting
the <code>pg_restore</code> logs because the <code>CREATE TRIGGER</code> command could not find the
necessary implementation procedure.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 29 Aug 2011 18:05:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/29-pgstaging-and-pgrestore-listing.html</guid>
</item>
<item>
  <title>Skytools, version 3</title>
  <link>http://tapoueh.org/blog/2011/08/26-skytools3.html</link>
  <description><![CDATA[<p>You can find <a href="http://packages.debian.org/source/experimental/skytools3">skytools3</a> in debian experimental already, it's in <em>release
candidate</em> status.  What's missing is the documentation, so here's an idea:
I'm going to make a blog post series about <a href="https://github.com/markokr/skytools">skytools</a> next features, how to
use them, what they are good for, etc.  This first article of the series
will just list what are those new features.</p>

<p>Here are the slides from the <a href="http://www.char11.org/">CHAR(11)</a> talk I made last month, about that
very subject:</p>

<center>
<p><a class="image-link" href="../../../images/confs/CHAR_2011_Skytools3.pdf">
<img src="../../../images/confs/CHAR_2011_Skytools3.png"></a></p>
</center>


<p>The new version comes with a lot of new features.  <code>PGQ</code> now is able to
duplicate the queue events from one node to the next, so that it's able to
manage <em>switching over</em>.  To do that we have three types of nodes now, <em>root</em>,
<em>branch</em> and <em>leaf</em>.  <code>PGQ</code> also supports <em>cooperative consumers</em>, meaning that you
can share the processing load among many <em>consumers</em>, or workers.</p>

<p><code>Londiste</code> now benefits from the <em>switch over</em> feature, and is packed with new
little features like <code>add &lt;table&gt; --create</code>, the new <code>--trigger-flags</code> argument,
and the new <code>--handler</code> thing (to do e.g. partial table replication).  Let's
not forget the much awaited <code>execute &lt;script&gt;</code> command that allows to include
<code>DDL</code> commands into the replication stream, nor the <em>parallel</em> <code>COPY</code> support that
will boost your initial setup.</p>

<p><code>walmgr</code> in the new version behaves correctly when using <a href="http://www.postgresql.org">PostgreSQL</a> 9.0.
Meaning that as soon as no more <em>WAL</em> files are available in the archives, it
returns an error code to the <em>archiver</em> so that the server switches to
<em>streaming</em> live from the <code>primary_conninfo</code>, then back to replaying the files
from the archive if the connection were to fail, etc.  All in all, it just
works.</p>

<p>Details to follow here, stay tuned!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 26 Aug 2011 21:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/26-skytools3.html</guid>
</item>
<item>
  <title>pgfincore in debian</title>
  <link>http://tapoueh.org/blog/2011/08/19-pgfincore-in-debian.html</link>
  <description><![CDATA[<p>As of pretty recently, <a href="http://villemain.org/projects/pgfincore">pgfincore</a> is now in debian, as you can see on its
<a href="http://packages.debian.org/sid/postgresql-9.0-pgfincore">postgresql-9.0-pgfincore</a> page.  The reason why it entered the <a href="http://www.debian.org/">debian</a>
archives is that it reached the <code>1.0</code> release!</p>

<p>Rather than talking about what <em>pgfincore</em> is all about (<em>A set of functions to
manage pages in memory from PostgreSQL</em>), I will talk about its packaging and
support as a <em>debian package</em>.  Here's the first example of a modern
multi-version packaging I have to offer.  <a href="https://github.com/dimitri/pgfincore/tree/master/debian">pgfincore packaging</a> supports
building for <code>8.4</code> and <code>9.0</code> and <code>9.1</code> out of the box, even if the only binary
you'll find in <em>debian</em> sid is the <code>9.0</code> one, as you can check on the
<a href="http://packages.debian.org/source/sid/pgfincore">pgfincore debian source package</a> page.</p>

<p>Also, this is the first package I've done properly using the newer version
of <a href="http://kitenet.net/~joey/code/debhelper/">debhelper</a>, which make the <a href="https://github.com/dimitri/pgfincore/blob/master/debian/rules">debian/rules</a> file easier than ever.  Let's have
a look at it:</p>

<pre class="src">
<span style="color: #b8860b;">SRCDIR</span> = $(<span style="color: #b8860b;">CURDIR</span>)
<span style="color: #b8860b;">TARGET</span> = $(<span style="color: #b8860b;">CURDIR</span>)/debian/pgfincore-%v
<span style="color: #b8860b;">PKGVERS</span> = $(<span style="color: #b8860b;">shell</span> dpkg-parsechangelog | awk -F <span style="color: #bc8f8f;">'[:-]'</span> <span style="color: #bc8f8f;">'/^Version:/ { print substr($$2, 2) }'</span>)
<span style="color: #b8860b;">EXCLUDE</span> = --exclude-vcs --exclude=debian

<span style="color: #7f007f;">include</span> <span style="color: #b8860b;">/usr/share/postgresql-common/pgxs_debian_control.mk</span>

<span style="color: #0000ff;">override_dh_auto_clean</span>: debian/control
        pg_buildext clean $(<span style="color: #b8860b;">SRCDIR</span>) $(<span style="color: #b8860b;">TARGET</span>) <span style="color: #bc8f8f;">"$(</span><span style="color: #b8860b;">CFLAGS</span><span style="color: #bc8f8f;">)"</span>
        dh_clean

<span style="color: #0000ff;">override_dh_auto_build</span>:
<span style="background-color: #ff69b4;">        #</span><span style="color: #b22222;"> </span><span style="color: #b22222;">build all supported version
</span>        pg_buildext build $(<span style="color: #b8860b;">SRCDIR</span>) $(<span style="color: #b8860b;">TARGET</span>) <span style="color: #bc8f8f;">"$(</span><span style="color: #b8860b;">CFLAGS</span><span style="color: #bc8f8f;">)"</span>

<span style="color: #0000ff;">override_dh_auto_install</span>:
<span style="background-color: #ff69b4;">        #</span><span style="color: #b22222;"> </span><span style="color: #b22222;">then install each of them
</span>        for v in <span style="color: #bc8f8f;">`pg_buildext supported-versions $(</span><span style="color: #b8860b;">SRCDIR</span><span style="color: #bc8f8f;">)`</span>; do \
                dh_install -ppostgresql-$$v-pgfincore ;\
        done

<span style="color: #0000ff;">orig</span>: clean
        cd .. &amp;&amp; tar czf pgfincore_$(<span style="color: #b8860b;">PKGVERS</span>).orig.tar.gz $(<span style="color: #b8860b;">EXCLUDE</span>) pgfincore

<span style="color: #0000ff;">%</span>:
        dh <span style="color: #0000ff;">$</span><span style="color: #5f9ea0;">@</span>
</pre>

<p>The <code>debian/rules</code> file is known to be the corner stone of your debian
packaging, and usually is the most complex part of it.  It's a <code>Makefile</code> at
its heart, and we can see that thanks to the <code>debhelper</code> magic it's not that
complex to maintain anymore.</p>

<p>Then, this file is using support from a bunch of helpers command, each of
them comes with its own man page and does a little part of the work.  The
overall idea around <code>debhelper</code> is that what it does covers 90% of the cases
around, and it's not aiming for more.  You have to <em>override</em> the parts where
it defaults to being wrong.</p>

<p>Here for example the build system has to produce files for all three
supported versions of <a href="http://www.postgresql.org/">PostgreSQL</a>, which means invoking the same build system
three time with some changes in the <em>environment</em> (mainly setting the
<code>PG_CONFIG</code> variable correctly).  But even for that we have a <em>debian</em> facility,
that comes in the package <a href="http://packages.debian.org/sid/postgresql-server-dev-all">postgresql-server-dev-all</a>, called <code>pg_buildext</code>.  As
long as your extension build system is <code>VPATH</code> friendly, it's all automated.</p>

<p>Please read that last sentence another time.  <code>VPATH</code> is the thing that allows
<code>Make</code> to find your source tree somewhere in the system, not in the current
working directory.  That allows you to cleanly build the same sources in
different build locations, which is exactly what we need here, and is
cleanly supported by <a href="http://www.postgresql.org/docs/9.1/static/extend-pgxs.html">PGXS</a>, the <a href="http://www.postgresql.org/docs/9.1/static/extend-pgxs.html">PostgreSQL Extension Building Infrastructure</a>.</p>

<p>Which means that the main <code>Makefile</code> of <em>pgfincore</em> had to be simplified, and
the code layout too.  Some advances <code>Make</code> features such as <code>$(wildcard ...)</code>
and all will not work here.  See what we got at the end:</p>

<pre class="src">
ifndef VPATH
<span style="color: #b8860b;">SRCDIR</span> = .
else
<span style="color: #b8860b;">SRCDIR</span> = $(<span style="color: #b8860b;">VPATH</span>)
endif

<span style="color: #b8860b;">EXTENSION</span>    = pgfincore
<span style="color: #b8860b;">EXTVERSION</span>   = $(<span style="color: #b8860b;">shell</span> grep default_version $(<span style="color: #b8860b;">SRCDIR</span>)/$(<span style="color: #b8860b;">EXTENSION</span>).control | \
               sed -e <span style="color: #bc8f8f;">"s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/"</span>)

<span style="color: #b8860b;">MODULES</span>      = $(<span style="color: #b8860b;">EXTENSION</span>)
<span style="color: #b8860b;">DATA</span>         = sql/pgfincore.sql sql/uninstall_pgfincore.sql
<span style="color: #b8860b;">DOCS</span>         = doc/README.$(<span style="color: #b8860b;">EXTENSION</span>).rst

<span style="color: #b8860b;">PG_CONFIG</span>    = pg_config

<span style="color: #b8860b;">PG91</span>         = $(<span style="color: #b8860b;">shell</span> $(<span style="color: #b8860b;">PG_CONFIG</span>) --version | grep -qE <span style="color: #bc8f8f;">"8\.|9\.0"</span> &amp;&amp; echo no || echo yes)

ifeq ($(<span style="color: #b8860b;">PG91</span>),yes)
<span style="color: #0000ff;">all</span>: pgfincore--$(<span style="color: #b8860b;">EXTVERSION</span>).sql

<span style="color: #0000ff;">pgfincore--$(</span><span style="color: #0000ff;">EXTVERSION</span><span style="color: #0000ff;">).sql</span>: sql/pgfincore.sql
        cp $<span style="color: #5f9ea0;">&lt;</span> <span style="color: #0000ff;">$</span><span style="color: #5f9ea0;">@</span>

<span style="color: #b8860b;">DATA</span>        = pgfincore--unpackaged--$(<span style="color: #b8860b;">EXTVERSION</span>).sql pgfincore--$(<span style="color: #b8860b;">EXTVERSION</span>).sql
<span style="color: #b8860b;">EXTRA_CLEAN</span> = sql/$(<span style="color: #b8860b;">EXTENSION</span>)--$(<span style="color: #b8860b;">EXTVERSION</span>).sql
endif

<span style="color: #b8860b;">PGXS</span> := $(<span style="color: #b8860b;">shell</span> $(<span style="color: #b8860b;">PG_CONFIG</span>) --pgxs)
<span style="color: #7f007f;">include</span> $(<span style="color: #b8860b;">PGXS</span>)

<span style="color: #0000ff;">deb</span>:
        dh clean
        make -f debian/rules orig
        debuild -us -uc -sa
</pre>

<p>No more <code>Make</code> magic to find source files.  Franckly though, when your sources
are 1 <code>c</code> file and 2 <code>sql</code> files, you don't need that much magic anyway.  You
just want to believe that a single generic <code>Makefile</code> will happily build any
project you throw at it, only requiring minor adjustment.  Well, the reality
is that you might need some more little adjustments if you want to benefit
from <code>VPATH</code> building, and having the binaries for <code>8.4</code> and <code>9.0</code> and <code>9.1</code> built
seemlessly in a simple loop.  Like we have here for <em>pgfincore</em>.</p>

<p>Now the <code>Makefile</code> still contains a little bit of magic, in order to parse the
extension version number from its <em>control file</em> and produce a <em>script</em> named
accordingly.  Then you'll notice a difference between the
<a href="https://github.com/dimitri/pgfincore/blob/master/debian/postgresql-9.1-pgfincore.install">postgresql-9.1-pgfincore.install</a> file and the
<a href="https://github.com/dimitri/pgfincore/blob/master/debian/postgresql-9.0-pgfincore.install">postgresql-9.0-pgfincore.install</a>.  We're just not shipping the same files:</p>

<pre class="src">
debian/pgfincore-9.0/pgfincore.so usr/lib/postgresql/9.0/lib
sql/pgfincore.sql usr/share/postgresql/9.0/contrib
sql/uninstall_pgfincore.sql usr/share/postgresql/9.0/contrib
</pre>

<p>As you can see here:</p>

<pre class="src">
debian/pgfincore-9.1/pgfincore.so usr/lib/postgresql/9.1/lib
debian/pgfincore-9.1/pgfincore*.sql usr/share/postgresql/9.1/extension
sql/pgfincore--unpackaged--1.0.sql usr/share/postgresql/9.1/extension
</pre>

<p>So, now that we uncovered all the relevant magic, packaging and building
your next extension so that it supports as many PostgreSQL major releases as
you need to will be that easy.</p>

<p>For reference, you might need to also tweak
<code>/usr/share/postgresql-common/supported-versions</code> so that it allows you to
build for all those versions you claim to support in the <a href="https://github.com/dimitri/pgfincore/blob/master/debian/pgversions">debian/pgversions</a>
file.</p>

<pre class="src">
$ sudo dpkg-divert \
--divert /usr/share/postgresql-common/supported-versions.distrib \
--rename /usr/share/postgresql-common/supported-versions

$ cat /usr/share/postgresql-common/supported-versions
#! /bin/bash

dpkg -l postgresql-server-dev-* \
| awk -F '[ -]' '/^ii/ &amp;&amp; ! /server-dev-all/ {print $6}'
</pre>

<p>All of this will come pretty handy when we finally sit down and work on a
way to provide binary packages for PostgreSQL and its extensions, and all
supported versions of those at that.  This very project is not dead, it's
just sleeping some more.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 19 Aug 2011 23:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/19-pgfincore-in-debian.html</guid>
</item>
<item>
  <title>Échappement de chaînes</title>
  <link>http://tapoueh.org/blog/2011/08/18-echappements-de-chaine.html</link>
  <description><![CDATA[<p>Parmis les nouveautés de la <a href="http://www.postgresql.org/about/news.1331">prochaine version</a> de <a href="http://www.postgresql.org/">PostgreSQL</a>, la fameuse <code>9.1</code>,
il faut signaler le changement de valeur par défaut de la variable
<code>standard_conforming_strings</code>, qui passe à <em>vraie</em>.</p>

<p>En effet, l'utilisation d'échappements avec le caractère « anti-slash »
n'est pas conforme au standard SQL.  Le paramètre
<code>standard_conforming_strings</code> permet de contrôler le comportement de
PostgreSQL lorsqu'il lit une chaîne de caractère dans une requête SQL.</p>

<p>Voyons quelques exemples :</p>

<pre class="src">
dimitri=# set standard_conforming_strings to true;
SET
dimitri=# select 'hop''';
 ?column?
----------
 hop'
(1 ligne)

dimitri=# select 'hop\'';
dimitri'# ';
 ?column?
----------
 hop\';

(1 ligne)

dimitri=# select E'hop\'';
 ?column?
----------
 hop'
(1 ligne)

dimitri=# set standard_conforming_strings to false;
SET
dimitri=# select E'hop\'';
 ?column?
----------
 hop'
(1 ligne)

dimitri=# select 'hop\'';
ATTENTION:  utilisation non standard de \' dans une cha&#238;ne litt&#233;rale
LIGNE 1 : select 'hop\'';
                 ^
ASTUCE : Utilisez '' pour &#233;crire des guillemets dans une cha&#238;ne ou utilisez
la syntaxe de cha&#238;ne d'&#233;chappement (E'...').
 ?column?
----------
 hop'
(1 ligne)
</pre>

<p>Il existe un moyen de forcer PostgreSQL à accepter l'utilisation
d'échappements avec « anti-slash » indépendamment de la valeur de
<code>standard_conforming_strings</code>, c'est la notation préfixée avec <code>E</code>.  Il est
recommandé de toujours l'utiliser dès lors que la chaîne de caractère
contient des « anti-slash » utilisés comme échappement (du caractère simple
guillemet en général).</p>

<p>Le paramètre <code>escape_string_warning</code>, enfin, permet de désactiver les
avertissements tels que présentés dans le dernier exemple ci-dessus,
lorsqu'il est positionné à <code>off</code>.  Bien sûr, sa valeur par défaut est <code>on</code>.</p>

<p>Toute apparition de ce <em>WARNING</em> lorsque <code>escape_string_warning</code> est <code>on</code> signifie
que votre application n'est pas prête à migrer à <code>9.1</code> avec son paramétrage
par défaut.  Il existe deux actions possible : changer le paramétrage de sa
nouvelle valeur par défaut à sa précédente, ou bien corriger ses
applications pour utiliser le préfixe <code>E</code> dès que cela est nécessaire.</p>

<p>L'utilisation de <code>standard_conforming_strings</code> à <code>on</code> présente un autre avantage
au respect du standard SQL : la sécurité contre les injections.  S'il n'est
pas possible d'échapper le guillemet simple qui termine toute chaîne de
caractère utilisateur, il devient compliqué de jouer au plus malin avec le
<em>parser</em>.  Le mieux ici reste bien sûr d'utiliser les requêtes paramétrées, à
suivre dans un prochain article.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 18 Aug 2011 19:01:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/18-echappements-de-chaine.html</guid>
</item>
<item>
  <title>el-get-list-packages</title>
  <link>http://tapoueh.org/blog/2011/08/18-el-get-list-packages.html</link>
  <description><![CDATA[<p>From the first days of <a href="../../../emacs/el-get.html">el-get</a> is was quite clear for me that we would reach
a point where users would want a nice listing including descriptions of the
packages, and a <em>major mode</em> allowing you to select packages to install,
remove and update.  It was also quite clear that I was not much interested
into doing it myself, even if I would appreciate having it done.</p>

<p>Well, the joy of Open Source &amp; Free Software (pick your own poison).
<a href="https://github.com/jglee1027">jglee1027</a> is this <em>GitHub</em> guy who did offer an implementation of said
facility, and who added descriptions for almost all of the now <code>402</code> recipes
that we have included with <a href="../../../emacs/el-get.html">el-get</a>.</p>

<p>Here's an image of what you get:</p>

<center>
<p><img src="../../../images/emacs-el-get-list-packages.png" alt=""></p>
</center>

<p>The packages with no description are fetched by <code>M-x el-get-emacswiki-refresh</code>
which will not download all <a href="http://emacswiki.org">emacswiki</a> content locally just so that it can
parse the scripts's header and have a local description.  Maybe it's time to
ask for another page over there like <a href="http://www.emacswiki.org/cgi-bin/wiki?action=index;match=%5C.(el%7Ctar)(%5C.gz)%3F%24">emacswiki page index</a> but containing the
first line too.</p>

<p>For recipes we offer, this first line often looks like the following:</p>

<pre class="src">
<span style="color: #b22222;">;;; </span><span style="color: #b22222;">123-menu.el --- Simple menuing system, reminiscent of Lotus 123 in DOS
</span></pre>

<p>Of course some files over there are not following the stanza, but that would
be good enough already.</p>

<p>All in all, I hope you enjoy <code>M-x el-get-list-packages</code>!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 18 Aug 2011 18:10:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/18-el-get-list-packages.html</guid>
</item>
<item>
  <title>Tutoriel pgloader</title>
  <link>http://tapoueh.org/blog/2011/08/15-tutoriel-pgloader.html</link>
  <description><![CDATA[<p>En reprenant le contenu des articles de la série sur <a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a>, j'ai pris le
temps de compiler un tutoriel complet, en anglais.  Si j'en crois les
quelques mails que je reçois régulièrement au sujet de <code>pgloader</code> depuis
quelques années maintenant, cela devrait aider les nouveaux utilisateurs.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 15 Aug 2011 15:39:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/15-tutoriel-pgloader.html</guid>
</item>
<item>
  <title>pgloader tutorial</title>
  <link>http://tapoueh.org/blog/2011/08/15-pgloader-tutorial.html</link>
  <description><![CDATA[<p>To finish up the pgloader series, I've compiled all the information into a
single page, the long awaited <a href="http://tapoueh.org/pgsql/pgloader.html#sec5">pgloader tutorial</a>.  That should help lots of
users to get started with <code>pgloader</code>.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 15 Aug 2011 15:33:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/15-pgloader-tutorial.html</guid>
</item>
<item>
  <title>pgloader constant cols</title>
  <link>http://tapoueh.org/blog/2011/08/12-pgloader-udc.html</link>
  <description><![CDATA[<p>The previous articles in the <a href="../../../pgsql/pgloader.html">pgloader</a> series detailed <a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">How To Use PgLoader</a>
then <a href="http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html">How to Setup pgloader</a>, then what to expect from a <a href="http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html">parallel pgloader</a>
setup, and then <a href="http://tapoueh.org/blog/2011/08/05-reformating-modules-for-pgloader.html">pgloader reformating</a>.  Another need you might encounter when
you get to use <a href="../../../pgsql/pgloader.html">pgloader</a> is adding <em>constant</em> values into a table's column.</p>

<p>The basic situation where you need to do so is adding an <em>origin</em> field to
your table.  The value of that is not to be found in the data file itself,
typically, but known in the pgloader setup.  That could even be the <code>filename</code>
you are importing data from.</p>

<p>In <a href="../../../pgsql/pgloader.html">pgloader</a> that's called a <em>user defined column</em>.  Here's what the relevant
<a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> setup looks like:</p>

<pre class="src">
[<span style="color: #228b22;">udc</span>]
<span style="color: #b8860b;">table</span>           = udc
<span style="color: #b8860b;">format</span>          = text
<span style="color: #b8860b;">filename</span>        = udc/udc.data
<span style="color: #b8860b;">input_encoding</span>  = <span style="color: #bc8f8f;">'latin1'</span>
<span style="color: #b8860b;">field_sep</span>       = %
<span style="color: #b8860b;">columns</span>         = b:2, d:1, x:3, y:4
<span style="color: #b8860b;">udc_c</span>           = constant value
<span style="color: #b8860b;">copy_columns</span>    = b, c, d
</pre>

<p>And the data file is:</p>

<pre class="src">
1%5%foo%bar
2%10%bar%toto
3%4%toto%titi
4%18%titi%baz
5%2%baz%foo
</pre>

<p>And here's what the loaded table looks like:</p>

<pre class="src">
pgloader/examples$ pgloader -Tsc pgloader.conf udc
Table name        |    duration |    size |  copy rows |     errors
====================================================================
udc               |      0.201s |       - |          5 |          0

pgloader/examples$ psql --cluster 8.4/main pgloader -c <span style="color: #bc8f8f;">"table udc"</span>
 b  |       c        | d
----+----------------+---
  5 | constant value | 1
 10 | constant value | 2
  4 | constant value | 3
 18 | constant value | 4
  2 | constant value | 5
(5 rows)
</pre>

<p>Of course the configuration is not so straightforward as to process fields
in the data file in the order that they appear, after all the
<a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> are also a test suite.</p>

<p>Long story short: if you need to add some <em>constant</em> values into the target
table you're loading data to, <a href="../../../pgsql/pgloader.html">pgloader</a> will help you there!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 12 Aug 2011 11:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/12-pgloader-udc.html</guid>
</item>
<item>
  <title>pgloader constant cols</title>
  <link>http://tapoueh.org/blog/2011/08/12-pgloader-udc.html</link>
  <description><![CDATA[<p>The previous articles in the <a href="../../../pgsql/pgloader.html">pgloader</a> series detailed <a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">How To Use PgLoader</a>
then <a href="http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html">How to Setup pgloader</a>, then what to expect from a <a href="http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html">parallel pgloader</a>
setup, and then <a href="http://tapoueh.org/blog/2011/08/05-reformating-modules-for-pgloader.html">pgloader reformating</a>.  Another need you might encounter when
you get to use <a href="../../../pgsql/pgloader.html">pgloader</a> is adding <em>constant</em> values into a table's column.</p>

<p>The basic situation where you need to do so is adding an <em>origin</em> field to
your table.  The value of that is not to be found in the data file itself,
typically, but known in the pgloader setup.  That could even be the <code>filename</code>
you are importing data from.</p>

<p>In <a href="../../../pgsql/pgloader.html">pgloader</a> that's called a <em>user defined column</em>.  Here's what the relevant
<a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> setup looks like:</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">udc</span>]
<span style="color: #eeeeec;">table</span>           = udc
<span style="color: #eeeeec;">format</span>          = text
<span style="color: #eeeeec;">filename</span>        = udc/udc.data
<span style="color: #eeeeec;">input_encoding</span>  = <span style="color: #ad7fa8; font-style: italic;">'latin1'</span>
<span style="color: #eeeeec;">field_sep</span>       = %
<span style="color: #eeeeec;">columns</span>         = b:2, d:1, x:3, y:4
<span style="color: #eeeeec;">udc_c</span>           = constant value
<span style="color: #eeeeec;">copy_columns</span>    = b, c, d
</pre>

<p>And the data file is:</p>

<pre class="src">
1%5%foo%bar
2%10%bar%toto
3%4%toto%titi
4%18%titi%baz
5%2%baz%foo
</pre>

<p>And here's what the loaded table looks like:</p>

<pre class="src">
pgloader/examples$ pgloader -Tsc pgloader.conf udc
Table name        |    duration |    size |  copy rows |     errors
====================================================================
udc               |      0.201s |       - |          5 |          0

pgloader/examples$ psql --cluster 8.4/main pgloader -c <span style="color: #ad7fa8; font-style: italic;">"table udc"</span>
 b  |       c        | d
----+----------------+---
  5 | constant value | 1
 10 | constant value | 2
  4 | constant value | 3
 18 | constant value | 4
  2 | constant value | 5
(5 rows)
</pre>

<p>Of course the configuration is not so straightforward as to process fields
in the data file in the order that they appear, after all the
<a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> are also a test suite.</p>

<p>Long story short: if you need to add some <em>constant</em> values into the target
table you're loading data to, <a href="../../../pgsql/pgloader.html">pgloader</a> will help you there!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 12 Aug 2011 11:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/12-pgloader-udc.html</guid>
</item>
<item>
  <title>Emacs Startup</title>
  <link>http://tapoueh.org/blog/2011/08/blog/2011/08/06-emacs-startup-notification.html</link>
  <description><![CDATA[<p>Using <a href="http://www.gnu.org/software/emacs/">Emacs</a> we get to manage a larger and larger setup file (either <code>~/.emacs</code>
or <code>~/.emacs.d/init.el</code>), sometime with lots of dependencies, and some
sub-files thanks to the <code>load</code> function or the <code>provide</code> and <code>require</code> mechanism.</p>

<p>Some users are even starting Emacs often enough for the startup time to be a
concern.  With an <code>emacs-uptime</code> (yes it's a command, you can <code>M-x
emacs-uptime</code>) of days to weeks (<code>10 days, 17 hours, 45 minutes, 34 seconds</code> as
of this writing), it's not something I really care about much.</p>

<p>But I know that some <a href="http://tapoueh.org/emacs/el-get.html">el-get</a> users still do care, and will use <code>el-get-is-lazy</code>
and do all their Emacs tweaking as <code>eval-after-load</code> blocks.  Trying to have
an idea of how much a <em>worst case</em> startup with <a href="http://www.emacswiki.org/emacs/el-get">el-get</a> is, I have added the
following piece of <code>elisp</code> at the very end of my startup code:</p>

<pre class="src">
(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">dim:notify-startup-done</span> ()
  <span style="color: #bc8f8f;">" notify user that Emacs is now ready"</span>
  (el-get-notify
   <span style="color: #bc8f8f;">"Emacs is ready."</span>
   (format <span style="color: #bc8f8f;">"The init sequence took %g seconds."</span>
           (float-time (time-subtract after-init-time before-init-time)))))

(add-hook 'after-init-hook 'dim:notify-startup-done)
</pre>

<p>The <code>el-get-notify</code> function will adapt and either use the dbus implementation
from Emacs 24, or <a href="http://www.emacswiki.org/emacs/notify.el">notify.el</a> from <a href="http://www.emacswiki.org/">EmacsWiki</a> (just <code>M-x el-get-install</code> it if
you need it), or will use its own implementation of an Emacs <a href="http://growl.info/">Growl</a> client
(it's about 5 lines long), and baring all of that will use the <code>message</code>
function.</p>

<p>The reason I say <em>worst case</em> is that I have a lot of packages to initialize
at startup, and that I did absolutely no effort for this initializing to be
quick.  Still, my Emacs setup is taking about 20 seconds to boot.  Pretty
good I would say, for a weekly operation.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sat, 06 Aug 2011 14:58:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/blog/2011/08/06-emacs-startup-notification.html</guid>
</item>
<item>
  <title>Emacs Startup</title>
  <link>http://tapoueh.org/blog/2011/08/06-emacs-startup-notification.html</link>
  <description><![CDATA[<p>Using <a href="http://www.gnu.org/software/emacs/">Emacs</a> we get to manage a larger and larger setup file (either <code>~/.emacs</code>
or <code>~/.emacs.d/init.el</code>), sometime with lots of dependencies, and some
sub-files thanks to the <code>load</code> function or the <code>provide</code> and <code>require</code> mechanism.</p>

<p>Some users are even starting Emacs often enough for the startup time to be a
concern.  With an <code>emacs-uptime</code> (yes it's a command, you can <code>M-x
emacs-uptime</code>) of days to weeks (<code>10 days, 17 hours, 45 minutes, 34 seconds</code> as
of this writing), it's not something I really care about much.</p>

<p>But I know that some <a href="http://tapoueh.org/emacs/el-get.html">el-get</a> users still do care, and will use <code>el-get-is-lazy</code>
and do all their Emacs tweaking as <code>eval-after-load</code> blocks.  Trying to have
an idea of how much a <em>worst case</em> startup with <a href="http://www.emacswiki.org/emacs/el-get">el-get</a> is, I have added the
following piece of <code>elisp</code> at the very end of my startup code:</p>

<pre class="src">
(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">dim:notify-startup-done</span> ()
  <span style="color: #bc8f8f;">" notify user that Emacs is now ready"</span>
  (el-get-notify
   <span style="color: #bc8f8f;">"Emacs is ready."</span>
   (format <span style="color: #bc8f8f;">"The init sequence took %g seconds."</span>
           (float-time (time-subtract after-init-time before-init-time)))))

(add-hook 'after-init-hook 'dim:notify-startup-done)
</pre>

<p>The <code>el-get-notify</code> function will adapt and either use the dbus implementation
from Emacs 24, or <a href="http://www.emacswiki.org/emacs/notify.el">notify.el</a> from <a href="http://www.emacswiki.org/">EmacsWiki</a> (just <code>M-x el-get-install</code> it if
you need it), or will use its own implementation of an Emacs <a href="http://growl.info/">Growl</a> client
(it's about 5 lines long), and baring all of that will use the <code>message</code>
function.</p>

<p>The reason I say <em>worst case</em> is that I have a lot of packages to initialize
at startup, and that I did absolutely no effort for this initializing to be
quick.  Still, my Emacs setup is taking about 20 seconds to boot.  Pretty
good I would say, for a weekly operation.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sat, 06 Aug 2011 14:58:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/06-emacs-startup-notification.html</guid>
</item>
<item>
  <title>pgloader reformating</title>
  <link>http://tapoueh.org/blog/2011/08/05-reformating-modules-for-pgloader.html</link>
  <description><![CDATA[<p>Back to our series about <a href="../../../pgsql/pgloader.html">pgloader</a>.  The previous articles detailed
<a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">How To Use PgLoader</a> then <a href="http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html">How to Setup pgloader</a>, then what to expect from a
<a href="http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html">parallel pgloader</a> setup.  This article will detail how to <em>reformat</em> input
columns so that what <a href="http://www.postgresql.org/">PostgreSQL</a> sees is not what's in the data file, but the
result of a <em>transformation</em> from this data into something acceptable as an
<em>input</em> for the target data type.</p>

<p>Here's what the <a href="http://pgloader.projects.postgresql.org/">pgloader documentation</a> has to say about this <em>reformat</em>
parameter: <em>The value of this option is a comma separated list of columns to
rewrite, which are a colon separated list of column name, reformat module
name, reformat function name</em>.</p>

<p>And here's the <a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> section that deals with reformat:</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">reformat</span>]
<span style="color: #eeeeec;">table</span>           = reformat
<span style="color: #eeeeec;">format</span>          = text
<span style="color: #eeeeec;">filename</span>        = reformat/reformat.data
<span style="color: #eeeeec;">field_sep</span>       = |
<span style="color: #eeeeec;">columns</span>         = id, timestamp
<span style="color: #eeeeec;">reformat</span>        = timestamp:mysql:timestamp
</pre>

<p>The documentation says some more about it, so check it out.  Also, the
<code>reformat_path</code> option (set either on the command line or in the configuration
file) is used to find the python module implementing the reformat function.
Please refer to the manual as to how to set it.</p>

<p>Now, obviously, for the <em>reformat</em> to happen we need to write some code.
That's the whole point of the option: you need something very specific, you
are in a position to write the 5 lines of code needed to make it happen,
<a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a> allows you to just do that.  Of course, the code needs to be
written in python here, so that you can even benefit from the
<a href="http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html">parallel pgloader</a> settings.</p>


<p>Let's see an reformat module exemple, as found in <a href="https://github.com/dimitri/pgloader/blob/master/reformat/mysql.py">reformat/mysql.py</a> in the
<code>pgloader</code> sources:</p>

<pre class="src">
<span style="color: #888a85;"># </span><span style="color: #888a85;">Author: Dimitri Fontaine &lt;<a href="mailto:dim&#64;tapoueh.org">dim&#64;tapoueh.org</a>&gt;
</span><span style="color: #888a85;">#</span><span style="color: #888a85;">
</span><span style="color: #888a85;"># </span><span style="color: #888a85;">pgloader mysql reformating module
</span><span style="color: #888a85;">#</span><span style="color: #888a85;">
</span>
<span style="color: #729fcf; font-weight: bold;">def</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">timestamp</span>(reject, <span style="color: #729fcf;">input</span>):
    <span style="color: #ad7fa8; font-style: italic;">""" Reformat str as a PostgreSQL timestamp

    MySQL timestamps are like:  20041002152952
    We want instead this input: 2004-10-02 15:29:52
    """</span>
    <span style="color: #729fcf; font-weight: bold;">if</span> <span style="color: #729fcf;">len</span>(<span style="color: #729fcf;">input</span>) != 14:
        <span style="color: #eeeeec;">e</span> = <span style="color: #ad7fa8; font-style: italic;">"MySQL timestamp reformat input too short: %s"</span> % <span style="color: #729fcf;">input</span>
        reject.log(e, <span style="color: #729fcf;">input</span>)

    <span style="color: #eeeeec;">year</span>    = <span style="color: #729fcf;">input</span>[0:4]
    <span style="color: #eeeeec;">month</span>   = <span style="color: #729fcf;">input</span>[4:6]
    <span style="color: #eeeeec;">day</span>     = <span style="color: #729fcf;">input</span>[6:8]
    <span style="color: #eeeeec;">hour</span>    = <span style="color: #729fcf;">input</span>[8:10]
    <span style="color: #eeeeec;">minute</span>  = <span style="color: #729fcf;">input</span>[10:12]
    <span style="color: #eeeeec;">seconds</span> = <span style="color: #729fcf;">input</span>[12:14]

    <span style="color: #729fcf; font-weight: bold;">return</span> <span style="color: #ad7fa8; font-style: italic;">'%s-%s-%s %s:%s:%s'</span> % (year, month, day, hour, minute, seconds)
</pre>

<p>This reformat module will <em>transform</em> a <code>timestamp</code> representation as issued by
certain versions of MySQL into something that PostgreSQL is able to read as
a timestamp.</p>

<p>If you're in the camp that wants to write as little code as possible rather
than easy to read and maintain code, I guess you could write it this way
instead:</p>

<pre class="src">
<span style="color: #729fcf; font-weight: bold;">import</span> re
<span style="color: #729fcf; font-weight: bold;">def</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">timestamp</span>(reject, <span style="color: #729fcf;">input</span>):
    <span style="color: #ad7fa8; font-style: italic;">""" 20041002152952 -&gt; 2004-10-02 15:29:52 """</span>
    <span style="color: #eeeeec;">g</span> = re.match(r<span style="color: #ad7fa8; font-style: italic;">"(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})"</span>, <span style="color: #729fcf;">input</span>)
    <span style="color: #729fcf; font-weight: bold;">return</span> <span style="color: #ad7fa8; font-style: italic;">'%s-%s-%s %s:%s:%s'</span> % <span style="color: #729fcf;">tuple</span>([g.group(x+1) <span style="color: #729fcf; font-weight: bold;">for</span> x <span style="color: #729fcf; font-weight: bold;">in</span> <span style="color: #729fcf;">range</span>(6)])
</pre>

<p>Whenever you have an input file with data that PostgreSQL chokes upon, you
can solve this problem from <a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a> itself: no need to resort to scripting
and a pipelines of <a href="http://www.gnu.org/software/gawk/manual/gawk.html">awk</a> (which I use a lot in other cases, don't get me
wrong) or other tools.  See, you finally have an excuse to <a href="http://diveintopython.org/">Dive into Python</a>!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 05 Aug 2011 11:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/05-reformating-modules-for-pgloader.html</guid>
</item>
<item>
  <title>Reformater avec pgloader</title>
  <link>http://tapoueh.org/blog/2011/08/05-reformater-avec-pgloader.html</link>
  <description><![CDATA[<p>Dans la série de nos articles sur <a href="http://tapoueh.org/tags/pgloader.html">pgloader</a>, le dernier venu détaille comment
utiliser la fonction de <em>reformatage</em> de cet outil.  Dans le cadre
d'utilisation d'un <a href="http://fr.wikipedia.org/wiki/Extract_Transform_Load">ETL</a>, cela est assimilé à la phase <em>Transform</em>, ce qui fait
de <code>pgloader</code> une solution <em>simple</em> pour vos besoins d'ETL.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 05 Aug 2011 11:26:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/05-reformater-avec-pgloader.html</guid>
</item>
<item>
  <title>See Tsung in action</title>
  <link>http://tapoueh.org/blog/2011/08/02-see-tsung-in-action.html</link>
  <description><![CDATA[<p><a href="http://tsung.erlang-projects.org/">Tsung</a> is an open-source multi-protocol distributed load testing tool and a
mature project.  It's been available for about 10 years and is built with
the <a href="http://www.erlang.org/">Erlang</a> system.  It supports several protocols, including the <a href="http://www.postgresql.org/">PostgreSQL</a>
one.</p>

<p>When you want to benchmark your own application, to know how many more
clients it can handle or how much gain you will see with some new shiny
hardware, <a href="http://tsung.erlang-projects.org/">Tsung</a> is the tool to use.  It will allow you to <em>record</em> a number of
sessions then replay them at high scale.  <a href="http://pgfouine.projects.postgresql.org/tsung.html">pgfouine</a> supports Tsung and is
able to turn your PostgreSQL logs into Tsung sessions, too.</p>

<p>Tsung did get used in the video game world, their version of it is called
<a href="http://www.developer.unitypark3d.com/tools/utsung/">uTsung</a>, apparently using the <a href="http://www.developer.unitypark3d.com/index.html">uLink</a> game development facilities.  They even
made a video demo of uTsung, that you might find interresting:</p>

<blockquote>
<p class="quoted"><a class="image-link" href="http://www.youtube.com/watch?v=rxBhqIP_7ls">
<img src="../../../images/utsung-demo.png"></a></p>
</blockquote>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 02 Aug 2011 10:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/02-see-tsung-in-action.html</guid>
</item>
<item>
  <title>Parallel pgloader</title>
  <link>http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html</link>
  <description><![CDATA[<p>This article continues the series that began with <a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">How To Use PgLoader</a> then
detailed <a href="http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html">How to Setup pgloader</a>.  We have some more fine points to talk about
here, today's article is about loading your data in parallel with <a href="../../../pgsql/pgloader.html">pgloader</a>.</p>

<h2>several files at a time</h2>

<p class="first">Parallelism is implemented in 3 different ways in pgloader.  First, you can
load more than one file at a time thanks to the <code>max_parallel_sections</code>
parameter, that has to be setup in the <em>global section</em> of the file.</p>

<p>This setting is quite simple and already allows the most common use case.</p>


<h2>several workers per file</h2>

<p class="first">The other use case is when you have huge files to load into the database.
Then you want to be able to have more than one process reading the file at
the same time.  Using <a href="../../../pgsql/pgloader.html">pgloader</a>, you already did the compromise to load the
whole content in more than one transaction, so there's no further drawback
here about having those multiple transactions per file spread to more than
one load <em>worker</em>.</p>

<p>There are basically two ways to split the work between several workers here,
and both are implemented in pgloader.</p>

<h3>N workers, N splits of the file</h3>

<pre class="src">
<span style="color: #eeeeec;">section_threads</span>    = 4
<span style="color: #eeeeec;">split_file_reading</span> = True
</pre>

<p>Setup this way, <a href="../../../pgsql/pgloader.html">pgloader</a> will launch 4 different <em>threads</em> (see the <strong>caveat</strong>
section of this article).  Each thread is then given a part of the input
data file and will run the whole usual pgloader processing on its own.  For
this to work you need to be able to <code>seek</code> in the input stream, which might
not always be convenient.</p>


<h3>one reader, N workers</h3>

<pre class="src">
<span style="color: #eeeeec;">section_threads</span>    = 4
<span style="color: #eeeeec;">split_file_reading</span> = False
<span style="color: #eeeeec;">rrqueue_size</span>       = 5000
</pre>

<p>With such a setup, <a href="../../../pgsql/pgloader.html">pgloader</a> will start 4 different worker <em>threads</em> that will
receive the data input in an internal <a href="http://docs.python.org/library/collections.html#deque-objects">python queue</a>.  Another active <em>thread</em>
will be responsible of reading the input file and filling the queues in a
<em>round robin</em> fashion, but will hand all the processing of the data to each
worker, of course.</p>


<h3>how many threads?</h3>

<p class="first">If you're using a mix and match of <code>max_parallel_sections</code> and <code>section_threads</code>
with <code>split_file_reading</code> set to <code>True</code> of <code>False</code>, it's uneasy to know exactly
how many <em>threads</em> will run at any time in the loading.  How to ascertain
which section will run in parallel when it depends on the timing of the
loading?</p>

<p>The advice here is the usual one, don't overestimate the capabilities of
your system unless you are in a position to check before by doing trial
runs.</p>



<h2>caveat</h2>

<p class="first">Current implementation of all the parallelism in <a href="../../../pgsql/pgloader.html">pgloader</a> has been done with
the <a href="http://docs.python.org/library/threading.html">python threading</a> API.  While this is easy enough to use when you want to
exchange data between threads, it's suffering from the
<a href="http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock">Global Interpreter Lock</a> issue.  This means that while the code is doing its
processing in parallel, the <em>runtime</em> not so much.  You might still benefit
from the current implementation if you have hard to parse files, or custom
reformat modules that are part of the loading bottleneck.</p>


<h2>future</h2>

<p class="first">The solution would be to switch to using the newer <a href="http://docs.python.org/library/multiprocessing.html">python multiprocessing</a>
API, and some preliminary work has been done in pgloader to allow for that.
If you're interested in real parallel bulk loading, <a href="dim%20(at)%20tapoueh%20(dot)%20org">contact-me</a>!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 01 Aug 2011 12:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html</guid>
</item>
<item>
  <title>Parallel pgloader</title>
  <link>http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html</link>
  <description><![CDATA[<p>This article continues the series that began with <a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">How To Use PgLoader</a> then
detailed <a href="http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html">How to Setup pgloader</a>.  We have some more fine points to talk about
here, today's article is about loading your data in parallel with <a href="../../../pgsql/pgloader.html">pgloader</a>.</p>

<h2>several files at a time</h2>

<p class="first">Parallelism is implemented in 3 different ways in pgloader.  First, you can
load more than one file at a time thanks to the <code>max_parallel_sections</code>
parameter, that has to be setup in the <em>global section</em> of the file.</p>

<p>This setting is quite simple and already allows the most common use case.</p>


<h2>several workers per file</h2>

<p class="first">The other use case is when you have huge files to load into the database.
Then you want to be able to have more than one process reading the file at
the same time.  Using <a href="../../../pgsql/pgloader.html">pgloader</a>, you already did the compromise to load the
whole content in more than one transaction, so there's no further drawback
here about having those multiple transactions per file spread to more than
one load <em>worker</em>.</p>

<p>There are basically two ways to split the work between several workers here,
and both are implemented in pgloader.</p>

<h3>N workers, N splits of the file</h3>

<pre class="src">
<span style="color: #eeeeec;">section_threads</span>    = 4
<span style="color: #eeeeec;">split_file_reading</span> = True
</pre>

<p>Setup this way, <a href="../../../pgsql/pgloader.html">pgloader</a> will launch 4 different <em>threads</em> (see the <strong>caveat</strong>
section of this article).  Each thread is then given a part of the input
data file and will run the whole usual pgloader processing on its own.  For
this to work you need to be able to <code>seek</code> in the input stream, which might
not always be convenient.</p>


<h3>one reader, N workers</h3>

<pre class="src">
<span style="color: #eeeeec;">section_threads</span>    = 4
<span style="color: #eeeeec;">split_file_reading</span> = False
<span style="color: #eeeeec;">rrqueue_size</span>       = 5000
</pre>

<p>With such a setup, <a href="../../../pgsql/pgloader.html">pgloader</a> will start 4 different worker <em>threads</em> that will
receive the data input in an internal <a href="http://docs.python.org/library/collections.html#deque-objects">python queue</a>.  Another active <em>thread</em>
will be responsible of reading the input file and filling the queues in a
<em>round robin</em> fashion, but will hand all the processing of the data to each
worker, of course.</p>


<h3>how many threads?</h3>

<p class="first">If you're using a mix and match of <code>max_parallel_sections</code> and <code>section_threads</code>
with <code>split_file_reading</code> set to <code>True</code> of <code>False</code>, it's uneasy to know exactly
how many <em>threads</em> will run at any time in the loading.  How to ascertain
which section will run in parallel when it depends on the timing of the
loading?</p>

<p>The advice here is the usual one, don't overestimate the capabilities of
your system unless you are in a position to check before by doing trial
runs.</p>



<h2>caveat</h2>

<p class="first">Current implementation of all the parallelism in <a href="../../../pgsql/pgloader.html">pgloader</a> has been done with
the <a href="http://docs.python.org/library/threading.html">python threading</a> API.  While this is easy enough to use when you want to
exchange data between threads, it's suffering from the
<a href="http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock">Global Interpreter Lock</a> issue.  This means that while the code is doing its
processing in parallel, the <em>runtime</em> not so much.  You might still benefit
from the current implementation if you have hard to parse files, or custom
reformat modules that are part of the loading bottleneck.</p>


<h2>future</h2>

<p class="first">The solution would be to switch to using the newer <a href="http://docs.python.org/library/multiprocessing.html">python multiprocessing</a>
API, and some preliminary work has been done in pgloader to allow for that.
If you're interested in real parallel bulk loading, <a href="dim%20(at)%20tapoueh%20(dot)%20org">contact-me</a>!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 01 Aug 2011 12:05:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/08/01-parallel-pgloader.html</guid>
</item>
<item>
  <title>Configurer pgloader</title>
  <link>http://tapoueh.org/blog/2011/07/29-configurer-pgloader.html</link>
  <description><![CDATA[<p>Je viens de publier un billet en anglais intitulé <a href="http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html">How to Setup pgloader</a>, qui
complète l'écriture en cours d'un <a href="http://tapoueh.org/pgsql/pgloader.html">tutoriel pgloader</a> plus complet.  Une fois
de plus, je n'ai pas pris le temps de traduire cet article en français avant
de savoir si cela vous intéresse, ô lecteurs.  Si c'est le cas il suffit de
me l'indiquer par mail (ou <em>courriel</em>, après tout) pour que j'ajoute cela dans
ma <code>TODO</code> liste.</p>

<p>Bonne lecture !</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 29 Jul 2011 15:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/29-configurer-pgloader.html</guid>
</item>
<item>
  <title>How to Setup pgloader</title>
  <link>http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html</link>
  <description><![CDATA[<p>In a previous article we detailed <a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">how to use pgloader</a>, let's now see how to
write the <code>pgloader.conf</code> that instructs <a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a> about what to do.</p>

<p>This file is expected in the <code>INI</code> format, with a <em>global</em> section then one
section per file you want to import.  The <em>global</em> section defines some
default options and how to connect to the <a href="http://tapoueh.org/pgsql/index.html">PostgreSQL</a> server.</p>

<p>The configuration setup is fully documented on the <a href="http://pgloader.projects.postgresql.org/">pgloader man page</a> that
you can even easily find online.  As all <em>unix</em> style man pages, though, it's
more a complete reference than introductory material.  Let's review.</p>

<h2>global section</h2>

<p class="first">Here's the <em>global</em> section of the <a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> file of the source
files.  Well, some options are <em>debugger</em> only options, really, so I changed
their value so that what you see here is a better starting point.</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">pgsql</span>]
<span style="color: #eeeeec;">base</span> = pgloader

<span style="color: #eeeeec;">log_file</span>            = /tmp/pgloader.log
<span style="color: #eeeeec;">log_min_messages</span>    = INFO
<span style="color: #eeeeec;">client_min_messages</span> = WARNING

<span style="color: #eeeeec;">lc_messages</span>         = C
<span style="color: #eeeeec;">pg_option_client_encoding</span> = <span style="color: #ad7fa8; font-style: italic;">'utf-8'</span>
<span style="color: #eeeeec;">pg_option_standard_conforming_strings</span> = on
<span style="color: #eeeeec;">pg_option_work_mem</span> = 128MB

<span style="color: #eeeeec;">copy_every</span>      = 15000

<span style="color: #eeeeec;">null</span>         = <span style="color: #ad7fa8; font-style: italic;">""</span>
<span style="color: #eeeeec;">empty_string</span> = <span style="color: #ad7fa8; font-style: italic;">"\ "</span>

<span style="color: #eeeeec;">max_parallel_sections</span> = 4
</pre>

<p>You don't see all the connection setup, here <code>base</code> was enough.  You might
need to setup <code>host</code>, <code>port</code> and <code>user</code>, and maybe even <code>pass</code>, too, to be able to
connect to the PostgreSQL server.</p>

<p>The logging options allows you to set a file where to log all <code>pgloader</code>
messages, that are categorized as either <code>DEBUG</code>, <code>INFO</code>, <code>WARNING</code>, <code>ERROR</code> or
<code>CRITICAL</code>.  The options <code>log_min_messages</code> and <code>client_min_messages</code> are another
good idea stolen from <a href="http://www.postgresql.org/">PostgreSQL</a> and allow you to setup the level of chatter
you want to see on the interactive console (standard output and standard
error streams) and on the log file.</p>

<p>Please note that the <code>DEBUG</code> level will produce more that 3 times as many data
as the data file you're importing.  If you're not a <code>pgloader</code> contributor or
helping them, well, <em>debug</em> it, you want to avoid setting the log chatter to
this value.</p>

<p>The <code>client_encoding</code> will be <a href="http://www.postgresql.org/docs/current/static/sql-set.html">SET</a> by <a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a> on the PostgreSQL connection it
establish.  You can now even set any parameter you want by using the
<code>pg_option_parameter_name</code> magic settings.  Note that the command line option
<code>--pg-options</code> (or <code>-o</code> for brevity) allows you to override that.</p>

<p>Then, the <code>copy_every</code> parameter is set to <code>5</code> in the examples, because the test
files are containing less than 10 lines and we want to test several <em>batches</em>
of commits when using them.  So for your real loading, stick to default
parameters (<code>10 000</code> lines per <code>COPY</code> command), or more.  You can play with this
parameter, depending on the network (or local access) and disk system you're
using you might see improvements by reducing it or enlarging it.  There's no
so much theory of operation as empirical testing and setting here.  For a
one-off operation, just remove the lines from the configuration.</p>

<p>The parameters <code>null</code> and <code>empty_string</code> are related to interpreting the data in
the text or <code>csv</code> files you have, and the documentation is quite clear about
them.  Note that you have global setting and per-section setting too.</p>

<p>The last parameter of this example, <code>max_parallel_sections</code>, is detailed later
in the article.</p>


<h2>files section</h2>

<p class="first">After the <em>global</em> section come as many sections as you have file to load.
Plus the <em>template</em> sections, that are only there so that you can share a
bunch of parameters in more than one section.  Picture a series of data file
all of the same format, the only thing that will change is the <code>filename</code>.
Use a template section in this case!</p>

<p>Let's see an example:</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">simple_tmpl</span>]
<span style="color: #eeeeec;">template</span>     = True
<span style="color: #eeeeec;">format</span>       = text
<span style="color: #eeeeec;">datestyle</span>    = dmy
<span style="color: #eeeeec;">field_sep</span>    = |
<span style="color: #eeeeec;">trailing_sep</span> = True

[<span style="color: #8ae234; font-weight: bold;">simple</span>]
<span style="color: #eeeeec;">use_template</span>    = simple_tmpl
<span style="color: #eeeeec;">table</span>           = simple
<span style="color: #eeeeec;">filename</span>        = simple/simple.data
<span style="color: #eeeeec;">columns</span>         = a:1, b:3, c:2
<span style="color: #eeeeec;">skip_head_lines</span> = 2

<span style="color: #888a85;"># </span><span style="color: #888a85;">those reject settings are defaults one
</span><span style="color: #eeeeec;">reject_log</span>   = /tmp/simple.rej.log
<span style="color: #eeeeec;">reject_data</span>  = /tmp/simple.rej

[<span style="color: #8ae234; font-weight: bold;">partial</span>]
<span style="color: #eeeeec;">table</span>        = partial
<span style="color: #eeeeec;">format</span>       = text
<span style="color: #eeeeec;">filename</span>     = partial/partial.data
<span style="color: #eeeeec;">field_sep</span>    = %
<span style="color: #eeeeec;">columns</span>      = *
<span style="color: #eeeeec;">only_cols</span>    = 1-3, 5
</pre>

<p>That's 2 of the examples from the <a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> file, in 3 sections
so that we see one template example.  Of course, having a single section
using the template, it's just here for the example.</p>


<h2>data file format</h2>

<p class="first">The most important setting that you have to care about is the file format.
Your choice here is either <code>text</code>, <code>csv</code> or <code>fixed</code>.  Mostly, what we are given
nowadays is <code>csv</code>.  You might remember having read that the nice thing about
standards is that there's so many to choose from... well, the <code>csv</code> land is
one where it's pretty hard to find different producers that understand it
the same way.</p>

<p>So when you fail to have pgloader load your <em>mostly csv</em> files with a <code>csv</code>
setup, it's time to consider using <code>text</code> instead.  The <code>text</code> file format
accept a lot of tunables to adapt to crazy situations, but is all <code>python</code>
code when the <a href="http://docs.python.org/library/csv.html">python csv module</a> is a C-coded module, more efficient.</p>

<p>If you're wondering what kind of format we're talking about here, here's the
<a href="https://github.com/dimitri/pgloader/blob/master/examples/cluttered/cluttered.data">cluttered pgloader example</a> for your reading pleasure, using <code>^</code> (carret) as
the field separator:</p>

<pre class="src">
1^some multi\
line text with\
newline escaping^and some other data following^
2^and another line^clean^
3^and\
a last multiline\
escaped line
with a missing\
escaping^just to test^
4^\ ^empty value^
5^^null value^
6^multi line\
escaped value\
\
with empty line\
embeded^last line^
</pre>

<p>And here's what we get by loading that:</p>

<pre class="src">
pgloader/examples$ pgloader -c pgloader.conf -s cluttered
Table name        |    duration |    size |  copy rows |     errors
====================================================================
cluttered         |      0.193s |       - |          6 |          0

pgloader/examples$ psql pgloader -c <span style="color: #ad7fa8; font-style: italic;">"table cluttered;"</span>
 a |               b               |        c
---+-------------------------------+------------------
 1 | and some other data following | some multi
                                   : line text with
                                   : newline escaping
 2 | clean                         | and another line
 3 | just to test                  | and
                                   : a last multiline
                                   : escaped line
                                   : with a missing
                                   : escaping
 4 | empty value                   |
 5 | null value                    |
 6 | last line                     | multi line
                                   : escaped value
                                   :
                                   : with empty line
                                   : embeded
(6 rows)
</pre>

<p>So when you have such kind of data, well, it might be that <code>pgloader</code> is still
able to help you!</p>

<p>Please refer to the <a href="http://pgloader.projects.postgresql.org/">pgloader man page</a> to know about each and every parameter
that you can define and the values accepted, etc.  And the <em>fixed</em> data format
is to be used when you're not given a field separator but field positions in
the file.  Yes, we still encounter those from time to time.  Who needs
variable size storage, after all?</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 29 Jul 2011 15:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html</guid>
</item>
<item>
  <title>Emacs ANSI colors</title>
  <link>http://tapoueh.org/blog/2011/07/blog/2011/07/29-emacs-ansi-colors.html</link>
  <description><![CDATA[<p><a href="http://tapoueh.org/emacs/index.html">Emacs</a> comes with a pretty good implementation of a terminal emulator, <code>M-x
term</code>.  Well not that good actually, but given what I use it for, it's just
what I need.  Particulary if you add to that my <a href="http://tapoueh.org/emacs/cssh.html">cssh</a> tool, so that
connecting with <code>ssh</code> to a remote host is just a <code>=C-= runs the command
cssh-term-remote-open</code> away, and completes on the host name thanks to
<code>~/.ssh/known_hosts</code>.</p>

<p>Now, a problem that I still had to solve was the colors used in the
terminal.  As I'm using the <em>tango</em> color theme for emacs, the default <em>ANSI</em>
palette's blue color was not readable.  Here's how to fix that:</p>

<pre class="src">
   (<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">ansi-color</span>)
   (setq ansi-color-names-vector
         (vector (frame-parameter nil 'background-color)
               <span style="color: #bc8f8f;">"#f57900"</span> <span style="color: #bc8f8f;">"#8ae234"</span> <span style="color: #bc8f8f;">"#edd400"</span> <span style="color: #bc8f8f;">"#729fcf"</span>
               <span style="color: #bc8f8f;">"#ad7fa8"</span> <span style="color: #bc8f8f;">"cyan3"</span> <span style="color: #bc8f8f;">"#eeeeec"</span>)
         ansi-term-color-vector ansi-color-names-vector
         ansi-color-map (ansi-color-make-color-map))
</pre>

<p>Now your colors in an emacs terminal are easy to read, as you can see:</p>

<blockquote>
<p class="quoted"><img src="../../../images/emacs-tango-term-colors.png" alt=""></p>
</blockquote>

<p>Hope you enjoy!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 29 Jul 2011 10:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/blog/2011/07/29-emacs-ansi-colors.html</guid>
</item>
<item>
  <title>Emacs ANSI colors</title>
  <link>http://tapoueh.org/blog/2011/07/29-emacs-ansi-colors.html</link>
  <description><![CDATA[<p><a href="http://tapoueh.org/emacs/index.html">Emacs</a> comes with a pretty good implementation of a terminal emulator, <code>M-x
term</code>.  Well not that good actually, but given what I use it for, it's just
what I need.  Particulary if you add to that my <a href="http://tapoueh.org/emacs/cssh.html">cssh</a> tool, so that
connecting with <code>ssh</code> to a remote host is just a <code>=C-= runs the command
cssh-term-remote-open</code> away, and completes on the host name thanks to
<code>~/.ssh/known_hosts</code>.</p>

<p>Now, a problem that I still had to solve was the colors used in the
terminal.  As I'm using the <em>tango</em> color theme for emacs, the default <em>ANSI</em>
palette's blue color was not readable.  Here's how to fix that:</p>

<pre class="src">
   (<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">ansi-color</span>)
   (setq ansi-color-names-vector
         (vector (frame-parameter nil 'background-color)
               <span style="color: #ad7fa8; font-style: italic;">"#f57900"</span> <span style="color: #ad7fa8; font-style: italic;">"#8ae234"</span> <span style="color: #ad7fa8; font-style: italic;">"#edd400"</span> <span style="color: #ad7fa8; font-style: italic;">"#729fcf"</span>
               <span style="color: #ad7fa8; font-style: italic;">"#ad7fa8"</span> <span style="color: #ad7fa8; font-style: italic;">"cyan3"</span> <span style="color: #ad7fa8; font-style: italic;">"#eeeeec"</span>)
         ansi-term-color-vector ansi-color-names-vector
         ansi-color-map (ansi-color-make-color-map))
</pre>

<p>Now your colors in an emacs terminal are easy to read, as you can see:</p>

<blockquote>
<p class="quoted"><img src="../../../images/emacs-tango-term-colors.png" alt=""></p>
</blockquote>

<p>Hope you enjoy!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 29 Jul 2011 10:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/29-emacs-ansi-colors.html</guid>
</item>
<item>
  <title>How to Setup pgloader</title>
  <link>http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html</link>
  <description><![CDATA[<p>In a previous article we detailed <a href="http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html">how to use pgloader</a>, let's now see how to
write the <code>pgloader.conf</code> that instructs <a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a> about what to do.</p>

<p>This file is expected in the <code>INI</code> format, with a <em>global</em> section then one
section per file you want to import.  The <em>global</em> section defines some
default options and how to connect to the <a href="http://tapoueh.org/pgsql/index.html">PostgreSQL</a> server.</p>

<p>The configuration setup is fully documented on the <a href="http://pgloader.projects.postgresql.org/">pgloader man page</a> that
you can even easily find online.  As all <em>unix</em> style man pages, though, it's
more a complete reference than introductory material.  Let's review.</p>

<h2>global section</h2>

<p class="first">Here's the <em>global</em> section of the <a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> file of the source
files.  Well, some options are <em>debugger</em> only options, really, so I changed
their value so that what you see here is a better starting point.</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">pgsql</span>]
<span style="color: #eeeeec;">base</span> = pgloader

<span style="color: #eeeeec;">log_file</span>            = /tmp/pgloader.log
<span style="color: #eeeeec;">log_min_messages</span>    = INFO
<span style="color: #eeeeec;">client_min_messages</span> = WARNING

<span style="color: #eeeeec;">lc_messages</span>         = C
<span style="color: #eeeeec;">pg_option_client_encoding</span> = <span style="color: #ad7fa8; font-style: italic;">'utf-8'</span>
<span style="color: #eeeeec;">pg_option_standard_conforming_strings</span> = on
<span style="color: #eeeeec;">pg_option_work_mem</span> = 128MB

<span style="color: #eeeeec;">copy_every</span>      = 15000

<span style="color: #eeeeec;">null</span>         = <span style="color: #ad7fa8; font-style: italic;">""</span>
<span style="color: #eeeeec;">empty_string</span> = <span style="color: #ad7fa8; font-style: italic;">"\ "</span>

<span style="color: #eeeeec;">max_parallel_sections</span> = 4
</pre>

<p>You don't see all the connection setup, here <code>base</code> was enough.  You might
need to setup <code>host</code>, <code>port</code> and <code>user</code>, and maybe even <code>pass</code>, too, to be able to
connect to the PostgreSQL server.</p>

<p>The logging options allows you to set a file where to log all <code>pgloader</code>
messages, that are categorized as either <code>DEBUG</code>, <code>INFO</code>, <code>WARNING</code>, <code>ERROR</code> or
<code>CRITICAL</code>.  The options <code>log_min_messages</code> and <code>client_min_messages</code> are another
good idea stolen from <a href="http://www.postgresql.org/">PostgreSQL</a> and allow you to setup the level of chatter
you want to see on the interactive console (standard output and standard
error streams) and on the log file.</p>

<p>Please note that the <code>DEBUG</code> level will produce more that 3 times as many data
as the data file you're importing.  If you're not a <code>pgloader</code> contributor or
helping them, well, <em>debug</em> it, you want to avoid setting the log chatter to
this value.</p>

<p>The <code>client_encoding</code> will be <a href="http://www.postgresql.org/docs/current/static/sql-set.html">SET</a> by <a href="http://tapoueh.org/pgsql/pgloader.html">pgloader</a> on the PostgreSQL connection it
establish.  You can now even set any parameter you want by using the
<code>pg_option_parameter_name</code> magic settings.  Note that the command line option
<code>--pg-options</code> (or <code>-o</code> for brevity) allows you to override that.</p>

<p>Then, the <code>copy_every</code> parameter is set to <code>5</code> in the examples, because the test
files are containing less than 10 lines and we want to test several <em>batches</em>
of commits when using them.  So for your real loading, stick to default
parameters (<code>10 000</code> lines per <code>COPY</code> command), or more.  You can play with this
parameter, depending on the network (or local access) and disk system you're
using you might see improvements by reducing it or enlarging it.  There's no
so much theory of operation as empirical testing and setting here.  For a
one-off operation, just remove the lines from the configuration.</p>

<p>The parameters <code>null</code> and <code>empty_string</code> are related to interpreting the data in
the text or <code>csv</code> files you have, and the documentation is quite clear about
them.  Note that you have global setting and per-section setting too.</p>

<p>The last parameter of this example, <code>max_parallel_sections</code>, is detailed later
in the article.</p>


<h2>files section</h2>

<p class="first">After the <em>global</em> section come as many sections as you have file to load.
Plus the <em>template</em> sections, that are only there so that you can share a
bunch of parameters in more than one section.  Picture a series of data file
all of the same format, the only thing that will change is the <code>filename</code>.
Use a template section in this case!</p>

<p>Let's see an example:</p>

<pre class="src">
[<span style="color: #8ae234; font-weight: bold;">simple_tmpl</span>]
<span style="color: #eeeeec;">template</span>     = True
<span style="color: #eeeeec;">format</span>       = text
<span style="color: #eeeeec;">datestyle</span>    = dmy
<span style="color: #eeeeec;">field_sep</span>    = |
<span style="color: #eeeeec;">trailing_sep</span> = True

[<span style="color: #8ae234; font-weight: bold;">simple</span>]
<span style="color: #eeeeec;">use_template</span>    = simple_tmpl
<span style="color: #eeeeec;">table</span>           = simple
<span style="color: #eeeeec;">filename</span>        = simple/simple.data
<span style="color: #eeeeec;">columns</span>         = a:1, b:3, c:2
<span style="color: #eeeeec;">skip_head_lines</span> = 2

<span style="color: #888a85;"># </span><span style="color: #888a85;">those reject settings are defaults one
</span><span style="color: #eeeeec;">reject_log</span>   = /tmp/simple.rej.log
<span style="color: #eeeeec;">reject_data</span>  = /tmp/simple.rej

[<span style="color: #8ae234; font-weight: bold;">partial</span>]
<span style="color: #eeeeec;">table</span>        = partial
<span style="color: #eeeeec;">format</span>       = text
<span style="color: #eeeeec;">filename</span>     = partial/partial.data
<span style="color: #eeeeec;">field_sep</span>    = %
<span style="color: #eeeeec;">columns</span>      = *
<span style="color: #eeeeec;">only_cols</span>    = 1-3, 5
</pre>

<p>That's 2 of the examples from the <a href="https://github.com/dimitri/pgloader/blob/master/examples/pgloader.conf">examples/pgloader.conf</a> file, in 3 sections
so that we see one template example.  Of course, having a single section
using the template, it's just here for the example.</p>

<h3>data file format</h3>

<p class="first">The most important setting that you have to care about is the file format.
Your choice here is either <code>text</code>, <code>csv</code> or <code>fixed</code>.  Mostly, what we are given
nowadays is <code>csv</code>.  You might remember having read that the nice thing about
standards is that there's so many to choose from... well, the <code>csv</code> land is
one where it's pretty hard to find different producers that understand it
the same way.</p>

<p>So when you fail to have pgloader load your <em>mostly csv</em> files with a <code>csv</code>
setup, it's time to consider using <code>text</code> instead.  The <code>text</code> file format
accept a lot of tunables to adapt to crazy situations, but is all <code>python</code>
code when the <a href="http://docs.python.org/library/csv.html">python csv module</a> is a C-coded module, more efficient.</p>

<p>If you're wondering what kind of format we're talking about here, here's the
<a href="https://github.com/dimitri/pgloader/blob/master/examples/cluttered/cluttered.data">cluttered pgloader example</a> for your reading pleasure, using <code>^</code> (carret) as
the field separator:</p>

<pre class="src">
1^some multi\
line text with\
newline escaping^and some other data following^
2^and another line^clean^
3^and\
a last multiline\
escaped line
with a missing\
escaping^just to test^
4^\ ^empty value^
5^^null value^
6^multi line\
escaped value\
\
with empty line\
embeded^last line^
</pre>

<p>And here's what we get by loading that:</p>

<pre class="src">
pgloader/examples$ pgloader -c pgloader.conf -s cluttered
Table name        |    duration |    size |  copy rows |     errors
====================================================================
cluttered         |      0.193s |       - |          6 |          0

pgloader/examples$ psql pgloader -c <span style="color: #ad7fa8; font-style: italic;">"table cluttered;"</span>
 a |               b               |        c
---+-------------------------------+------------------
 1 | and some other data following | some multi
                                   : line text with
                                   : newline escaping
 2 | clean                         | and another line
 3 | just to test                  | and
                                   : a last multiline
                                   : escaped line
                                   : with a missing
                                   : escaping
 4 | empty value                   |
 5 | null value                    |
 6 | last line                     | multi line
                                   : escaped value
                                   :
                                   : with empty line
                                   : embeded
(6 rows)
</pre>

<p>So when you have such kind of data, well, it might be that <code>pgloader</code> is still
able to help you!</p>

<p>Please refer to the <a href="http://pgloader.projects.postgresql.org/">pgloader man page</a> to know about each and every parameter
that you can define and the values accepted, etc.  And the <em>fixed</em> data format
is to be used when you're not given a field separator but field positions in
the file.  Yes, we still encounter those from time to time.</p>



<h2>parallel processing</h2>

<h3>one reader, multiple workers</h3>


<h3>multiple workers, each reading</h3>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 29 Jul 2011 09:57:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/29-how-to-setup-pgloader.html</guid>
</item>
<item>
  <title>Next month partitions</title>
  <link>http://tapoueh.org/blog/2011/07/27-check-parts-for-next-month.html</link>
  <description><![CDATA[<p>When you do partition your tables monthly, then comes the question of when
to create next partitions.  I tend to create them just the week before next
month and I have some nice <a href="http://www.nagios.org/">nagios</a> scripts to alert me in case I've forgotten
to do so.  How to check that by hand in the end of a month?</p>

<p>Here's a catalog query to help you there:</p>

<pre class="src">
=&gt; select *
-&gt;   from
-&gt;   (
(&gt;   select <span style="color: #ad7fa8; font-style: italic;">'previous parts'</span> as schemaname, count(*)::text as tablename
(&gt;     from pg_tables
(&gt;    where schemaname not in (<span style="color: #ad7fa8; font-style: italic;">'pg_catalog'</span>,<span style="color: #ad7fa8; font-style: italic;">'information_schema'</span>)
(&gt;      and tablename like to_char(now(), <span style="color: #ad7fa8; font-style: italic;">'%YYYYMM'</span>)
(&gt;
(&gt;   union
(&gt;
(&gt;   select schemaname, substring(tablename,1,length(tablename)-6) || <span style="color: #ad7fa8; font-style: italic;">'201108'</span>
(&gt;     from pg_tables
(&gt;    where schemaname not in (<span style="color: #ad7fa8; font-style: italic;">'pg_catalog'</span>,<span style="color: #ad7fa8; font-style: italic;">'information_schema'</span>)
(&gt;      and tablename like to_char(now(), <span style="color: #ad7fa8; font-style: italic;">'%YYYYMM'</span>)
(&gt;
(&gt;   except
(&gt;
(&gt;   select schemaname, tablename
(&gt;     from pg_tables
(&gt;    where schemaname not in (<span style="color: #ad7fa8; font-style: italic;">'pg_catalog'</span>,<span style="color: #ad7fa8; font-style: italic;">'information_schema'</span>)
(&gt;      and tablename like to_char(now() + interval <span style="color: #ad7fa8; font-style: italic;">'1 month'</span>, <span style="color: #ad7fa8; font-style: italic;">'%YYYYMM'</span>)
(&gt;   ) as t
-&gt; order by schemaname &lt;&gt; <span style="color: #ad7fa8; font-style: italic;">'previous parts'</span>, schemaname;
   schemaname   |       tablename
<span style="color: #888a85;">----------------+------------------------
</span> previous parts | 1
 central        | stats_entrantes_201108
(2 rows)
</pre>

<p>As you see, our partitions are named <code>_YYYYMM</code> so that's it's easy to match
them in our queries, but I guess about everyone does about the same here.
Then the <code>to_char</code> expressions only allow to not enter manually <code>'%201108'</code> in
the query text.  And there's a trick so that we display how many partitions
we have this month, adding a line to the result...</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 27 Jul 2011 22:35:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/27-check-parts-for-next-month.html</guid>
</item>
<item>
  <title>Comment Utiliser pgloader</title>
  <link>http://tapoueh.org/blog/2011/07/22-comment-utiliser-pgloader.html</link>
  <description><![CDATA[<p>C'est une question qui revient régulièrement, et à laquelle je pensais avoir
apporté une réponse satisfaisante avec <a href="https://github.com/dimitri/pgloader/tree/master/examples">les exemples pgloader</a>. Ce document
ressemble un peu à un <em>tutoriel</em>, en anglais, et je l'ai détaillé dans
l'article <a href="22-how-to-use-pgloader.html">how to use pgloader</a> sur ce même site, en anglais. Si la demande
est suffisante, je le traduirai en français.</p>

<p>En attendant, bonne lecture !</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 22 Jul 2011 13:48:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/22-comment-utiliser-pgloader.html</guid>
</item>
<item>
  <title>How To Use PgLoader</title>
  <link>http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html</link>
  <description><![CDATA[<p>This question about <a href="../../../pgsql/pgloader.html">pgloader</a> usage coms in quite frequently, and I think the
examples <a href="https://github.com/dimitri/pgloader/tree/master/examples">README</a> goes a long way in answering it.  It's not exactly a
<em>tutorial</em> but is almost there. Let me paste it here for reference:</p>

<h2>installing pgloader</h2>

<p class="first">Either use the <a href="http://packages.debian.org/source/pgloader">debian package</a> or the one for your distribution of choice if
you use another one.  RedHat, CentOS, FreeBSD, OpenBSD and some more already
include a binary package that you can use directly.</p>

<p>Or you could <code>git clone https://github.com/dimitri/pgloader.git</code> and go from
there.  As it's all <code>python</code> code, it runs fine interpreted from the source
directory, you don't <em>need</em> to install it in a special place in your system.</p>


<h2>setting up the test environment</h2>

<p class="first">To use them, please first create a <code>pgloader</code> database, then for each example
the tables it needs, then issue the pgloader command:</p>

<pre class="src">
$ createdb --encoding=utf-8 pgloader
$ cd examples
$ psql pgloader &lt; simple/simple.sql
$ ../pgloader.py -Tvc pgloader.conf simple
</pre>

<p>If you want to load data from all examples, create tables for all of them
first, then run pgloader without argument.</p>


<h2>example description</h2>

<p class="first">The provided examples are:</p>

<ul>
<li>simple

<p>This dataset shows basic case, with trailing separator and data
reordering.</p></li>

<li>xzero

<p>Same as simple but using \0 as the null marker ( )</p></li>

<li>errors

<p>Same test, but with impossible dates. Should report some errors. If it
does not report errors, check you're not using psycopg 1.1.21.</p>

<p>Should report 3 errors out of 7 lines (4 updates).</p></li>

<li>clob

<p>This dataset shows some text large object importing to PostgreSQL text
datatype.</p></li>

<li>cluttured

<p>A dataset with newline escaped and multi-line input (without quoting)
Beware of data reordering, too.</p></li>

<li>csv

<p>A dataset with csv delimiter ',' and quoting '&quot;'.</p></li>

<li>partial

<p>A dataset from which we only load some columns of the provided one.</p></li>

<li>serial

<p>In this dataset the id field is ommited, it's a serial which will be
automatically set by PostgreSQL while COPYing.</p></li>

<li>reformat

<p>A timestamp column is formated the way MySQL dump its timestamp,
which is not the same as the way PostgreSQL reads them. The
reformat.mysql module is used to reformat the data on-the-fly.</p></li>

<li>udc

<p>A used defined column test, where all file columns are not used but
a new constant one, not found in the input datafile, is added while
loading data.</p></li>
</ul>


<h2>running the import</h2>

<p class="first">You can launch all those pgloader tests in one run, provided you created the
necessary tables:</p>

<pre class="src">
 $ for sql in */*sql; do psql pgloader &lt; $sql; done
 $ ../pgloader.py -Tsc pgloader.conf

  errors       WARNING  COPY error, trying to find on which line
  errors       WARNING  COPY data buffer saved in /tmp/errors.AhWvAv.pgloader
  errors       WARNING  COPY error recovery done (2/3) in 0.064s
  errors       WARNING  COPY error, trying to find on which line
  errors       WARNING  COPY data buffer saved in /tmp/errors.BclHtj.pgloader
  errors       WARNING  COPY error recovery done (1/1) in 0.054s
  errors       ERROR    3 errors found into [errors] data
  errors       ERROR    please read /tmp/errors.rej.log for errors log
  errors       ERROR    and /tmp/errors.rej for data still to process
  errors       ERROR    3 database errors occured
  reformat     WARNING  COPY error, trying to find on which line
  reformat     WARNING  COPY data buffer saved in /tmp/reformat.6P4WCD.pgloader
  reformat     WARNING  COPY error recovery done (1/4) in 0.034s
  reformat     ERROR    1 errors found into [reformat] data
  reformat     ERROR    please read /tmp/reformat.rej.log for errors log
  reformat     ERROR    and /tmp/reformat.rej for data still to process
  reformat     ERROR    1 database errors occured

  Table name        |    duration |    size |  copy rows |     errors
  ====================================================================
  allcols           |      0.025s |       - |          8 |          0
  clob              |      0.034s |       - |          7 |          0
  cluttered         |      0.061s |       - |          6 |          0
  csv               |      0.035s |       - |          6 |          0
  errors            |      0.113s |       - |          4 |          3
  fixed             |      0.045s |       - |          3 |          0
  partial           |      0.030s |       - |          7 |          0
  reformat          |      0.036s |       - |          4 |          1
  serial            |      0.029s |       - |          7 |          0
  simple            |      0.050s |       - |          7 |          0
  udc               |      0.020s |       - |          5 |          0
  ====================================================================
  Total             |      0.367s |       - |         64 |          4
</pre>

<p>Please note errors test should return 3 errors and reformat 1 error.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 22 Jul 2011 13:38:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/22-how-to-use-pgloader.html</guid>
</item>
<item>
  <title>Emacs Cheat Sheet</title>
  <link>http://tapoueh.org/blog/2011/07/blog/2011/07/20-emacs-cheat-sheet.html</link>
  <description><![CDATA[<p>I stumbled upon the following <em>cheat sheet</em> for <a href="http://www.gnu.org/software/emacs/">Emacs</a> yesterday, and it's
worth sharing.  I already learnt or discovered again some nice default
chords, like for example <code>C-x C-o runs the command delete-blank-lines</code> and
<code>C-M-o runs the command split-line</code>.  I guess I will use the later one a lot.</p>

<center>
<p><a class="image-link" href="../../../images/emacs-cheat-sheet.png">
<img src="../../../images/emacs-cheat-sheet-tn.png"></a></p>
</center>

<p>Hope you'll like it!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 20 Jul 2011 10:44:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/blog/2011/07/20-emacs-cheat-sheet.html</guid>
</item>
<item>
  <title>Emacs Cheat Sheet</title>
  <link>http://tapoueh.org/blog/2011/07/20-emacs-cheat-sheet.html</link>
  <description><![CDATA[<p>I stumbled upon the following <em>cheat sheet</em> for <a href="http://www.gnu.org/software/emacs/">Emacs</a> yesterday, and it's
worth sharing.  I already learnt or discovered again some nice default
chords, like for example <code>C-x C-o runs the command delete-blank-lines</code> and
<code>C-M-o runs the command split-line</code>.  I guess I will use the later one a lot.</p>

<center>
<p><a class="image-link" href="../../../images/emacs-cheat-sheet.png">
<img src="../../../images/emacs-cheat-sheet-tn.png"></a></p>
</center>

<p>Hope you'll like it!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 20 Jul 2011 10:44:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/20-emacs-cheat-sheet.html</guid>
</item>
<item>
  <title>Skytools3 : les slides</title>
  <link>http://tapoueh.org/blog/2011/07/19-skytools3-slides.html</link>
  <description><![CDATA[<p>La conférence <a href="http://char11.org/">CHAR(11)</a> étant maintenant terminée, il est d'usage de publier
les <em>slides</em> utilisés.  J'ai présenté <a href="http://wiki.postgresql.org/wiki/SkyTools">Skytools</a> <code>3.0</code> dont la prochaine version
sera publiée dès que j'aurais eu le temps de terminer de revoir (en fait
principalement d'écrire) la documentation.</p>

<center>
<p><a class="image-link" href="../../../images/skytools3.pdf">
<img src="../../../images/skytools3-0.png"></a></p>
</center>

<p>Les <em>slides</em> de l'ensemble des présentations devraient être publiés en ligne à
terme, mais cela ne va pas pouvoir être fait aussi rapidement que nous le
voudrions tous.  Alors voici un peu de lecture en attendant la suite !</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 19 Jul 2011 14:39:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/19-skytools3-slides.html</guid>
</item>
<item>
  <title>Skytools3 talk Slides</title>
  <link>http://tapoueh.org/blog/2011/07/19-skytools3-talk-slides.html</link>
  <description><![CDATA[<p>In case you're wondering, here are the slides from the <a href="http://char11.org/">CHAR(11)</a> talk I gave,
about <a href="http://wiki.postgresql.org/wiki/SkyTools">Skytools</a> <code>3.0</code>, <em>soon</em> to be released.  That means as soon as I have
enough time available to polish (or write) the documentation.</p>

<center>
<p><a class="image-link" href="../../../images/skytools3.pdf">
<img src="../../../images/skytools3-0.png"></a></p>
</center>


<p>The slides for all the talks should eventually make their way to a central
place, but expect some noticable delay here.  Sorry about that, and have a
good reading meanwhile!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 19 Jul 2011 14:24:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/19-skytools3-talk-slides.html</guid>
</item>
<item>
  <title>Elisp Breadcrumbs</title>
  <link>http://tapoueh.org/blog/2011/07/blog/2011/07/14-elisp-breadcrumbs.html</link>
  <description><![CDATA[<p>A <a href="http://en.wikipedia.org/wiki/Breadcrumb_(navigation)">breadcrumb</a> is a navigation aid.  I just added one to this website, so that
it gets easier to browse from any article to its local and parents indexes
and back to <a href="../../../index.html">/dev/dim</a>, the root webpage of this site.</p>

<p>As it was not that much work to implement, here's the whole of it:</p>

<pre class="src">
<span style="color: #b22222;">;;;</span><span style="color: #b22222;">
</span><span style="color: #b22222;">;;; </span><span style="color: #b22222;">Breadcrumb support
</span><span style="color: #b22222;">;;;</span><span style="color: #b22222;">
</span>(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">tapoueh-breadcrumb-to-current-page</span> ()
  <span style="color: #bc8f8f;">"Return a list of (name . link) from the index root page to current one"</span>
  (<span style="color: #7f007f;">let*</span> ((current (muse-current-file))
         (cwd     (file-name-directory current))
         (project (muse-project-of-file current))
         (root    (muse-style-element <span style="color: #da70d6;">:path</span> (caddr project)))
         (path    (tapoueh-path-to-root))
         (dirs    (split-string (file-relative-name current root) <span style="color: #bc8f8f;">"/"</span>)))
    <span style="color: #b22222;">;; </span><span style="color: #b22222;">("blog" "2011" "07" "13-back-from-char11.muse")
</span>    (append
     (list (cons <span style="color: #bc8f8f;">"/dev/dim"</span> (concat path <span style="color: #bc8f8f;">"index.html"</span>)))
     (<span style="color: #7f007f;">loop</span> for p in (butlast dirs)
           collect (cons p (format <span style="color: #bc8f8f;">"%s%s/index.html"</span> path p))
           do (setq path (concat path p <span style="color: #bc8f8f;">"/"</span>))))))

(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">tapoueh-insert-breadcrumb-div</span> ()
  <span style="color: #bc8f8f;">"The real HTML inserting"</span>
  (insert <span style="color: #bc8f8f;">"&lt;div id=\"breadcrumb\"&gt;"</span>)
  (<span style="color: #7f007f;">loop</span> for (name . link) in (tapoueh-breadcrumb-to-current-page)
        do (insert (format <span style="color: #bc8f8f;">"&lt;a href=%s&gt;%s&lt;/a&gt;"</span> link name) <span style="color: #bc8f8f;">" / "</span>))
  (insert <span style="color: #bc8f8f;">"&lt;/div&gt;\n"</span>))

(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">tapoueh-insert-breadcrumb</span> ()
  <span style="color: #bc8f8f;">"Must run with current buffer being a muse article"</span>
  (<span style="color: #7f007f;">save-excursion</span>
    (beginning-of-buffer)
    (<span style="color: #7f007f;">when</span> (tapoueh-extract-directive <span style="color: #bc8f8f;">"author"</span> (muse-current-file))
      (re-search-forward <span style="color: #bc8f8f;">"&lt;body&gt;"</span> nil t) <span style="color: #b22222;">; </span><span style="color: #b22222;">find where the article content is
</span>      (re-search-forward <span style="color: #bc8f8f;">"&lt;h2&gt;"</span> nil t)   <span style="color: #b22222;">; </span><span style="color: #b22222;">that's the title line
</span>      (beginning-of-line)
      (open-line 1)
      (tapoueh-insert-breadcrumb-div)

      (re-search-forward <span style="color: #bc8f8f;">"&lt;h2&gt;"</span> nil t 2) <span style="color: #b22222;">; </span><span style="color: #b22222;">that's the TAG line
</span>      (beginning-of-line)
      (open-line 1)
      (tapoueh-insert-breadcrumb-div))))
</pre>

<p>This code is now called in the <code>:after</code> function of my <a href="http://www.emacswiki.org/emacs/EmacsMuse">Muse</a> project style, and
it gets the work done.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 14 Jul 2011 18:44:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/blog/2011/07/14-elisp-breadcrumbs.html</guid>
</item>
<item>
  <title>Elisp Breadcrumbs</title>
  <link>http://tapoueh.org/blog/2011/07/14-elisp-breadcrumbs.html</link>
  <description><![CDATA[
      (open-line 1)
      (tapoueh-insert-breadcrumb-div))))
</pre>

<p>This code is now called in the <code>:after</code> function of my <a href="http://www.emacswiki.org/emacs/EmacsMuse">Muse</a> project style, and
it gets the work done.</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/muse.html">Muse</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 14 Jul 2011 18:44:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/14-elisp-breadcrumbs.html</guid>
</item>
<item>
  <title>De retour de CHAR(11)</title>
  <link>http://tapoueh.org/blog/2011/07/13-de-retour-de-char11.html</link>
  <description><![CDATA[h1>De retour de CHAR(11)</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/07/index.html>07</a> / </div>
<div class="date">Wednesday, July 13 2011, 17:30</div>
</div>
<div id="article">
<p>Quelle meilleure occupation dans le train du retour de <a href="http://char11.org/schedule">CHAR(11)</a> que de se
faire reporteur pour l'occasion ?  En réalité, dormir serait une idée tant
les soirées se sont prolongées !</p>

<p>Nous avons eu le plaisir d'écouter <strong><em>Jan Wieck</em></strong> présenter un historique
simplifié de la réplication avec <a href="http://www.postgresql.org/">PostgreSQL</a>.  Étant lui-même l'un des
pionniers du domaine, son point de vue est des plus intéressants.  Il a
parlé de l'évolution des solutions de réplication, et je ne peux m'empêcher
de penser que par bien des côtés <a href="http://wiki.postgresql.org/wiki/SKytools">Skytools</a> est une évolution de <a href="http://slony.info/">Slony</a> — Jan,
auteur de Slony, semblait d'accord avec cela.</p>

<p>En effet Skytools est né de limitations de Slony.  Certaines d'entre elles
existent toujours, comme l'absence de séparation entre la couche de <strong><em>queuing</em></strong>
et la couche de réplication elle-même, et certaines ont été résolues depuis,
comme les difficultés à subir de fortes charges en écriture.  Et puis les
deux solutions partagent même une partie de leur implémentation, depuis
PostgreSQL 8.3, avec les types de données <code>txid</code> et <a href="http://www.postgresql.org/docs/8.3/interactive/functions-info.html#FUNCTIONS-TXID-SNAPSHOT">txid_snapshot</a>.  Bien sûr,
l'objectif de Skytools est d'avoir une solution la plus simple possible,
parfaitement adapée à un ensemble de cas d'utilisation précis et bornés,
alors que Slony essaye de résoudre automatiquement les problèmes les plus
difficiles du domaine, au prix d'une interface très complexe.</p>

<p>Bien sûr, <strong><em>Jan</em></strong> a pris le temps de comparer objectivement ces solutions de
réplication avec la solution intégrée dans PostgreSQL, <em>Streaming Replication</em>
et <em>Hot Standby</em>.  Nous avions déjà la réplication binaire asynchrone,
PostgreSQL 9.1 nous apporte la réplication synchrone avec un contrôle par
transaction.  <a href="http://database-explorer.blogspot.com/">Simon Riggs</a>, auteur de la fonctionalité, a insisté sur
l'innovation que cela représente : aucun autre projet ne permet de contrôler
la garantie de durabilité des données avec une granularité aussi souple et
précise !</p>

<p><a href="http://projects.2ndquadrant.com/repmgr">repmgr</a> est une solution d'administration de <em>cluster</em> animés avec <em>Hot Standby</em>
et <em>Streaming Replication</em> (synchrone ou non).  Son fonctionnement a été
détaillé par <strong><em>Greg Smith</em></strong> et <strong><em>Cédric Villemain</em></strong>.  Le premier a montré comment
mettre au point une architecture permettant de répartir la charge en
lecture, et le second comment obtenir un système tolérant aux pannes grâce
au <em>failover</em> automatique intégré dans repmgr. Cette solution innovante a été
mise au point en grande partie par 2ndQuadrant France, nous l'avons déjà
estampillée <em>production ready</em>.</p>

<p><strong><em><a href="http://www.hagander.net/">Magnus Hagander</a></em></strong> a beaucoup travaillé sur le protocole de <em>streaming</em> utilisé
pour la réplication intégrée dans PostgreSQL 9.1, ainsi que sur les outils
qui exploitent ce protocole.  Il a naturellement présenté cela, et l'idée
d'un <em>proxy</em> relayant le flux binaire des journaux de transaction est revenue
dans les discutions (nous avions déjà envisagé cela en 2010, l'article en
anglais <a href="../../2010/05/27-back-from-pgcon2010.html">Back from PgCon2010</a> contient quelques éléments sur le sujet).  Avec
la réplication synchrone, il devient possible de concevoir des architectures
avancées, robustes et versatiles — le proxy pourrait maintenant s'occuper à
la fois des archives et des serveurs <em>standby</em>.</p>

<p><a href="http://database-explorer.blogspot.com/">Simon Riggs</a> nous a ensuite proposé une rétrospective des 7 dernières années
de travail qu'il a réalisé avec PostgreSQL, de l'implémentation du <em>Point in
Time Recovery</em> à la réplication synchrone, en passant par <em>Hot Standby</em>.  Ce
que nous avons dans PostgreSQL 9.0 correspond déjà à ce qu'Oracle propose de
plus avancé en terme de durabitilé des données, et 9.1 permet de franchir
l'étape suivante.  Cela ne freine en rien <strong><em>Simon</em></strong> qui parlait déjà des projets
à venir pour les 10 prochaines années.</p>

<p>Enfin, <a href="http://www.heroku.com/">Heroku</a> nous a présenté leur incroyable entreprise.  Ils ont
aujourd'hui plus de <code>150 000</code> instances de PostgreSQL en production,
démontrant que notre <code>SGBD</code> préféré est prêt pour les hébergeurs. <strong><em>Heroku</em></strong> est
en train de concevoir et réaliser une solution prête à l'emploi pour le
fameux <em>Cloud</em> si difficile à définir.  Ici, il s'agit d'être capable
d'ajouter des nouveaux réplicas en lecture seule à la volée pour encaisser
les pics de trafic, créer des instances de développement d'un clic, etc.</p>

<p>Cet article ne couvre qu'une petite sélection des sujets abordés à la
conférence, je compte sur <a href="http://blog.guillaume.lelarge.info/">Guillaume</a> pour lui aussi vous parler de <a href="http://char11.org/schedule">CHAR(11)</a>,
mais il faudra peut être attendre son retour des <a href="http://2011.rmll.info/">RMLL</a> (quelle énergie !).</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresqlfr.html">PostgreSQLFr</a> <a href="../../../tags/conferences.html">Conferences</a> <a href="../../../tags/skytools.html">Skytools</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 13 Jul 2011 17:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/13-de-retour-de-char11.html</guid>
</item>
<item>
  <title>Back From CHAR(11)</title>
  <link>http://tapoueh.org/blog/2011/07/13-back-from-char11.html</link>
  <description><![CDATA[h1>Back From CHAR(11)</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/07/index.html>07</a> / </div>
<div class="date">Wednesday, July 13 2011, 17:15</div>
</div>
<div id="article">
<p><a href="http://char11.org/schedule">CHAR(11)</a> finished somewhen in the night leading to today, if you consider
the <em>social events</em> to be part of it, which I definitely do.  This conference
has been a very good one, both on the organisation side of things and of
course for its content.</p>

<p>It began with a perspective about the evolution of replication solutions, by
<strong><em>Jan Wieck</em></strong> himself.  In some way <a href="http://wiki.postgresql.org/wiki/SKytools">Skytools</a> is an evolution of <a href="http://slony.info/">Slony</a>, in the
sense that it reuses the same concepts, a part of the design, and even share
bits of the implementation (like the <a href="http://www.postgresql.org/docs/8.3/interactive/functions-info.html#FUNCTIONS-TXID-SNAPSHOT">txid_snapshot</a> datatype that were added
in PostgreSQL 8.3).  The evolution occured in choosing a subset of the
features of Slony and then simplifying the user interface as much as
possible.  And with Skytools 3.0, those features that were removed but still
are useful to solve real-life problems are now available too.</p>

<p>Of course the talk did approach the other replication solutions (not just
the trigger based ones), and did compare <a href="http://wiki.postgresql.org/wiki/Setting_up_RServ_with_PostgreSQL_7.0.3">RServ</a> to <a href="http://bucardo.org/">Bucardo</a> for example.  And
then all those were compared to the <a href="http://www.postgresql.org/">PostgreSQL</a> core replication facilities,
which are quite a different animal.  It was a really nice <em>keynote</em> here,
preparing the audience minds to make the most out of all the other talks.</p>

<p>I will not review all the talks in details, as I'm pretty sure some other
attendees will turn into reporters themselves: scaling the write load!</p>

<p>Still <a href="http://projects.2ndquadrant.com/repmgr">repmgr</a> got its share of attention.  <a href="http://www.2ndquadrant.com/books/postgresql-9-0-high-performance/">Greg Smith</a> and <a href="http://www.2ndquadrant.fr/">Cédric Villemain</a>
did present both how to do <strong>read scaling</strong> and <strong>auto failover</strong> management with
this tool, going into fine details about how it works internally and how to
best design your services architecture for maximum <strong>data availibility</strong>.  The
question and answers section led to insist on the fact that you can not have
data availibility with less than 3 production nodes.</p>

<p><a href="http://www.hagander.net/">Magnus Hagander</a> detailed how flexible the core protocol support for
replication (and streaming) really is.  That flexibility means that you can
quite easily talk this protocol from any application, and the idea of a <em>wal
proxy</em> did pop out again (see <a href="../../2010/05/27-back-from-pgcon2010.html">Back from PgCon2010</a> article for my first
mentionning of the idea).  The main difference is that we now have
<em>synchronous replication</em> support, so that the proxy could be trusted both for
archiving and serving standbys.</p>

<p>Of course <a href="http://database-explorer.blogspot.com/">Simon</a> still has lots of ideas about next 10 years of replication
oriented projects for core PostgreSQL code, and his talk nicely summarized
the previous 7 years.  Future is bright, and guess what, it's beginning
today!</p>

<p>We also heard about <a href="http://www.heroku.com/">Heroku</a>, and these guys are doing crazy impressive
things.  Like running <code>150 000</code> PostgreSQL instances, for example, showing
that you can actually use our prefered database server in the hosting
business.  I expect that the maturing solution and tool sets providing data
availibility are soon to be a game changer here.  What they are doing is
designing a <strong>flexible data architecture</strong> with strong guarantees (<strong>no data
loss</strong>).  The <em>cloud elasticity</em> is reaching out from the stateless services,
and <em>those guys</em> are making it happen now.</p>

<p>May you live in interresting times!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/skytools.html">skytools</a> <a href="../../../tags/conferences.html">Conferences</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 13 Jul 2011 17:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/13-back-from-char11.html</guid>
</item>
<item>
  <title>Muse setup revised</title>
  <link>http://tapoueh.org/blog/2011/07/blog/2011/07/05-muse-setup-revised.html</link>
  <description><![CDATA[<p>Most of you are probably reading my posts directly in their <code>RSS</code> reader tools
(mine is <a href="http://www.gnus.org/">gnus</a> thanks to the <a href="http://gwene.org/">Gwene</a> service), so you probably missed it, but I
just <em>pushed</em> a whole new version of <a href="http://tapoueh.org">my website</a>, still using <a href="https://github.com/alexott/muse">Emacs Muse</a> as the
engine.</p>

<p>My setup is tentatively called <a href="../../../tapoueh.el.html">tapoueh.el</a> and browsable online.  It consists
of some tweaks on top of Muse, so that I can enjoy <a href="../../../tags/index.html">tags</a> and proper <a href="../../../rss/">rss</a>
support.  By <em>proper</em>, I mean that I want to be able to produce as many <em>topic</em>
<code>RSS</code> <em>feeds</em> from a single <em>blog</em>, and thanks to the <em>tags</em> support that's now what
I have.</p>

<p>The <code>RSS</code> handling and the tagging system are adhoc code, and this very
article begins like this:</p>

<pre class="src">
#author Dimitri Fontaine
#title <span style="font-size: 140%; font-weight: bold;"> Muse setup revised</span>
#date   20110705-19:55
#tags   Emacs Muse
</pre>

<p>All the information for the site navigation are taken from there, and at
long last the <code>RSS</code> I publish now contains proper <code>URLs</code> without abusing
<a href="../../../blog.dim.html">anchors</a>, as in the previous link which is a compatibility page in case you
had some bookmarks.  The compat only works with javascript (did you know
that <em>anchors</em> are not part of the <code>URL</code> that is sent to the server, so that you
can't apply <code>RedirectMatch</code> or other tweaks?), but all it needs is <em>2 lines of
code</em>, so I guess that's not so bad.</p>

<pre class="src">
<span style="color: #7f007f;">var</span> <span style="color: #b8860b;">anchor</span> = window.location.hash;
document.location.href=document.getElementById(anchor).href;
</pre>

<p>I hope you like the new setup as much as I do, even if I'm left with some
debugging to do.  That's the price to pay for doing it yourself I guess.
But I still don't know of a ready to use solution (as in <em>off the shelf</em>) that
meet my criteria for web publishing.  More on that topic another time.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 05 Jul 2011 19:55:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/blog/2011/07/05-muse-setup-revised.html</guid>
</item>
<item>
  <title>Prêt pour CHAR(11) ?</title>
  <link>http://tapoueh.org/blog/2011/07/04-pret-pour-char11.html</link>
  <description><![CDATA[h1>Prêt pour CHAR(11) ?</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/07/index.html>07</a> / </div>
<div class="date">Monday, July 04 2011, 20:15</div>
</div>
<div id="article">
<p>La semaine prochaine <strong>déjà</strong> se tient <a href="http://www.char11.org/">CHAR(11)</a>, la conférence spécialisée sur
le <em>Clustering</em>, la <em>Haute Disponibilité</em> et la <em>Réplication</em> avec <a href="http://www.postgresql.org/">PostgreSQL</a>.
C'est en Europe, à Cambridge cette fois, et c'est en anglais même si
plusieurs compatriotes seront dans l'assistance.</p>

<p>Si vous n'avez pas encore jeté un œil au <a href="http://www.char11.org/schedule">programme</a>, je vous encourage à le
faire. Même si vous n'aviez pas prévu de venir… parce qu'il y a de quoi vous
faire changer d'avis !</p>

<p>Il est déjà difficile de suivre les <a href="http://archives.postgresql.org/">listes de diffusions PostgreSQL</a> en
anglais, pour une simple question de temps, mais parfois la barrière de la
langue peut également jouer. Alors si vous n'aviez pas bien suivi, je me
permets de préciser qui sont les principaux intervenants à cette conférence.</p>

<p><strong><em>Jan Wieck</em></strong> assure la première intervention avec un rétrospectif des solutions
de réplication pour PostgreSQL. Il a initié <a href="http://slony.info/">Slony</a> et continue d'être très
actif dans son architecture et son développement.</p>

<p><strong><em>Greg Smith</em></strong>, un collègue chez <a href="http://www.2ndquadrant.us/">2ndQuadrant</a>, est monsieur performances « bas
niveau » : sa spécialité est de tirer le meilleur de votre matériel, de
votre configuration serveur, de PostgreSQL lui-même, et des requêtes que
vous lui soumettez. Son livre <a href="http://www.2ndquadrant.com/books/postgresql-9-0-high-performance/">PostgreSQL High Performance</a> est un
incontournable, à ce titre <a href="http://blog.guillaume.lelarge.info/index.php/post/2011/05/01/%C2%AB-Bases-de-donn%C3%A9es-PostgreSQL&#44;-Gestion-des-performances-%C2%BB">traduit en français</a>.</p>

<p>Nous avons ensuite <strong><em>Magnus Hagander</em></strong> qui a rejoint récemment la <em>core team</em>
(l'organisation centrale du projet), et qui contribue depuis plus de 10 ans
au code de PostgreSQL.</p>

<p><strong><em>Simon Riggs</em></strong>, lui aussi un de <a href="http://www.2ndquadrant.com/about/#riggs">nos collègues</a>, a réalisé le <em>PITR</em>, l'archivage
des journaux de transactions, la réplication asynchrone et pour la prochaine
version de PostgreSQL, la réplication synchrone.</p>

<p><strong><em>Hannu Krosing</em></strong> (devinez <a href="http://www.2ndquadrant.com/">où</a> il travaille ?) a conçu l'architecture (et les
outils) qui permettent à <a href="http://www.skype.com/">Skype</a> d'annoncer une « scalability » infinie, en
tout cas annoncée pour supporter jusqu'à <a href="http://highscalability.com/skype-plans-postgresql-scale-1-billion-users">1 milliard d'utilisateurs</a>.</p>

<p><strong><em>Koichi Suzuki</em></strong> dirige les efforts du produit prometteur <a href="http://postgres-xc.sourceforge.net/">PostgreS-XC</a>, un bel
exemple de collaboration entre différents acteurs du marché, ici
<a href="http://www.enterprisedb.com/">EnterpriseDB</a> et <a href="https://www.oss.ecl.ntt.co.jp/ossc/">NTT Open Source Software Center</a>. Ce qui montre une fois de
plus que l'<a href="http://fr.wikipedia.org/wiki/Open_source">Open Source</a> est solidement ancré dans entreprises commerciales.</p>

<p>Bien sûr, Cédric et moi-même, de la partie française de <a href="http://www.2ndquadrant.fr/">2ndQuadrant</a>, serons
de la partie. Nous interviendrons sur des sujets que nous connaissons bien
pour avoir participé à leur développement et pour les déployer et les
maintenir en production, <a href="http://projects.2ndquadrant.com/repmgr">repmgr</a> et <a href="http://wiki.postgresql.org/wiki/Londiste_Tutorial">Londiste</a>.</p>

<p>Et je passe sur d'autres profils, dont les sujets ne serront pas moins
intéressants. Bref, si <em>réplication</em> et <em>cluster</em> sont des thèmes que vous
voulez conjuguer avec PostgreSQL, c'est l'endroit où passer le début de la
semaine prochaine !</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresqlfr.html">PostgreSQLfr</a> <a href="../../../tags/conferences.html">Conferences</a> <a href="../../../tags/skytools.html">skytools</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 04 Jul 2011 20:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/07/04-pret-pour-char11.html</guid>
</item>




<item>
  <title>Multi-Version support for Extensions</title>
  <link>http://tapoueh.org/blog/2011/06/29-multi-version-support-for-extensions.html</link>
  <description><![CDATA[h1>Multi-Version support for Extensions</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/06/index.html>06</a> / </div>
<div class="date">Wednesday, June 29 2011, 09:50</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>We still have this problem to solve with extensions and their packaging.
How to best organize things so that your extension is compatible with before
<code>9.1</code> and <code>9.1</code> and following releases of <a href="http://www.postgresql.org/">PostgreSQL</a>?</p>

<p>Well, I had to do it for the <a href="http://pgfoundry.org/projects/ip4r/">ip4r</a> contribution, and I wanted the following
to happen:</p>

<pre class="src">
dpkg-deb: building package `postgresql-8.3-ip4r' ...
dpkg-deb: building package `postgresql-8.4-ip4r' ...
dpkg-deb: building package `postgresql-9.0-ip4r' ...
dpkg-deb: building package `postgresql-9.1-ip4r' ...
</pre>

<p>And here's a simple enough way to achieve that.  First, you have to get your
packaging ready the usual way, and to install the build dependencies.  Then
realizing that <code>/usr/share/postgresql-common/supported-versions</code> from the
latest <code>postgresql-common</code> package will only return <code>8.3</code> in <code>lenny</code> (yes, I'm
doing some <em>backporting</em> here), we have to tweak it.</p>

<pre class="src">
postgresql-server-dev-8.4
postgresql-server-dev-9.0
postgresql-server-dev-9.1
postgresql-server-dev-all

$ sudo dpkg-divert \
--divert /usr/share/postgresql-common/supported-versions.distrib \
--rename /usr/share/postgresql-common/supported-versions

$ cat /usr/share/postgresql-common/supported-versions
#! /bin/bash

dpkg -l postgresql-server-dev-* \
| awk -F '[ -]' '/^ii/ &amp;&amp; ! /server-dev-all/ {print $6}'
</pre>

<p>Now we are allowed to build our extension for all those versions, so we add
<code>9.1</code> to the <code>debian/pgversions</code> file.  And <code>debuild</code> will do the right thing now,
thanks to <a href="http://manpages.debian.net/cgi-bin/man.cgi?query=pg_buildext">pg_buildext</a> from <a href="http://packages.debian.org/sid/postgresql-server-dev-all">postgresql-server-dev-all</a>.</p>

<p>The problem we face is that the built is not an <a href="http://www.postgresql.org/docs/9.1/static/extend-extensions.html">extension</a> as in <code>9.1</code>, so
things like <code>\dx</code> in <code>psql</code> and <a href="http://www.postgresql.org/docs/9.1/static/sql-createextension.html">CREATE EXTENSION</a> will not work out of the box.
First, we need a control file.  Then we need to remove the transaction
control from the install script (here, <code>ip4r.sql</code>), and finally, this script
needs to be called <code>ip4r--1.05.sql</code>.  Here's how I did it:</p>

<pre class="src">
$ cat ip4r.control
comment = 'IPv4 and IPv4 range index types'
default_version = '1.05'
relocatable = yes

$ cat debian/postgresql-9.1-ip4r.install
debian/ip4r-9.1/ip4r.so usr/lib/postgresql/9.1/lib
ip4r.control usr/share/postgresql/9.1/extension
debian/ip4r-9.1/ip4r.sql usr/share/postgresql/9.1/extension

$ cat debian/postgresql-9.1-ip4r.links
usr/share/postgresql/9.1/extension/ip4r.sql usr/share/postgresql/9.1/extension/ip4r--1.05.sql
</pre>

<p>Be careful not to forget to remove any and all <code>BEGIN;</code> and <code>COMMIT;</code> lines from
the <code>ip4r.sql</code> file, which meant that I also removed support for <em>Rtree</em>, which
is not relevant for modern versions of PostgreSQL saith the script (post
<code>8.2</code>).  That means I'm not publishing this very work yet, but I wanted to
share the <code>debian/postgresql-9.1-extension.links</code> idea.</p>

<p>Notice that I didn't change anything about the <code>.sql.in</code> make rule, so I
didn't have to use the support for <code>module_pathname</code> in the control file.</p>

<p>Now, after the usual <code>debuild</code> step, I can just <code>sudo debi</code> to install all the
just build packages and <code>CREATE EXTENSION</code> will run fine.  And in <code>9.0</code> you get
the old way to install it, but it still works:</p>

<pre class="src">
$ psql -U postgres --cluster 9.0/main -1 \
-f /usr/share/postgresql/9.0/contrib/ip4r.sql
&lt;lots of chatter&gt;

$ psql -U postgres --cluster 9.1/main -c 'create extension ip4r;'
CREATE EXTENSION
</pre>

<p>That's it :)</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/ip4r.html">ip4r</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 29 Jun 2011 09:50:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/06/29-multi-version-support-for-extensions.html</guid>
</item>
<item>
  <title>Don't be afraid of 'cl</title>
  <link>http://tapoueh.org/blog/2011/06/blog/2011/06/20-dont-be-afraid-of-cl.html</link>
  <description><![CDATA[<p>In this <a href="http://tsengf.blogspot.com/2011/06/confirm-to-quit-when-editing-files-from.html">blog article</a>, you're shown a quite long function that loop through
your buffers to find out if any of them is associated with a file whose full
name includes <code>&quot;projects&quot;</code>.  Well, you should not be afraid of using <code>cl</code>:</p>

<pre class="src">
(<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">cl</span>)
(<span style="color: #7f007f;">loop</span> for b being the buffers
      when (string-match <span style="color: #bc8f8f;">"projects"</span> (or (buffer-file-name b) <span style="color: #bc8f8f;">""</span>))
      return t)
</pre>

<p>If you want to collect the list of buffers whose name matches your test,
then replace <code>return t</code> by <code>collect b</code> and you're done.  Really, this <code>loop</code> thing
is worth learning.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 20 Jun 2011 00:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/06/blog/2011/06/20-dont-be-afraid-of-cl.html</guid>
</item>
<item>
  <title>Don't be afraid of 'cl</title>
  <link>http://tapoueh.org/blog/2011/06/20-dont-be-afraid-of-cl.html</link>
  <description><![CDATA[h1>Don't be afraid of 'cl</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/06/index.html>06</a> / </div>
<div class="date">Monday, June 20 2011, 00:15</div>
</div>
<div id="article">
<p>In this <a href="http://tsengf.blogspot.com/2011/06/confirm-to-quit-when-editing-files-from.html">blog article</a>, you're shown a quite long function that loop through
your buffers to find out if any of them is associated with a file whose full
name includes <code>&quot;projects&quot;</code>.  Well, you should not be afraid of using <code>cl</code>:</p>

<pre class="src">
(<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">cl</span>)
(<span style="color: #729fcf; font-weight: bold;">loop</span> for b being the buffers
      when (string-match <span style="color: #ad7fa8; font-style: italic;">"projects"</span> (or (buffer-file-name b) <span style="color: #ad7fa8; font-style: italic;">""</span>))
      return t)
</pre>

<p>If you want to collect the list of buffers whose name matches your test,
then replace <code>return t</code> by <code>collect b</code> and you're done.  Really, this <code>loop</code> thing
is worth learning.</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 20 Jun 2011 00:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/06/20-dont-be-afraid-of-cl.html</guid>
</item>






<item>
  <title>Back from Ottawa, preparing for Cambridge</title>
  <link>http://tapoueh.org/blog/2011/05/30-back-from-ottawa-preparing-for-cambridge.html</link>
  <description><![CDATA[h1>Back from Ottawa, preparing for Cambridge</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Monday, May 30 2011, 11:00</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>While <a href="http://blog.hagander.net/">Magnus</a> is all about <a href="http://2011.pgconf.eu/">PG Conf EU</a> already, you have to realize we're just
landed back from <a href="http://www.pgcon.org/2011/">PG Con</a> in Ottawa.  My next stop in the annual conferences
is <a href="http://char11.org/">CHAR 11</a>, the <em>Clustering, High Availability and Replication</em> conference in
Cambridge, 11-12 July.  Yes, on the old continent this time.</p>

<p>This year's <em>pgcon</em> hot topics, for me, have been centered around a better
grasp at <a href="http://www.postgresql.org/docs/9.1/static/transaction-iso.html#XACT-SERIALIZABLE">SSI</a> and <em>DDL Triggers</em>.  Having those beasts in <a href="http://www.postgresql.org/">PostgreSQL</a> would
allow for auditing, finer privileges management and some more automated
replication facilities.  Imagine that <code>ALTER TABLE</code> is able to fire a <em>trigger</em>,
provided by <em>Londiste</em> or <em>Slony</em>, that will do what's needed on the cluster by
itself.  That would be awesome, wouldn't it?</p>

<p>At <em>CHAR 11</em> I'll be talking about <a href="http://wiki.postgresql.org/wiki/SkyTools">Skytools 3</a>.  You know I've been working on
its <em>debian</em> packaging, now is the time to review the documentation and make
there something as good looking as the monitoring system are...</p>

<p>Well, expect some news and a nice big picture diagram overview soon, if work
schedule leaves me anytime that's what I want to be working on now.</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/pgcon.html">pgcon</a> <a href="../../../tags/conferences.html">Conferences</a> <a href="../../../tags/skytools.html">skytools</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 30 May 2011 11:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/30-back-from-ottawa-preparing-for-cambridge.html</guid>
</item>
<item>
  <title>el-get 2.2</title>
  <link>http://tapoueh.org/blog/2011/05/blog/2011/05/26-el-get-22.html</link>
  <description><![CDATA[<p>We've spotted a little too late for our own taste a discrepancy in the
source tree: a work in progress patch landed in git just before to release
<a href="https://github.com/dimitri/el-get">el-get</a> stable.  So we cleaned the tree (thanks again <a href="http://julien.danjou.info/">Julien</a>), branched a
stable maintenance tree, and released <code>2.2</code> from there.</p>

<p>You're back to enjoying <code>el-get</code> :)</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 May 2011 12:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/blog/2011/05/26-el-get-22.html</guid>
</item>
<item>
  <title>el-get 2.2</title>
  <link>http://tapoueh.org/blog/2011/05/26-el-get-22.html</link>
  <description><![CDATA[h1>el-get 2.2</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Thursday, May 26 2011, 12:00</div>
</div>
<div id="article">
<p>We've spotted a little too late for our own taste a discrepancy in the
source tree: a work in progress patch landed in git just before to release
<a href="https://github.com/dimitri/el-get">el-get</a> stable.  So we cleaned the tree (thanks again <a href="http://julien.danjou.info/">Julien</a>), branched a
stable maintenance tree, and released <code>2.2</code> from there.</p>

<p>You're back to enjoying <code>el-get</code> :)</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/release.html">release</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 May 2011 12:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/26-el-get-22.html</guid>
</item>
<item>
  <title>el-get 2.1</title>
  <link>http://tapoueh.org/blog/2011/05/blog/2011/05/26-el-get-21.html</link>
  <description><![CDATA[<p>Current <a href="https://github.com/dimitri/el-get">el-get</a> status is stable, ready for daily use and packed with extra
features that make life easier.  There are some more things we could do, as
always, but they will be about smoothing things further.</p>

<h3>Latest released version</h3>

<p><a href="https://github.com/dimitri/el-get">el-get</a> version <code>2.1</code> is available, with a boatload of features, including
autoloads support, byte-compiling in an external <em>clean room</em> <a href="http://www.gnu.org/software/emacs/">Emacs</a> instance,
custom support, lazy initialisation support (defering all <em>init</em> functions to
<code>eval-after-load</code>), and multi repositories <code>ELPA</code> support.</p>


<h3>Version numbering</h3>

<p class="first">Version String are now inspired by how Emacs itself numbers its versions.
First is the major version number, then a dot, then the minor version
number.  The minor version number is <code>0</code> when still developping the next major
version.  So <code>3.0</code> is a developer release while <code>3.1</code> will be the next stable
release.</p>

<p>Please note that this versioning policy has been picked while backing
<code>1.2~dev</code>, so <code>1.0</code> was a <em>stable</em> release in fact.  Ah, history.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 May 2011 10:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/blog/2011/05/26-el-get-21.html</guid>
</item>
<item>
  <title>el-get 2.1</title>
  <link>http://tapoueh.org/blog/2011/05/26-el-get-21.html</link>
  <description><![CDATA[h1>el-get 2.1</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Thursday, May 26 2011, 10:00</div>
</div>
<div id="article">
<p>Current <a href="https://github.com/dimitri/el-get">el-get</a> status is stable, ready for daily use and packed with extra
features that make life easier.  There are some more things we could do, as
always, but they will be about smoothing things further.</p>

<h3>Latest released version</h3>

<p><a href="https://github.com/dimitri/el-get">el-get</a> version <code>2.1</code> is available, with a boatload of features, including
autoloads support, byte-compiling in an external <em>clean room</em> <a href="http://www.gnu.org/software/emacs/">Emacs</a> instance,
custom support, lazy initialisation support (defering all <em>init</em> functions to
<code>eval-after-load</code>), and multi repositories <code>ELPA</code> support.</p>


<h3>Version numbering</h3>

<p class="first">Version String are now inspired by how Emacs itself numbers its versions.
First is the major version number, then a dot, then the minor version
number.  The minor version number is <code>0</code> when still developping the next major
version.  So <code>3.0</code> is a developer release while <code>3.1</code> will be the next stable
release.</p>

<p>Please note that this versioning policy has been picked while backing
<code>1.2~dev</code>, so <code>1.0</code> was a <em>stable</em> release in fact.  Ah, history.</p>




<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/release.html">release</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 May 2011 10:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/26-el-get-21.html</guid>
</item>
<item>
  <title>Preparing for PGCON</title>
  <link>http://tapoueh.org/blog/2011/05/12-preparing-for-pgcon.html</link>
  <description><![CDATA[h1>Preparing for PGCON</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Thursday, May 12 2011, 10:30</div>
</div>
<div id="article">
<p>It's this time of the year again, the main international
<a href="http://www.pgcon.org/2011/">PostgreSQL Conference</a> is next week in Ottawa, Canada.  If previous years are
any indication, this will be great event where to meet with a lot of the
members of your community.  The core team will be there, developers will be
there, and we will meet with users and their challenging use cases.</p>

<p>This is a very good time to review both what you did in the project those
last 12 months, and what you plan to do next year.  To help with that,
several <em>meeting</em> events are organized.  They're like a whole-day round table
with a kind of an agenda, with a limited number of invited people in, and
very intense on-topic discussions about how to organize ourselves for
another great year of innovation in PostgreSQL.</p>

<p>Then we have two days full of talks where I usually learn some new aspect of
the project or of the product, and where ideas tend to just pop-up in a
continuous race.  Being away from home and with people you see only once a
year (some of them more than that of course, hi European fellows!) seems to
allow for some broader thinking.</p>

<p>The talks I want to go to include
<a href="http://www.pgcon.org/2011/schedule/events/361.en.html">Database Scalability Patterns: Sharding for Unlimited Growth</a> by
<a href="http://www.pgcon.org/2011/schedule/speakers/20.en.html">Robert Treat</a>, <a href="http://www.pgcon.org/2011/schedule/events/366.en.html">Maintaining Terabytes</a> by <a href="http://www.pgcon.org/2011/schedule/speakers/112.en.html">Selena Deckelmann</a>, <a href="http://www.pgcon.org/2011/schedule/events/307.en.html">NTT’s Case Report</a>
by <a href="http://www.pgcon.org/2011/schedule/speakers/192.en.html">Tetsuo Sakata</a>, <a href="http://www.pgcon.org/2011/schedule/events/350.en.html">Hacking the Query Planner</a> by <a href="http://www.pgcon.org/2011/schedule/speakers/202.en.html">Tom Lane</a>.  That's for a first
day, right?</p>

<p>Then, on the second day, I notice <a href="http://www.pgcon.org/2011/schedule/events/311.en.html">Range Types</a> by <a href="http://www.pgcon.org/2011/schedule/speakers/83.en.html">Jeff Davis</a>,
<a href="http://www.pgcon.org/2011/schedule/events/309.en.html">SP-GiST - a new indexing infrastructure for PostgreSQL</a> by <a href="http://www.pgcon.org/2011/schedule/speakers/29.en.html">Oleg</a> and <a href="http://www.pgcon.org/2011/schedule/speakers/33.en.html">Teodor</a>,
<a href="http://www.pgcon.org/2011/schedule/events/337.en.html">The Write Stuff</a> by <a href="http://www.pgcon.org/2011/schedule/speakers/110.en.html">Greg Smith</a> (a colleague at <a href="http://www.2ndquadrant.fr/">2ndQuadrant</a>).</p>

<p>I will miss <a href="http://www.pgcon.org/2011/schedule/events/333.en.html">Serializable Snapshot Isolation in Postgres</a> by <a href="http://www.pgcon.org/2011/schedule/speakers/113.en.html">Kevin Grittner</a>
and <a href="http://www.pgcon.org/2011/schedule/speakers/197.en.html">Dan Ports</a>, unfortunately, because I'll be talking about
<a href="http://www.pgcon.org/2011/schedule/events/280.en.html">Extensions Development</a> at the same time.</p>

<p>Well of course this list is just a first selection, hallway tracks are often
what guides me through talks or make me skip some.</p>

<p>See you there!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/pgcon.html">pgcon</a> <a href="../../../tags/extensions.html">Extensions</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 12 May 2011 10:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/12-preparing-for-pgcon.html</guid>
</item>
<item>
  <title>Mailq modeline display</title>
  <link>http://tapoueh.org/blog/2011/05/blog/2011/05/05-mailq-modeline-display.html</link>
  <description><![CDATA[<p>If you've not been following along, you might have missed it: it appears to
me that even today, in 2011, mail systems work much better when setup the
old way.  Meaning with a local <a href="http://en.wikipedia.org/wiki/Mail_Transfer_Agent">MTA</a> for outgoing mail.  With some niceties,
such as <a href="http://tapoueh.org/articles/news/_Postfix_sender_dependent_relayhost_maps.html">sender dependent relayhost maps</a>.</p>

<p>That's why I needed <a href="http://tapoueh.org/projects.html#sec21">M-x mailq</a> to display the <em>mail queue</em> and have some easy
shortcuts in order to operate it (mainly <code>f runs the command
mailq-mode-flush</code>, but per site and per id delivery are useful too).</p>

<p>Now, I also happen to setup outgoing mail routes to walk through an <em>SSH
tunnel</em>, which thanks to both <a href="http://www.manpagez.com/man/5/ssh_config/">~/.ssh/config</a> and <a href="https://github.com/dimitri/cssh">cssh</a> (<code>C-= runs the
command cssh-term-remote-open</code>, with completion) is a couple of
keystrokes away to start.  Well it still happens to me to forget about
starting it, which causes mails to hold in a queue until I realise it's not
delivered, which always take just about too long.</p>

<p>A solution I've been thinking about is to add a little flag in the <a href="http://www.gnu.org/s/emacs/manual/html_node/elisp/Mode-Line-Format.html">modeline</a>
in my <a href="http://www.gnus.org/">gnus</a> <code>*Group*</code> and <code>*Summary*</code> buffers.  The flag would show up as ✔ when
no mail is queued and waiting for me to open the tunnel, or ✘ as soon as the
queue is not empty.  Here's what it looks like here:</p>

<center>
<p><img src="../../../images//mailq-modeline-display.png" alt=""></p>
</center>

<p>Well I'm pretty happy with the setup.  The flag is refreshed every minute,
and here's as an example how I did setup <code>mailq</code> in my <a href="https://github.com/dimitri/el-get">el-get-sources</a> setup:</p>

<pre class="src">
         (<span style="color: #da70d6;">:name</span> mailq
                <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> () (mailq-modeline-display)))
</pre>

<p>I'm not sure how many of you dear readers are using a local MTA to deliver
your mails, but well, the ones who do (or consider doing so) might even find
this article useful!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 05 May 2011 14:10:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/blog/2011/05/05-mailq-modeline-display.html</guid>
</item>
<item>
  <title>Mailq modeline display</title>
  <link>http://tapoueh.org/blog/2011/05/05-mailq-modeline-display.html</link>
  <description><![CDATA[h1>Mailq modeline display</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Thursday, May 05 2011, 14:10</div>
</div>
<div id="article">
<p>If you've not been following along, you might have missed it: it appears to
me that even today, in 2011, mail systems work much better when setup the
old way.  Meaning with a local <a href="http://en.wikipedia.org/wiki/Mail_Transfer_Agent">MTA</a> for outgoing mail.  With some niceties,
such as <a href="http://tapoueh.org/articles/news/_Postfix_sender_dependent_relayhost_maps.html">sender dependent relayhost maps</a>.</p>

<p>That's why I needed <a href="http://tapoueh.org/projects.html#sec21">M-x mailq</a> to display the <em>mail queue</em> and have some easy
shortcuts in order to operate it (mainly <code>f runs the command
mailq-mode-flush</code>, but per site and per id delivery are useful too).</p>

<p>Now, I also happen to setup outgoing mail routes to walk through an <em>SSH
tunnel</em>, which thanks to both <a href="http://www.manpagez.com/man/5/ssh_config/">~/.ssh/config</a> and <a href="https://github.com/dimitri/cssh">cssh</a> (<code>C-= runs the
command cssh-term-remote-open</code>, with completion) is a couple of
keystrokes away to start.  Well it still happens to me to forget about
starting it, which causes mails to hold in a queue until I realise it's not
delivered, which always take just about too long.</p>

<p>A solution I've been thinking about is to add a little flag in the <a href="http://www.gnu.org/s/emacs/manual/html_node/elisp/Mode-Line-Format.html">modeline</a>
in my <a href="http://www.gnus.org/">gnus</a> <code>*Group*</code> and <code>*Summary*</code> buffers.  The flag would show up as ✔ when
no mail is queued and waiting for me to open the tunnel, or ✘ as soon as the
queue is not empty.  Here's what it looks like here:</p>

<center>
<p><img src="../../../images//mailq-modeline-display.png" alt=""></p>
</center>

<p>Well I'm pretty happy with the setup.  The flag is refreshed every minute,
and here's as an example how I did setup <code>mailq</code> in my <a href="https://github.com/dimitri/el-get">el-get-sources</a> setup:</p>

<pre class="src">
         (<span style="color: #729fcf;">:name</span> mailq
                <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> () (mailq-modeline-display)))
</pre>

<p>I'm not sure how many of you dear readers are using a local MTA to deliver
your mails, but well, the ones who do (or consider doing so) might even find
this article useful!</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/modeline.html">modeline</a> <a href="../../../tags/cssh.html">cssh</a> <a href="../../../tags/mailq.html">mailq</a> <a href="../../../tags/postfix.html">postfix</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 05 May 2011 14:10:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/05-mailq-modeline-display.html</guid>
</item>
<item>
  <title>Tables and Views dependencies</title>
  <link>http://tapoueh.org/blog/2011/05/04-tables-and-views-dependencies.html</link>
  <description><![CDATA[h1>Tables and Views dependencies</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Wednesday, May 04 2011, 11:45</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>Let's say you need to <code>ALTER TABLE foo ALTER COLUMN bar TYPE bigint;</code>, and
<a href="http://postgresql.org">PostgreSQL</a> is helpfully telling you that no you can't because such and such
<em>views</em> depend on the column.  The basic way to deal with that is to copy
paste from the error message the names of the views involved, then prepare a
script wherein you first <code>DROP VIEW ...;</code> then <code>ALTER TABLE</code> and finally <code>CREATE
VIEW</code> again, all in the same transaction.</p>

<p>So you have to copy paste also the view definitions.  With large view
definitions, it quickly gets cumbersome to do so.  Well when you're working
on operations, you have to bear in mind that cumbersome is a synonym for
<em>error prone</em>, in fact — so you want another solution if possible.</p>

<p>Oh, and the other drawback of this solution is that <code>ALTER TABLE</code> will first
take a <code>LOCK</code> on the table, locking out any activity.  And more than that, the
lock acquisition will queue behind current activity on the table, which
means waiting for a fairly long time and damaging the service quality on a
moderately loaded server.</p>

<p>It's possible to abuse the <a href="http://www.postgresql.org/docs/current/static/catalogs.html">system catalogs</a> in order to find all <em>views</em> that
depend on a given table, too.  For that, you have to play with <code>pg_depend</code> and
you have to know that internally, a <em>view</em> is in fact a <em>rewrite rule</em>.  Then
here's how to produce the two scripts that we need:</p>

<pre class="src">
=# \t
Showing only tuples.

=# \o /tmp/drop.sql
=# select <span style="color: #ad7fa8; font-style: italic;">'DROP VIEW '</span> || views || <span style="color: #ad7fa8; font-style: italic;">';'</span>
     from (select distinct(r.ev_class::regclass) as views
            from pg_depend d join pg_rewrite r on r.oid = d.objid
           where refclassid = <span style="color: #ad7fa8; font-style: italic;">'pg_class'</span>::regclass
             and refobjid = <span style="color: #ad7fa8; font-style: italic;">'SCHEMA.TABLENAME'</span>::regclass
             and classid = <span style="color: #ad7fa8; font-style: italic;">'pg_rewrite'</span>::regclass
             and pg_get_viewdef(r.ev_class, true) ~ <span style="color: #ad7fa8; font-style: italic;">'COLUMN_NAME'</span>) as x;

=# \o /tmp/create.sql
=# select <span style="color: #ad7fa8; font-style: italic;">'CREATE VIEW '</span> || views || E<span style="color: #ad7fa8; font-style: italic;">' AS \n'</span>
       || pg_get_viewdef(views, true) || <span style="color: #ad7fa8; font-style: italic;">';'</span>
     from (select distinct(r.ev_class::regclass) as views
          from pg_depend d join pg_rewrite r on r.oid = d.objid
         where refclassid = <span style="color: #ad7fa8; font-style: italic;">'pg_class'</span>::regclass
           and refobjid = <span style="color: #ad7fa8; font-style: italic;">'SCHEMA.TABLENAME'</span>::regclass
           and classid = <span style="color: #ad7fa8; font-style: italic;">'pg_rewrite'</span>::regclass
           and pg_get_viewdef(r.ev_class, true) ~ <span style="color: #ad7fa8; font-style: italic;">'COLUMN_NAME'</span>) as x;

=# \o
</pre>

<p>Replace <code>SCHEMA.TABLENAME</code> and <code>COLUMN_NAME</code> with your targets here and the
first query should give you one row per candidate view.  Well if you're not
using the <code>\o</code> trick, that is — if you do, check out the generated file
instead, with <code>\! cat /tmp/drop.sql</code> for example.</p>

<p>Please note that this catalog query is not accurate, as it will select as a
candidate any view that will by chance both depend on the target table and
contain the <code>column_name</code> in its text definition.  So either filter out the
candidates properly (by proper proof reading then another <code>WHERE</code> clause), or
just accept that you might <code>DROP</code> then <code>CREATE</code> again more <em>views</em> than need be.</p>

<p>If you need some more details about the <code>\t \o</code> sequence you might be
interested in this older article about <a href="http://tapoueh.org/articles/blog/_Resetting_sequences._All_of_them&#44;_please&#33;.html">resetting sequences</a>.</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/catalogs.html">catalogs</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 04 May 2011 11:45:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/04-tables-and-views-dependencies.html</guid>
</item>
<item>
  <title>Extension module_pathname and .sql.in</title>
  <link>http://tapoueh.org/blog/2011/05/02-extension-module_pathname-and-sqlin.html</link>
  <description><![CDATA[h1>Extension module_pathname and .sql.in</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/05/index.html>05</a> / </div>
<div class="date">Monday, May 02 2011, 17:30</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>While currently too busy at work to deliver much Open Source contributions,
let's debunk an old habit of <a href="http://www.postgresql.org/">PostgreSQL</a> extension authors.  It's all down to
copy pasting from <em>contrib</em>, and there's no reason to continue doing <code>$libdir</code>
this way ever since <code>7.4</code> days.</p>

<p>Let's take an example here, with the <a href="https://github.com/dimitri/prefix">prefix</a> extension.  This one too will
need some love, but is still behind on my spare time todo list, sorry about
that.  So, in the <code>prefix.sql.in</code> we read</p>

<pre class="src">
  CREATE OR REPLACE FUNCTION prefix_range_in(cstring)
  RETURNS prefix_range
  AS <span style="color: #ad7fa8; font-style: italic;">'MODULE_PATHNAME'</span>
  LANGUAGE <span style="color: #ad7fa8; font-style: italic;">'C'</span> IMMUTABLE STRICT;
</pre>

<p>Two things are to change here.  First, the PostgreSQL <em>backend</em> will
understand just fine if you just say <code>AS '$libdir/prefix'</code>.  So you have to
know in the <code>sql</code> script the name of the shared object library, but if you do,
you can maintain directly a <code>prefix.sql</code> script instead.</p>

<p>The advantage is that you now can avoid a compatibility problem when you
want to support PostgreSQL from <code>8.2</code> to <code>9.1</code> in your extension (older than
that and it's <a href="http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy">no longer supported</a>).  You directly ship your script.</p>

<p>For compatibility, you could also use the <a href="http://developer.postgresql.org/pgdocs/postgres/extend-extensions.html">control file</a> <code>module_pathname</code>
property.  But for <code>9.1</code> you then have to add a implicit <code>Make</code> rule so that the
script is derived from your <code>.sql.in</code>. And as you are managing several scripts
— so that you can handle <em>versioning</em> and <em>upgrades</em> — it can get hairy (<em>hint</em>,
you need to copy <code>prefix.sql</code> as <code>prefix--1.1.1.sql</code>, then change its name at
next revision, and think about <em>upgrade</em> scripts too).  The <code>module_pathname</code>
facility is better to keep for when managing more than a single extension in
the same directory, like the <a href="http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=contrib/spi/Makefile;h=0c11bfcbbd47b0c3ed002874bfefd9e2022cf5ac;hb=HEAD">SPI contrib</a> is doing.</p>

<p>Sure, maintaining an extension that targets both antique releases of
PostgreSQL and <a href="http://developer.postgresql.org/pgdocs/postgres/sql-createextension.html">CREATE EXTENSION</a> super-powered one(s) (not yet released) is a
little more involved than that.  We'll get back to that, as some people are
still pioneering the movement.</p>

<p>On my side, I'm working with some <a href="http://www.debian.org/">debian</a> <a href="http://qa.debian.org/developer.php?login=myon">developer</a> on how to best manage the
packaging of those extensions, and this work could end up as a specialized
<em>policy</em> document and a coordinated <em>team</em> of maintainers for all things
PostgreSQL in <code>debian</code>.  This will also give some more steam to the PostgreSQL
effort for debian packages: the idea is to maintain packages for all
supported version (from <code>8.2</code> up to soon <code>9.1</code>), something <code>debian</code> itself can not
commit to.</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/prefix.html">prefix</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 02 May 2011 17:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/05/02-extension-module_pathname-and-sqlin.html</guid>
</item>
































<item>
  <title>Emacs and PostgreSQL, PL line numbering</title>
  <link>http://tapoueh.org/blog/2011/04/blog/2011/04/23-emacs-and-postgresql-pl-line-numbering.html</link>
  <description><![CDATA[<p><span class="hack"> </span></p>

<p>A while ago I've been fixing and publishing <a href="https://github.com/dimitri/pgsql-linum-format">pgsql-linum-format</a> separately.
That allows to number <code>PL/whatever</code> code lines when editing from <a href="http://www.gnu.org/software/emacs/">Emacs</a>, and
it's something very useful to turn on when debugging.</p>

<center>
<p><img src="../../../images//emacs-pgsql-linum.png" alt=""></p>
</center>


<p>The carrets on the <em>fringe</em> in the emacs window are the result of
<code>(setq-default indicate-buffer-boundaries 'left)</code> and here it's
just overloading the image somehow.  But the idea is to just <code>M-x linum-mode</code>
when you need it, at least that's my usage of it.</p>

<p>You can use <a href="https://github.com/dimitri/el-get">el-get</a> to easily get (then update) this little <code>Emacs</code> extension.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sat, 23 Apr 2011 10:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/04/blog/2011/04/23-emacs-and-postgresql-pl-line-numbering.html</guid>
</item>
<item>
  <title>Emacs and PostgreSQL, PL line numbering</title>
  <link>http://tapoueh.org/blog/2011/04/23-emacs-and-postgresql-pl-line-numbering.html</link>
  <description><![CDATA[h1>Emacs and PostgreSQL, PL line numbering</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/04/index.html>04</a> / </div>
<div class="date">Saturday, April 23 2011, 10:30</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>A while ago I've been fixing and publishing <a href="https://github.com/dimitri/pgsql-linum-format">pgsql-linum-format</a> separately.
That allows to number <code>PL/whatever</code> code lines when editing from <a href="http://www.gnu.org/software/emacs/">Emacs</a>, and
it's something very useful to turn on when debugging.</p>

<center>
<p><img src="../../../images//emacs-pgsql-linum.png" alt=""></p>
</center>


<p>The carrets on the <em>fringe</em> in the emacs window are the result of
<code>(setq-default indicate-buffer-boundaries 'left)</code> and here it's
just overloading the image somehow.  But the idea is to just <code>M-x linum-mode</code>
when you need it, at least that's my usage of it.</p>

<p>You can use <a href="https://github.com/dimitri/el-get">el-get</a> to easily get (then update) this little <code>Emacs</code> extension.</p>



<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/pgsql-linum-format.html">pgsql-linum-format</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sat, 23 Apr 2011 10:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/04/23-emacs-and-postgresql-pl-line-numbering.html</guid>
</item>
<item>
  <title>Emacs Kicker</title>
  <link>http://tapoueh.org/blog/2011/04/blog/2011/04/15-emacs-kicker.html</link>
  <description><![CDATA[<p>Following up on the very popular <a href="https://github.com/technomancy/emacs-starter-kit">emacs-starter-kit</a>, I'm now proposing the
<a href="https://github.com/dimitri/emacs-kicker">emacs-kicker</a>.  It's about the <code>.emacs</code> file you've seen in older posts here,
which I maintain for some colleagues.  After all, if they find it useful,
some more people might to, so I've decided to publish it.</p>

<p>What you'll find is a very simple <code>128</code> lines <a href="http://www.gnu.org/software/emacs/">Emacs</a> user init file, based on
<a href="https://github.com/dimitri/el-get">el-get</a> for external packages.  A not so <em>random</em> selection of those is used,
here's the list when you hide some details:</p>

<pre class="src">
 '(el-get                       <span style="color: #b22222;">; </span><span style="color: #b22222;">el-get is self-hosting
</span>   escreen                      <span style="color: #b22222;">; </span><span style="color: #b22222;">screen for emacs, C-\ C-h
</span>   php-mode-improved            <span style="color: #b22222;">; </span><span style="color: #b22222;">if you're into php...
</span>   psvn                         <span style="color: #b22222;">; </span><span style="color: #b22222;">M-x svn-status
</span>   switch-window                <span style="color: #b22222;">; </span><span style="color: #b22222;">takes over C-x o
</span>   auto-complete                <span style="color: #b22222;">; </span><span style="color: #b22222;">complete as you type with overlays
</span>   emacs-goodies-el             <span style="color: #b22222;">; </span><span style="color: #b22222;">the debian addons for emacs
</span>   yasnippet                    <span style="color: #b22222;">; </span><span style="color: #b22222;">powerful snippet mode
</span>   zencoding-mode               <span style="color: #b22222;">; </span><span style="color: #b22222;">http://www.emacswiki.org/emacs/ZenCoding
</span>   (<span style="color: #da70d6;">:name</span> buffer-move           <span style="color: #b22222;">; </span><span style="color: #b22222;">move buffers around in windows
</span>   (<span style="color: #da70d6;">:name</span> smex                  <span style="color: #b22222;">; </span><span style="color: #b22222;">a better (ido like) M-x
</span>   (<span style="color: #da70d6;">:name</span> magit                 <span style="color: #b22222;">; </span><span style="color: #b22222;">git meet emacs, and a binding
</span>   (<span style="color: #da70d6;">:name</span> goto-last-change      <span style="color: #b22222;">; </span><span style="color: #b22222;">move pointer back to last change
</span></pre>

<p>Another interresting thing to note in this <code>kicker</code> is a choice of some key
bindings that are rather unusual (yet) I guess.</p>

<pre class="src">
(global-set-key (kbd <span style="color: #bc8f8f;">"C-x C-b"</span>) 'ido-switch-buffer)
(global-set-key (kbd <span style="color: #bc8f8f;">"C-x C-c"</span>) 'ido-switch-buffer)
(global-set-key (kbd <span style="color: #bc8f8f;">"C-x B"</span>) 'ibuffer)
</pre>

<p>Yes, you see that I've rebound <code>C-x C-c</code> to switching buffers.  That key is
really easy to use and I don't think that <code>M-x kill-emacs</code> deserves it.  Keys
that are so easy to use should be kept for frequent actions, and quiting
emacs is a once-a-day to once-a-month action here.  And you can still quit
from the window manager button or from the menu or from <code>M-x</code>.</p>

<p>Also <em>Mac</em> users are not left behind, you will see some settings that either
are adapted to the system (like choosing another <em>font</em>, keep displaying the
<code>menu-bar</code> or not installing the darkish <code>tango-color-mode</code> on this system,
where it renders poorly in my opinion), as you can see here:</p>

<pre class="src">
(<span style="color: #7f007f;">if</span> (string-match <span style="color: #bc8f8f;">"apple-darwin"</span> system-configuration)
    (set-face-font 'default <span style="color: #bc8f8f;">"Monaco-13"</span>)
  (set-frame-font <span style="color: #bc8f8f;">"Monospace-10"</span>))

(<span style="color: #7f007f;">when</span> (string-match <span style="color: #bc8f8f;">"apple-darwin"</span> system-configuration)
  (setq mac-allow-anti-aliasing t)
  (setq mac-command-modifier 'meta)
  (setq mac-option-modifier 'none))
</pre>

<p>So all in all, I don't expect this <code>emacs-kicker</code> to please everyone, but I
expect it to be simple and rich enough (thanks to <a href="https://github.com/dimitri/el-get">el-get</a>), and it should be
a good <em>kick start</em> that's easy to adapt.</p>

<p>If you want to try it without installing it it's very easy to do so.  Just
clone the <code>git</code> repository then start an <code>Emacs</code> that will use this.  For
example that could be, using the excellent <a href="http://emacsformacosx.com/">Emacs For MacOSX</a>:</p>

<pre class="src">
 $ /Applications/Emacs.app/Contents/MacOS/Emacs -Q -l init.el
</pre>

<p>I hope some readers will find it useful! :)</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 15 Apr 2011 21:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/04/blog/2011/04/15-emacs-kicker.html</guid>
</item>
<item>
  <title>Emacs Kicker</title>
  <link>http://tapoueh.org/blog/2011/04/15-emacs-kicker.html</link>
  <description><![CDATA[h1>Emacs Kicker</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/04/index.html>04</a> / </div>
<div class="date">Friday, April 15 2011, 21:30</div>
</div>
<div id="article">
<p>Following up on the very popular <a href="https://github.com/technomancy/emacs-starter-kit">emacs-starter-kit</a>, I'm now proposing the
<a href="https://github.com/dimitri/emacs-kicker">emacs-kicker</a>.  It's about the <code>.emacs</code> file you've seen in older posts here,
which I maintain for some colleagues.  After all, if they find it useful,
some more people might to, so I've decided to publish it.</p>

<p>What you'll find is a very simple <code>128</code> lines <a href="http://www.gnu.org/software/emacs/">Emacs</a> user init file, based on
<a href="https://github.com/dimitri/el-get">el-get</a> for external packages.  A not so <em>random</em> selection of those is used,
here's the list when you hide some details:</p>

<pre class="src">
 '(el-get                       <span style="color: #888a85;">; </span><span style="color: #888a85;">el-get is self-hosting
</span>   escreen                      <span style="color: #888a85;">; </span><span style="color: #888a85;">screen for emacs, C-\ C-h
</span>   php-mode-improved            <span style="color: #888a85;">; </span><span style="color: #888a85;">if you're into php...
</span>   psvn                         <span style="color: #888a85;">; </span><span style="color: #888a85;">M-x svn-status
</span>   switch-window                <span style="color: #888a85;">; </span><span style="color: #888a85;">takes over C-x o
</span>   auto-complete                <span style="color: #888a85;">; </span><span style="color: #888a85;">complete as you type with overlays
</span>   emacs-goodies-el             <span style="color: #888a85;">; </span><span style="color: #888a85;">the debian addons for emacs
</span>   yasnippet                    <span style="color: #888a85;">; </span><span style="color: #888a85;">powerful snippet mode
</span>   zencoding-mode               <span style="color: #888a85;">; </span><span style="color: #888a85;">http://www.emacswiki.org/emacs/ZenCoding
</span>   (<span style="color: #729fcf;">:name</span> buffer-move           <span style="color: #888a85;">; </span><span style="color: #888a85;">move buffers around in windows
</span>   (<span style="color: #729fcf;">:name</span> smex                  <span style="color: #888a85;">; </span><span style="color: #888a85;">a better (ido like) M-x
</span>   (<span style="color: #729fcf;">:name</span> magit                 <span style="color: #888a85;">; </span><span style="color: #888a85;">git meet emacs, and a binding
</span>   (<span style="color: #729fcf;">:name</span> goto-last-change      <span style="color: #888a85;">; </span><span style="color: #888a85;">move pointer back to last change
</span></pre>

<p>Another interresting thing to note in this <code>kicker</code> is a choice of some key
bindings that are rather unusual (yet) I guess.</p>

<pre class="src">
(global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-b"</span>) 'ido-switch-buffer)
(global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-c"</span>) 'ido-switch-buffer)
(global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x B"</span>) 'ibuffer)
</pre>

<p>Yes, you see that I've rebound <code>C-x C-c</code> to switching buffers.  That key is
really easy to use and I don't think that <code>M-x kill-emacs</code> deserves it.  Keys
that are so easy to use should be kept for frequent actions, and quiting
emacs is a once-a-day to once-a-month action here.  And you can still quit
from the window manager button or from the menu or from <code>M-x</code>.</p>

<p>Also <em>Mac</em> users are not left behind, you will see some settings that either
are adapted to the system (like choosing another <em>font</em>, keep displaying the
<code>menu-bar</code> or not installing the darkish <code>tango-color-mode</code> on this system,
where it renders poorly in my opinion), as you can see here:</p>

<pre class="src">
(<span style="color: #729fcf; font-weight: bold;">if</span> (string-match <span style="color: #ad7fa8; font-style: italic;">"apple-darwin"</span> system-configuration)
    (set-face-font 'default <span style="color: #ad7fa8; font-style: italic;">"Monaco-13"</span>)
  (set-frame-font <span style="color: #ad7fa8; font-style: italic;">"Monospace-10"</span>))

(<span style="color: #729fcf; font-weight: bold;">when</span> (string-match <span style="color: #ad7fa8; font-style: italic;">"apple-darwin"</span> system-configuration)
  (setq mac-allow-anti-aliasing t)
  (setq mac-command-modifier 'meta)
  (setq mac-option-modifier 'none))
</pre>

<p>So all in all, I don't expect this <code>emacs-kicker</code> to please everyone, but I
expect it to be simple and rich enough (thanks to <a href="https://github.com/dimitri/el-get">el-get</a>), and it should be
a good <em>kick start</em> that's easy to adapt.</p>

<p>If you want to try it without installing it it's very easy to do so.  Just
clone the <code>git</code> repository then start an <code>Emacs</code> that will use this.  For
example that could be, using the excellent <a href="http://emacsformacosx.com/">Emacs For MacOSX</a>:</p>

<pre class="src">
 $ /Applications/Emacs.app/Contents/MacOS/Emacs -Q -l init.el
</pre>

<p>I hope some readers will find it useful! :)</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/switch-window.html">switch-window</a> <a href="../../../tags/emacs-kicker.html">emacs-kicker</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 15 Apr 2011 21:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/04/15-emacs-kicker.html</guid>
</item>
<item>
  <title>Some notes about Skytools3</title>
  <link>http://tapoueh.org/blog/2011/04/11-some-notes-about-skytools3.html</link>
  <description><![CDATA[h1>Some notes about Skytools3</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/04/index.html>04</a> / </div>
<div class="date">Monday, April 11 2011, 11:30</div>
</div>
<div id="article">
<p>I've been working on <a href="http://github.com/markokr/skytools">skytools3</a> packaging lately.  I've been pushing quite a
lot of work into it, in order to have exactly what I needed out of the box,
after some 3 years of production and experiences with the products.  Plural,
yes, because even if <a href="http://wiki.postgresql.org/wiki/PgBouncer">pgbouncer</a> and <a href="http://wiki.postgresql.org/wiki/PL/Proxy">plproxy</a> are siblings to the projets (same
developers team, separate life cycle and releases), then <code>skytools</code> still
includes several sub-projects.</p>

<p>Here's what the <code>skytools3</code> packaging is going to look like:</p>

<pre class="src">
skytools3              Skytool's replication and queuing
python-pgq3            Skytool's PGQ python library
python-skytools3       python scripts framework for skytools
skytools-ticker3       PGQ ticker daemon service
skytools-walmgr3       high-availability archive and restore commands
postgresql-8.4-pgq3    PGQ server-side code (C module for PostgreSQL)
postgresql-9.0-pgq3    PGQ server-side code (C module for PostgreSQL)
</pre>

<p>This split is needed so that you can install your <em>daemons</em> (we call them
<em>consumers</em>) on separate machines than where you run <a href="http://postgresql.org">PostgreSQL</a>.  But for the
<code>walmgr</code> part, it makes no sense to install it if you don't have a local
PostgreSQL service, as it's providing <code>archive</code> and <code>restore</code> commands.  Then
the <em>ticker</em>, you're free to run it on any machine really, so just package it
this way (in <code>skytools3</code> the <em>ticker</em> is written in <code>C</code> and does not depend on the
python framework any more).</p>

<p>What you can't see here yet is the new goodies that wraps it as a quality
<code>debian</code> package.  A new <code>skytools</code> user is created for you when you install the
<code>skytools3</code> package (which contains the services), along with a skeleton file
<code>/etc/skytools.ini</code> and a user directory <code>/etc/skytools/</code>.  Put in there your
services configuration file, and register those service in the
<code>/etc/skytools.ini</code> file itself.  Then they will get cared about in the <code>init</code>
sequence at startup and shutdown of your server.</p>

<p>The services will run under the <code>skytools</code> system user, and will default to
put their log into <code>/var/log/skytools/</code>.  The <code>pidfile</code> will get into
<code>/var/run/skytools/</code>.  All integrated, automated.</p>

<p>Next big <em>TODO</em> is about documentation, reviewing it and polishing it, and I
think that <code>skytools3</code> will then get ready for public release.  Yes, you read
it right, it's happening this very year!  I'm very excited about it, and
have several architectures that will greatly benefit from the switch to
<code>skytools3</code>.  More on that later, though!  (Yes, my <em>to blog later</em> list is
getting quite long now).</p>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/skytools.html">skytools</a> <a href="../../../tags/restore.html">restore</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 11 Apr 2011 11:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/04/11-some-notes-about-skytools3.html</guid>
</item>























<item>
  <title>towards pg_staging 1.0</title>
  <link>http://tapoueh.org/blog/2011/03/29-towards-pg_staging-10.html</link>
  <description><![CDATA[h1>towards pg_staging 1.0</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/03/index.html>03</a> / </div>
<div class="date">Tuesday, March 29 2011, 15:30</div>
</div>
<div id="article">
<p>If you don't remember about what <a href="pgstaging.html">pg_staging</a> is all about, it's a central
console from where to control all your <a href="http://www.postgresql.org/">PostgreSQL</a> databases.  Typically you
use it to manage your development and pre-production setup, where developers
ask you pretty often to install them some newer dump from the production,
and you want that operation streamlined and easy.</p>

<center>
<p><img src="../../../images//pg_staging.png" alt=""></p>
</center>


<h3>Usage</h3>

<p class="first">The typical session would be something like this:</p>

<pre class="src">
pg_staging&gt; databases foodb.dev
                    foodb      foodb_20100824 :5432
           foodb_20100209      foodb_20100209 :5432
           foodb_20100824      foodb_20100824 :5432
                pgbouncer           pgbouncer :6432
                 postgres            postgres :5432

pg_staging&gt; dbsizes foodb.dev
foodb.dev
           foodb_20100209: -1
           foodb_20100824: 104 GB
Total                    = 104 GB

pg_staging&gt; restore foodb.dev
...
pg_staging&gt; switch foodb.dev today
</pre>

<p>The list of supported commands is quite long now, and documented too (it
comes with two man pages).  The <code>restore</code> one is the most important and will
create the database, add it to the <code>pgbouncer</code> setup, fetch the backup named
<code>dbname.`date -I`.dump</code>, prepare a filtered object list (more on that), load
<em>pre</em> <code>SQL</code> scripts, launch <code>pg_restore</code>, <code>VACUUM ANALYZE</code> the database when
configured to do so, load the <em>post</em> <code>SQL</code> scripts then optionaly <em>switch</em> the
<code>pgbouncer</code> setup to default to this new database.</p>


<h3>Filtering</h3>

<p class="first">The newer option is called <code>tablename_nodata_regexp</code>, and here's its documentation in full:</p>

<blockquote>
<p class="quoted">
List of table names regexp (comma separated) to restore without
content. The <code>pg_restore</code> catalog <code>TABLE DATA</code> sections will get
filtered out.  The regexp is applied against <code>schemaname.tablename</code>
and non-anchored by default.</p>

</blockquote>

<p>This comes to supplement the <code>schemas</code> and <code>schemas_nodata</code> options, that allows
to only restore objects from a given set of <em>schemas</em> (filtering out triggers
that will calls function that are in the excluded schemas, like
e.g. <a href="http://wiki.postgresql.org/wiki/Skytools">Londiste</a> ones) or to restore only the <code>TABLE</code> definitions while skipping
the <code>TABLE DATA</code> entries.</p>


<h3>Setup</h3>

<p class="first">To setup your environment for <em>pg_staging</em>, you need to take some steps.  It's
not complex but it's fairly involved.  The benefit is this amazingly useful
central unique console to control as many databases as you need.</p>

<p>You need a <code>pg_staging.ini</code> file where to describe your environment.  I
typically name the sessions in there by the name of the database to restore
followed by a <code>dev</code> or <code>preprod</code> extension.</p>

<p>You need to have all your backups available through <code>HTTP</code>, and as of now,
served by the famous <em>apache</em> <code>mod_dir</code> directory listing.  It's easy to add
support to other methods, but is has not been done yet.  You also need to
have a cluster wide <code>--globals-only</code> backup available somewhere so that you
can easily create the users etc you need from <code>pg_staging</code>.</p>

<p>You also need to run a <code>pgbouncer</code> daemon on each database server, allowing
you to bypass editing connection strings when you <code>switch</code> a new database
version live.</p>

<p>You also need to install the <em>client</em> script, have a local <code>pgstaging</code> system
user and allow it to run the client script as root, so that it's able to
control some services and edit <code>pgbouncer.ini</code> for you.</p>


<h3>Status</h3>

<p class="first">I'm still using it a lot (several times a week) to manage a whole
development and pre-production environment set, so the very low
<a href="https://github.com/dimitri/pg_staging">code activity</a> of the project is telling that it's pretty stable (last series
of <em>commits</em> are all bug fixes and round corners).</p>

<p>Given that, I'm thinking in terms of <code>pg_staging 1.0</code> soon!  Now is a pretty
good time to try it and see how it can help you.</p>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/skytools.html">skytools</a> <a href="../../../tags/backup.html">backup</a> <a href="../../../tags/restore.html">restore</a> <a href="../../../tags/pg_staging.html">pg_staging</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 29 Mar 2011 15:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/03/29-towards-pg_staging-10.html</guid>
</item>
<item>
  <title>Extensions in 9.1</title>
  <link>http://tapoueh.org/blog/2011/03/01-extensions-in-91.html</link>
  <description><![CDATA[h1>Extensions in 9.1</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/03/index.html>03</a> / </div>
<div class="date">Tuesday, March 01 2011, 16:30</div>
</div>
<div id="article">
<p>If you've not been following closely you might have missed out on extensions
integration.  Well, <a href="http://en.wikipedia.org/wiki/Tom_Lane_(computer_scientist)">Tom</a> spent some time on the patches I've been preparing
for the last 4 months.  And not only did he commit most of the work but he
also enhanced some parts of the code (better factoring) and basically
finished it.</p>

<p>At the <a href="http://wiki.postgresql.org/wiki/PgCon_2010_Developer_Meeting">previous developer meeting</a> his advice was to avoid putting too much
into the very first version of the patch for it to stand its chances of
being integrated, and while in the review process more than one major
<a href="http://www.postgresql.org/">PostgreSQL</a> contributor expressed worries about the size of the patch and the
number of features proposed.  Which is the usual process.</p>

<p>Then what happened is that <strong><em>Tom</em></strong> finally took a similar reasoning as mine
while working on the feature.  To maximize the benefits, once you have the
infrastructure in place, it's not that much more work to provide the really
interesting features.  What's complex is agreeing on what exactly are their
specifications.  And in the <em>little</em> time window we got on this commit fest
(well, we hijacked about 2 full weeks there), we managed to get there.</p>

<p>So in the end the result is quite amazing, and you can see that on the
documentation chapter about it:
<a href="http://developer.postgresql.org/pgdocs/postgres/extend-extensions.html">35.15. Packaging Related Objects into an Extension</a>.</p>

<p>All the <em>contrib</em> modules that are installing <code>SQL</code> objects into databases for
you to use them are now converted to <strong><em>Extensions</em></strong> too, and will get released
in <code>9.1</code> with an upgrade script that allows you to <em>upgrade from unpackaged</em>.
That means that once you've upgraded from a past PostgreSQL release up to
<code>9.1</code>, it will be a command away for you to register <em>extensions</em> as such.  I
expect third party <em>extension</em> authors (from <a href="http://pgfoundry.org/projects/ip4r/">ip4r</a> to <a href="http://pgfoundry.org/projects/temporal">temporal</a>) to release a
<em>upgrade-from-unpackaged</em> version of their work too.</p>

<p>Of course, a big use case of the <em>extensions</em> is also in-house <code>PL</code> code, and
having version number and multi-stage upgrade scripts there will be
fantastic too, I can't wait to work with such a tool set myself.  Some later
blog post will detail the benefits and usage.  I'm already trying to think
how much of this version and upgrade facility could be expanded to classic
<code>DDL</code> objects…</p>

<p>So expect some more blog posts from me on this subject, I will have to talk
about <em>debian packaging</em> an extension (it's getting damn easy with
<a href="http://packages.debian.org/squeeze/postgresql-server-dev-all">postgresql-server-dev-all</a> — yes it has received some planing ahead), and
about how to package your own extension, manage upgrades, turn your current
<code>pre-9.1</code> extension into a <em>full blown extension</em>, and maybe how to stop
worrying about extension when you're a DBA.</p>

<p>If you have some features you would want to discuss for next releases,
please do contact me!</p>

<p>Meanwhile, I'm very happy that this project of mine finally made it to <em>core</em>,
it's been long in the making.  Some years to talk about it and then finally
4 months of coding that I'll remember as a marathon.  Many Thanks go to all
who helped here, from <a href="http://www.2ndquadrant.com/">2ndQuadrant</a> to early reviewers to people I talked to
over beers at conferences… lots of people really.</p>

<p>To an extended PostgreSQL (and beyond) :)</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/pgcon.html">pgcon</a> <a href="../../../tags/conferences.html">Conferences</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/ip4r.html">ip4r</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 01 Mar 2011 16:30:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/03/01-extensions-in-91.html</guid>
</item>

























<item>
  <title>desktop-mode and readahead</title>
  <link>http://tapoueh.org/blog/2011/02/blog/2011/02/23-desktop-mode-and-readahead.html</link>
  <description><![CDATA[<p>I'm using <a href="http://www.gnu.org/software/emacs/manual/html_node/elisp/Desktop-Save-Mode.html#Desktop-Save-Mode">Desktop Save Mode</a> so that <a href="http://www.gnu.org/software/emacs/">Emacs</a> knows to open again all the
buffers I've been using.  That goes quite well with how often I start <code>Emacs</code>,
that is once a week or once a month.  Now, <code>M-x ibuffer</code> last line is as
following:</p>

<pre class="src">
    718 buffers         19838205                  668 files, 15 processes
</pre>

<p>That means that at startup, <code>Emacs</code> will load that many files.  In order not
to have to wait until it's done doing so, I've setup things this way:</p>

<pre class="src">
<span style="color: #b22222;">;; </span><span style="color: #b22222;">and the session
</span>(setq desktop-restore-eager 20
      desktop-lazy-verbose nil)
(desktop-save-mode 1)
(savehist-mode 1)
</pre>

<p>Problem is that it's still slow.  An idea I had was to use the <a href="https://fedorahosted.org/readahead/browser/README">readahead</a>
tool that allows reducing some distributions boot time.  Of course this tool
is not expecting the same file format as <code>emacs-desktop</code> uses.  Still,
converting is quite easy is some <code>awk</code> magic.  Here's the result:</p>

<pre class="src">
<span style="color: #b22222;">;;; </span><span style="color: #b22222;">dim-desktop.el --- Dimitri Fontaine
</span><span style="color: #b22222;">;;</span><span style="color: #b22222;">
</span><span style="color: #b22222;">;; </span><span style="color: #b22222;">Allows to prepare a readahead file list from desktop-save
</span>
(<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">desktop</span>)

(<span style="color: #7f007f;">defvar</span> <span style="color: #b8860b;">dim-desktop-file-readahead-list</span>
  <span style="color: #bc8f8f;">"~/.emacs.desktop.readahead"</span>
  <span style="color: #bc8f8f;">"*Where to save the emacs desktop `readahead` file list"</span>)

(<span style="color: #7f007f;">defvar</span> <span style="color: #b8860b;">dim-desktop-filelist-command</span>
  <span style="color: #bc8f8f;">"gawk -F '[ \"]' '/desktop-.*-buffer/ {getline; if($4) print $4}' %s"</span>
  <span style="color: #bc8f8f;">"Command to run to prepare the readahead file list"</span>)

(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">dim-desktop-get-readahead-file-list</span> (<span style="color: #228b22;">&amp;optional</span> filename dir)
  <span style="color: #bc8f8f;">"get the file list for readahead from dekstop file in DIR, or ~"</span>
  (<span style="color: #7f007f;">with-temp-file</span> (or filename dim-desktop-file-readahead-list)
    (insert
     (shell-command-to-string
      (format dim-desktop-filelist-command
              (expand-file-name desktop-base-file-name (or dir <span style="color: #bc8f8f;">"~"</span>)))))))

<span style="color: #b22222;">;; </span><span style="color: #b22222;">This will not work because the hook is run before to add the buffers into
</span><span style="color: #b22222;">;; </span><span style="color: #b22222;">the desktop file.
</span><span style="color: #b22222;">;;</span><span style="color: #b22222;">
</span><span style="color: #b22222;">;;</span><span style="color: #b22222;">(add-hook 'desktop-save-hook 'dim-desktop-get-readahead-file-list)
</span>
<span style="color: #b22222;">;; </span><span style="color: #b22222;">so instead, advise the function
</span>(<span style="color: #7f007f;">defadvice</span> <span style="color: #0000ff;">desktop-save</span> (after desktop-save-readahead activate)
  <span style="color: #bc8f8f;">"Prepare a readahead(8) file for the desktop file"</span>
  (dim-desktop-get-readahead-file-list))

(<span style="color: #7f007f;">provide</span> '<span style="color: #5f9ea0;">dim-desktop</span>)
</pre>

<p>The <code>awk</code> construct <code>getline</code> allows to process the next line of the input file,
which is very practical here (and in a host of other situations).  Now that
we have a file containing the list of files <code>Emacs</code> will load, we have to
tweak the system to <code>readahead</code> those disk blocks.  As I'm currently using <a href="http://kde.org/">KDE</a>
again, I've done it thusly:</p>

<pre class="src">
% cat ~/.kde/Autostart/readahead.emacs.sh
#! /bin/bash

# just readahead the emacs desktop files
# this file listing is maintained directly from Emacs itself
readahead ~/.emacs.desktop.readahead
</pre>

<p>So, well, it works.  The files that <code>Emacs</code> will need are pre-read, so at the
time the desktop really gets to them, I see no more disk activity (laptops
have a led to see that happening).  But the desktop loading time has not
changed...</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 23 Feb 2011 16:45:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/02/blog/2011/02/23-desktop-mode-and-readahead.html</guid>
</item>
<item>
  <title>desktop-mode and readahead</title>
  <link>http://tapoueh.org/blog/2011/02/23-desktop-mode-and-readahead.html</link>
  <description><![CDATA[h1>desktop-mode and readahead</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/02/index.html>02</a> / </div>
<div class="date">Wednesday, February 23 2011, 16:45</div>
</div>
<div id="article">
<p>I'm using <a href="http://www.gnu.org/software/emacs/manual/html_node/elisp/Desktop-Save-Mode.html#Desktop-Save-Mode">Desktop Save Mode</a> so that <a href="http://www.gnu.org/software/emacs/">Emacs</a> knows to open again all the
buffers I've been using.  That goes quite well with how often I start <code>Emacs</code>,
that is once a week or once a month.  Now, <code>M-x ibuffer</code> last line is as
following:</p>

<pre class="src">
    718 buffers         19838205                  668 files, 15 processes
</pre>

<p>That means that at startup, <code>Emacs</code> will load that many files.  In order not
to have to wait until it's done doing so, I've setup things this way:</p>

<pre class="src">
<span style="color: #888a85;">;; </span><span style="color: #888a85;">and the session
</span>(setq desktop-restore-eager 20
      desktop-lazy-verbose nil)
(desktop-save-mode 1)
(savehist-mode 1)
</pre>

<p>Problem is that it's still slow.  An idea I had was to use the <a href="https://fedorahosted.org/readahead/browser/README">readahead</a>
tool that allows reducing some distributions boot time.  Of course this tool
is not expecting the same file format as <code>emacs-desktop</code> uses.  Still,
converting is quite easy is some <code>awk</code> magic.  Here's the result:</p>

<pre class="src">
<span style="color: #888a85;">;;; </span><span style="color: #888a85;">dim-desktop.el --- Dimitri Fontaine
</span><span style="color: #888a85;">;;</span><span style="color: #888a85;">
</span><span style="color: #888a85;">;; </span><span style="color: #888a85;">Allows to prepare a readahead file list from desktop-save
</span>
(<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">desktop</span>)

(<span style="color: #729fcf; font-weight: bold;">defvar</span> <span style="color: #eeeeec;">dim-desktop-file-readahead-list</span>
  <span style="color: #ad7fa8; font-style: italic;">"~/.emacs.desktop.readahead"</span>
  <span style="color: #888a85;">"*Where to save the emacs desktop `readahead` file list"</span>)

(<span style="color: #729fcf; font-weight: bold;">defvar</span> <span style="color: #eeeeec;">dim-desktop-filelist-command</span>
  <span style="color: #ad7fa8; font-style: italic;">"gawk -F '[ \"]' '/desktop-.*-buffer/ {getline; if($4) print $4}' %s"</span>
  <span style="color: #888a85;">"Command to run to prepare the readahead file list"</span>)

(<span style="color: #729fcf; font-weight: bold;">defun</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">dim-desktop-get-readahead-file-list</span> (<span style="color: #8ae234; font-weight: bold;">&amp;optional</span> filename dir)
  <span style="color: #888a85;">"get the file list for readahead from dekstop file in DIR, or ~"</span>
  (<span style="color: #729fcf; font-weight: bold;">with-temp-file</span> (or filename dim-desktop-file-readahead-list)
    (insert
     (shell-command-to-string
      (format dim-desktop-filelist-command
              (expand-file-name desktop-base-file-name (or dir <span style="color: #ad7fa8; font-style: italic;">"~"</span>)))))))

<span style="color: #888a85;">;; </span><span style="color: #888a85;">This will not work because the hook is run before to add the buffers into
</span><span style="color: #888a85;">;; </span><span style="color: #888a85;">the desktop file.
</span><span style="color: #888a85;">;;</span><span style="color: #888a85;">
</span><span style="color: #888a85;">;;</span><span style="color: #888a85;">(add-hook 'desktop-save-hook 'dim-desktop-get-readahead-file-list)
</span>
<span style="color: #888a85;">;; </span><span style="color: #888a85;">so instead, advise the function
</span>(<span style="color: #729fcf; font-weight: bold;">defadvice</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">desktop-save</span> (after desktop-save-readahead activate)
  <span style="color: #888a85;">"Prepare a readahead(8) file for the desktop file"</span>
  (dim-desktop-get-readahead-file-list))

(<span style="color: #729fcf; font-weight: bold;">provide</span> '<span style="color: #8ae234;">dim-desktop</span>)
</pre>

<p>The <code>awk</code> construct <code>getline</code> allows to process the next line of the input file,
which is very practical here (and in a host of other situations).  Now that
we have a file containing the list of files <code>Emacs</code> will load, we have to
tweak the system to <code>readahead</code> those disk blocks.  As I'm currently using <a href="http://kde.org/">KDE</a>
again, I've done it thusly:</p>

<pre class="src">
% cat ~/.kde/Autostart/readahead.emacs.sh
#! /bin/bash

# just readahead the emacs desktop files
# this file listing is maintained directly from Emacs itself
readahead ~/.emacs.desktop.readahead
</pre>

<p>So, well, it works.  The files that <code>Emacs</code> will need are pre-read, so at the
time the desktop really gets to them, I see no more disk activity (laptops
have a led to see that happening).  But the desktop loading time has not
changed...</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/restore.html">restore</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 23 Feb 2011 16:45:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/02/23-desktop-mode-and-readahead.html</guid>
</item>
<item>
  <title>Back from FOSDEM</title>
  <link>http://tapoueh.org/blog/2011/02/07-back-from-fosdem.html</link>
  <description><![CDATA[h1>Back from FOSDEM</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/02/index.html>02</a> / </div>
<div class="date">Monday, February 07 2011, 11:10</div>
</div>
<div id="article">
<p>This year we were in the main building of the conference, and apparently the
booth went very well, solding lots of <a href="http://postgresqleu.spreadshirt.net/">PostgreSQL merchandise</a> etc.  I had the
pleasure to once again meet with the community, but being there only 1 day I
didn't spend as much time as I would have liked with some of the people there.</p>

<p>In case you're wondering, my <a href="http://fosdem.org/2011/schedule/event/pg_extension1">extension's talk</a> went quite well, and several
people were kind enough to tell me they appreciated it!  There was video
recording of it, so we will soon have proofs showing how bad it really was
and how <em>polite</em> those people really are :)</p>

<p>I will soon be able to write an article series detailing what's an Extension
and how you deal with them, either as a user or an author.  Well in fact the
goal is for any user to easily become an extension author, as I think lots
of people are already maintaining server side code but missing tools to
manage it properly.  But that will begin once the patch is in, so that I
present <em>the real stuff</em> rather than what I proposed to the community… Stay
tuned!</p>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/conferences.html">Conferences</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/fosdem.html">FOSDEM</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 07 Feb 2011 11:10:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/02/07-back-from-fosdem.html</guid>
</item>
<item>
  <title>Going to FOSDEM</title>
  <link>http://tapoueh.org/blog/2011/02/01-going-to-fosdem.html</link>
  <description><![CDATA[h1>Going to FOSDEM</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/02/index.html>02</a> / </div>
<div class="date">Tuesday, February 01 2011, 13:35</div>
</div>
<div id="article">
<p>A quick blog entry to say that yes:</p>

<center>
<p><img src="../../../images//going-to-fosdem-2011.png" alt=""></p>
</center>


<p>And I will even do my <a href="http://fosdem.org/2011/schedule/event/pg_extension1">Extension's talk</a> which had a <a href="http://blog.hagander.net/archives/183-Feedback-from-PGDay.EU-the-speakers.html">success at pgday.eu</a>.  The
talk will be updated to include the last developments of the extension's
feature, as some of it changed already in between, and to detail the plan
for the <code>ALTER EXTENSION ... UPGRADE</code> feature that I'd like to see included as
soon as <code>9.1</code>, but time is running so fast.</p>

<p>In fact the design for the <code>UPGRADE</code> has been done and reviewed already, but
there's yet to reach consensus on how to setup which is the upgrade file to
use when upgrading from a given version to another.  I've solved it in my
patch, of course, by adding properties into the extension's <em>control
file</em>. That's the best place to have that setup I think, it allows lots of
flexibility, leave the extension's author in charge, and avoids any hard
coding of any kind of assumptions about file naming or whatever.</p>

<p>Next days and reviews will tell us more about how the design is received.
Meanwhile, we're working on finalizing the main extension's patch, offering
<code>pg_dump</code> support.</p>

<p>See you at <a href="http://fosdem.org/2011/">FOSDEM</a>!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/conferences.html">Conferences</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/fosdem.html">FOSDEM</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 01 Feb 2011 13:35:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/02/01-going-to-fosdem.html</guid>
</item>


<item>
  <title>Starting afresh with el-get</title>
  <link>http://tapoueh.org/blog/2011/01/blog/2011/01/11-starting-afresh-with-el-get.html</link>
  <description><![CDATA[<p>It so happens that a colleague of mine wanted to start using <a href="http://www.gnu.org/software/emacs/">Emacs</a> but
couldn't get to it. He insists on having proper color themes in all
applications and some sensible defaults full of nifty add-ons everywhere,
and didn't want to have to learn that much about <em>Emacs</em> and <em>Emacs Lisp</em> to get
started. I'm not even sure that he will <a href="http://www.gnu.org/software/emacs/tour/">Take the Emacs tour</a>.</p>

<p>You would tell me that there's nothing we can do for so unfriendly
users. Well, here's what I did:</p>

<pre class="src">
<span style="color: #b22222;">;; </span><span style="color: #b22222;">emacs setup
</span>
(add-to-list 'load-path <span style="color: #bc8f8f;">"~/.emacs.d/el-get/el-get"</span>)
(<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">el-get</span>)
(setq
 el-get-sources
 '(el-get
   php-mode-improved
   psvn
   auto-complete
   switch-window

   (<span style="color: #da70d6;">:name</span> buffer-move
          <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> ()
                   (global-set-key (kbd <span style="color: #bc8f8f;">"&lt;C-S-up&gt;"</span>)     'buf-move-up)
                   (global-set-key (kbd <span style="color: #bc8f8f;">"&lt;C-S-down&gt;"</span>)   'buf-move-down)
                   (global-set-key (kbd <span style="color: #bc8f8f;">"&lt;C-S-left&gt;"</span>)   'buf-move-left)
                   (global-set-key (kbd <span style="color: #bc8f8f;">"&lt;C-S-right&gt;"</span>)  'buf-move-right)))

   (<span style="color: #da70d6;">:name</span> magit
          <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> ()
                   (global-set-key (kbd <span style="color: #bc8f8f;">"C-x C-z"</span>) 'magit-status)))

   (<span style="color: #da70d6;">:name</span> goto-last-change
          <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> ()
                   <span style="color: #b22222;">;; </span><span style="color: #b22222;">azerty keyboard here, don't use C-x C-/
</span>                   (global-set-key (kbd <span style="color: #bc8f8f;">"C-x C-_"</span>) 'goto-last-change)))))

(<span style="color: #7f007f;">when</span> window-system
   (add-to-list 'el-get-sources  'color-theme-tango))

(el-get 'sync)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">visual settings
</span>(setq inhibit-splash-screen t)
(menu-bar-mode -1)
(tool-bar-mode -1)
(scroll-bar-mode -1)

(line-number-mode 1)
(column-number-mode 1)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">Use the clipboard, pretty please, so that copy/paste "works"
</span>(setq x-select-enable-clipboard t)

(set-frame-font <span style="color: #bc8f8f;">"Monospace-10"</span>)

(global-hl-line-mode)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">suivre les changements exterieurs sur les fichiers
</span>(global-auto-revert-mode 1)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">pour les couleurs dans M-x shell
</span>(autoload 'ansi-color-for-comint-mode-on <span style="color: #bc8f8f;">"ansi-color"</span> nil t)
(add-hook 'shell-mode-hook 'ansi-color-for-comint-mode-on)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">S-fleches pour changer de fen&#234;tre
</span>(windmove-default-keybindings)
(setq windmove-wrap-around t)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">find-file-at-point quand &#231;a a du sens
</span>(setq ffap-machine-p-known 'accept) <span style="color: #b22222;">; </span><span style="color: #b22222;">no pinging
</span>(setq ffap-url-regexp nil) <span style="color: #b22222;">; </span><span style="color: #b22222;">disable URL features in ffap
</span>(setq ffap-ftp-regexp nil) <span style="color: #b22222;">; </span><span style="color: #b22222;">disable FTP features in ffap
</span>(define-key global-map (kbd <span style="color: #bc8f8f;">"C-x C-f"</span>) 'find-file-at-point)

(<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">ibuffer</span>)
(global-set-key <span style="color: #bc8f8f;">"\C-x\C-b"</span> 'ibuffer)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">use iswitchb-mode for C-x b
</span>(iswitchb-mode)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">I can't remember having meant to use C-z as suspend-frame
</span>(global-set-key (kbd <span style="color: #bc8f8f;">"C-z"</span>) 'undo)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">winner-mode pour revenir sur le layout pr&#233;c&#233;dent C-c &lt;left&gt;
</span>(winner-mode 1)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">dired-x pour C-x C-j
</span>(<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">dired-x</span>)

<span style="color: #b22222;">;; </span><span style="color: #b22222;">full screen
</span>(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">fullscreen</span> ()
  (interactive)
  (set-frame-parameter nil 'fullscreen
                       (<span style="color: #7f007f;">if</span> (frame-parameter nil 'fullscreen) nil 'fullboth)))
(global-set-key [f11] 'fullscreen)
</pre>

<p>With just this simple 87 lines (all included) of setup, my local user is
very happy to switch to using <a href="http://www.gnu.org/software/emacs/">our favorite editor</a>. And he's not even afraid
(yet) of his <code>~/.emacs</code>. I say that's a very good sign of where we are with
<a href="https://github.com/dimitri/el-get">el-get</a>!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 11 Jan 2011 16:20:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/01/blog/2011/01/11-starting-afresh-with-el-get.html</guid>
</item>
<item>
  <title>Starting afresh with el-get</title>
  <link>http://tapoueh.org/blog/2011/01/11-starting-afresh-with-el-get.html</link>
  <description><![CDATA[h1>Starting afresh with el-get</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2011/index.html>2011</a> / <a href=../../../blog/2011/01/index.html>01</a> / </div>
<div class="date">Tuesday, January 11 2011, 16:20</div>
</div>
<div id="article">
<p>It so happens that a colleague of mine wanted to start using <a href="http://www.gnu.org/software/emacs/">Emacs</a> but
couldn't get to it. He insists on having proper color themes in all
applications and some sensible defaults full of nifty add-ons everywhere,
and didn't want to have to learn that much about <em>Emacs</em> and <em>Emacs Lisp</em> to get
started. I'm not even sure that he will <a href="http://www.gnu.org/software/emacs/tour/">Take the Emacs tour</a>.</p>

<p>You would tell me that there's nothing we can do for so unfriendly
users. Well, here's what I did:</p>

<pre class="src">
<span style="color: #888a85;">;; </span><span style="color: #888a85;">emacs setup
</span>
(add-to-list 'load-path <span style="color: #ad7fa8; font-style: italic;">"~/.emacs.d/el-get/el-get"</span>)
(<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">el-get</span>)
(setq
 el-get-sources
 '(el-get
   php-mode-improved
   psvn
   auto-complete
   switch-window

   (<span style="color: #729fcf;">:name</span> buffer-move
          <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                   (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"&lt;C-S-up&gt;"</span>)     'buf-move-up)
                   (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"&lt;C-S-down&gt;"</span>)   'buf-move-down)
                   (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"&lt;C-S-left&gt;"</span>)   'buf-move-left)
                   (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"&lt;C-S-right&gt;"</span>)  'buf-move-right)))

   (<span style="color: #729fcf;">:name</span> magit
          <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                   (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-z"</span>) 'magit-status)))

   (<span style="color: #729fcf;">:name</span> goto-last-change
          <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                   <span style="color: #888a85;">;; </span><span style="color: #888a85;">azerty keyboard here, don't use C-x C-/
</span>                   (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-_"</span>) 'goto-last-change)))))

(<span style="color: #729fcf; font-weight: bold;">when</span> window-system
   (add-to-list 'el-get-sources  'color-theme-tango))

(el-get 'sync)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">visual settings
</span>(setq inhibit-splash-screen t)
(menu-bar-mode -1)
(tool-bar-mode -1)
(scroll-bar-mode -1)

(line-number-mode 1)
(column-number-mode 1)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">Use the clipboard, pretty please, so that copy/paste "works"
</span>(setq x-select-enable-clipboard t)

(set-frame-font <span style="color: #ad7fa8; font-style: italic;">"Monospace-10"</span>)

(global-hl-line-mode)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">suivre les changements exterieurs sur les fichiers
</span>(global-auto-revert-mode 1)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">pour les couleurs dans M-x shell
</span>(autoload 'ansi-color-for-comint-mode-on <span style="color: #ad7fa8; font-style: italic;">"ansi-color"</span> nil t)
(add-hook 'shell-mode-hook 'ansi-color-for-comint-mode-on)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">S-fleches pour changer de fen&#234;tre
</span>(windmove-default-keybindings)
(setq windmove-wrap-around t)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">find-file-at-point quand &#231;a a du sens
</span>(setq ffap-machine-p-known 'accept) <span style="color: #888a85;">; </span><span style="color: #888a85;">no pinging
</span>(setq ffap-url-regexp nil) <span style="color: #888a85;">; </span><span style="color: #888a85;">disable URL features in ffap
</span>(setq ffap-ftp-regexp nil) <span style="color: #888a85;">; </span><span style="color: #888a85;">disable FTP features in ffap
</span>(define-key global-map (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-f"</span>) 'find-file-at-point)

(<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">ibuffer</span>)
(global-set-key <span style="color: #ad7fa8; font-style: italic;">"\C-x\C-b"</span> 'ibuffer)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">use iswitchb-mode for C-x b
</span>(iswitchb-mode)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">I can't remember having meant to use C-z as suspend-frame
</span>(global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-z"</span>) 'undo)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">winner-mode pour revenir sur le layout pr&#233;c&#233;dent C-c &lt;left&gt;
</span>(winner-mode 1)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">dired-x pour C-x C-j
</span>(<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">dired-x</span>)

<span style="color: #888a85;">;; </span><span style="color: #888a85;">full screen
</span>(<span style="color: #729fcf; font-weight: bold;">defun</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">fullscreen</span> ()
  (interactive)
  (set-frame-parameter nil 'fullscreen
                       (<span style="color: #729fcf; font-weight: bold;">if</span> (frame-parameter nil 'fullscreen) nil 'fullboth)))
(global-set-key [f11] 'fullscreen)
</pre>

<p>With just this simple 87 lines (all included) of setup, my local user is
very happy to switch to using <a href="http://www.gnu.org/software/emacs/">our favorite editor</a>. And he's not even afraid
(yet) of his <code>~/.emacs</code>. I say that's a very good sign of where we are with
<a href="https://github.com/dimitri/el-get">el-get</a>!</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/switch-window.html">switch-window</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 11 Jan 2011 16:20:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2011/01/11-starting-afresh-with-el-get.html</guid>
</item>
<item>
  <title>el-get 1.1, with 174 recipes</title>
  <link>http://tapoueh.org/blog/2010/12/blog/2010/12/20-el-get-11-with-174-recipes.html</link>
  <description><![CDATA[<p>Yes, you read it well, <a href="https://github.com/dimitri/el-get">el-get</a> currently <em>features</em> <code>174</code> <a href="https://github.com/dimitri/el-get/tree/master/recipes">recipes</a>, and is now
reaching the <code>1.1</code> release. The reason for this release is mainly that I have
two big chunks of code to review and the current code has been very stable
for awhile. It seems better to do a release with the stable code that exists
now before to shake it this much. If you're wondering when to jump in the
water and switch to using <em>el-get</em>, now is a pretty good time.</p>

<h3>New source types</h3>

<p class="first">We now have support for the <a href="http://www.archlinux.org/pacman/">pacman</a> package management for <a href="http://www.archlinux.org/">archlinux</a>, and a
way to handle a different package name in the recipe and in the
distribution. We also have support for <a href="http://mercurial.selenic.com/">mercurial</a> and <a href="http://subversion.tigris.org/">subversion</a> and <a href="http://darcs.net/">darcs</a>.</p>

<p>Also, <a href="http://wiki.debian.org/Apt">apt-get</a> will sometime prompt you to validate its choices, that's the
infamous <em>Do you want to continue?</em> prompt. We now handle that smoothly.</p>


<h3>(el-get 'sync)</h3>

<p class="first">In <code>1.1</code>, that really means <em>synchronous</em>. That means we install one package
after the other, and any error will stop it all. Before that, it was an
active wait loop over a parallel install: this option is still available
through calling <code>(el-get 'wait)</code>.</p>


<h3>No more <em>failed to install</em></h3>

<p class="first">Exactly. This error you may have encountered sometime is due to trying to
install a package over a previous failed install attempt (network outage,
disk full, bad work-in-progress recipe, etc). After awhile in the field it
was clear that no case where found where you would regret it if <a href="https://github.com/dimitri/el-get">el-get</a> just
did removed the previous failed installation for you before to go and
install again, as aked. So that's now automatic.</p>


<h3>Featuring an overhauled :build facility</h3>

<p class="first">The <code>build</code> commands can now either be a list, as before, or some that we
<em>evaluate</em> for you. That allows for easier to maintain <em>recipes</em>, and here's an
exemple of that:</p>

<pre class="src">
(<span style="color: #da70d6;">:name</span> distel
       <span style="color: #da70d6;">:type</span> svn
       <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"http://distel.googlecode.com/svn/trunk/"</span>
       <span style="color: #da70d6;">:info</span> <span style="color: #bc8f8f;">"doc"</span>
       <span style="color: #da70d6;">:build</span> `,(mapcar
                 (<span style="color: #7f007f;">lambda</span> (target)
                   (concat <span style="color: #bc8f8f;">"make "</span> target <span style="color: #bc8f8f;">" EMACS="</span> el-get-emacs))
                 '(<span style="color: #bc8f8f;">"clean"</span> <span style="color: #bc8f8f;">"all"</span>))
       <span style="color: #da70d6;">:load-path</span> (<span style="color: #bc8f8f;">"elisp"</span>)
       <span style="color: #da70d6;">:features</span> distel)
</pre>

<p>As you see that also allows for maintainance of multi-platform build
recipes, and multiple emacs versions too. It's still a little too much on
the <em>awkward</em> side of things, though, and that's one of the ongoing work that
will happen for next version.</p>


<h3>Misc improvements</h3>

<p class="first">We are now able to <code>byte-compile</code> your packages, and offer some more hooks
(<code>el-get-init-hooks</code> has been asked with a nice usage example). There's a new
<code>:localname</code> property that allows to pick where to save the local file when
using <code>HTTP</code> method for retrieval, and that in turn allows to fix some
<em>recipes</em>.</p>

<pre class="src">
(<span style="color: #da70d6;">:name</span> xcscope
       <span style="color: #da70d6;">:type</span> http
       <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"http://cscope.cvs.sourceforge.net/viewvc/cscope/cscope/contrib/xcsc</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">ope/xcscope.el?revision=1.14&amp;content-type=text%2Fplain"</span>
       <span style="color: #da70d6;">:localname</span> <span style="color: #bc8f8f;">"xscope.el"</span>
       <span style="color: #da70d6;">:features</span> xcscope)
</pre>

<p>Oh and you even get <code>:before</code> user function support, even if needing it often
shows that you're doing it in a strange way. More often than not it's
possible to do all you need to in the <code>:after</code> function, but this tool is
there so that you spend less time on having a working environment, not more,
right? :)</p>


<h3>Switch notice</h3>

<p class="first">All in all, if you're already using <a href="https://github.com/dimitri/el-get">el-get</a> you should consider switching to
<code>1.1</code> (by issuing <code>M-x el-get-update</code> of course), and if you're hesitating, just
join the fun now!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 20 Dec 2010 16:45:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/12/blog/2010/12/20-el-get-11-with-174-recipes.html</guid>
</item>
<item>
  <title>el-get 1.1, with 174 recipes</title>
  <link>http://tapoueh.org/blog/2010/12/20-el-get-11-with-174-recipes.html</link>
  <description><![CDATA[h1>el-get 1.1, with 174 recipes</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/12/index.html>12</a> / </div>
<div class="date">Monday, December 20 2010, 16:45</div>
</div>
<div id="article">
<p>Yes, you read it well, <a href="https://github.com/dimitri/el-get">el-get</a> currently <em>features</em> <code>174</code> <a href="https://github.com/dimitri/el-get/tree/master/recipes">recipes</a>, and is now
reaching the <code>1.1</code> release. The reason for this release is mainly that I have
two big chunks of code to review and the current code has been very stable
for awhile. It seems better to do a release with the stable code that exists
now before to shake it this much. If you're wondering when to jump in the
water and switch to using <em>el-get</em>, now is a pretty good time.</p>

<h3>New source types</h3>

<p class="first">We now have support for the <a href="http://www.archlinux.org/pacman/">pacman</a> package management for <a href="http://www.archlinux.org/">archlinux</a>, and a
way to handle a different package name in the recipe and in the
distribution. We also have support for <a href="http://mercurial.selenic.com/">mercurial</a> and <a href="http://subversion.tigris.org/">subversion</a> and <a href="http://darcs.net/">darcs</a>.</p>

<p>Also, <a href="http://wiki.debian.org/Apt">apt-get</a> will sometime prompt you to validate its choices, that's the
infamous <em>Do you want to continue?</em> prompt. We now handle that smoothly.</p>


<h3>(el-get 'sync)</h3>

<p class="first">In <code>1.1</code>, that really means <em>synchronous</em>. That means we install one package
after the other, and any error will stop it all. Before that, it was an
active wait loop over a parallel install: this option is still available
through calling <code>(el-get 'wait)</code>.</p>


<h3>No more <em>failed to install</em></h3>

<p class="first">Exactly. This error you may have encountered sometime is due to trying to
install a package over a previous failed install attempt (network outage,
disk full, bad work-in-progress recipe, etc). After awhile in the field it
was clear that no case where found where you would regret it if <a href="https://github.com/dimitri/el-get">el-get</a> just
did removed the previous failed installation for you before to go and
install again, as aked. So that's now automatic.</p>


<h3>Featuring an overhauled :build facility</h3>

<p class="first">The <code>build</code> commands can now either be a list, as before, or some that we
<em>evaluate</em> for you. That allows for easier to maintain <em>recipes</em>, and here's an
exemple of that:</p>

<pre class="src">
(<span style="color: #729fcf;">:name</span> distel
       <span style="color: #729fcf;">:type</span> svn
       <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"http://distel.googlecode.com/svn/trunk/"</span>
       <span style="color: #729fcf;">:info</span> <span style="color: #ad7fa8; font-style: italic;">"doc"</span>
       <span style="color: #729fcf;">:build</span> `,(mapcar
                 (<span style="color: #729fcf; font-weight: bold;">lambda</span> (target)
                   (concat <span style="color: #ad7fa8; font-style: italic;">"make "</span> target <span style="color: #ad7fa8; font-style: italic;">" EMACS="</span> el-get-emacs))
                 '(<span style="color: #ad7fa8; font-style: italic;">"clean"</span> <span style="color: #ad7fa8; font-style: italic;">"all"</span>))
       <span style="color: #729fcf;">:load-path</span> (<span style="color: #ad7fa8; font-style: italic;">"elisp"</span>)
       <span style="color: #729fcf;">:features</span> distel)
</pre>

<p>As you see that also allows for maintainance of multi-platform build
recipes, and multiple emacs versions too. It's still a little too much on
the <em>awkward</em> side of things, though, and that's one of the ongoing work that
will happen for next version.</p>


<h3>Misc improvements</h3>

<p class="first">We are now able to <code>byte-compile</code> your packages, and offer some more hooks
(<code>el-get-init-hooks</code> has been asked with a nice usage example). There's a new
<code>:localname</code> property that allows to pick where to save the local file when
using <code>HTTP</code> method for retrieval, and that in turn allows to fix some
<em>recipes</em>.</p>

<pre class="src">
(<span style="color: #729fcf;">:name</span> xcscope
       <span style="color: #729fcf;">:type</span> http
       <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"http://cscope.cvs.sourceforge.net/viewvc/cscope/cscope/contrib/xcsc</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">ope/xcscope.el?revision=1.14&amp;content-type=text%2Fplain"</span>
       <span style="color: #729fcf;">:localname</span> <span style="color: #ad7fa8; font-style: italic;">"xscope.el"</span>
       <span style="color: #729fcf;">:features</span> xcscope)
</pre>

<p>Oh and you even get <code>:before</code> user function support, even if needing it often
shows that you're doing it in a strange way. More often than not it's
possible to do all you need to in the <code>:after</code> function, but this tool is
there so that you spend less time on having a working environment, not more,
right? :)</p>


<h3>Switch notice</h3>

<p class="first">All in all, if you're already using <a href="https://github.com/dimitri/el-get">el-get</a> you should consider switching to
<code>1.1</code> (by issuing <code>M-x el-get-update</code> of course), and if you're hesitating, just
join the fun now!</p>



<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/release.html">release</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 20 Dec 2010 16:45:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/12/20-el-get-11-with-174-recipes.html</guid>
</item>












<item>
  <title>Dynamic Triggers in PLpgSQL</title>
  <link>http://tapoueh.org/blog/2010/11/24-dynamic-triggers-in-plpgsql.html</link>
  <description><![CDATA[h1>Dynamic Triggers in PLpgSQL</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/11/index.html>11</a> / </div>
<div class="date">Wednesday, November 24 2010, 16:45</div>
</div>
<div id="article">
<p>You certainly know that implementing <em>dynamic</em> triggers in <code>PLpgSQL</code> is
impossible. But I had a very bad night, being up from as soon as 3:30 am
today, so that when a developer asked me about reusing the same trigger
function code from more than one table and for a dynamic column name, I
didn't remember about it being impossible.</p>

<p>Here's what happens in such cases, after a long time on the problem (yes,
overall, that's a slow day). Note that I'm abusing the <code>(record_literal).*</code>
notation a lot in there, and even the <code>(record_literal).column_name</code> too.</p>

<pre class="src">
CREATE OR REPLACE FUNCTION public.update_timestamp()
 RETURNS TRIGGER
 LANGUAGE plpgsql
AS $f$
DECLARE
    ts_column varchar;
    old_timestamp timestamptz;
    attname name;
    n text;
    v text;
BEGIN
    IF TG_NARGS != 1
    THEN
        RAISE EXCEPTION <span style="color: #ad7fa8; font-style: italic;">'Trigger public.update_timestamp() called with % args'</span>,
                         TG_NARGS;
    END IF;

    ts_column := TG_ARGV[0];

    EXECUTE <span style="color: #ad7fa8; font-style: italic;">'SELECT n.'</span> || ts_column
         || <span style="color: #ad7fa8; font-style: italic;">' FROM (SELECT ('</span>
         || quote_literal(OLD) || <span style="color: #ad7fa8; font-style: italic;">'::'</span> || TG_RELID::regclass
         || <span style="color: #ad7fa8; font-style: italic;">').*) as n'</span>
       INTO old_timestamp;

    <span style="color: #888a85;">-- build NEW record text
</span>    n := <span style="color: #ad7fa8; font-style: italic;">'('</span>;
    FOR attname IN
      EXECUTE <span style="color: #ad7fa8; font-style: italic;">'SELECT attname '</span>
           || <span style="color: #ad7fa8; font-style: italic;">'  FROM pg_class c left join pg_attribute a on a.attrelid = c.oid'</span>
           || <span style="color: #ad7fa8; font-style: italic;">' WHERE c.oid = $1 and attnum &gt; 0 order by attnum'</span>
       USING TG_RELID
    LOOP
        EXECUTE <span style="color: #ad7fa8; font-style: italic;">'SELECT ('</span> || quote_literal(NEW) || <span style="color: #ad7fa8; font-style: italic;">'::'</span> || TG_RELID::regclass || <span style="color: #ad7fa8; font-style: italic;">').'</span> || attname INTO v;

        IF n != <span style="color: #ad7fa8; font-style: italic;">'('</span> THEN n := n || <span style="color: #ad7fa8; font-style: italic;">','</span>; END IF;

        IF attname = ts_column
           AND v::timestamptz IS NOT DISTINCT FROM old_timestamp
        THEN
                n := n || now();
        ELSE
                n := n || COALESCE(v, <span style="color: #ad7fa8; font-style: italic;">''</span>);
        END IF;
    END LOOP;
    n := n || <span style="color: #ad7fa8; font-style: italic;">')'</span>;

    EXECUTE <span style="color: #ad7fa8; font-style: italic;">'SELECT ($1::'</span> || TG_RELID::regclass || <span style="color: #ad7fa8; font-style: italic;">').*'</span>
      INTO NEW
     USING n;

    RETURN NEW;
END;
$f$;
</pre>

<p>It's not pretty, and not fast. It's about <code>2 ms</code> per call on a table with <code>15</code>
columns, in some preliminary tests. But it sure was a nice challenge!</p>


<h2>Tags</h2>

<p><a href="../../../tags/plpgsql.html">plpgsql</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 24 Nov 2010 16:45:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/11/24-dynamic-triggers-in-plpgsql.html</guid>
</item>
<item>
  <title>pg_basebackup</title>
  <link>http://tapoueh.org/blog/2010/11/07-pg_basebackup.html</link>
  <description><![CDATA[h1>pg_basebackup</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/11/index.html>11</a> / </div>
<div class="date">Sunday, November 07 2010, 13:45</div>
</div>
<div id="article">
<p><a href="http://2ndquadrant.com/about/#krosing">Hannu</a> just gave me a good idea in <a href="http://archives.postgresql.org/pgsql-hackers/2010-11/msg00236.php">this email</a> on <a href="http://archives.postgresql.org/pgsql-hackers/">-hackers</a>, proposing that
<a href="https://github.com/dimitri/pg_basebackup">pg_basebackup</a> should get the <code>xlog</code> files again and again in a loop for the
whole duration of the <em>base backup</em>. That's now done in the aforementioned
tool, whose options got a little more useful now:</p>

<pre class="src">
Usage: pg_basebackup.py [-v] [-f] [-j jobs] <span style="color: #ad7fa8; font-style: italic;">"dsn"</span> dest

Options:
  -h, --help            show this help message and exit
  --version             show version and quit
  -x, --pg_xlog         backup the pg_xlog files
  -v, --verbose         be verbose and about processing progress
  -d, --debug           show debug information, including SQL queries
  -f, --force           remove destination directory if it exists
  -j JOBS, --jobs=JOBS  how many helper jobs to launch
  -D DELAY, --delay=DELAY
                        pg_xlog subprocess loop delay, see -x
  -S, --slave           auxilliary process
  --stdin               get list of files to backup from stdin
</pre>

<p>Yeah, as implementing the <code>xlog</code> idea required having some kind of
parallelism, I built on it and the script now has a <code>--jobs</code> option for you to
setup how many processes to launch in parallel, all fetching some <code>base
backup</code> files in its own standard (<code>libpq</code>) <a href="http://www.postgresql.org/">PostgreSQL</a> connection, in
compressed chunks of <code>8 MB</code> (so that's not <code>8 MB</code> chunks sent over).</p>

<p>The <code>xlog</code> loop will fetch any <code>WAL</code> file whose <code>ctime</code> changed again,
wholesale. It's easier this way, and tools to get optimized behavior already
do exist, either <a href="http://skytools.projects.postgresql.org/doc/walmgr.html">walmgr</a> or <a href="http://www.postgresql.org/docs/9.0/interactive/warm-standby.html#STREAMING-REPLICATION">walreceiver</a>.</p>

<p>The script is still a little <a href="http://python.org/">python</a> self-contained short file, it just went
from about <code>100</code> lines of code to about <code>400</code> lines. There's no external
dependency, all it needs is provided by a standard python installation. The
problem with that is that it's using <code>select.poll()</code> that I think is not
available on windows. Supporting every system or adding to the dependencies,
I've been choosing what's easier for me.</p>

<pre class="src">
    <span style="color: #729fcf; font-weight: bold;">import</span> select
    <span style="color: #eeeeec;">p</span> = select.poll()
    p.register(sys.stdin, select.POLLIN)
</pre>

<p>If you get to try it, please report about it, you should know or easily
discover my <em>email</em>!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/skytools.html">skytools</a> <a href="../../../tags/backup.html">backup</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sun, 07 Nov 2010 13:45:00 +0100</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/11/07-pg_basebackup.html</guid>
</item>


<item>
  <title>Introducing Extensions</title>
  <link>http://tapoueh.org/blog/2010/10/21-introducing-extensions.html</link>
  <description><![CDATA[h1>Introducing Extensions</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/10/index.html>10</a> / </div>
<div class="date">Thursday, October 21 2010, 13:45</div>
</div>
<div id="article">
<p>After reading <a href="http://database-explorer.blogspot.com/2010/10/extensions-in-91.html">Simon's blog post</a>, I can't help but try to give some details
about what it is exactly that I'm working on. As he said, there are several
aspects to <em>extensions</em> in <a href="http://www.postgresql.org/">PostgreSQL</a>, it all begins here:
<a href="http://www.postgresql.org/docs/9.0/interactive/extend.html">Chapter 35. Extending SQL</a>.</p>

<p>It's possible, and mostly simple enough, to add your own code or behavior to
PostgreSQL, so that it will use your code and your semantics while solving
user queries. That's highly useful and it's easy to understand how so when
you look at some projects like <a href="http://postgis.refractions.net/">PostGIS</a>, <a href="http://pgfoundry.org/projects/ip4r/">ip4r</a> (index searches of <code>ip</code> in a
<code>range</code>, not limited to <code>CIDR</code> notation), or our own <em>Key Value Store</em>, <a href="http://www.postgresql.org/docs/9.0/interactive/hstore.html">hstore</a>.</p>

<h3>So, what's in an <em>Extension</em>?</h3>

<p class="first">An <em>extension</em> in its simple form is a <code>SQL</code> <em>script</em> that you load on your
database, but manage separately. Meaning you don't want the script to be
part of your backups. Often, that kind of script will create new datatypes
and operators, support functions, user functions and index support, and then
it would include some <code>C</code> code that ships in a <em>shared library object</em>.</p>

<p>As far as PostgreSQL is concerned, at least in the current version of my
patch, the extension is first a <em>meta</em> information file that allows to
register it. We currently call that the <code>control</code> file. Then, it's an <code>SQL</code>
script that is <em>executed</em> by the server when you <code>create</code> the <em>extension</em>.</p>

<p>If it so happens that the <code>SQL</code> script depends on some <em>shared library objects</em>
file, this has to be present at the right place (<code>MODULE_PATHNAME</code>) for the
<em>extension</em> to be successfully created, but that's always been the case.</p>

<p>The problem with current releases of PostgreSQL, that the <em>extension</em> patch is
solving, is the <code>pg_dump</code> and <code>pg_restore</code> support. We said it, you don't want
the <code>SQL</code> script to be part of your dump, because it's not maintained in your
database, but in some code repository out there. What you want is to be able
to install the <em>extension</em> again at the file system level then <code>pg_restore</code> your
database — that depends on it being there.</p>

<p>And that's exactly what the <em>extension</em> patch provides. By now having a <code>SQL</code>
object called an <code>extension</code>, and maintained in the new <code>pg_extension</code> catalog,
we have an <code>Oid</code> to refer to. Which we do by recording a dependency between
any object created by the script and the <em>extension</em> <code>Oid</code>, so that <code>pg_dump</code> can
be instructed to skip those.</p>


<h3>Examples?</h3>

<p class="first">So, let's have a look at what you can do if you play with a patched
development server version, or if you play directly from the <code>git</code> repository
at
<a href="http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension">http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension</a></p>

<pre class="src">
dim ~ createdb exts
dim ~ psql exts
psql (9.1devel)
Type <span style="color: #ad7fa8; font-style: italic;">"help"</span> for help.

dim=# \dx+
                                                        List of extensions
        Name        | | |                               Description
--------------------+-+-+-------------------------------------------------------------------------
 adminpack          | | | Administrative functions for PostgreSQL
 auto_username      | | | functions for tracking who changed a table
 autoinc            | | | functions for autoincrementing fields
 btree_gin          | | | GIN support for common types BTree operators
 btree_gist         | | | GiST support for common types BTree operators
 chkpass            | | | Store crypt()ed passwords
 citext             | | | case-insensitive character string type
 cube               | | | data type for representing multidimensional cubes
 dblink             | | | connect to other PostgreSQL databases from within a database
 dict_int           | | | example of an add-on dictionary template for full-text search
 dict_xsyn          | | | example of an add-on dictionary template for full-text search
 earthdistance      | | | calculating great circle distances on the surface of the Earth
 fuzzystrmatch      | | | determine similarities and distance between strings
 hstore             | | | storing sets of key/value pairs
 int_aggregate      | | | integer aggregator and an enumerator (obsolete)
 intarray           | | | one-dimensional arrays of integers: functions, operators, index support
 isn                | | | data types for the international product numbering standards
 lo                 | | | managing Large Objects
 ltree              | | | data type for hierarchical tree-like structure
 moddatetime        | | | functions for tracking last modification time
 pageinspect        | | | inspect the contents of database pages at a low level
 pg_buffercache     | | | examine the shared buffer cache in real time
 pg_freespacemap    | | | examine the free space map (FSM)
 pg_stat_statements | | | tracking execution statistics of all SQL statements executed
 pg_trgm            | | | determine the similarity of text, with indexing support
 pgcrypto           | | | cryptographic functions
 pgrowlocks         | | | show row locking information for a specified table
 pgstattuple        | | | obtain tuple-level statistics
 prefix             | | | Prefix Match Indexing
 refint             | | | functions for implementing referential integrity
 seg                | | | data type for representing line segments, or floating point intervals
 tablefunc          | | | various functions that return tables, including crosstab(text sql)
 test_parser        | | | example of a custom parser for full-text search
 timetravel         | | | functions for implementing time travel
 tsearch2           | | | backwards-compatible text search functionality (pre-8.3)
 unaccent           | | | text search dictionary that removes accents
(36 rows)
</pre>

<p>Ok I've edited the output in a visible way, to leave the <em>Version</em> and <em>Custom
Variable Classes</em> column out. It's taking lots of screen place and it's not
that useful here. Maybe the <em>classes</em> one will even get dropped out of the
patch before reaching <code>9.1</code>, we'll see.</p>

<p>Let's pick an extension there and install it in our new database:</p>

<pre class="src">
exts=# create extension pg_trgm;
NOTICE:  Installing extension 'pg_trgm' from '/Users/dim/pgsql/exts/share/contrib/pg_trgm.sql', with user data
CREATE EXTENSION
exts=# \dx
                                           List of extensions
  Name   |  |  |                       Description
---------+--+--+---------------------------------------------------------
 pg_trgm |  |  | determine the similarity of text, with indexing support
(1 row)
</pre>

<p>See, that was easy enough. Same thing, the extra columns have been
removed. So, what's in this extension, will you ask me, what are those
objects that you would normally (that is, before the patch) find in your
<code>pg_dump</code> backup script?</p>

<pre class="src">
exts=# select * from pg_extension_objects('pg_trgm');
    class     | classid | objid |                                                                objdesc
--------------+---------+-------+----------------------------------------------------------------------------------------------------------------------------------------
 pg_extension |    3996 | 18498 | extension pg_trgm
 pg_proc      |    1255 | 18499 | function set_limit(real)
 pg_proc      |    1255 | 18500 | function show_limit()
 pg_proc      |    1255 | 18501 | function show_trgm(text)
 pg_proc      |    1255 | 18502 | function similarity(text,text)
 pg_proc      |    1255 | 18503 | function similarity_op(text,text)
 pg_operator  |    2617 | 18504 | operator %(text,text)
 pg_type      |    1247 | 18505 | type gtrgm
 pg_proc      |    1255 | 18506 | function gtrgm_in(cstring)
 pg_proc      |    1255 | 18507 | function gtrgm_out(gtrgm)
 pg_type      |    1247 | 18508 | type gtrgm[]
 pg_proc      |    1255 | 18509 | function gtrgm_consistent(internal,text,integer,oid,internal)
 pg_proc      |    1255 | 18510 | function gtrgm_compress(internal)
 pg_proc      |    1255 | 18511 | function gtrgm_decompress(internal)
 pg_proc      |    1255 | 18512 | function gtrgm_penalty(internal,internal,internal)
 pg_proc      |    1255 | 18513 | function gtrgm_picksplit(internal,internal)
 pg_proc      |    1255 | 18514 | function gtrgm_union(bytea,internal)
 pg_proc      |    1255 | 18515 | function gtrgm_same(gtrgm,gtrgm,internal)
 pg_opfamily  |    2753 | 18516 | operator family gist_trgm_ops for access method gist
 pg_opclass   |    2616 | 18517 | operator class gist_trgm_ops for access method gist
 pg_amop      |    2602 | 18518 | operator 1 %(text,text) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18519 | function 1 gtrgm_consistent(internal,text,integer,oid,internal) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18520 | function 2 gtrgm_union(bytea,internal) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18521 | function 3 gtrgm_compress(internal) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18522 | function 4 gtrgm_decompress(internal) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18523 | function 5 gtrgm_penalty(internal,internal,internal) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18524 | function 6 gtrgm_picksplit(internal,internal) of operator family gist_trgm_ops for access method gist
 pg_amproc    |    2603 | 18525 | function 7 gtrgm_same(gtrgm,gtrgm,internal) of operator family gist_trgm_ops for access method gist
 pg_proc      |    1255 | 18526 | function gin_extract_trgm(text,internal)
 pg_proc      |    1255 | 18527 | function gin_extract_trgm(text,internal,smallint,internal,internal)
 pg_proc      |    1255 | 18528 | function gin_trgm_consistent(internal,smallint,text,integer,internal,internal)
 pg_opfamily  |    2753 | 18529 | operator family gin_trgm_ops for access method gin
 pg_opclass   |    2616 | 18530 | operator class gin_trgm_ops for access method gin
 pg_amop      |    2602 | 18531 | operator 1 %(text,text) of operator family gin_trgm_ops for access method gin
 pg_amproc    |    2603 | 18532 | function 1 btint4cmp(integer,integer) of operator family gin_trgm_ops for access method gin
 pg_amproc    |    2603 | 18533 | function 2 gin_extract_trgm(text,internal) of operator family gin_trgm_ops for access method gin
 pg_amproc    |    2603 | 18534 | function 3 gin_extract_trgm(text,internal,smallint,internal,internal) of operator family gin_trgm_ops for access method gin
 pg_amproc    |    2603 | 18535 | function 4 gin_trgm_consistent(internal,smallint,text,integer,internal,internal) of operator family gin_trgm_ops for access method gin
(38 rows)
</pre>

<p>This function main intended users are the <em>extension</em> authors themselves, so
that it's easy for them to figure out which system identifier (the <code>objid</code>
column) has been attributed to some <code>SQL</code> objects from their install
script. With this knowledge, you can prepare some <em>upgrade</em> scripts. But
that's for another patch altogether, so we'll get back to the matter in
another blog entry.</p>

<p>So we chose <a href="http://www.postgresql.org/docs/9.0/interactive/pgtrgm.html">trgm</a> as an example, let's follow the documentation and create a
test table and a custom index in there, just so that the extension is put to
good use. Then let's try to <code>DROP</code> our extension, because we're testing the
infrastructure, right?</p>

<pre class="src">
exts=# create table test(id bigint, name text);
CREATE TABLE
exts=# CREATE INDEX idx_test_name ON test USING gist (name gist_trgm_ops);
CREATE INDEX
exts=# drop extension pg_trgm;
ERROR:  cannot drop extension pg_trgm because other objects depend on it
DETAIL:  index idx_test_name depends on operator class gist_trgm_ops for access method gist
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
</pre>

<p>Of course PostgreSQL is smart enough here — the <em>extension</em> patch had nothing
special to do to achieve that, apart from recording the dependencies. Next,
as we didn't <code>drop extension pg_trgm cascade;</code>, it's still in the database. So
let's see what a <code>pg_dump</code> will look like. As it's quite a lot of text to
paste, let's see the <code>pg_restore</code> catalog instead. And that's a feature that
needs to be known some more, too.</p>

<pre class="src">
dim ~ pg_dump -Fc exts | pg_restore -l | grep -v '^;'
1812; 1262 18497 DATABASE - exts dim
1; 3996 18498 EXTENSION - pg_trgm
1813; 0 0 COMMENT - EXTENSION pg_trgm
6; 2615 2200 SCHEMA - public dim
1814; 0 0 COMMENT - SCHEMA public dim
1815; 0 0 ACL - public dim
320; 2612 11602 PROCEDURAL LANGUAGE - plpgsql dim
1521; 1259 18543 TABLE public test dim
1809; 0 18543 TABLE DATA public test dim
1808; 1259 18549 INDEX public idx_test_name dim
</pre>

<p>As you see, the only SQL object that got into the backup are an <code>EXTENSION</code>
and its <code>COMMENT</code>. Nothing like the types or the functions that the <code>pg_trgm</code>
script creates.</p>


<h3>What does it means to extension authors?</h3>

<p class="first">In order to be an <em>extension</em>, you have to prepare a <em>control</em> file where to
give the necessary information to register your script. This file must be
named <code>extension.control</code> if the script is named <code>extension.sql</code>, at least at
the moment. This file can benefit from some variable expansion too, like
does the current <code>extension.sql.in</code>, in that if you provide an
<code>extension.control.in</code> file the term <code>VERSION</code> will be expanded to whatever
<code>$(VERSION)</code> is set to in your <code>Makefile</code>.</p>

<p>If you never wrote a <code>C</code> coded <em>extension</em> for PostgreSQL, this might look
complex and irrelevant. Baseline is that you need a <code>Makefile</code> so that you can
benefit easily from the PostgreSQL infrastructure work and have the <code>make
install</code> operation place your files at the right place, including the new
<code>control</code> file.</p>


<h3>That's it for today, folks</h3>

<p class="first">A next blog entry will detail what happens with extensions providing <em>user
data</em>, and the <code>CREATE EXTENSION name WITH NO DATA;</code> variant. Stay tuned!</p>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/ip4r.html">ip4r</a> <a href="../../../tags/plpgsql.html">plpgsql</a> <a href="../../../tags/backup.html">backup</a> <a href="../../../tags/restore.html">restore</a> <a href="../../../tags/prefix.html">prefix</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 21 Oct 2010 13:45:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/10/21-introducing-extensions.html</guid>
</item>
<item>
  <title>Extensions: writing a patch for PostgreSQL</title>
  <link>http://tapoueh.org/blog/2010/10/15-extensions-writing-a-patch-for-postgresql.html</link>
  <description><![CDATA[h1>Extensions: writing a patch for PostgreSQL</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/10/index.html>10</a> / </div>
<div class="date">Friday, October 15 2010, 11:30</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>These days, thanks to my <a href="http://2ndquadrant.com/">community oriented job</a>, I'm working full time on a
<a href="http://www.postgresql.org/">PostgreSQL</a> patch to terminate basic support for <a href="http://www.postgresql.org/docs/9/static/extend.html">extending SQL</a>. First thing I
want to share is that patching the <em>backend code</em> is not as hard as one would
think. Second one is that <a href="http://git-scm.com/">git</a> really is helping.</p>

<p><em>“Not as hard as one would think</em>, are you kidding me?”, I hear some
say. Well, that's true. It's <code>C</code> code in there, but with a very good layer of
abstractions so that you're not dealing with subtle problems that much. Of
course it happens that you have to, and managing the memory isn't an
option. That said, <code>palloc()</code> and the <em>memory contexts</em> implementation makes
that as easy as <em>in lots of cases, you don't have to think about it</em>.</p>

<p>PostgreSQL is very well known for its reliability, and that's not something
that just happened. All the source code is organized in a way that makes it
possible, so your main task is to write code that looks as much as possible
like the existing surrounding code. And we all know how to <em>copy paste</em>,
right?</p>

<p>So, my current work on the <em>extensions</em> is to make it so that if you install
<a href="http://www.postgresql.org/docs/9.0/interactive/hstore.html">hstore</a> in your database (to pick an example), your backup won't contain any
<em>hstore</em> specific objects (types, functions, operators, index support objects,
etc) but rather a single line that tells PostgreSQL to install <em>hstore</em> again.</p>

<pre class="src">
CREATE EXTENSION hstore;
</pre>

<p>The feature already works in <a href="http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension">my git branch</a> and I'm extracting infrastructure
work in there to ease review. That's when <code>git</code> helps a lot. What I've done is
create a new branch from the master one, then <a href="http://www.kernel.org/pub/software/scm/git/docs/git-cherry-pick.html">cherry pick</a> the patches of
interest. Well sometime you have to resort to helper tools. I've been told
after the fact that using <code>git cherry-pick -n</code> would have allowed the
following to be much simpler:</p>

<pre class="src">
dim ~/dev/PostgreSQL/postgresql-extension git cherry-pick 3f291b4f82598309368610431cf2a18d7b7a7950
error: could not apply 3f291b4... Implement dependency tracking for CREATE EXTENSION, and DROP EXTENSION ... CASCADE.
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add &lt;paths&gt;' or 'git rm &lt;paths&gt;'
hint: and commit the result with 'git commit -c 3f291b4'
dim ~/dev/PostgreSQL/postgresql-extension git status \
| awk '/modified/ &amp;&amp; ! /both/ &amp;&amp; ! /genfile/ {print $3}
       /deleted/ {print $5}
       /both/    {print $4}' \
| xargs echo git reset -- \
| sh
Unstaged changes after reset:
M       src/backend/catalog/dependency.c
M       src/backend/catalog/heap.c
M       src/backend/catalog/pg_aggregate.c
M       src/backend/catalog/pg_conversion.c
M       src/backend/catalog/pg_namespace.c
M       src/backend/catalog/pg_operator.c
M       src/backend/catalog/pg_proc.c
M       src/backend/catalog/pg_type.c
M       src/backend/commands/extension.c
M       src/backend/commands/foreigncmds.c
M       src/backend/commands/opclasscmds.c
M       src/backend/commands/proclang.c
M       src/backend/commands/tsearchcmds.c
M       src/backend/nodes/copyfuncs.c
M       src/backend/nodes/equalfuncs.c
M       src/backend/parser/gram.y
M       src/include/catalog/dependency.h
M       src/include/commands/extension.h
M       src/include/nodes/parsenodes.h
</pre>

<p>That's what I did to prepare a side branch containing only changes to a part
of my current work. I had to filter the diff so much only because I'm
commiting in rather big steps, rather than very little chunks at a time. In
this case that means I had a single patch with several <em>units</em> of changes and
I wanted to extract only one. Well, it happens that even in such a case, <code>git</code>
is helping!</p>

<p>There's more to say about the <em>extension</em> related feature of course, but
that'll do it for this article. I'd just end up with the following nice
<em>diffstat</em> of 4 days of work:</p>

<pre class="src">
dim ~/dev/PostgreSQL/postgresql-extension git --no-pager diff master..|wc -l
    3897
dim ~/dev/PostgreSQL/postgresql-extension git --no-pager diff master..|diffstat
 doc/src/sgml/extend.sgml               |   46 ++
 doc/src/sgml/ref/allfiles.sgml         |    2
 doc/src/sgml/ref/create_extension.sgml |   95 ++++
 doc/src/sgml/ref/drop_extension.sgml   |  115 +++++
 doc/src/sgml/reference.sgml            |    2
 src/backend/access/transam/xlog.c      |   95 ----
 src/backend/catalog/Makefile           |    1
 src/backend/catalog/dependency.c       |   25 +
 src/backend/catalog/heap.c             |    9
 src/backend/catalog/objectaddress.c    |   14
 src/backend/catalog/pg_aggregate.c     |    7
 src/backend/catalog/pg_conversion.c    |    7
 src/backend/catalog/pg_namespace.c     |   13
 src/backend/catalog/pg_operator.c      |    7
 src/backend/catalog/pg_proc.c          |    7
 src/backend/catalog/pg_type.c          |    8
 src/backend/commands/Makefile          |    3
 src/backend/commands/comment.c         |    6
 src/backend/commands/extension.c       |  688 +++++++++++++++++++++++++++++++++
 src/backend/commands/foreigncmds.c     |   19
 src/backend/commands/functioncmds.c    |    7
 src/backend/commands/opclasscmds.c     |   13
 src/backend/commands/proclang.c        |    7
 src/backend/commands/tsearchcmds.c     |   25 +
 src/backend/nodes/copyfuncs.c          |   22 +
 src/backend/nodes/equalfuncs.c         |   18
 src/backend/parser/gram.y              |   51 ++
 src/backend/tcop/utility.c             |   27 +
 src/backend/utils/adt/genfile.c        |  193 +++++++++
 src/backend/utils/init/postinit.c      |    3
 src/backend/utils/misc/Makefile        |    2
 src/backend/utils/misc/cfparser.c      |  113 +++++
 src/backend/utils/misc/guc-file.l      |   26 -
 src/backend/utils/misc/guc.c           |  160 ++++++-
 src/bin/pg_dump/common.c               |    6
 src/bin/pg_dump/pg_dump.c              |  520 ++++++++++++++++++++++--
 src/bin/pg_dump/pg_dump.h              |   10
 src/bin/pg_dump/pg_dump_sort.c         |    7
 src/bin/psql/command.c                 |    3
 src/bin/psql/describe.c                |   45 ++
 src/bin/psql/describe.h                |    3
 src/bin/psql/help.c                    |    1
 src/include/catalog/dependency.h       |    1
 src/include/catalog/indexing.h         |    6
 src/include/catalog/pg_extension.h     |   61 ++
 src/include/catalog/pg_proc.h          |   13
 src/include/catalog/toasting.h         |    1
 src/include/commands/extension.h       |   54 ++
 src/include/nodes/nodes.h              |    2
 src/include/nodes/parsenodes.h         |   20
 src/include/parser/kwlist.h            |    1
 src/include/utils/builtins.h           |    4
 src/include/utils/cfparser.h           |   18
 src/include/utils/guc.h                |   11
 src/makefiles/pgxs.mk                  |   21 -
 55 files changed, 2456 insertions(+), 188 deletions(-)
</pre>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/backup.html">backup</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 15 Oct 2010 11:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/10/15-extensions-writing-a-patch-for-postgresql.html</guid>
</item>
<item>
  <title>Date puzzle for starters</title>
  <link>http://tapoueh.org/blog/2010/10/08-date-puzzle-for-starters.html</link>
  <description><![CDATA[h1>Date puzzle for starters</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/10/index.html>10</a> / </div>
<div class="date">Friday, October 08 2010, 10:00</div>
</div>
<div id="article">
<p>The <a href="http://www.postgresql.org/">PostgreSQL</a> <code>IRC</code> channel is a good place to be, for all the very good help
you can get there, because people are always wanting to remain helpful,
because of the off-topics discussions sometime, or to get to talk with
community core members. And to start up your day too.</p>

<p>This morning's question started simple : “how can I check if today is the
&quot;first sunday fo the month&quot;. or &quot;the second tuesday of the month&quot; etc?”</p>

<p>And the first version of the answer, quite simple it is too:</p>

<pre class="src">
dim=#   with begin(d) as (select date_trunc(<span style="color: #ad7fa8; font-style: italic;">'month'</span>, <span style="color: #ad7fa8; font-style: italic;">'today'</span>::date)::date)
dim-# select d + 7 - extract(dow from d)::int as sunday from begin;
   sunday
<span style="color: #888a85;">------------
</span> 2010-10-03
(1 row)
</pre>

<p>So you just have to compare the result of the function with <code>'today'::date</code>
and there you go. The problem is that the question could be read in the
other way round, like, what is today in <em>first</em> or <em>second</em> <em>day name</em> of this
month <em>format</em>? Once more, <a href="http://blog.rhodiumtoad.org.uk/">RhodiumToad</a> to the rescue:</p>

<pre class="src">
select to_char(current_date,
               <span style="color: #ad7fa8; font-style: italic;">'"'</span> || ((ARRAY[<span style="color: #ad7fa8; font-style: italic;">'First'</span>,<span style="color: #ad7fa8; font-style: italic;">'Second'</span>,<span style="color: #ad7fa8; font-style: italic;">'Third'</span>,<span style="color: #ad7fa8; font-style: italic;">'Fourth'</span>,<span style="color: #ad7fa8; font-style: italic;">'Fifth'</span>])
                             [(extract(day from current_date)::integer - 1)/7 + 1]
                      )
                   || <span style="color: #ad7fa8; font-style: italic;">'" Day'</span>);
     to_char
<span style="color: #888a85;">------------------
</span> Second Friday
(1 row)
</pre>

<p>That's a straight answer to the question, read that way!</p>

<p>But the part that I found nice to play with was my first reading of the
question, as I don't get to lose my ideas that easily, you see… so what
about writing a function to return the date of any <em>nth</em> occurrence of a given
<em>day of week</em> in a <em>given month</em>, defaulting to this very month?</p>

<pre class="src">
create or replace function get_nth_dow_of_month
 (
  nth int,
  dow int,
  begin date default current_date
 )
 returns date
 language sql
 strict
 as
$$
with month(d) as (
  select generate_series(date_trunc(<span style="color: #ad7fa8; font-style: italic;">'month'</span>, $3),
                         date_trunc(<span style="color: #ad7fa8; font-style: italic;">'month'</span>, $3) + interval <span style="color: #ad7fa8; font-style: italic;">'1 month - 1 day'</span>,
                         interval <span style="color: #ad7fa8; font-style: italic;">'1 day'</span>)::date
),
     repeat as (
  select d, extract(dow from d) as dow, (d - date_trunc(<span style="color: #ad7fa8; font-style: italic;">'month'</span>, $3)::date) / 7 as repeat
    from month
)
select d
  from repeat
 where dow = $2 and repeat = $1;
$$;

dim=# select get_nth_dow_of_month(0, 0);
 get_nth_dow_of_month
<span style="color: #888a85;">----------------------
</span> 2010-10-03
(1 row)

dim=# select get_nth_dow_of_month(1, 4, <span style="color: #ad7fa8; font-style: italic;">'2010-09-12'</span>);
 get_nth_dow_of_month
<span style="color: #888a85;">----------------------
</span> 2010-09-09
(1 row)
</pre>

<p>So you see we just got the first Sunday of this month <code>(0, 0)</code> and the second
Thursday <code>(1, 4)</code> of the previous one. Any date within a month is a good way
to tell which month you want to work in, as the function's written, abusing
<code>date_trunc</code> like it does.</p>

<p>Now the way the function is written is unfinished. You want to fix it in one
of two ways. Either stop using <code>generate_series</code> to only output one row at a
time, or fix the <code>API</code> so that you can ask for more than a <em>nth dow</em> at a
time. Of course, that was a starter for me, not a problem I need to solve
directly, and that was a good excuse for a blog entry, so I won't fix
it. That's left as an exercise to our interested readers!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/9.1.html">9.1</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 08 Oct 2010 10:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/10/08-date-puzzle-for-starters.html</guid>
</item>
<item>
  <title>Resuming work on Extensions, first little step</title>
  <link>http://tapoueh.org/blog/2010/10/07-resuming-work-on-extensions-first-little-step.html</link>
  <description><![CDATA[h1>Resuming work on Extensions, first little step</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/10/index.html>10</a> / </div>
<div class="date">Thursday, October 07 2010, 17:15</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>Yeah I'm back on working on my part of the extension thing in <a href="http://www.postgresql.org/">PostgreSQL</a>.</p>

<p>First step is a little one, but as it has public consequences, I figured I'd
talk about it already. I've just refreshed my <code>git</code> repository to follow the
new <code>master</code> one, and you can see that here
<a href="http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=9a88e9de246218e93c04b6b97e1ef61d97925430">http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=9a88e9de246218e93c04b6b97e1ef61d97925430</a>.</p>

<p>It's been easier than I feared, mainly:</p>

<pre class="src">
$ git --no-pager diff master..extension
$ git --no-pager format-patch master..extension
$ cp 0001-First-stab-at-writing-pg_execute_from_file-function.patch ..
$ git checkout master
$ git pull -f pgmaster
$ git reset --hard pgmaster/master
$ git checkout extension
$ git reset --hard master
$ git am -s ../0001-First-stab-at-writing-pg_execute_from_file-function.edit.patch
$ git status
$ git log --short | head
$ git log -n2 --oneline
$ git push -f
</pre>

<p>So that's still more steps that one want to call dead simple, but still. The
<code>format-patch</code> command is to save my work away (all patches that are in the
<em>extension</em> branch but not in the <em>master</em> — well that was only one of them
here). Then, as the master repository <code>URL</code> didn't change, I can simply <code>pull</code>
the changes in. Of course I had a nice message <em>warning: no common commits</em>.</p>

<p>Once pulled, I trashed my local copy and replaced it with the new official
one, that's <code>git reset --hard pgmaster/master</code>, then in the <em>extension</em> branch I
could trash it and have it linked to the local <code>master</code> again.</p>

<p>Of course, the <code>git am</code> method wouldn't apply my patch as-is, there was some
underlying changes in the source files, the identification tag changed from
<code>$PostgreSQL$</code> to, e.g., <code>src/backend/utils/adt/genfile.c</code>, and I had to cope
with that. Maybe there's some tool (<code>git am -3</code> ?) to do it automatically, I
just copy edited the <code>.patch</code> file.</p>

<p>Lastly, it's all about checking the result and publishing the result. This
last line is <code>git push -f</code> and is when I just trashed and replaced my
<a href="http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=summary">postgresql-extension</a> community repository. I don't think anybody was
following it, but should it be the case, you will have to <em>reinit</em> your copy.</p>

<p>More blog posts to come about extensions, as I arranged to have some real
time to devote on the topic. At least I was able to arrange things so that I
can work on the subject for real, and the first thing I did, the very night
before it was meant to begin, is catch a <em>tonsillitis</em>. Lost about a week, not
the project! Stay tuned!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/extensions.html">Extensions</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 07 Oct 2010 17:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/10/07-resuming-work-on-extensions-first-little-step.html</guid>
</item>
<item>
  <title>el-get reaches 1.0</title>
  <link>http://tapoueh.org/blog/2010/10/blog/2010/10/07-el-get-reaches-10.html</link>
  <description><![CDATA[<p>It's been a week since the last commits in the <a href="http://github.com/dimitri/el-get">el-get repository</a>, and those
were all about fixing and adding recipes, and about notifications. Nothing
like <em>core plumbing</em> you see. Also, <code>0.9</code> was released on <em>2010-08-24</em> and felt
pretty complete already, then received lots of improvements. It's high time
to cross the line and call it <code>1.0</code>!</p>

<p>Now existing users will certainly just be moderatly happy to see the tool
reach that version number, depending whether they think more about the bugs
they want to see fixed (ftp is supported, only called http) and the new
features they want to see in (<em>info</em> documentation) or more about what <code>el-get</code>
does for them already today...</p>

<p>For the new users, or the yet-to-be-convinced users, let's take some time
and talk about <code>el-get</code>. A <em>FAQ</em> like session might be best.</p>

<h3>How is el-get different from ELPA?</h3>

<p><a href="http://tromey.com/elpa/">ELPA</a> is the <em>Emacs Lisp Package Archive</em> and is also known as <code>package.el</code>, to
be included in Emacs 24. This allows emacs list extension authors to <em>package</em>
their work. That means they have to follow some guidelines and format their
contribution, then propose it for upload.</p>

<p>This requires licence checks (good) and for the <a href="http://elpa.gnu.org/">new official ELPA mirror</a> it
even requires dead-tree papers exchange and contracts and copyright
assignments, I believe.</p>


<h3>Why have both?</h3>

<p class="first">While <em>ELPA</em> is a great thing to have, it's so easy to find some high quality
Emacs extension out there that are not part of the offer. Either authors are
not interrested into uploading to ELPA, or they don't know how to properly
<em>package</em> for it (it's only simple for single file extensions, see).</p>

<p>So <code>el-get</code> is a pragmatic answer here. It's there because it so happens that
I don't depend only on emacs extensions that are available with Emacs
itself, in my distribution <code>site-lisp</code> and in <code>ELPA</code>. I need some more, and I
don't need it to be complex to find it, fetch it, init it and use it.</p>

<p>Of course I could try and package any extension I find I need and submit it
to <code>ELPA</code>, but really, to do that nicely I'd need to contact the extension
author (<em>upstream</em>) for him to accept my patch, and then consider a fork.</p>

<p>With <code>el-get</code> I propose distributed packaging if you will. Let's have a look
at two <em>recipes</em> here. First, the <code>el-get</code> one itself:</p>

<pre class="src">
(<span style="color: #da70d6;">:name</span> el-get
       <span style="color: #da70d6;">:type</span> git
       <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"git://github.com/dimitri/el-get.git"</span>
       <span style="color: #da70d6;">:features</span> el-get
       <span style="color: #da70d6;">:compile</span> <span style="color: #bc8f8f;">"el-get.el"</span>)
</pre>

<p>Then a much more complex one, the <a href="http://bbdb.sourceforge.net/">bbdb</a> one:</p>

<pre class="src">
(<span style="color: #da70d6;">:name</span> bbdb
       <span style="color: #da70d6;">:type</span> git
       <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"git://github.com/barak/BBDB.git"</span>
       <span style="color: #da70d6;">:load-path</span> (<span style="color: #bc8f8f;">"./lisp"</span> <span style="color: #bc8f8f;">"./bits"</span>)
       <span style="color: #da70d6;">:build</span> (<span style="color: #bc8f8f;">"./configure"</span> <span style="color: #bc8f8f;">"make autoloads"</span> <span style="color: #bc8f8f;">"make"</span>)
       <span style="color: #da70d6;">:build/darwin</span> (<span style="color: #bc8f8f;">"./configure --with-emacs=/Applications/Emacs.app/Contents</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">/MacOS/Emacs" "make autoloads" "make")</span>
       <span style="color: #da70d6;">:features</span> bbdb
       <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> () (bbdb-initialize))
       <span style="color: #da70d6;">:info</span> <span style="color: #bc8f8f;">"texinfo"</span>)
</pre>

<p>The idea is that it's much simpler to just come up with a recipe like this
than to patch existing code and upload it to <code>ELPA</code>. And anybody can share
their <em>recipes</em> very easily, with or without proposing them to me, even if I
very much like to add some more in the official <code>el-get</code> list.</p>

<p>As a user, you don't even need to twiddle with recipes, mostly, because we
already have them for you. What you do instead is list them in
<code>el-get-sources</code>.</p>


<h3>So, show me how you use it?</h3>

<p class="first">Yeah, sure. Here's a sample of my <code>dim-packages.el</code> file, part of my <code>.emacs</code>
<em>suite</em>. Yeah a single <code>.emacs</code> does not suit me anymore, it's a complete
<code>.emacs.d</code> now, but that's because that's how I like it organised, you
know. So, here's the example:</p>

<pre class="src">
<span style="color: #b22222;">;;; </span><span style="color: #b22222;">dim-packages.el --- Dimitri Fontaine
</span><span style="color: #b22222;">;;</span><span style="color: #b22222;">
</span><span style="color: #b22222;">;; </span><span style="color: #b22222;">Set el-get-sources and call el-get to init all those packages we need.
</span>(<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">el-get</span>)
(add-to-list 'el-get-recipe-path <span style="color: #bc8f8f;">"~/dev/emacs/el-get/recipes"</span>)

(setq el-get-sources
      '(cssh el-get switch-window vkill google-maps yasnippet verbiste mailq sic<span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">p</span>

        (<span style="color: #da70d6;">:name</span> magit
               <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> () (global-set-key (kbd <span style="color: #bc8f8f;">"C-x C-z"</span>) 'magit-status))<span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">)</span>

        (<span style="color: #da70d6;">:name</span> asciidoc
               <span style="color: #da70d6;">:type</span> elpa
               <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> ()
                        (autoload 'doc-mode <span style="color: #bc8f8f;">"doc-mode"</span> nil t)
                        (add-to-list 'auto-mode-alist '(<span style="color: #bc8f8f;">"\\.adoc$"</span> . doc-mode))
                        (add-hook 'doc-mode-hook '(<span style="color: #7f007f;">lambda</span> ()
                                                    (turn-on-auto-fill)
                                                    (<span style="color: #7f007f;">require</span> '<span style="color: #5f9ea0;">asciidoc</span>)))))

        (<span style="color: #da70d6;">:name</span> goto-last-change
               <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> ()
                        (global-set-key (kbd <span style="color: #bc8f8f;">"C-x C-/"</span>) 'goto-last-change)))

        (<span style="color: #da70d6;">:name</span> auto-dictionary <span style="color: #da70d6;">:type</span> elpa)
        (<span style="color: #da70d6;">:name</span> gist            <span style="color: #da70d6;">:type</span> elpa)
        (<span style="color: #da70d6;">:name</span> lisppaste       <span style="color: #da70d6;">:type</span> elpa)))

(el-get) <span style="color: #b22222;">; </span><span style="color: #b22222;">that could/should be (el-get 'sync)
</span>(<span style="color: #7f007f;">provide</span> '<span style="color: #5f9ea0;">dim-packages</span>)
</pre>

<p>Ok that's not all of it, but it should give you a nice idea about what
problem I solve with <code>el-get</code> and how. In my emacs startup sequence, somewhere
inside my <code>~/.emacs.d/init.el</code> file, I have a line that says <code>(require
'dim-packages)</code>. This will set <code>el-get-sources</code> to the list just above, then
call <code>(el-get)</code>, the main function.</p>

<p>This main function will check each given package and install it if necessary
(including <em>build</em> the package, as in <code>make autoloads; make</code>), then <em>init</em>
it. What <em>init</em> means exactly depends on what the recipe says. That can
include <em>byte-compiling</em> some files, caring about <em>load-path</em>, <em>load</em> and <em>require</em>
commands, caring about <em>Info-directory-list</em> and <code>ginstall-info</code> too, and some
more.</p>

<p>So in short, it will make it so that your emacs instance is ready for you to
use. And you get the choice to use the given <code>el-get</code> recipes as-is, like I
did for <code>cssh</code>, <code>el-get</code>, <code>switch-window</code> and others, up to <code>sicp</code>, or to tweak them
partly, like in the <code>magit</code> example where I've added a user init function (the
<code>:after</code> property) to bind <code>magit-status</code> to <code>C-x C-z</code> here. You can even embed a
full recipe inline in the <code>el-get-sources</code> variable, that's the case for each
item that gives its <code>:type</code> property, like <code>asciidoc</code> or <code>gist</code>.</p>

<p>And, as you see, we're using <code>ELPA</code> a lot in this sources, so <code>el-get</code> isn't
striving to replace it at all, it's just trying to accomodate to a broader
world.</p>


<h3>I read that the el-get-install is asynchronous, tell me more.</h3>

<p class="first">Yeah, right, the example above says <code>(el-get)</code> at its end, and in the cases
when <code>el-get</code> has to install or build sources, this will be done
asynchronously. Which means that not only several sources will get processed
at once (using your multi cores, yeah) but that it will let emacs start up
as if it was ready.</p>

<p>It happens that's usually what I want, because I seldom add sources in my
setup, but in theory that can break your emacs. What I do is start it again
or fix by hand, what you can do instead is <code>(el-get 'sync)</code> so that emacs is
blocked waiting for <code>el-get</code> to properly install and initialize all the
sources you've setup. Your choice, just add the <code>'sync</code> parameter there.</p>


<h3>Now, explain me why it is better this way, again, please?</h3>

<p class="first">Well, before I wrote <code>el-get</code>, trying out a new extension, setting it up etc
was something quite involved, and that I had to redo on several
machines. The only way not to redo it was to include the extension's code
into my own <code>git</code> repository (my <code>emacs.d</code> is in <code>git</code>, of course).</p>

<p>And putting code I don't maintain into my own <code>git</code> repository is something I
frown upon. I have no business pretending I'll maintain the code, and I know
I will never think to check the <code>URL</code> where I've found it for updates. That's
when I though noting down the <code>URL</code> somewhere.</p>

<p>Also, what about sharing the extension with friends. Uneasy, at best.</p>

<p>Enters <code>el-get</code> and I can just add an entry to <code>el-get-sources</code>, based on a file
somewhere in my own <code>el-get-recipe-path</code>. When I'm happy with this file, I can
contribute it to <code>el-get</code> proper or just send it over to any interested
recipient. Adding it to your sources is easy. Copy the file in your
<code>el-get-recipe-path</code> somewhere, add its name to your <code>el-get-sources</code>, then <code>M-x
el-get-install</code> it. Done. If you were given the <code>:after</code> function, it's all
setup already.</p>

<p>If you contribute the recipe to <code>el-get</code>, then <code>M-x el-get-update RET el-get
RET</code> and you get it on this other machine where you also use Emacs. Or you
can tell your friend to do the same and benefit from your <em>packaging</em>.</p>


<h3>Well, sounds good. What recipes do you have already?</h3>

<p class="first">I count <code>67</code> of them already. One of them is just a book in <em>info</em> format, with
no <em>elisp</em> at all, can you spot it?</p>

<pre class="src">
ELISP&gt; (directory-files <span style="color: #bc8f8f;">"~/dev/emacs/el-get/recipes/"</span> nil <span style="color: #bc8f8f;">"el$"</span>)

(<span style="color: #bc8f8f;">"auctex.el"</span> <span style="color: #bc8f8f;">"auto-complete-etags.el"</span> <span style="color: #bc8f8f;">"auto-complete-extension.el"</span>
<span style="color: #bc8f8f;">"auto-complete.el"</span> <span style="color: #bc8f8f;">"auto-install.el"</span> <span style="color: #bc8f8f;">"autopair.el"</span> <span style="color: #bc8f8f;">"bbdb.el"</span>
<span style="color: #bc8f8f;">"blender-python-mode.el"</span> <span style="color: #bc8f8f;">"color-theme-twilight.el"</span> <span style="color: #bc8f8f;">"color-theme.el"</span>
<span style="color: #bc8f8f;">"cssh.el"</span> <span style="color: #bc8f8f;">"django-mode.el"</span> <span style="color: #bc8f8f;">"el-get.el"</span> <span style="color: #bc8f8f;">"emacs-w3m.el"</span> <span style="color: #bc8f8f;">"emacschrome.el"</span>
<span style="color: #bc8f8f;">"emms.el"</span> <span style="color: #bc8f8f;">"ensime.el"</span> <span style="color: #bc8f8f;">"erc-highlight-nicknames.el"</span> <span style="color: #bc8f8f;">"erc-track-score.el"</span>
<span style="color: #bc8f8f;">"escreen.el"</span> <span style="color: #bc8f8f;">"filladapt.el"</span> <span style="color: #bc8f8f;">"flyguess.el"</span> <span style="color: #bc8f8f;">"gist.el"</span> <span style="color: #bc8f8f;">"google-maps.el"</span>
<span style="color: #bc8f8f;">"google-weather.el"</span> <span style="color: #bc8f8f;">"goto-last-change.el"</span> <span style="color: #bc8f8f;">"haskell-mode.el"</span>
<span style="color: #bc8f8f;">"highlight-parentheses.el"</span> <span style="color: #bc8f8f;">"hl-sexp.el"</span> <span style="color: #bc8f8f;">"levenshtein.el"</span> <span style="color: #bc8f8f;">"magit.el"</span>
<span style="color: #bc8f8f;">"mailq.el"</span> <span style="color: #bc8f8f;">"maxframe.el"</span> <span style="color: #bc8f8f;">"multi-term.el"</span> <span style="color: #bc8f8f;">"muse-blog.el"</span> <span style="color: #bc8f8f;">"nognus.el"</span>
<span style="color: #bc8f8f;">"nterm.el"</span> <span style="color: #bc8f8f;">"nxhtml.el"</span> <span style="color: #bc8f8f;">"offlineimap.el"</span> <span style="color: #bc8f8f;">"package.el"</span> <span style="color: #bc8f8f;">"popup-kill-ring.el"</span>
<span style="color: #bc8f8f;">"pos-tip.el"</span> <span style="color: #bc8f8f;">"pov-mode.el"</span> <span style="color: #bc8f8f;">"psvn.el"</span> <span style="color: #bc8f8f;">"pymacs.el"</span> <span style="color: #bc8f8f;">"rainbow-mode.el"</span>
<span style="color: #bc8f8f;">"rcirc-groups.el"</span> <span style="color: #bc8f8f;">"rinari.el"</span> <span style="color: #bc8f8f;">"ropemacs.el"</span> <span style="color: #bc8f8f;">"rt-liberation.el"</span> <span style="color: #bc8f8f;">"scratch.el"</span>
<span style="color: #bc8f8f;">"session.el"</span> <span style="color: #bc8f8f;">"sicp.el"</span> <span style="color: #bc8f8f;">"smex.el"</span> <span style="color: #bc8f8f;">"switch-window.el"</span> <span style="color: #bc8f8f;">"textile-mode.el"</span>
<span style="color: #bc8f8f;">"todochiku.el"</span> <span style="color: #bc8f8f;">"twitter.el"</span> <span style="color: #bc8f8f;">"twittering-mode.el"</span> <span style="color: #bc8f8f;">"undo-tree.el"</span>
<span style="color: #bc8f8f;">"verbiste.el"</span> <span style="color: #bc8f8f;">"vimpulse-surround.el"</span> <span style="color: #bc8f8f;">"vimpulse.el"</span> <span style="color: #bc8f8f;">"vkill.el"</span> <span style="color: #bc8f8f;">"xcscope.el"</span>
<span style="color: #bc8f8f;">"xml-rpc-el.el"</span> <span style="color: #bc8f8f;">"yasnippet.el"</span>)
</pre>


<h3>Ok, I want to try it, what's next?</h3>

<p class="first">Visit the following <code>URL</code> <a href="http://github.com/dimitri/el-get">http://github.com/dimitri/el-get</a> and follow the
install instructions. You're given a <em>scratch installer</em> there, that's some
<em>elisp</em> code you copy paste into <code>*scratch*</code> then execute there, and you have
<code>el-get</code> ready to serve.</p>

<p>An excellent idea I stole at <code>ELPA</code>!</p>


<h3>Hey, I already know what el-get is, what's new in 1.0?</h3>

<p class="first">The <em>changelog</em> is quite full of good stuff, really:</p>

<ul>
<li>Implement el-get recipes so that el-get-sources can be a simple list
of symbols. Now that there's an authoritative git repository, where
to share the recipes is easy.</li>

<li>Add support for emacswiki directly, save from having to enter the URL</li>

<li>Implement package status on-disk saving so that installing over a
previously failed install is in theory possible. Currently `el-get'
will refrain from removing your package automatically, though.</li>

<li>Fix ELPA remove method, adding a &quot;removed&quot; state too.</li>

<li>Implement CVS login support.</li>

<li>Add lots of recipes</li>

<li>Add support for `system-type' specific build commands</li>

<li>Byte compile files from the load-path entries or :compile files</li>

<li>Implement support for git submodules with the command
`git submodule update &mdash;init &mdash;recursive`</li>

<li>Add catch-all post-install and post-update hooks</li>

<li>Add desktop notification on install/update.</li>
</ul>


<h3>I'm still using the deprecated emacswiki version, what now?</h3>

<p class="first">That version didn't have recipes, and the new version should be perfectly
happy with your current <code>el-get-sources</code>, so that I recommend using the
<em>scratch installer</em> too. Don't forget to add <code>el-get</code> itself into your
<code>el-get-sources</code> list, of course!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 07 Oct 2010 13:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/10/blog/2010/10/07-el-get-reaches-10.html</guid>
</item>
<item>
  <title>el-get reaches 1.0</title>
  <link>http://tapoueh.org/blog/2010/10/07-el-get-reaches-10.html</link>
  <description><![CDATA[h1>el-get reaches 1.0</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/10/index.html>10</a> / </div>
<div class="date">Thursday, October 07 2010, 13:30</div>
</div>
<div id="article">
<p>It's been a week since the last commits in the <a href="http://github.com/dimitri/el-get">el-get repository</a>, and those
were all about fixing and adding recipes, and about notifications. Nothing
like <em>core plumbing</em> you see. Also, <code>0.9</code> was released on <em>2010-08-24</em> and felt
pretty complete already, then received lots of improvements. It's high time
to cross the line and call it <code>1.0</code>!</p>

<p>Now existing users will certainly just be moderatly happy to see the tool
reach that version number, depending whether they think more about the bugs
they want to see fixed (ftp is supported, only called http) and the new
features they want to see in (<em>info</em> documentation) or more about what <code>el-get</code>
does for them already today...</p>

<p>For the new users, or the yet-to-be-convinced users, let's take some time
and talk about <code>el-get</code>. A <em>FAQ</em> like session might be best.</p>

<h3>How is el-get different from ELPA?</h3>

<p><a href="http://tromey.com/elpa/">ELPA</a> is the <em>Emacs Lisp Package Archive</em> and is also known as <code>package.el</code>, to
be included in Emacs 24. This allows emacs list extension authors to <em>package</em>
their work. That means they have to follow some guidelines and format their
contribution, then propose it for upload.</p>

<p>This requires licence checks (good) and for the <a href="http://elpa.gnu.org/">new official ELPA mirror</a> it
even requires dead-tree papers exchange and contracts and copyright
assignments, I believe.</p>


<h3>Why have both?</h3>

<p class="first">While <em>ELPA</em> is a great thing to have, it's so easy to find some high quality
Emacs extension out there that are not part of the offer. Either authors are
not interrested into uploading to ELPA, or they don't know how to properly
<em>package</em> for it (it's only simple for single file extensions, see).</p>

<p>So <code>el-get</code> is a pragmatic answer here. It's there because it so happens that
I don't depend only on emacs extensions that are available with Emacs
itself, in my distribution <code>site-lisp</code> and in <code>ELPA</code>. I need some more, and I
don't need it to be complex to find it, fetch it, init it and use it.</p>

<p>Of course I could try and package any extension I find I need and submit it
to <code>ELPA</code>, but really, to do that nicely I'd need to contact the extension
author (<em>upstream</em>) for him to accept my patch, and then consider a fork.</p>

<p>With <code>el-get</code> I propose distributed packaging if you will. Let's have a look
at two <em>recipes</em> here. First, the <code>el-get</code> one itself:</p>

<pre class="src">
(<span style="color: #729fcf;">:name</span> el-get
       <span style="color: #729fcf;">:type</span> git
       <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"git://github.com/dimitri/el-get.git"</span>
       <span style="color: #729fcf;">:features</span> el-get
       <span style="color: #729fcf;">:compile</span> <span style="color: #ad7fa8; font-style: italic;">"el-get.el"</span>)
</pre>

<p>Then a much more complex one, the <a href="http://bbdb.sourceforge.net/">bbdb</a> one:</p>

<pre class="src">
(<span style="color: #729fcf;">:name</span> bbdb
       <span style="color: #729fcf;">:type</span> git
       <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"git://github.com/barak/BBDB.git"</span>
       <span style="color: #729fcf;">:load-path</span> (<span style="color: #ad7fa8; font-style: italic;">"./lisp"</span> <span style="color: #ad7fa8; font-style: italic;">"./bits"</span>)
       <span style="color: #729fcf;">:build</span> (<span style="color: #ad7fa8; font-style: italic;">"./configure"</span> <span style="color: #ad7fa8; font-style: italic;">"make autoloads"</span> <span style="color: #ad7fa8; font-style: italic;">"make"</span>)
       <span style="color: #729fcf;">:build/darwin</span> (<span style="color: #ad7fa8; font-style: italic;">"./configure --with-emacs=/Applications/Emacs.app/Contents</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">/MacOS/Emacs" "make autoloads" "make")</span>
       <span style="color: #729fcf;">:features</span> bbdb
       <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> () (bbdb-initialize))
       <span style="color: #729fcf;">:info</span> <span style="color: #ad7fa8; font-style: italic;">"texinfo"</span>)
</pre>

<p>The idea is that it's much simpler to just come up with a recipe like this
than to patch existing code and upload it to <code>ELPA</code>. And anybody can share
their <em>recipes</em> very easily, with or without proposing them to me, even if I
very much like to add some more in the official <code>el-get</code> list.</p>

<p>As a user, you don't even need to twiddle with recipes, mostly, because we
already have them for you. What you do instead is list them in
<code>el-get-sources</code>.</p>


<h3>So, show me how you use it?</h3>

<p class="first">Yeah, sure. Here's a sample of my <code>dim-packages.el</code> file, part of my <code>.emacs</code>
<em>suite</em>. Yeah a single <code>.emacs</code> does not suit me anymore, it's a complete
<code>.emacs.d</code> now, but that's because that's how I like it organised, you
know. So, here's the example:</p>

<pre class="src">
<span style="color: #888a85;">;;; </span><span style="color: #888a85;">dim-packages.el --- Dimitri Fontaine
</span><span style="color: #888a85;">;;</span><span style="color: #888a85;">
</span><span style="color: #888a85;">;; </span><span style="color: #888a85;">Set el-get-sources and call el-get to init all those packages we need.
</span>(<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">el-get</span>)
(add-to-list 'el-get-recipe-path <span style="color: #ad7fa8; font-style: italic;">"~/dev/emacs/el-get/recipes"</span>)

(setq el-get-sources
      '(cssh el-get switch-window vkill google-maps yasnippet verbiste mailq sic<span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">p</span>

        (<span style="color: #729fcf;">:name</span> magit
               <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> () (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-z"</span>) 'magit-status))<span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">)</span>

        (<span style="color: #729fcf;">:name</span> asciidoc
               <span style="color: #729fcf;">:type</span> elpa
               <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                        (autoload 'doc-mode <span style="color: #ad7fa8; font-style: italic;">"doc-mode"</span> nil t)
                        (add-to-list 'auto-mode-alist '(<span style="color: #ad7fa8; font-style: italic;">"\\.adoc$"</span> . doc-mode))
                        (add-hook 'doc-mode-hook '(<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                                                    (turn-on-auto-fill)
                                                    (<span style="color: #729fcf; font-weight: bold;">require</span> '<span style="color: #8ae234;">asciidoc</span>)))))

        (<span style="color: #729fcf;">:name</span> goto-last-change
               <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                        (global-set-key (kbd <span style="color: #ad7fa8; font-style: italic;">"C-x C-/"</span>) 'goto-last-change)))

        (<span style="color: #729fcf;">:name</span> auto-dictionary <span style="color: #729fcf;">:type</span> elpa)
        (<span style="color: #729fcf;">:name</span> gist            <span style="color: #729fcf;">:type</span> elpa)
        (<span style="color: #729fcf;">:name</span> lisppaste       <span style="color: #729fcf;">:type</span> elpa)))

(el-get) <span style="color: #888a85;">; </span><span style="color: #888a85;">that could/should be (el-get 'sync)
</span>(<span style="color: #729fcf; font-weight: bold;">provide</span> '<span style="color: #8ae234;">dim-packages</span>)
</pre>

<p>Ok that's not all of it, but it should give you a nice idea about what
problem I solve with <code>el-get</code> and how. In my emacs startup sequence, somewhere
inside my <code>~/.emacs.d/init.el</code> file, I have a line that says <code>(require
'dim-packages)</code>. This will set <code>el-get-sources</code> to the list just above, then
call <code>(el-get)</code>, the main function.</p>

<p>This main function will check each given package and install it if necessary
(including <em>build</em> the package, as in <code>make autoloads; make</code>), then <em>init</em>
it. What <em>init</em> means exactly depends on what the recipe says. That can
include <em>byte-compiling</em> some files, caring about <em>load-path</em>, <em>load</em> and <em>require</em>
commands, caring about <em>Info-directory-list</em> and <code>ginstall-info</code> too, and some
more.</p>

<p>So in short, it will make it so that your emacs instance is ready for you to
use. And you get the choice to use the given <code>el-get</code> recipes as-is, like I
did for <code>cssh</code>, <code>el-get</code>, <code>switch-window</code> and others, up to <code>sicp</code>, or to tweak them
partly, like in the <code>magit</code> example where I've added a user init function (the
<code>:after</code> property) to bind <code>magit-status</code> to <code>C-x C-z</code> here. You can even embed a
full recipe inline in the <code>el-get-sources</code> variable, that's the case for each
item that gives its <code>:type</code> property, like <code>asciidoc</code> or <code>gist</code>.</p>

<p>And, as you see, we're using <code>ELPA</code> a lot in this sources, so <code>el-get</code> isn't
striving to replace it at all, it's just trying to accomodate to a broader
world.</p>


<h3>I read that the el-get-install is asynchronous, tell me more.</h3>

<p class="first">Yeah, right, the example above says <code>(el-get)</code> at its end, and in the cases
when <code>el-get</code> has to install or build sources, this will be done
asynchronously. Which means that not only several sources will get processed
at once (using your multi cores, yeah) but that it will let emacs start up
as if it was ready.</p>

<p>It happens that's usually what I want, because I seldom add sources in my
setup, but in theory that can break your emacs. What I do is start it again
or fix by hand, what you can do instead is <code>(el-get 'sync)</code> so that emacs is
blocked waiting for <code>el-get</code> to properly install and initialize all the
sources you've setup. Your choice, just add the <code>'sync</code> parameter there.</p>


<h3>Now, explain me why it is better this way, again, please?</h3>

<p class="first">Well, before I wrote <code>el-get</code>, trying out a new extension, setting it up etc
was something quite involved, and that I had to redo on several
machines. The only way not to redo it was to include the extension's code
into my own <code>git</code> repository (my <code>emacs.d</code> is in <code>git</code>, of course).</p>

<p>And putting code I don't maintain into my own <code>git</code> repository is something I
frown upon. I have no business pretending I'll maintain the code, and I know
I will never think to check the <code>URL</code> where I've found it for updates. That's
when I though noting down the <code>URL</code> somewhere.</p>

<p>Also, what about sharing the extension with friends. Uneasy, at best.</p>

<p>Enters <code>el-get</code> and I can just add an entry to <code>el-get-sources</code>, based on a file
somewhere in my own <code>el-get-recipe-path</code>. When I'm happy with this file, I can
contribute it to <code>el-get</code> proper or just send it over to any interested
recipient. Adding it to your sources is easy. Copy the file in your
<code>el-get-recipe-path</code> somewhere, add its name to your <code>el-get-sources</code>, then <code>M-x
el-get-install</code> it. Done. If you were given the <code>:after</code> function, it's all
setup already.</p>

<p>If you contribute the recipe to <code>el-get</code>, then <code>M-x el-get-update RET el-get
RET</code> and you get it on this other machine where you also use Emacs. Or you
can tell your friend to do the same and benefit from your <em>packaging</em>.</p>


<h3>Well, sounds good. What recipes do you have already?</h3>

<p class="first">I count <code>67</code> of them already. One of them is just a book in <em>info</em> format, with
no <em>elisp</em> at all, can you spot it?</p>

<pre class="src">
ELISP&gt; (directory-files <span style="color: #ad7fa8; font-style: italic;">"~/dev/emacs/el-get/recipes/"</span> nil <span style="color: #ad7fa8; font-style: italic;">"el$"</span>)

(<span style="color: #ad7fa8; font-style: italic;">"auctex.el"</span> <span style="color: #ad7fa8; font-style: italic;">"auto-complete-etags.el"</span> <span style="color: #ad7fa8; font-style: italic;">"auto-complete-extension.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"auto-complete.el"</span> <span style="color: #ad7fa8; font-style: italic;">"auto-install.el"</span> <span style="color: #ad7fa8; font-style: italic;">"autopair.el"</span> <span style="color: #ad7fa8; font-style: italic;">"bbdb.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"blender-python-mode.el"</span> <span style="color: #ad7fa8; font-style: italic;">"color-theme-twilight.el"</span> <span style="color: #ad7fa8; font-style: italic;">"color-theme.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"cssh.el"</span> <span style="color: #ad7fa8; font-style: italic;">"django-mode.el"</span> <span style="color: #ad7fa8; font-style: italic;">"el-get.el"</span> <span style="color: #ad7fa8; font-style: italic;">"emacs-w3m.el"</span> <span style="color: #ad7fa8; font-style: italic;">"emacschrome.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"emms.el"</span> <span style="color: #ad7fa8; font-style: italic;">"ensime.el"</span> <span style="color: #ad7fa8; font-style: italic;">"erc-highlight-nicknames.el"</span> <span style="color: #ad7fa8; font-style: italic;">"erc-track-score.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"escreen.el"</span> <span style="color: #ad7fa8; font-style: italic;">"filladapt.el"</span> <span style="color: #ad7fa8; font-style: italic;">"flyguess.el"</span> <span style="color: #ad7fa8; font-style: italic;">"gist.el"</span> <span style="color: #ad7fa8; font-style: italic;">"google-maps.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"google-weather.el"</span> <span style="color: #ad7fa8; font-style: italic;">"goto-last-change.el"</span> <span style="color: #ad7fa8; font-style: italic;">"haskell-mode.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"highlight-parentheses.el"</span> <span style="color: #ad7fa8; font-style: italic;">"hl-sexp.el"</span> <span style="color: #ad7fa8; font-style: italic;">"levenshtein.el"</span> <span style="color: #ad7fa8; font-style: italic;">"magit.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"mailq.el"</span> <span style="color: #ad7fa8; font-style: italic;">"maxframe.el"</span> <span style="color: #ad7fa8; font-style: italic;">"multi-term.el"</span> <span style="color: #ad7fa8; font-style: italic;">"muse-blog.el"</span> <span style="color: #ad7fa8; font-style: italic;">"nognus.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"nterm.el"</span> <span style="color: #ad7fa8; font-style: italic;">"nxhtml.el"</span> <span style="color: #ad7fa8; font-style: italic;">"offlineimap.el"</span> <span style="color: #ad7fa8; font-style: italic;">"package.el"</span> <span style="color: #ad7fa8; font-style: italic;">"popup-kill-ring.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"pos-tip.el"</span> <span style="color: #ad7fa8; font-style: italic;">"pov-mode.el"</span> <span style="color: #ad7fa8; font-style: italic;">"psvn.el"</span> <span style="color: #ad7fa8; font-style: italic;">"pymacs.el"</span> <span style="color: #ad7fa8; font-style: italic;">"rainbow-mode.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"rcirc-groups.el"</span> <span style="color: #ad7fa8; font-style: italic;">"rinari.el"</span> <span style="color: #ad7fa8; font-style: italic;">"ropemacs.el"</span> <span style="color: #ad7fa8; font-style: italic;">"rt-liberation.el"</span> <span style="color: #ad7fa8; font-style: italic;">"scratch.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"session.el"</span> <span style="color: #ad7fa8; font-style: italic;">"sicp.el"</span> <span style="color: #ad7fa8; font-style: italic;">"smex.el"</span> <span style="color: #ad7fa8; font-style: italic;">"switch-window.el"</span> <span style="color: #ad7fa8; font-style: italic;">"textile-mode.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"todochiku.el"</span> <span style="color: #ad7fa8; font-style: italic;">"twitter.el"</span> <span style="color: #ad7fa8; font-style: italic;">"twittering-mode.el"</span> <span style="color: #ad7fa8; font-style: italic;">"undo-tree.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"verbiste.el"</span> <span style="color: #ad7fa8; font-style: italic;">"vimpulse-surround.el"</span> <span style="color: #ad7fa8; font-style: italic;">"vimpulse.el"</span> <span style="color: #ad7fa8; font-style: italic;">"vkill.el"</span> <span style="color: #ad7fa8; font-style: italic;">"xcscope.el"</span>
<span style="color: #ad7fa8; font-style: italic;">"xml-rpc-el.el"</span> <span style="color: #ad7fa8; font-style: italic;">"yasnippet.el"</span>)
</pre>


<h3>Ok, I want to try it, what's next?</h3>

<p class="first">Visit the following <code>URL</code> <a href="http://github.com/dimitri/el-get">http://github.com/dimitri/el-get</a> and follow the
install instructions. You're given a <em>scratch installer</em> there, that's some
<em>elisp</em> code you copy paste into <code>*scratch*</code> then execute there, and you have
<code>el-get</code> ready to serve.</p>

<p>An excellent idea I stole at <code>ELPA</code>!</p>


<h3>Hey, I already know what el-get is, what's new in 1.0?</h3>

<p class="first">The <em>changelog</em> is quite full of good stuff, really:</p>

<ul>
<li>Implement el-get recipes so that el-get-sources can be a simple list
of symbols. Now that there's an authoritative git repository, where
to share the recipes is easy.</li>

<li>Add support for emacswiki directly, save from having to enter the URL</li>

<li>Implement package status on-disk saving so that installing over a
previously failed install is in theory possible. Currently `el-get'
will refrain from removing your package automatically, though.</li>

<li>Fix ELPA remove method, adding a &quot;removed&quot; state too.</li>

<li>Implement CVS login support.</li>

<li>Add lots of recipes</li>

<li>Add support for `system-type' specific build commands</li>

<li>Byte compile files from the load-path entries or :compile files</li>

<li>Implement support for git submodules with the command
`git submodule update &mdash;init &mdash;recursive`</li>

<li>Add catch-all post-install and post-update hooks</li>

<li>Add desktop notification on install/update.</li>
</ul>


<h3>I'm still using the deprecated emacswiki version, what now?</h3>

<p class="first">That version didn't have recipes, and the new version should be perfectly
happy with your current <code>el-get-sources</code>, so that I recommend using the
<em>scratch installer</em> too. Don't forget to add <code>el-get</code> itself into your
<code>el-get-sources</code> list, of course!</p>



<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/muse.html">Muse</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/switch-window.html">switch-window</a> <a href="../../../tags/cssh.html">cssh</a> <a href="../../../tags/mailq.html">mailq</a> <a href="../../../tags/rcirc.html">rcirc</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 07 Oct 2010 13:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/10/07-el-get-reaches-10.html</guid>
</item>












<item>
  <title>Regexp performances and Finite Automata</title>
  <link>http://tapoueh.org/blog/2010/09/blog/2010/09/26-regexp-performances-and-finite-automata.html</link>
  <description><![CDATA[<p><span class="hack"> </span></p>

<p>The major reason why I dislike <a href="http://www.perl.org/">perl</a> so much, and <a href="http://www.ruby-lang.org">ruby</a> too, and the thing I'd
want different in the <a href="http://www.gnu.org/software/emacs/manual/elisp.html">Emacs Lisp</a> <code>API</code> so far is how they set developers mind
into using <a href="http://www.regular-expressions.info/">regexp</a>. You know the quote, don't you?</p>

<blockquote>
<p class="quoted">
Some people, when confronted with a problem, think “I know, I'll use regular
expressions.” Now they have two problems.</p>

</blockquote>

<p>That said, some situations require the use of <em>regexp</em> — or are so much
simpler to solve using them than the maintenance hell you're building here
ain't that big a drag. The given expressiveness is hard to match with any
other solution, to the point I sometime use them in my code (well I use <a href="http://www.emacswiki.org/emacs/rx">rx</a>
to lower the burden sometime, just see this example).</p>

<pre class="src">
(rx bol (zero-or-more blank) (one-or-more digit) <span style="color: #bc8f8f;">":"</span>)
<span style="color: #bc8f8f;">"^[[:blank:]]*[[:digit:]]+:"</span>
</pre>

<p>The thing you might want to know about <em>regexp</em> is that computing them is an
heavy task usually involving <em>parsing</em> their representation, <em>compiling</em> it to
some executable code, and then <em>executing</em> generated code. It's been showed in
the past (as soon as 1968) that a <em>regexp</em> is just another way to write a
finite automata, at least as soon as you don't need <em>backtracking</em>. The
writing of this article is my reaction to reading
<a href="http://swtch.com/~rsc/regexp/regexp1.html">Regular Expression Matching Can Be Simple And Fast</a> (but is slow in Java,
Perl, PHP, Python, Ruby, ...), a very interesting article — see the
benchmarks in there.</p>

<p>The bulk of it is that we find mainly two categories of <em>regexp</em> engine in the
wild, those that are using <a href="http://en.wikipedia.org/wiki/Nondeterministic_finite_state_machine">NFA</a> and <a href="http://en.wikipedia.org/wiki/Deterministic_finite_automaton">DFA</a> intermediate representation
techniques, and the others. Our beloved <a href="http://www.postgresql.org/">PostgreSQL</a> sure offers the feature,
it's the <code>~</code> and <code>~*</code> <a href="http://www.postgresql.org/docs/9.0/interactive/functions-matching.html">operators</a>. The implementation here is based on
<a href="http://www.arglist.com/regex/">Henry Spencer</a>'s work, which the aforementioned article says</p>

<blockquote>
<p class="quoted">
became very widely used, eventually serving as the basis for the slow
regular expression implementations mentioned earlier: Perl, PCRE, Python,
and so on.</p>

</blockquote>

<p>Having a look at the actual implementation shows that indeed, current
PostgreSQL code for <em>regexp</em> matching uses intermediate representations of
them as <code>NFA</code> and <code>DFA</code>. The code is quite complex, even more than I though it
would be, and I didn't have the time it would take to check it against the
proposed one from the <em>simple and fast</em> article.</p>

<pre class="src">
postgresql/src/backend/regex
  -rw-r--r--   1 dim  staff   4362 Sep 25 20:59 COPYRIGHT
  -rw-r--r--   1 dim  staff    614 Sep 25 20:59 Makefile
  -rw-r--r--   1 dim  staff  28217 Sep 25 20:59 re_syntax.n
  -rw-r--r--   1 dim  staff  16589 Sep 25 20:59 regc_color.c
  -rw-r--r--   1 dim  staff   3464 Sep 25 20:59 regc_cvec.c
  -rw-r--r--   1 dim  staff  25036 Sep 25 20:59 regc_lex.c
  -rw-r--r--   1 dim  staff  16845 Sep 25 20:59 regc_locale.c
  -rw-r--r--   1 dim  staff  35917 Sep 25 20:59 regc_nfa.c
  -rw-r--r--   1 dim  staff  50714 Sep 25 20:59 regcomp.c
  -rw-r--r--   1 dim  staff  17368 Sep 25 20:59 rege_dfa.c
  -rw-r--r--   1 dim  staff   3627 Sep 25 20:59 regerror.c
  -rw-r--r--   1 dim  staff  27664 Sep 25 20:59 regexec.c
  -rw-r--r--   1 dim  staff   2122 Sep 25 20:59 regfree.c
</pre>

<p>So all in all, I'll continue avoiding <em>regexp</em> as much as I currently do, and
will maintain my tendency to using <a href="http://www.gnu.org/manual/gawk/gawk.html">awk</a> when I need them on files (it allows
to refine the searching without resorting to more and more pipes in the
command line). And as far as resorting to using <em>regexp</em> in PostgreSQL is
concerned, it seems that the code here is already about topnotch. Once more.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sun, 26 Sep 2010 21:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/blog/2010/09/26-regexp-performances-and-finite-automata.html</guid>
</item>
<item>
  <title>Regexp performances and Finite Automata</title>
  <link>http://tapoueh.org/blog/2010/09/26-regexp-performances-and-finite-automata.html</link>
  <description><![CDATA[h1>Regexp performances and Finite Automata</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Sunday, September 26 2010, 21:00</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>The major reason why I dislike <a href="http://www.perl.org/">perl</a> so much, and <a href="http://www.ruby-lang.org">ruby</a> too, and the thing I'd
want different in the <a href="http://www.gnu.org/software/emacs/manual/elisp.html">Emacs Lisp</a> <code>API</code> so far is how they set developers mind
into using <a href="http://www.regular-expressions.info/">regexp</a>. You know the quote, don't you?</p>

<blockquote>
<p class="quoted">
Some people, when confronted with a problem, think “I know, I'll use regular
expressions.” Now they have two problems.</p>

</blockquote>

<p>That said, some situations require the use of <em>regexp</em> — or are so much
simpler to solve using them than the maintenance hell you're building here
ain't that big a drag. The given expressiveness is hard to match with any
other solution, to the point I sometime use them in my code (well I use <a href="http://www.emacswiki.org/emacs/rx">rx</a>
to lower the burden sometime, just see this example).</p>

<pre class="src">
(rx bol (zero-or-more blank) (one-or-more digit) <span style="color: #ad7fa8; font-style: italic;">":"</span>)
<span style="color: #ad7fa8; font-style: italic;">"^[[:blank:]]*[[:digit:]]+:"</span>
</pre>

<p>The thing you might want to know about <em>regexp</em> is that computing them is an
heavy task usually involving <em>parsing</em> their representation, <em>compiling</em> it to
some executable code, and then <em>executing</em> generated code. It's been showed in
the past (as soon as 1968) that a <em>regexp</em> is just another way to write a
finite automata, at least as soon as you don't need <em>backtracking</em>. The
writing of this article is my reaction to reading
<a href="http://swtch.com/~rsc/regexp/regexp1.html">Regular Expression Matching Can Be Simple And Fast</a> (but is slow in Java,
Perl, PHP, Python, Ruby, ...), a very interesting article — see the
benchmarks in there.</p>

<p>The bulk of it is that we find mainly two categories of <em>regexp</em> engine in the
wild, those that are using <a href="http://en.wikipedia.org/wiki/Nondeterministic_finite_state_machine">NFA</a> and <a href="http://en.wikipedia.org/wiki/Deterministic_finite_automaton">DFA</a> intermediate representation
techniques, and the others. Our beloved <a href="http://www.postgresql.org/">PostgreSQL</a> sure offers the feature,
it's the <code>~</code> and <code>~*</code> <a href="http://www.postgresql.org/docs/9.0/interactive/functions-matching.html">operators</a>. The implementation here is based on
<a href="http://www.arglist.com/regex/">Henry Spencer</a>'s work, which the aforementioned article says</p>

<blockquote>
<p class="quoted">
became very widely used, eventually serving as the basis for the slow
regular expression implementations mentioned earlier: Perl, PCRE, Python,
and so on.</p>

</blockquote>

<p>Having a look at the actual implementation shows that indeed, current
PostgreSQL code for <em>regexp</em> matching uses intermediate representations of
them as <code>NFA</code> and <code>DFA</code>. The code is quite complex, even more than I though it
would be, and I didn't have the time it would take to check it against the
proposed one from the <em>simple and fast</em> article.</p>

<pre class="src">
postgresql/src/backend/regex
  -rw-r--r--   1 dim  staff   4362 Sep 25 20:59 COPYRIGHT
  -rw-r--r--   1 dim  staff    614 Sep 25 20:59 Makefile
  -rw-r--r--   1 dim  staff  28217 Sep 25 20:59 re_syntax.n
  -rw-r--r--   1 dim  staff  16589 Sep 25 20:59 regc_color.c
  -rw-r--r--   1 dim  staff   3464 Sep 25 20:59 regc_cvec.c
  -rw-r--r--   1 dim  staff  25036 Sep 25 20:59 regc_lex.c
  -rw-r--r--   1 dim  staff  16845 Sep 25 20:59 regc_locale.c
  -rw-r--r--   1 dim  staff  35917 Sep 25 20:59 regc_nfa.c
  -rw-r--r--   1 dim  staff  50714 Sep 25 20:59 regcomp.c
  -rw-r--r--   1 dim  staff  17368 Sep 25 20:59 rege_dfa.c
  -rw-r--r--   1 dim  staff   3627 Sep 25 20:59 regerror.c
  -rw-r--r--   1 dim  staff  27664 Sep 25 20:59 regexec.c
  -rw-r--r--   1 dim  staff   2122 Sep 25 20:59 regfree.c
</pre>

<p>So all in all, I'll continue avoiding <em>regexp</em> as much as I currently do, and
will maintain my tendency to using <a href="http://www.gnu.org/manual/gawk/gawk.html">awk</a> when I need them on files (it allows
to refine the searching without resorting to more and more pipes in the
command line). And as far as resorting to using <em>regexp</em> in PostgreSQL is
concerned, it seems that the code here is already about topnotch. Once more.</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/emacs.html">Emacs</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sun, 26 Sep 2010 21:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/26-regexp-performances-and-finite-automata.html</guid>
</item>
<item>
  <title>Postfix sender_dependent_relayhost_maps</title>
  <link>http://tapoueh.org/blog/2010/09/23-postfix-sender_dependent_relayhost_maps.html</link>
  <description><![CDATA[h1>Postfix sender_dependent_relayhost_maps</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Thursday, September 23 2010, 14:30</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>The previous article about <a href="http://tapoueh.org/articles/news/_Scratch_that_itch:_M-x_mailq.html">M-x mailq</a> has raised several mails asking me
details about the <a href="http://www.postfix.com/">Postfix</a> setup I'm talking about. The problem we're trying
to solve is having a local <code>MTA</code> to send mails, so that any old-style Unix
tool just works, instead of only the <code>MUA</code> you've spent time setting up.</p>

<p>Postfix makes it possible to do that quite easily, but it gets a little more
involved if you have more than one <em>relayhost</em> that you want to use depending
on your current <em>From</em> address. Think personal email against work email, or
avoiding your <code>ISP</code> network when sending your private mails, <em>hoping</em> directly
on a server you own or trust.</p>

<p>So how do you do just that? Let's see the relevant parts of <code>main.cf</code>.</p>

<pre class="src">
relayhost = your.default.relay.host.here
relay_domains = domain.org, work-domain.com, other-domain.info
smtp_sender_dependent_authentication = yes
sender_dependent_relayhost_maps = hash:/etc/postfix/relaymap
</pre>

<p>The <code>relaymap</code> looks like this:</p>

<pre class="src">
<span style="color: #888a85;"># </span><span style="color: #888a85;">comments
</span>user@domain.org         mail.domain.org
local@work-domain.com   smtp.work-domain.com
<span style="color: #888a85;"># </span><span style="color: #888a85;">that requires a local tunnel started with ssh, see ~/.ssh/config
</span>me@other-domain.info    [127.0.0.1]:10025
</pre>

<p>You need to use <a href="http://www.postfix.org/postmap.1.html">postmap</a> on this file before to reload or restart your local
instance of Postfix.</p>

<p>Also, you should want to crypt your communication to your preferred relay
host, using <code>TLS</code> goes like this:</p>

<pre class="src">
smtp_sasl_auth_enable=yes
smtp_sasl_password_maps=hash:/etc/postfix/sasl-passwords
smtp_sasl_mechanism_filter = digest-md5
smtp_sasl_security_options = noanonymous
smtp_sasl_mechanism_filter = login, plain
smtp_sasl_type = cyrus

smtp_tls_session_cache_database = btree:${queue_directory}/smtp_scache
smtp_tls_loglevel = 2
smtp_use_tls = yes
smtp_tls_security_level = may
</pre>

<p>The password file will need to get parsed by <code>postmap</code> too, and would better
be set with limited read access, and looks like this:</p>

<pre class="src">
mail.domain.org        user@domain.org:password
smtp.work-domain.com   local@work-domain.com:h4ckm3
[<span style="color: #8ae234; font-weight: bold;">127.0.0.1</span>]:10025      me@other-domain.info:guess
</pre>

<p>Hope this help you get started, at least that's a document I would have
enjoyed reading when I first started to setup my local relaying <code>MTA</code>.</p>

<p>Oh, and now that you have this, I hope you will enjoy my <code>M-x mailq</code> tool for
occasions when you're wondering why you're not receiving an answer back yet,
then start the ssh tunnel…</p>


<h2>Tags</h2>

<p><a href="../../../tags/mailq.html">mailq</a> <a href="../../../tags/postfix.html">postfix</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 23 Sep 2010 14:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/23-postfix-sender_dependent_relayhost_maps.html</guid>
</item>






<item>
  <title>Scratch that itch: M-x mailq</title>
  <link>http://tapoueh.org/blog/2010/09/23-scratch-that-itch-m-x-mailq.html</link>
  <description><![CDATA[h1>Scratch that itch: M-x mailq</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Thursday, September 23 2010, 09:30</div>
</div>
<div id="article">
<p>Nowadays, most people would think that email is something simple, you just
setup your preferred client (that's called a <code>MUA</code>) with some information such
as the <code>smtp</code> host you want it to talk to (that's call a <code>MTA</code> and this one is
your <code>relayhost</code>). Then there's all the receiving mails part, and that's <code>smtp</code>
again on the server side. Then there's how to get those mail, read them,
flag them, manage them, and that's better served by <code>IMAP</code>. Let's talk about
sending mails in <code>smtp</code> for this entry.</p>

<p>The traditional way to handle mail sending is to have your own <code>MTA</code> on each
system you use — there used to be a <em>sysadmin</em> team caring about all those
systems, but we're lost in the personal computer era now — that only means
<strong><em>you</em></strong> are the sysadmin. So about any Unix tool that wants to send a mail will
do so with the command <code>/usr/bin/sendmail</code> to queue the outgoing message.</p>

<p>My typical <em>workstation</em> setup includes a full-blown <code>MTA</code> (my choice is
<a href="http://www.postfix.com/">Postfix</a>) that will choose the next relay host depending on the message <em>From</em>
field: I don't want to trust any local default relayhost. Note that the next
relay is connected to with authentication and over an encrypted protocol.</p>

<blockquote>
<p class="quoted">
We're getting there, really. But I don't know a better way to present a
software, little as it be, other than talking about the need that leads to
its development.</p>

</blockquote>

<p>Some relaying I do atop an <code>ssh</code> tunnel, and it happens that I send mail and
have forgotten about setting up the aforementioned tunnel. In this case, the
advantage is that it will not block my <code>MUA</code> (<a href="http://gnus.org/">gnus</a>, in quite good shape those
days, receiving lots of love), as the queueing happens as usual. The
drawback is that <a href="http://www.postfix.com/">Postfix</a> will <em>silently</em> queue the mail until it's able to
deliver it, which can take days.</p>

<p>Enters <code>M-x mailq</code>! Ok, I could be doing <code>M-! mailq</code> and see <em>Mail queue is empty</em>
in the message area, but then as soon as the queue's not empty I need to
resort to some <em>shell</em> or <em>terminal</em> in order to <em>flush</em> the queue — that's after
setting up the tunnel, as easy as <code>C-= remote</code> in my case, see
<a href="http://github.com/dimitri/cssh">cssh</a>. Scratching that itch, I now only have to hit <code>f</code> here, to flush the
queue. And from the <em>gnus</em> <code>*Group*</code> and <code>*Summary*</code> buffers, it's <code>M-q</code> to see the
mail queue.</p>

<p>Thanks to <a href="http://forum.ubuntu-fr.org/viewtopic.php?id=218883">http://forum.ubuntu-fr.org/viewtopic.php?id=218883</a> here's a visual
sample of the <code>mailq</code> mode, where you see the mail queue in colors and the
<em>keymap</em> you're offered.</p>

<center>
<p><img src="../../../images//mailq-el.png" alt=""></p>
</center>

<p>So you could even <em>flush</em> only a given <code>queue id</code> or a given <code>site</code>, or just <em>kill</em>
the current <code>id</code> or the current <code>site</code> so that it's a <code>C-y</code> away. I hope it's
useful for you too — oh, and it's already in the <a href="http://github.com/dimitri/el-get">el-get</a> recipes, of course!</p>


<h2>Tags</h2>

<p><a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/cssh.html">cssh</a> <a href="../../../tags/mailq.html">mailq</a> <a href="../../../tags/postfix.html">postfix</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 23 Sep 2010 09:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/23-scratch-that-itch-m-x-mailq.html</guid>
</item>
<item>
  <title>switch-window reaches 0.8</title>
  <link>http://tapoueh.org/blog/2010/09/blog/2010/09/13-switch-window-reaches-08.html</link>
  <description><![CDATA[<p>I wanted to play with the idea of using the whole keyboard for my
<a href="http://github.com/dimitri/switch-window">switch-window</a> utility, but wondered how to get those keys in the right order
and all. Finally found <code>quail-keyboard-layout</code> which seems to exists for such
uses, as you can see:</p>

<pre class="src">
(<span style="color: #7f007f;">loop</span> with layout = (split-string quail-keyboard-layout <span style="color: #bc8f8f;">""</span>)
  for row from 1 to 4
  collect (<span style="color: #7f007f;">loop</span> for col from 1 to 12
 (<span style="color: #bc8f8f;">"q"</span> <span style="color: #bc8f8f;">"w"</span> <span style="color: #bc8f8f;">"e"</span> <span style="color: #bc8f8f;">"r"</span> <span style="color: #bc8f8f;">"t"</span> <span style="color: #bc8f8f;">"y"</span> <span style="color: #bc8f8f;">"u"</span> <span style="color: #bc8f8f;">"i"</span> <span style="color: #bc8f8f;">"o"</span> <span style="color: #bc8f8f;">"p"</span> <span style="color: #bc8f8f;">"["</span> <span style="color: #bc8f8f;">"]"</span>)
 (<span style="color: #bc8f8f;">"a"</span> <span style="color: #bc8f8f;">"s"</span> <span style="color: #bc8f8f;">"d"</span> <span style="color: #bc8f8f;">"f"</span> <span style="color: #bc8f8f;">"g"</span> <span style="color: #bc8f8f;">"h"</span> <span style="color: #bc8f8f;">"j"</span> <span style="color: #bc8f8f;">"k"</span> <span style="color: #bc8f8f;">"l"</span> <span style="color: #bc8f8f;">";"</span> <span style="color: #bc8f8f;">"'"</span> <span style="color: #bc8f8f;">"\\"</span>)
 (<span style="color: #bc8f8f;">"z"</span> <span style="color: #bc8f8f;">"x"</span> <span style="color: #bc8f8f;">"c"</span> <span style="color: #bc8f8f;">"v"</span> <span style="color: #bc8f8f;">"b"</span> <span style="color: #bc8f8f;">"n"</span> <span style="color: #bc8f8f;">"m"</span> <span style="color: #bc8f8f;">","</span> <span style="color: #bc8f8f;">"."</span> <span style="color: #bc8f8f;">"/"</span> <span style="color: #bc8f8f;">" "</span> <span style="color: #bc8f8f;">" "</span>))
</pre>

<p>So now <code>switch-window</code> will use that (but only the first <code>10</code> letters) instead
of <em>hard-coding</em> numbers from 1 to 9 as labels and direct switches. That makes
it more suitable to <a href="http://github.com/dimitri/cssh">cssh</a> users too, I guess.</p>

<p>In other news, I think <a href="http://github.com/dimitri/el-get">el-get</a> is about ready for its <code>1.0</code> release. Please
test it and report any problem very soon before the release!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 13 Sep 2010 17:45:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/blog/2010/09/13-switch-window-reaches-08.html</guid>
</item>
<item>
  <title>switch-window reaches 0.8</title>
  <link>http://tapoueh.org/blog/2010/09/13-switch-window-reaches-08.html</link>
  <description><![CDATA[h1>switch-window reaches 0.8</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Monday, September 13 2010, 17:45</div>
</div>
<div id="article">
<p>I wanted to play with the idea of using the whole keyboard for my
<a href="http://github.com/dimitri/switch-window">switch-window</a> utility, but wondered how to get those keys in the right order
and all. Finally found <code>quail-keyboard-layout</code> which seems to exists for such
uses, as you can see:</p>

<pre class="src">
(<span style="color: #729fcf; font-weight: bold;">loop</span> with layout = (split-string quail-keyboard-layout <span style="color: #ad7fa8; font-style: italic;">""</span>)
  for row from 1 to 4
  collect (<span style="color: #729fcf; font-weight: bold;">loop</span> for col from 1 to 12
 (<span style="color: #ad7fa8; font-style: italic;">"q"</span> <span style="color: #ad7fa8; font-style: italic;">"w"</span> <span style="color: #ad7fa8; font-style: italic;">"e"</span> <span style="color: #ad7fa8; font-style: italic;">"r"</span> <span style="color: #ad7fa8; font-style: italic;">"t"</span> <span style="color: #ad7fa8; font-style: italic;">"y"</span> <span style="color: #ad7fa8; font-style: italic;">"u"</span> <span style="color: #ad7fa8; font-style: italic;">"i"</span> <span style="color: #ad7fa8; font-style: italic;">"o"</span> <span style="color: #ad7fa8; font-style: italic;">"p"</span> <span style="color: #ad7fa8; font-style: italic;">"["</span> <span style="color: #ad7fa8; font-style: italic;">"]"</span>)
 (<span style="color: #ad7fa8; font-style: italic;">"a"</span> <span style="color: #ad7fa8; font-style: italic;">"s"</span> <span style="color: #ad7fa8; font-style: italic;">"d"</span> <span style="color: #ad7fa8; font-style: italic;">"f"</span> <span style="color: #ad7fa8; font-style: italic;">"g"</span> <span style="color: #ad7fa8; font-style: italic;">"h"</span> <span style="color: #ad7fa8; font-style: italic;">"j"</span> <span style="color: #ad7fa8; font-style: italic;">"k"</span> <span style="color: #ad7fa8; font-style: italic;">"l"</span> <span style="color: #ad7fa8; font-style: italic;">";"</span> <span style="color: #ad7fa8; font-style: italic;">"'"</span> <span style="color: #ad7fa8; font-style: italic;">"\\"</span>)
 (<span style="color: #ad7fa8; font-style: italic;">"z"</span> <span style="color: #ad7fa8; font-style: italic;">"x"</span> <span style="color: #ad7fa8; font-style: italic;">"c"</span> <span style="color: #ad7fa8; font-style: italic;">"v"</span> <span style="color: #ad7fa8; font-style: italic;">"b"</span> <span style="color: #ad7fa8; font-style: italic;">"n"</span> <span style="color: #ad7fa8; font-style: italic;">"m"</span> <span style="color: #ad7fa8; font-style: italic;">","</span> <span style="color: #ad7fa8; font-style: italic;">"."</span> <span style="color: #ad7fa8; font-style: italic;">"/"</span> <span style="color: #ad7fa8; font-style: italic;">" "</span> <span style="color: #ad7fa8; font-style: italic;">" "</span>))
</pre>

<p>So now <code>switch-window</code> will use that (but only the first <code>10</code> letters) instead
of <em>hard-coding</em> numbers from 1 to 9 as labels and direct switches. That makes
it more suitable to <a href="http://github.com/dimitri/cssh">cssh</a> users too, I guess.</p>

<p>In other news, I think <a href="http://github.com/dimitri/el-get">el-get</a> is about ready for its <code>1.0</code> release. Please
test it and report any problem very soon before the release!</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/switch-window.html">switch-window</a> <a href="../../../tags/cssh.html">cssh</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 13 Sep 2010 17:45:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/13-switch-window-reaches-08.html</guid>
</item>
<item>
  <title>Window Functions example remix</title>
  <link>http://tapoueh.org/blog/2010/09/12-window-functions-example-remix.html</link>
  <description><![CDATA[h1>Window Functions example remix</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Sunday, September 12 2010, 21:35</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>The drawback of hosting a static only website is, obviously, the lack of
comments. What happens actually, though, is that I receive very few comments
by direct mail. As I don't get another <em>spam</em> source to cleanup, I'm left
unconvinced that's such a drawback. I still miss the low probability of
seeing blog readers exchange directly, but I think a <code>tapoueh.org</code> mailing
list would be my answer, here...</p>

<p>Anyway, <a href="http://people.planetpostgresql.org/dfetter/">David Fetter</a> took the time to send me a comment by mail with a
cleaned up rewrite of the previous entry <code>SQL</code>, here's it for your pleasure!</p>

<pre class="src">
WITH t AS (
    SELECT
        o, w,
        CASE WHEN
            LAG(w) OVER(w) IS DISTINCT FROM w AND
            ROW_NUMBER() OVER (w) &gt; 1 <span style="color: #888a85;">/* Eliminate first change */</span>
        THEN 1
        END AS change
    FROM (
        VALUES
            (1, 5),
            (2, 10),
            (3, 7),
            (4, 7),
            (5, 7)
    ) AS data(o, w)
    WINDOW w AS (ORDER BY o) <span style="color: #888a85;">/* Factor out WINDOW */</span>
)
SELECT SUM(change) FROM t;
</pre>

<p>As you can see <strong><em>David</em></strong> chose to filter the first change in the subquery rather
than hacking it away with a simple <code>-1</code> at the outer level. I'm still
wondering which way is cleaner (that depends on how you look at the
problem), but I think I know which one is simpler! Thanks <strong><em>David</em></strong> for this
blog entry!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sun, 12 Sep 2010 21:35:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/12-window-functions-example-remix.html</guid>
</item>
<item>
  <title>Window Functions example</title>
  <link>http://tapoueh.org/blog/2010/09/09-window-functions-example.html</link>
  <description><![CDATA[h1>Window Functions example</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Thursday, September 09 2010, 16:35</div>
</div>
<div id="article">
<p>So, when <code>8.4</code> came out there was all those comments about how getting
<a href="http://www.postgresql.org/docs/8.4/interactive/tutorial-window.html">window functions</a> was an awesome addition. Now, it seems that a lot of people
seeking for help in <a href="http://wiki.postgresql.org/index.php?title=IRC">#postgresql</a> just don't know what kind of problem this
feature helps solving. I've already been using them in some cases here in
this blog, for getting some nice overview about
<a href="http://tapoueh.org/articles/blog/_Partitioning:_relation_size_per_%E2%80%9Cgroup%E2%80%9D.html">Partitioning: relation size per “group”</a>.</p>

<p>Now, another example use case rose on <code>IRC</code> today. I'll quote directly our user here:</p>

<blockquote>
<p class="quoted">
hey there, how can i count the number of (value) changes in one column?</p>
<p class="quoted">  example: a table with a column <em>weight</em>. let's say we have 5 rows, having
the following values for weight: <code>5, 10, 7, 7, 7</code>. the number of changes of
weight would be 2 here (from 5 to 10 and 10 to 7). any idea how I could do
that in SQL using PGSQL 8.4.4? GROUP BY or count(distinct weight)
obviously does not work. thx in advance</p>

</blockquote>

<p>Now, several of us began talking about <em>window functions</em> and about the fact
that you need some other column to identify the ordering of those weights,
obviously, because that's the only way to define what a change is in this
context. Let's have a first try at it.</p>

<pre class="src">
=# select o, w,
          case when lag(w) over(order by o) is distinct from w then 1 end as change
     from (values (1, 5), (2, 10), (3, 7), (4, 7), (5, 7)) as data(o, w);
 o | w  | change
<span style="color: #888a85;">---+----+--------
</span> 1 |  5 |      1
 2 | 10 |      1
 3 |  7 |      1
 4 |  7 |
 5 |  7 |
(5 rows)
</pre>

<p>Not too bad, but of course we are seeing a false change on the first line,
as for any <em>window</em> of rows you define the previous one, given by <code>lag()
over()</code>, will be <code>NULL</code>. The easiest way to accommodate is the following:</p>

<pre class="src">
=# select sum(change) -1 as changes
     from (select case when lag(w) over(order by o) is distinct from w
                       then 1
                   end as change
             from (values (1, 5),
                          (2, 10),
                          (3, 7),
                          (4, 7),
                          (5, 7)) as t(o, w)) as x;
 changes
<span style="color: #888a85;">---------
</span>       2
(1 row)
</pre>

<p>So don't be shy and go read about <a href="http://www.postgresql.org/docs/8.4/interactive/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS">window functions in SQL expressions</a> and
<a href="http://www.postgresql.org/docs/8.4/interactive/queries-table-expressions.html#QUERIES-WINDOW">window function processing</a> in the query table expressions. That's a very
nice tool to have and my guess is that you will soon enough realize the only
reason why you could think you don't have a need for them is that you didn't
know it existed, and what you can do with it. <em>Sharpen your saw!</em> :)</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 09 Sep 2010 16:35:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/09-window-functions-example.html</guid>
</item>
<item>
  <title>Synchronous Replication</title>
  <link>http://tapoueh.org/blog/2010/09/06-synchronous-replication.html</link>
  <description><![CDATA[h1>Synchronous Replication</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/09/index.html>09</a> / </div>
<div class="date">Monday, September 06 2010, 18:05</div>
</div>
<div id="article">
<p>Although the new asynchronous replication facility that ships with 9.0 ain't
released to the wide public yet, our hackers hero are already working on the
synchronous version of it. A part of the facility is rather easy to design,
we want something comparable to <a href="http://www.drbd.org/">DRBD</a> flexibility, but specific to our
database world.  So <em>synchronous</em> would either mean <em>recv</em>, <em>fsync</em> or <em>apply</em>,
depending on what you need the <em>standby</em> to have already done when the master
acknowledges the <code>COMMIT</code>. Let's call that the <em>service level</em>.</p>

<p>The part of the design that's not so easy is more interesting. Do we need to
register standbys and have the <em>service level</em> setup per standby? Can we get
some more flexibility and have the <em>service level</em> set on a per-transaction
basis? The idea here would be that the application knows which transactions
are meant to be extra-safe and which are not, the same way that you can set
<code>synchronous_commit to off</code> when dealing with web sessions, for example.</p>

<p><em>Why choosing?</em> I hear you ask. Well, it's all about having more data safety,
and a typical setup would contain an asynchronous reporting server and a
local <em>failover</em> synchronous server. Then add a remote one, too. So even if we
pick the transaction based facility, we still want to be able to choose at
setup time which server to failover to. Than means we don't want that much
flexibility now, we want to know where the data is safe, we don't want to
have to guess.</p>

<p>Some way to solve that is to be able to setup a slave as being the failover
one, or say, the <code>sync</code> one. Now, the detail that ruins it all is that we need
a <em>timeout</em> to handle worst cases when a given slave loses its connectivity
(or power, say). Now, the slave ain't in <em>sync</em> any more and some people will
require that the service is still available (<em>timeout</em> but <code>COMMIT</code>) and some
will require that the service is down: don't accept a new transaction if you
can't make its data safe to the slave too.</p>

<p>The answer would be to have the master arbitrate between what the
transaction wants and what the slave is setup to provide, and what it's able
to provide at the time of the transaction. Given a transaction with a
<em>service level</em> of <em>apply</em> and a slave setup for being <em>async</em>, the <code>COMMIT</code> does
not have to wait, because there's no known slave able to offer the needed
level. Or the <code>COMMIT</code> can not happen, for the very same reason.</p>

<p>Then I think it all flows quite naturally from there, and while arbitrating
the master could record which slave is currently offering what <em>service
level</em>. And offering the information in a system view too, of course.</p>

<p>The big question that's not answered in this proposal is how to setup that
being unable to reach the wanted <em>service level</em> is an error or a
warning?</p>

<p>That too would need to be for the master to arbitrate based on a per standby
and a per transaction setting, and in the general case it could be a <em>quorum</em>
setup: each slave is given a <em>weight</em> and each transaction a <em>quorum</em> to
reach. The master sums up the weights of the standby that ack the
transaction at the needed <em>service level</em> and the <code>COMMIT</code> happens as soon as
the quorum is reached, or is canceled as soon as the <em>timeout</em> is reached,
whichever comes first.</p>

<p>Such a model allows for very flexible setups, where each standby has a
<em>weight</em> and offers a given <em>service level</em>, and each transaction waits until a
<em>quorum</em> is reached. Giving the right weights to your standbys (like, powers
of two) allow you to set the quorum in a way that only one given standby is
able to acknowledge the most important transactions. But that's flexible
enough you can change it at any time, it's just a <em>weight</em> that allows a <em>sum</em>
to be made, so my guess would be it ends up in the <em>feedback loop</em> between the
standby and its master.</p>

<p>The most appealing part of this proposal is that it doesn't look complex to
implement, and should allow for highly flexible setups. Of course, the devil
is in the details, and we're talking about latencies in the distributed
system here. That's also being discussed on the <a href="http://archives.postgresql.org/pgsql-hackers/">mailing list</a>.</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/release.html">release</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 06 Sep 2010 18:05:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/09/06-synchronous-replication.html</guid>
</item>


<item>
  <title>Want to share your recipes?</title>
  <link>http://tapoueh.org/blog/2010/08/blog/2010/08/31-want-to-share-your-recipes.html</link>
  <description><![CDATA[<p>Yes, that's another <a href="http://github.com/dimitri/el-get/">el-get</a> related entry. It seems to take a lot of my
attention these days. After having setup the <code>git</code> repository so that you can
update <code>el-get</code> from within itself (so that it's <em>self-contained</em>), the next
logical step is providing <em>recipes</em>.</p>

<p>By that I mean that <code>el-get-sources</code> entries will certainly look a lot alike
between a user and another. Let's take the <code>el-get</code> entry itself:</p>

<pre class="src">
(<span style="color: #da70d6;">:name</span> el-get
       <span style="color: #da70d6;">:type</span> git
       <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"git://github.com/dimitri/el-get.git"</span>
       <span style="color: #da70d6;">:features</span> <span style="color: #bc8f8f;">"el-get"</span>)
</pre>

<p>I guess all <code>el-get</code> users will have just the same 4 lines in their
<code>el-get-sources</code>. So let's call that a <em>recipe</em>, and have <code>el-get</code> look for yours
into the <code>el-get-recipe-path</code> directories. A recipe is found looking in those
directories in order, and must be named <code>package.el</code>. Now, <code>el-get</code> already
contains a handful of them, as you can see:</p>

<pre class="src">
ELISP&gt; (directory-files <span style="color: #bc8f8f;">"~/dev/emacs/el-get/recipes/"</span> nil <span style="color: #bc8f8f;">"[</span><span style="color: #bc8f8f;">^</span><span style="color: #bc8f8f;">.]$"</span>)
(<span style="color: #bc8f8f;">"auctex.el"</span> <span style="color: #bc8f8f;">"bbdb.el"</span> <span style="color: #bc8f8f;">"cssh.el"</span> <span style="color: #bc8f8f;">"el-get.el"</span> <span style="color: #bc8f8f;">"emms.el"</span> <span style="color: #bc8f8f;">"erc-track-score.el"</span>
 <span style="color: #bc8f8f;">"escreen.el"</span> <span style="color: #bc8f8f;">"google-maps.el"</span> <span style="color: #bc8f8f;">"haskell-mode.el"</span> <span style="color: #bc8f8f;">"hl-sexp.el"</span> <span style="color: #bc8f8f;">"magit.el"</span>
 <span style="color: #bc8f8f;">"muse-blog.el"</span> <span style="color: #bc8f8f;">"nxhtml.el"</span> <span style="color: #bc8f8f;">"psvn.el"</span> <span style="color: #bc8f8f;">"rainbow-mode.el"</span> <span style="color: #bc8f8f;">"rcirc-groups.el"</span>
 <span style="color: #bc8f8f;">"vkill.el"</span> <span style="color: #bc8f8f;">"xcscope.el"</span> <span style="color: #bc8f8f;">"xml-rpc-el.el"</span> <span style="color: #bc8f8f;">"yasnippet.el"</span>)
</pre>

<p>Please note that you can have your own local recipes by adding directories
to <code>el-get-recipe-path</code>. So now your minimalistic <code>el-get-sources</code> list will
look like <code>'(el-get cssh screen)</code>, say. And if you want to override a recipe,
for instance to use the default one but still have a personal <code>:after</code>
function containing your own setup, then simply have your <code>el-get-source</code>
entry a partial entry. Missing <code>:type</code> and <code>el-get</code> will merge your local
overrides atop the default one.</p>

<p>Finally, the way to share your recipes is by sending me an email with the
file, or to do the same over the <code>github</code> interface, I guess I'll still
receive a mail then.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 31 Aug 2010 14:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/blog/2010/08/31-want-to-share-your-recipes.html</guid>
</item>
<item>
  <title>Want to share your recipes?</title>
  <link>http://tapoueh.org/blog/2010/08/31-want-to-share-your-recipes.html</link>
  <description><![CDATA[h1>Want to share your recipes?</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Tuesday, August 31 2010, 14:15</div>
</div>
<div id="article">
<p>Yes, that's another <a href="http://github.com/dimitri/el-get/">el-get</a> related entry. It seems to take a lot of my
attention these days. After having setup the <code>git</code> repository so that you can
update <code>el-get</code> from within itself (so that it's <em>self-contained</em>), the next
logical step is providing <em>recipes</em>.</p>

<p>By that I mean that <code>el-get-sources</code> entries will certainly look a lot alike
between a user and another. Let's take the <code>el-get</code> entry itself:</p>

<pre class="src">
(<span style="color: #729fcf;">:name</span> el-get
       <span style="color: #729fcf;">:type</span> git
       <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"git://github.com/dimitri/el-get.git"</span>
       <span style="color: #729fcf;">:features</span> <span style="color: #ad7fa8; font-style: italic;">"el-get"</span>)
</pre>

<p>I guess all <code>el-get</code> users will have just the same 4 lines in their
<code>el-get-sources</code>. So let's call that a <em>recipe</em>, and have <code>el-get</code> look for yours
into the <code>el-get-recipe-path</code> directories. A recipe is found looking in those
directories in order, and must be named <code>package.el</code>. Now, <code>el-get</code> already
contains a handful of them, as you can see:</p>

<pre class="src">
ELISP&gt; (directory-files <span style="color: #ad7fa8; font-style: italic;">"~/dev/emacs/el-get/recipes/"</span> nil <span style="color: #ad7fa8; font-style: italic;">"[</span><span style="color: #ad7fa8; font-style: italic;">^</span><span style="color: #ad7fa8; font-style: italic;">.]$"</span>)
(<span style="color: #ad7fa8; font-style: italic;">"auctex.el"</span> <span style="color: #ad7fa8; font-style: italic;">"bbdb.el"</span> <span style="color: #ad7fa8; font-style: italic;">"cssh.el"</span> <span style="color: #ad7fa8; font-style: italic;">"el-get.el"</span> <span style="color: #ad7fa8; font-style: italic;">"emms.el"</span> <span style="color: #ad7fa8; font-style: italic;">"erc-track-score.el"</span>
 <span style="color: #ad7fa8; font-style: italic;">"escreen.el"</span> <span style="color: #ad7fa8; font-style: italic;">"google-maps.el"</span> <span style="color: #ad7fa8; font-style: italic;">"haskell-mode.el"</span> <span style="color: #ad7fa8; font-style: italic;">"hl-sexp.el"</span> <span style="color: #ad7fa8; font-style: italic;">"magit.el"</span>
 <span style="color: #ad7fa8; font-style: italic;">"muse-blog.el"</span> <span style="color: #ad7fa8; font-style: italic;">"nxhtml.el"</span> <span style="color: #ad7fa8; font-style: italic;">"psvn.el"</span> <span style="color: #ad7fa8; font-style: italic;">"rainbow-mode.el"</span> <span style="color: #ad7fa8; font-style: italic;">"rcirc-groups.el"</span>
 <span style="color: #ad7fa8; font-style: italic;">"vkill.el"</span> <span style="color: #ad7fa8; font-style: italic;">"xcscope.el"</span> <span style="color: #ad7fa8; font-style: italic;">"xml-rpc-el.el"</span> <span style="color: #ad7fa8; font-style: italic;">"yasnippet.el"</span>)
</pre>

<p>Please note that you can have your own local recipes by adding directories
to <code>el-get-recipe-path</code>. So now your minimalistic <code>el-get-sources</code> list will
look like <code>'(el-get cssh screen)</code>, say. And if you want to override a recipe,
for instance to use the default one but still have a personal <code>:after</code>
function containing your own setup, then simply have your <code>el-get-source</code>
entry a partial entry. Missing <code>:type</code> and <code>el-get</code> will merge your local
overrides atop the default one.</p>

<p>Finally, the way to share your recipes is by sending me an email with the
file, or to do the same over the <code>github</code> interface, I guess I'll still
receive a mail then.</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/muse.html">Muse</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/cssh.html">cssh</a> <a href="../../../tags/rcirc.html">rcirc</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 31 Aug 2010 14:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/31-want-to-share-your-recipes.html</guid>
</item>
<item>
  <title>Happy Numbers</title>
  <link>http://tapoueh.org/blog/2010/08/blog/2010/08/30-happy-numbers.html</link>
  <description><![CDATA[<p>After discovering the excellent <a href="http://gwene.org/">Gwene</a> service, which allows you to subscribe
to <em>newsgroups</em> to read <code>RSS</code> content (<em>blogs</em>, <em>planets</em>, <em>commits</em>, etc), I came to
read this nice article about <a href="http://programmingpraxis.com/2010/07/23/happy-numbers/">Happy Numbers</a>. That's a little problem that
fits well an interview style question, so I first solved it yesterday
evening in <a href="http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/List-Processing.html#List-Processing">Emacs Lisp</a> as that's the language I use the most those days.</p>

<blockquote>
<p class="quoted">
A happy number is defined by the following process. Starting with any
positive integer, replace the number by the sum of the squares of its
digits, and repeat the process until the number equals 1 (where it will
stay), or it loops endlessly in a cycle which does not include 1. Those
numbers for which this process ends in 1 are happy numbers, while those
that do not end in 1 are unhappy numbers (or sad numbers).</p>

</blockquote>

<p>Now, what about implementing the same in pure <code>SQL</code>, for more fun? Now that's
interesting! After all, we didn't get <code>WITH RECURSIVE</code> for tree traversal
only, <a href="http://archives.postgresql.org/message-id/e08cc0400911042333o5361b21cu2c9438f82b1e55ce@mail.gmail.com">did we</a>?</p>

<p>Unfortunately, we need a little helper function first, if only to ease the
reading of the recursive query. I didn't try to inline it, but here it goes:</p>

<pre class="src">
create or replace function digits(x bigint)
  returns setof int
  language sql
as $$
  select substring($1::text from i for 1)::int
    from generate_series(1, length($1::text)) as t(i)
$$;
</pre>

<p>That was easy: it will output one row per digit of the input number — and
rather than resorting to powers of ten and divisions and remainders, we do
use plain old text representation and <code>substring</code>. Now, to the real
problem. If you're read what is an happy number and already did read the
fine manual about <a href="http://www.postgresql.org/docs/8.4/interactive/queries-with.html">Recursive Query Evaluation</a>, it should be quite easy to
read the following:</p>

<pre class="src">
with recursive happy(n, seen) as (
    select 7::bigint, <span style="color: #bc8f8f;">'{}'</span>::bigint[]
  union all
    select sum(d*d), h.seen || sum(d*d)
      from (select n, digits(n) as d, seen
              from happy
           ) as h
  group by h.n, h.seen
    having not seen @&gt; array[sum(d*d)]
)
  select * from happy;
  n  |       seen
<span style="color: #b22222;">-----+------------------
</span>   7 | {}
  49 | {49}
  97 | {49,97}
 130 | {49,97,130}
  10 | {49,97,130,10}
   1 | {49,97,130,10,1}
(6 rows)

Time: 1.238 ms
</pre>

<p>That shows how it works for some <em>happy</em> number, and it's easy to test for a
non-happy one, like for example <code>17</code>. The query won't cycle thanks to the <code>seen</code>
array and the <code>having</code> filter, so the only difference between an <em>happy</em> and a
<em>sad</em> number will be that in the former case the last line output by the
recursive query will have <code>n = 1</code>. Let's expand this knowledge
into a proper function (because we want to be able to have the number we
test for happiness as an argument):</p>

<pre class="src">
create or replace function happy(x bigint)
  returns boolean
  language sql
as $$
with recursive happy(n, seen) as (
    select $1, <span style="color: #bc8f8f;">'{}'</span>::bigint[]
  union all
    select sum(d*d), h.seen || sum(d*d)
      from (select n, digits(n) as d, seen
              from happy
           ) as h
  group by h.n, h.seen
    having not seen @&gt; array[sum(d*d)]
)
  select n = 1 as happy
    from happy
order by array_length(seen, 1) desc nulls last
   limit 1
$$;
</pre>

<p>We need the <code>desc nulls last</code> trick in the <code>order by</code> because the <code>array_length()</code>
of any dimension of an empty array is <code>NULL</code>, and we certainly don't want to
return all and any number as unhappy on the grounds that the query result
contains a line <code>input, {}</code>. Let's now play the same tricks as in the puzzle
article:</p>

<pre class="src">
=# select array_agg(x) as happy
     from generate_series(1, 50) as t(x)
    where happy(x);
              happy
<span style="color: #b22222;">----------------------------------
</span> {1,7,10,13,19,23,28,31,32,44,49}
(1 row)

Time: 24.527 ms

=# explain analyze select x
                     from generate_series(1, 10000) as t(x)
                    where happy(x);
                      QUERY PLAN
<span style="color: #b22222;">------------------------------------------------------------
</span> Function Scan on generate_series t
     (cost=0.00..265.00 rows=333 width=4)
     (actual time=2.938..3651.019 rows=1442 loops=1)
   Filter: happy((x)::bigint)
 Total runtime: 3651.534 ms
(3 rows)

Time: 3652.178 ms
</pre>

<p>(Yes, I tricked the <code>EXPLAIN ANALYZE</code> output so that it fits on the page width
here). For what it's worth, finding the first <code>10000</code> happy numbers in <em>Emacs
Lisp</em> on the same laptop takes <code>2830 ms</code>, also running a recursive version of
the code.</p>

<h3>Update, the Emacs Lisp version, inline:</h3>

<pre class="src">
(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">happy?</span> (<span style="color: #228b22;">&amp;optional</span> n seen)
  <span style="color: #bc8f8f;">"return true when n is a happy number"</span>
  (interactive)
  (<span style="color: #7f007f;">let*</span> ((number    (or n (read-from-minibuffer
                           <span style="color: #bc8f8f;">"Is this number happy: "</span>)))
         (digits    (mapcar
                     'string-to-int
                     (subseq (split-string number <span style="color: #bc8f8f;">""</span>) 1 -1)))
         (squares   (mapcar (<span style="color: #7f007f;">lambda</span> (x) (* x x)) digits))
         (happiness (apply '+ squares)))
    (<span style="color: #7f007f;">cond</span> ((eq 1 happiness)      t)
          ((memq happiness seen) nil)
          (t
           (happy? (number-to-string happiness)
                   (push happiness seen))))))

(<span style="color: #7f007f;">defun</span> <span style="color: #0000ff;">find-happy-numbers</span> (<span style="color: #228b22;">&amp;optional</span> limit)
  <span style="color: #bc8f8f;">"find all happy numbers from 1 to limit"</span>
  (interactive)
  (<span style="color: #7f007f;">let</span> ((count (or limit
                   (read-from-minibuffer
                    <span style="color: #bc8f8f;">"List of happy numbers from 1 to: "</span>)))
        happy)
    (<span style="color: #7f007f;">dotimes</span> (n (string-to-int count))
      (<span style="color: #7f007f;">when</span> (happy? (number-to-string (1+ n)))
        (push (1+ n) happy)))
    (nreverse happy)))
</pre>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 30 Aug 2010 11:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/blog/2010/08/30-happy-numbers.html</guid>
</item>
<item>
  <title>Happy Numbers</title>
  <link>http://tapoueh.org/blog/2010/08/30-happy-numbers.html</link>
  <description><![CDATA[h1>Happy Numbers</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Monday, August 30 2010, 11:00</div>
</div>
<div id="article">
<p>After discovering the excellent <a href="http://gwene.org/">Gwene</a> service, which allows you to subscribe
to <em>newsgroups</em> to read <code>RSS</code> content (<em>blogs</em>, <em>planets</em>, <em>commits</em>, etc), I came to
read this nice article about <a href="http://programmingpraxis.com/2010/07/23/happy-numbers/">Happy Numbers</a>. That's a little problem that
fits well an interview style question, so I first solved it yesterday
evening in <a href="http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/List-Processing.html#List-Processing">Emacs Lisp</a> as that's the language I use the most those days.</p>

<blockquote>
<p class="quoted">
A happy number is defined by the following process. Starting with any
positive integer, replace the number by the sum of the squares of its
digits, and repeat the process until the number equals 1 (where it will
stay), or it loops endlessly in a cycle which does not include 1. Those
numbers for which this process ends in 1 are happy numbers, while those
that do not end in 1 are unhappy numbers (or sad numbers).</p>

</blockquote>

<p>Now, what about implementing the same in pure <code>SQL</code>, for more fun? Now that's
interesting! After all, we didn't get <code>WITH RECURSIVE</code> for tree traversal
only, <a href="http://archives.postgresql.org/message-id/e08cc0400911042333o5361b21cu2c9438f82b1e55ce@mail.gmail.com">did we</a>?</p>

<p>Unfortunately, we need a little helper function first, if only to ease the
reading of the recursive query. I didn't try to inline it, but here it goes:</p>

<pre class="src">
create or replace function digits(x bigint)
  returns setof int
  language sql
as $$
  select substring($1::text from i for 1)::int
    from generate_series(1, length($1::text)) as t(i)
$$;
</pre>

<p>That was easy: it will output one row per digit of the input number — and
rather than resorting to powers of ten and divisions and remainders, we do
use plain old text representation and <code>substring</code>. Now, to the real
problem. If you're read what is an happy number and already did read the
fine manual about <a href="http://www.postgresql.org/docs/8.4/interactive/queries-with.html">Recursive Query Evaluation</a>, it should be quite easy to
read the following:</p>

<pre class="src">
with recursive happy(n, seen) as (
    select 7::bigint, <span style="color: #ad7fa8; font-style: italic;">'{}'</span>::bigint[]
  union all
    select sum(d*d), h.seen || sum(d*d)
      from (select n, digits(n) as d, seen
              from happy
           ) as h
  group by h.n, h.seen
    having not seen @&gt; array[sum(d*d)]
)
  select * from happy;
  n  |       seen
<span style="color: #888a85;">-----+------------------
</span>   7 | {}
  49 | {49}
  97 | {49,97}
 130 | {49,97,130}
  10 | {49,97,130,10}
   1 | {49,97,130,10,1}
(6 rows)

Time: 1.238 ms
</pre>

<p>That shows how it works for some <em>happy</em> number, and it's easy to test for a
non-happy one, like for example <code>17</code>. The query won't cycle thanks to the <code>seen</code>
array and the <code>having</code> filter, so the only difference between an <em>happy</em> and a
<em>sad</em> number will be that in the former case the last line output by the
recursive query will have <code>n = 1</code>. Let's expand this knowledge
into a proper function (because we want to be able to have the number we
test for happiness as an argument):</p>

<pre class="src">
create or replace function happy(x bigint)
  returns boolean
  language sql
as $$
with recursive happy(n, seen) as (
    select $1, <span style="color: #ad7fa8; font-style: italic;">'{}'</span>::bigint[]
  union all
    select sum(d*d), h.seen || sum(d*d)
      from (select n, digits(n) as d, seen
              from happy
           ) as h
  group by h.n, h.seen
    having not seen @&gt; array[sum(d*d)]
)
  select n = 1 as happy
    from happy
order by array_length(seen, 1) desc nulls last
   limit 1
$$;
</pre>

<p>We need the <code>desc nulls last</code> trick in the <code>order by</code> because the <code>array_length()</code>
of any dimension of an empty array is <code>NULL</code>, and we certainly don't want to
return all and any number as unhappy on the grounds that the query result
contains a line <code>input, {}</code>. Let's now play the same tricks as in the puzzle
article:</p>

<pre class="src">
=# select array_agg(x) as happy
     from generate_series(1, 50) as t(x)
    where happy(x);
              happy
<span style="color: #888a85;">----------------------------------
</span> {1,7,10,13,19,23,28,31,32,44,49}
(1 row)

Time: 24.527 ms

=# explain analyze select x
                     from generate_series(1, 10000) as t(x)
                    where happy(x);
                      QUERY PLAN
<span style="color: #888a85;">------------------------------------------------------------
</span> Function Scan on generate_series t
     (cost=0.00..265.00 rows=333 width=4)
     (actual time=2.938..3651.019 rows=1442 loops=1)
   Filter: happy((x)::bigint)
 Total runtime: 3651.534 ms
(3 rows)

Time: 3652.178 ms
</pre>

<p>(Yes, I tricked the <code>EXPLAIN ANALYZE</code> output so that it fits on the page width
here). For what it's worth, finding the first <code>10000</code> happy numbers in <em>Emacs
Lisp</em> on the same laptop takes <code>2830 ms</code>, also running a recursive version of
the code.</p>

<h3>Update, the Emacs Lisp version, inline:</h3>

<pre class="src">
(<span style="color: #729fcf; font-weight: bold;">defun</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">happy?</span> (<span style="color: #8ae234; font-weight: bold;">&amp;optional</span> n seen)
  <span style="color: #888a85;">"return true when n is a happy number"</span>
  (interactive)
  (<span style="color: #729fcf; font-weight: bold;">let*</span> ((number    (or n (read-from-minibuffer
                           <span style="color: #ad7fa8; font-style: italic;">"Is this number happy: "</span>)))
         (digits    (mapcar
                     'string-to-int
                     (subseq (split-string number <span style="color: #ad7fa8; font-style: italic;">""</span>) 1 -1)))
         (squares   (mapcar (<span style="color: #729fcf; font-weight: bold;">lambda</span> (x) (* x x)) digits))
         (happiness (apply '+ squares)))
    (<span style="color: #729fcf; font-weight: bold;">cond</span> ((eq 1 happiness)      t)
          ((memq happiness seen) nil)
          (t
           (happy? (number-to-string happiness)
                   (push happiness seen))))))

(<span style="color: #729fcf; font-weight: bold;">defun</span> <span style="color: #edd400; font-weight: bold; font-style: italic;">find-happy-numbers</span> (<span style="color: #8ae234; font-weight: bold;">&amp;optional</span> limit)
  <span style="color: #888a85;">"find all happy numbers from 1 to limit"</span>
  (interactive)
  (<span style="color: #729fcf; font-weight: bold;">let</span> ((count (or limit
                   (read-from-minibuffer
                    <span style="color: #ad7fa8; font-style: italic;">"List of happy numbers from 1 to: "</span>)))
        happy)
    (<span style="color: #729fcf; font-weight: bold;">dotimes</span> (n (string-to-int count))
      (<span style="color: #729fcf; font-weight: bold;">when</span> (happy? (number-to-string (1+ n)))
        (push (1+ n) happy)))
    (nreverse happy)))
</pre>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/emacs.html">Emacs</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 30 Aug 2010 11:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/30-happy-numbers.html</guid>
</item>
<item>
  <title>welcome el-get scratch installer</title>
  <link>http://tapoueh.org/blog/2010/08/blog/2010/08/27-welcome-el-get-scratch-installer.html</link>
  <description><![CDATA[<p><span class="hack"> </span></p>

<p>A very good remark from some users: installing and managing <code>el-get</code> should be
simpler. They wanted both an easy install of the thing, and a way to be able
to manage it afterwards (like, update the local copy against the
authoritative source). So I decided it was high time for getting the code
out of my <code>~/.emacs.d</code> git repository and up to a public place:
<a href="http://github.com/dimitri/el-get">http://github.com/dimitri/el-get</a>.</p>

<p>Then, I added some documentation (a <code>README</code>), and then, a <code>*scratch*
installer</code>, following great ideas from <code>ELPA</code>. So have at it, it's a copy paste
away!</p>

<p>Don't forget to setup your <code>el-get-sources</code> and include there the <code>el-get</code>
source for updates, there's nothing magic about it so it's up to you. You
may notice that it's not yet possible to init <code>el-get</code> from <code>el-get-sources</code>,
though, that's the drawback of the lack of magic. So you will have to still
add an explicit <code>(require 'el-get)</code> before to go and define you own
<code>el-get-sources</code> then finally <code>(el-get)</code>. I don't think that's a problem I need
to solve, though.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 27 Aug 2010 14:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/blog/2010/08/27-welcome-el-get-scratch-installer.html</guid>
</item>
<item>
  <title>welcome el-get scratch installer</title>
  <link>http://tapoueh.org/blog/2010/08/27-welcome-el-get-scratch-installer.html</link>
  <description><![CDATA[h1>welcome el-get scratch installer</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Friday, August 27 2010, 14:15</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>A very good remark from some users: installing and managing <code>el-get</code> should be
simpler. They wanted both an easy install of the thing, and a way to be able
to manage it afterwards (like, update the local copy against the
authoritative source). So I decided it was high time for getting the code
out of my <code>~/.emacs.d</code> git repository and up to a public place:
<a href="http://github.com/dimitri/el-get">http://github.com/dimitri/el-get</a>.</p>

<p>Then, I added some documentation (a <code>README</code>), and then, a <code>*scratch*
installer</code>, following great ideas from <code>ELPA</code>. So have at it, it's a copy paste
away!</p>

<p>Don't forget to setup your <code>el-get-sources</code> and include there the <code>el-get</code>
source for updates, there's nothing magic about it so it's up to you. You
may notice that it's not yet possible to init <code>el-get</code> from <code>el-get-sources</code>,
though, that's the drawback of the lack of magic. So you will have to still
add an explicit <code>(require 'el-get)</code> before to go and define you own
<code>el-get-sources</code> then finally <code>(el-get)</code>. I don't think that's a problem I need
to solve, though.</p>



<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 27 Aug 2010 14:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/27-welcome-el-get-scratch-installer.html</guid>
</item>
<item>
  <title>Playing with bit strings</title>
  <link>http://tapoueh.org/blog/2010/08/26-playing-with-bit-strings.html</link>
  <description><![CDATA[h1>Playing with bit strings</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Thursday, August 26 2010, 17:45</div>
</div>
<div id="article">
<p>The idea of the day ain't directly from me, I'm just helping with a very
thin subpart of the problem. The problem, I can't say much about, let's just
assume you want to reduce the storage of <code>MD5</code> in your database, so you want
to abuse <a href="http://www.postgresql.org/docs/8.4/interactive/datatype-bit.html">bit strings</a>. A solution to use them works fine, but the datatype is
still missing some facilities, for example going from and to hexadecimal
representation in text.</p>

<pre class="src">
create or replace function hex_to_varbit(h text)
 returns varbit
 language sql
as $$
  select (<span style="color: #ad7fa8; font-style: italic;">'X'</span> || $1)::varbit;
$$;

create or replace function varbit_to_hex(b varbit)
 returns text
 language sql
as $$
  select array_to_string(array_agg(to_hex((b &lt;&lt; (32*o))::bit(32)::bigint)), <span style="color: #ad7fa8; font-style: italic;">''</span>)
    from (select b, generate_series(0, n-1) as o
            from (select $1, octet_length($1)/4) as t(b, n)) as x
$$;
</pre>

<p>To understand the magic in the second function, let's walk through the tests
one could do when wanting to grasp how things work in the <code>bitstring</code> world
(using also some reading of the fine documentation, too).</p>

<pre class="src">
=# select ('101011001011100110010110'::varbit &lt;&lt; 0)::bit(8);
   bit
----------
 10101100
(1 row)

=# select ('101011001011100110010110'::varbit &lt;&lt; 8)::bit(8);
   bit
----------
 10111001
(1 row)

=# select ('101011001011100110010110'::varbit &lt;&lt; 16)::bit(8);
   bit
----------
 10010110
(1 row)

=# select * from *TEMP VERSION OF THE FUNCTION FOR TESTING*
 o |                b                 |    x
---+----------------------------------+----------
 0 | 10101100101111010001100011011011 | acbd18db
 1 | 01001100110000101111100001011100 | 4cc2f85c
 2 | 11101101111011110110010101001111 | edef654f
 3 | 11001100110001001010010011011000 | ccc4a4d8
(4 rows)
</pre>

<p>What do we get from that, will you ask? Let's see a little example:</p>

<pre class="src">
=# select hex_to_varbit(md5('foo'));
                                                          hex_to_varbit
----------------------------------------------------------------------------------------------------------------------------------
 10101100101111010001100011011011010011001100001011111000010111001110110111101111011001010100111111001100110001001010010011011000
(1 row)

=# select md5('foo'), varbit_to_hex(hex_to_varbit(md5('foo')));
               md5                |          varbit_to_hex
----------------------------------+----------------------------------
 acbd18db4cc2f85cedef654fccc4a4d8 | acbd18db4cc2f85cedef654fccc4a4d8
(1 row)
</pre>

<p>Storing <code>varbits</code> rather than the <code>text</code> form of the <code>MD5</code> allows us to go from
<code>6510 MB</code> down to <code>4976 MB</code> on a sample table containing 100 millions
rows. We're targeting more that that, so that's a great win down here!</p>

<p>In case you wonder, querying the main index on <code>varbit</code> rather than the one on
<code>text</code> for a single result row, the cost of doing the conversion with
<code>varbit_to_hex</code> seems to be around <code>28 µs</code>. We can afford it.</p>

<p>Hope this helps!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 Aug 2010 17:45:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/26-playing-with-bit-strings.html</guid>
</item>
<item>
  <title>el-get news</title>
  <link>http://tapoueh.org/blog/2010/08/blog/2010/08/26-el-get-news.html</link>
  <description><![CDATA[<p>I've been receiving some requests for <a href="http://www.emacswiki.org/emacs/el-get.el">el-get</a>, some of them even included a
patch. So now there's support for <code>bzr</code>, <code>CSV</code> and <code>http-tar</code>, augmenting the
existing support for <code>git</code>, <code>git-svn</code>, <code>apt-get</code>, <code>fink</code> and <code>ELPA</code> formats.</p>

<p>Also, as the <code>install</code> and even the <code>build</code> are completely <em>asynchronous</em> —
there's a pending bugfix for the building, which is now using
<a href="http://www.gnu.org/software/emacs/elisp/html_node/Asynchronous-Processes.html">start-process-shell-command</a>. The advantage of doing so is that you're free
to use Emacs as usual while <code>el-get</code> is having your piece of <code>elisp</code> code
compiled, which can take time.</p>

<p>The drawback is that it's uneasy to to do the associated setup at the right
time without support from <code>el-get</code>, so you have the new option <code>:after</code> which
takes a <code>functionp</code> object: please consider using that to give your own
special setup for the external emacs bits and pieces you're using.</p>

<p>Let's see some examples of the new features:</p>

<pre class="src">
  (<span style="color: #da70d6;">:name</span> xml-rpc-el
         <span style="color: #da70d6;">:type</span> bzr
         <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"lp:xml-rpc-el"</span>)

  (<span style="color: #da70d6;">:name</span> haskell-mode
         <span style="color: #da70d6;">:type</span> http-tar
         <span style="color: #da70d6;">:options</span> (<span style="color: #bc8f8f;">"xzf"</span>)
         <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"http://projects.haskell.org/haskellmode-emacs/haskell-mode-2.8.0.</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">tar.gz"</span>
         <span style="color: #da70d6;">:load</span> <span style="color: #bc8f8f;">"haskell-site-file.el"</span>
         <span style="color: #da70d6;">:after</span> (<span style="color: #7f007f;">lambda</span> ()
                  (add-hook 'haskell-mode-hook 'turn-on-haskell-doc-mode)
                  (add-hook 'haskell-mode-hook 'turn-on-haskell-indentation)))

  (<span style="color: #da70d6;">:name</span> auctex
         <span style="color: #da70d6;">:type</span> cvs
         <span style="color: #da70d6;">:module</span> <span style="color: #bc8f8f;">"auctex"</span>
         <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">":pserver:anonymous@cvs.sv.gnu.org:/sources/auctex"</span>
         <span style="color: #da70d6;">:build</span> (<span style="color: #bc8f8f;">"./autogen.sh"</span> <span style="color: #bc8f8f;">"./configure"</span> <span style="color: #bc8f8f;">"make"</span>)
         <span style="color: #da70d6;">:load</span>  (<span style="color: #bc8f8f;">"auctex.el"</span> <span style="color: #bc8f8f;">"preview/preview-latex.el"</span>)
         <span style="color: #da70d6;">:info</span> <span style="color: #bc8f8f;">"doc"</span>)
</pre>

<p>As you can see, there are also the new options <code>:module</code> (only used by <code>CVS</code> so
far) and <code>:options</code> (only used by <code>http-tar</code> so far). With this later method,
the <code>:options</code> key allows you to have support for virtually any kind of <code>tar</code>
compression (<code>.tar.bz2</code>, etc).</p>

<p>The <code>CVS</code> support currently does not include authentication against the
anonymous <code>pserver</code>, because the only repository I've been asked support for
isn't using that, and the couple of servers that I know of are either
wanting no password at the prompt, or a dummy one. That's for another day,
if needed at all.</p>

<p>That pushes the little local hack to more than a thousand lines of <code>elisp</code>
code, and the next steps include proposing it to <a href="http://tromey.com/elpa/">ELPA</a> so that getting to use
it is easier than ever. You'd just have to choose whether to install <code>ELPA</code>
from <code>el-get</code> or the other way around.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 Aug 2010 16:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/blog/2010/08/26-el-get-news.html</guid>
</item>
<item>
  <title>el-get news</title>
  <link>http://tapoueh.org/blog/2010/08/26-el-get-news.html</link>
  <description><![CDATA[h1>el-get news</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Thursday, August 26 2010, 16:30</div>
</div>
<div id="article">
<p>I've been receiving some requests for <a href="http://www.emacswiki.org/emacs/el-get.el">el-get</a>, some of them even included a
patch. So now there's support for <code>bzr</code>, <code>CSV</code> and <code>http-tar</code>, augmenting the
existing support for <code>git</code>, <code>git-svn</code>, <code>apt-get</code>, <code>fink</code> and <code>ELPA</code> formats.</p>

<p>Also, as the <code>install</code> and even the <code>build</code> are completely <em>asynchronous</em> —
there's a pending bugfix for the building, which is now using
<a href="http://www.gnu.org/software/emacs/elisp/html_node/Asynchronous-Processes.html">start-process-shell-command</a>. The advantage of doing so is that you're free
to use Emacs as usual while <code>el-get</code> is having your piece of <code>elisp</code> code
compiled, which can take time.</p>

<p>The drawback is that it's uneasy to to do the associated setup at the right
time without support from <code>el-get</code>, so you have the new option <code>:after</code> which
takes a <code>functionp</code> object: please consider using that to give your own
special setup for the external emacs bits and pieces you're using.</p>

<p>Let's see some examples of the new features:</p>

<pre class="src">
  (<span style="color: #729fcf;">:name</span> xml-rpc-el
         <span style="color: #729fcf;">:type</span> bzr
         <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"lp:xml-rpc-el"</span>)

  (<span style="color: #729fcf;">:name</span> haskell-mode
         <span style="color: #729fcf;">:type</span> http-tar
         <span style="color: #729fcf;">:options</span> (<span style="color: #ad7fa8; font-style: italic;">"xzf"</span>)
         <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"http://projects.haskell.org/haskellmode-emacs/haskell-mode-2.8.0.</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">tar.gz"</span>
         <span style="color: #729fcf;">:load</span> <span style="color: #ad7fa8; font-style: italic;">"haskell-site-file.el"</span>
         <span style="color: #729fcf;">:after</span> (<span style="color: #729fcf; font-weight: bold;">lambda</span> ()
                  (add-hook 'haskell-mode-hook 'turn-on-haskell-doc-mode)
                  (add-hook 'haskell-mode-hook 'turn-on-haskell-indentation)))

  (<span style="color: #729fcf;">:name</span> auctex
         <span style="color: #729fcf;">:type</span> cvs
         <span style="color: #729fcf;">:module</span> <span style="color: #ad7fa8; font-style: italic;">"auctex"</span>
         <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">":pserver:anonymous@cvs.sv.gnu.org:/sources/auctex"</span>
         <span style="color: #729fcf;">:build</span> (<span style="color: #ad7fa8; font-style: italic;">"./autogen.sh"</span> <span style="color: #ad7fa8; font-style: italic;">"./configure"</span> <span style="color: #ad7fa8; font-style: italic;">"make"</span>)
         <span style="color: #729fcf;">:load</span>  (<span style="color: #ad7fa8; font-style: italic;">"auctex.el"</span> <span style="color: #ad7fa8; font-style: italic;">"preview/preview-latex.el"</span>)
         <span style="color: #729fcf;">:info</span> <span style="color: #ad7fa8; font-style: italic;">"doc"</span>)
</pre>

<p>As you can see, there are also the new options <code>:module</code> (only used by <code>CVS</code> so
far) and <code>:options</code> (only used by <code>http-tar</code> so far). With this later method,
the <code>:options</code> key allows you to have support for virtually any kind of <code>tar</code>
compression (<code>.tar.bz2</code>, etc).</p>

<p>The <code>CVS</code> support currently does not include authentication against the
anonymous <code>pserver</code>, because the only repository I've been asked support for
isn't using that, and the couple of servers that I know of are either
wanting no password at the prompt, or a dummy one. That's for another day,
if needed at all.</p>

<p>That pushes the little local hack to more than a thousand lines of <code>elisp</code>
code, and the next steps include proposing it to <a href="http://tromey.com/elpa/">ELPA</a> so that getting to use
it is easier than ever. You'd just have to choose whether to install <code>ELPA</code>
from <code>el-get</code> or the other way around.</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 26 Aug 2010 16:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/26-el-get-news.html</guid>
</item>
<item>
  <title>el-get and dim-switch-window status update</title>
  <link>http://tapoueh.org/blog/2010/08/blog/2010/08/09-el-get-and-dim-switch-window-status-update.html</link>
  <description><![CDATA[<p><span class="hack"> </span></p>

<p>Thanks to you readers of <a href="http://planet.emacsen.org/">Planet Emacsen</a> taking the time to try those pieces
of emacs lisp found in my blog, and also the time to comment on them, some
bugs have been fixed, and new releases appeared.</p>

<p><a href="http://tapoueh.org/projects.html#sec20">el-get</a> had some typo kind of bug in its support for <code>apt-get</code> and <code>fink</code>
packages, and I managed to break the <code>elpa</code> and <code>http</code> support when going <em>all
asynchronous</em> by forgetting to update the call convention I'm using. Fixing
that, I also switched to using <code>url-retrieve</code> so that the <code>http</code> support also is
<em>asynchronous</em>. That makes the version <code>0.5</code>, available on <a href="http://www.emacswiki.org/emacs/el-get.el">emacswiki el-get</a>
page.</p>

<p>Meanwhile <a href="http://tapoueh.org/projects.html#sec19">dim-switch-window.el</a> got some testers too and got updated with a
nice fix, or so I think. If you're using it with a small enough emacs frame,
or some very little windows in there, you'd have noticed that the number get
so big they don't fit anymore, and all you see while it's waiting for your
window number choice is... blank windows. Not very helpful. Thanks to the
following piece of code, that's no longer the case as of the current
version, available on <a href="http://www.emacswiki.org/emacs/switch-window.el">emacswiki switch-window</a> page.</p>

<p>In short, where I used to blindly apply <code>dim:switch-window-increase</code> on the
big numbers to display, the code now checks that there's enough room for it
to get there, and adjust the <em>increase</em> level scaling it down if
necessary. Very simple, and effective too:</p>

<pre class="src">
    (<span style="color: #7f007f;">with-current-buffer</span> buf
      (text-scale-increase
       (<span style="color: #7f007f;">if</span> (&gt; (/ (float (window-body-height win))
                 dim:switch-window-increase)
              1)
           dim:switch-window-increase
         (window-body-height win)))
      (insert <span style="color: #bc8f8f;">"\n\n    "</span> (number-to-string num)))
</pre>

<p>Centering the text in the window's width is another story entirely, as the
<code>text-scale-increase</code> ain't linear on this axis. I'd take any good idea,
here's what I'm currently at, but it's not there yet:</p>

<pre class="src">
    (<span style="color: #7f007f;">with-current-buffer</span> buf
      (<span style="color: #7f007f;">let*</span> ((w (window-width win))
             (h (window-body-height win))
             (increased-lines (/ (float h) dim:switch-window-increase))
             (scale (<span style="color: #7f007f;">if</span> (&gt; increased-lines 1) dim:switch-window-increase h))
             (lines-before (/ increased-lines 2))
             (margin-left (/ w h) ))
        <span style="color: #b22222;">;; </span><span style="color: #b22222;">increase to maximum dim:switch-window-increase
</span>        (text-scale-increase scale)
        <span style="color: #b22222;">;; </span><span style="color: #b22222;">make it so that the hyuge number appears centered
</span>        (<span style="color: #7f007f;">dotimes</span> (i lines-before) (insert <span style="color: #bc8f8f;">"\n"</span>))
        (<span style="color: #7f007f;">dotimes</span> (i margin-left)  (insert <span style="color: #bc8f8f;">" "</span>))
        (insert (number-to-string num))))
</pre>

<p>So, if you're using one or the other (both?) of those utilities, update your
local version of them!</p>

<p>Note: I also fixed a but in <a href="http://github.com/dimitri/rcirc-groups">rcirc-groups</a> this week-end, but I'll talk about
it in another entry, if I may.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 09 Aug 2010 15:35:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/blog/2010/08/09-el-get-and-dim-switch-window-status-update.html</guid>
</item>
<item>
  <title>el-get and dim-switch-window status update</title>
  <link>http://tapoueh.org/blog/2010/08/09-el-get-and-dim-switch-window-status-update.html</link>
  <description><![CDATA[h1>el-get and dim-switch-window status update</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Monday, August 09 2010, 15:35</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>Thanks to you readers of <a href="http://planet.emacsen.org/">Planet Emacsen</a> taking the time to try those pieces
of emacs lisp found in my blog, and also the time to comment on them, some
bugs have been fixed, and new releases appeared.</p>

<p><a href="http://tapoueh.org/projects.html#sec20">el-get</a> had some typo kind of bug in its support for <code>apt-get</code> and <code>fink</code>
packages, and I managed to break the <code>elpa</code> and <code>http</code> support when going <em>all
asynchronous</em> by forgetting to update the call convention I'm using. Fixing
that, I also switched to using <code>url-retrieve</code> so that the <code>http</code> support also is
<em>asynchronous</em>. That makes the version <code>0.5</code>, available on <a href="http://www.emacswiki.org/emacs/el-get.el">emacswiki el-get</a>
page.</p>

<p>Meanwhile <a href="http://tapoueh.org/projects.html#sec19">dim-switch-window.el</a> got some testers too and got updated with a
nice fix, or so I think. If you're using it with a small enough emacs frame,
or some very little windows in there, you'd have noticed that the number get
so big they don't fit anymore, and all you see while it's waiting for your
window number choice is... blank windows. Not very helpful. Thanks to the
following piece of code, that's no longer the case as of the current
version, available on <a href="http://www.emacswiki.org/emacs/switch-window.el">emacswiki switch-window</a> page.</p>

<p>In short, where I used to blindly apply <code>dim:switch-window-increase</code> on the
big numbers to display, the code now checks that there's enough room for it
to get there, and adjust the <em>increase</em> level scaling it down if
necessary. Very simple, and effective too:</p>

<pre class="src">
    (<span style="color: #729fcf; font-weight: bold;">with-current-buffer</span> buf
      (text-scale-increase
       (<span style="color: #729fcf; font-weight: bold;">if</span> (&gt; (/ (float (window-body-height win))
                 dim:switch-window-increase)
              1)
           dim:switch-window-increase
         (window-body-height win)))
      (insert <span style="color: #ad7fa8; font-style: italic;">"\n\n    "</span> (number-to-string num)))
</pre>

<p>Centering the text in the window's width is another story entirely, as the
<code>text-scale-increase</code> ain't linear on this axis. I'd take any good idea,
here's what I'm currently at, but it's not there yet:</p>

<pre class="src">
    (<span style="color: #729fcf; font-weight: bold;">with-current-buffer</span> buf
      (<span style="color: #729fcf; font-weight: bold;">let*</span> ((w (window-width win))
             (h (window-body-height win))
             (increased-lines (/ (float h) dim:switch-window-increase))
             (scale (<span style="color: #729fcf; font-weight: bold;">if</span> (&gt; increased-lines 1) dim:switch-window-increase h))
             (lines-before (/ increased-lines 2))
             (margin-left (/ w h) ))
        <span style="color: #888a85;">;; </span><span style="color: #888a85;">increase to maximum dim:switch-window-increase
</span>        (text-scale-increase scale)
        <span style="color: #888a85;">;; </span><span style="color: #888a85;">make it so that the hyuge number appears centered
</span>        (<span style="color: #729fcf; font-weight: bold;">dotimes</span> (i lines-before) (insert <span style="color: #ad7fa8; font-style: italic;">"\n"</span>))
        (<span style="color: #729fcf; font-weight: bold;">dotimes</span> (i margin-left)  (insert <span style="color: #ad7fa8; font-style: italic;">" "</span>))
        (insert (number-to-string num))))
</pre>

<p>So, if you're using one or the other (both?) of those utilities, update your
local version of them!</p>

<p>Note: I also fixed a but in <a href="http://github.com/dimitri/rcirc-groups">rcirc-groups</a> this week-end, but I'll talk about
it in another entry, if I may.</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/switch-window.html">switch-window</a> <a href="../../../tags/rcirc.html">rcirc</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 09 Aug 2010 15:35:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/09-el-get-and-dim-switch-window-status-update.html</guid>
</item>
<item>
  <title>Editing constants in constraints</title>
  <link>http://tapoueh.org/blog/2010/08/09-editing-constants-in-constraints.html</link>
  <description><![CDATA[h1>Editing constants in constraints</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Monday, August 09 2010, 14:45</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>We're using constants in some constraints here, for example in cases where
several servers are replicating to the same <em>federating</em> one: each origin
server has his own schema, and all is replicated nicely on the central host,
thanks to <a href="http://wiki.postgresql.org/wiki/Londiste_Tutorial#Federated_database">Londiste</a>, as you might have guessed already.</p>

<p>For bare-metal recovery scripts, I'm working on how to change those
constants in the constraints, so that <code>pg_dump -s</code> plus some schema tweaking
would kick-start a server. Here's a <code>PLpgSQL</code> snippet to do just that:</p>

<pre class="src">
  FOR rec IN EXECUTE
$s$
SELECT schemaname, tablename, conname, attnames, def
  FROM (
   SELECT n.nspname, c.relname, r.conname,
          (select array_accum(attname)
             from pg_attribute
            where attrelid = c.oid and r.conkey @&gt; array[attnum]) as attnames,
          pg_catalog.pg_get_constraintdef(r.oid, true)
   FROM pg_catalog.pg_constraint r
        JOIN pg_class c on c.oid = r.conrelid
        JOIN pg_namespace n ON n.oid = c.relnamespace
   WHERE r.contype = <span style="color: #ad7fa8; font-style: italic;">'c'</span>
ORDER BY 1, 2, 3
       ) as cons(schemaname, tablename, conname, attnames, def)
WHERE attnames @&gt; array[<span style="color: #ad7fa8; font-style: italic;">'server'</span>]::name[]
$s$
  LOOP
    rec.def := replace(rec.def, <span style="color: #ad7fa8; font-style: italic;">'server = '</span> || old_id,
                                <span style="color: #ad7fa8; font-style: italic;">'server = '</span> || new_id);

    sql := <span style="color: #ad7fa8; font-style: italic;">'ALTER TABLE '</span> || rec.schemaname || <span style="color: #ad7fa8; font-style: italic;">'.'</span> || rec.tablename
        || <span style="color: #ad7fa8; font-style: italic;">' DROP CONSTRAINT '</span> || rec.conname;
    RAISE NOTICE <span style="color: #ad7fa8; font-style: italic;">'%'</span>, sql;
    RETURN NEXT;
    EXECUTE sql;

    sql := <span style="color: #ad7fa8; font-style: italic;">'ALTER TABLE '</span> || rec.schemaname || <span style="color: #ad7fa8; font-style: italic;">'.'</span> || rec.tablename
        || <span style="color: #ad7fa8; font-style: italic;">' ADD '</span> || rec.def;
    RAISE NOTICE <span style="color: #ad7fa8; font-style: italic;">'%'</span>, sql;
    RETURN NEXT;
    EXECUTE sql;

  END LOOP;
</pre>

<p>This relies on the fact that our constraints are on the column <code>server</code>. Why
would this be any better than a <code>sed</code> one-liner, would you ask me? I'm fed up
with having pseudo-parsing scripts and taking the risk that the simple
command will change data I didn't want to edit. I want context aware tools,
pretty please, to <em>feel</em> safe.</p>

<p>Otherwise I'd might have gone with <code>pg_dump -s| sed -e 's:\(server =\)
17:\1 18:'</code> but this one-liner already contains too much useless magic
for my taste (the space before <code>17</code> ain't in the group match to allow for
having <code>\1 18</code> in the right hand side. And this isn't yet parametrized, and
there I'll need to talk to the database, as that's were I store the servers
name and their id (a <code>bigserial</code> — yes, the constraints are all generated from
scripts). I don't want to write an <em>SQL parser</em> and I don't want to play
loose, so the <code>PLpgSQL</code> approach is what I'm thinking as the best tool
here. Opinionated answers get to my mailbox!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/plpgsql.html">plpgsql</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 09 Aug 2010 14:45:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/09-editing-constants-in-constraints.html</guid>
</item>
<item>
  <title>debian packaging PostgreSQL extensions</title>
  <link>http://tapoueh.org/blog/2010/08/06-debian-packaging-postgresql-extensions.html</link>
  <description><![CDATA[h1>debian packaging PostgreSQL extensions</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Friday, August 06 2010, 13:00</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>In trying to help an extension <em>debian packaging</em> effort, I've once again
proposed to handle it. That's because I now begin to know how to do it, as
you can see in my <a href="http://qa.debian.org/developer.php?login=dim%40tapoueh.org">package overview</a> page at <em>debian QA</em> facility. There's a
reason why I proposed myself here, it's that yet another tool of mine is now
to be found in <em>debian</em>, and should greatly help <em>extension packaging</em>
there. You can already check for the <a href="http://packages.debian.org/sid/postgresql-server-dev-all">postgresql-server-dev-all</a> package page
if you're that impatient!</p>

<p>Back? Ok, so I used to have two main gripes against debian support for
<a href="http://www.postgresql.org/">PostgreSQL</a>. The first one, which is now feeling alone, is that both project
<a href="http://wiki.postgresql.org/wiki/PostgreSQL_Release_Support_Policy">release support policy</a> aren't compatible enough for debian stable to include
all currently supported stable PostgreSQL major version. That's very bad
that debian stable will only propose one major version, knowing that the
support for several of them is in there.</p>

<p>The problem is two fold: first, debian stable has to maintain any
distributed package. There's no <em>deprecation policy</em> allowing for droping the
ball. So the other side of this coin is that debian developers must take on
themselves maintaining included software for as long as stable is not
renamed <code>oldstable</code>. And it so happens that there's no debian developer that
feels like maintaining <em>end of lined</em> PostgreSQL releases without help from
<a href="http://www.postgresql.org/community/contributors/">PostgreSQL Core Team</a>. Or, say, without official statement that they would
help.</p>

<p>Now, why I don't like this situation is because I'm pretty sure there's very
few software development group offering as long and reliable maintenance
policy as PostgreSQL is doing, but debian will still happily distribute
<em>unknown-maintenance-policy</em> pieces of code in its stable repositories. So the
<em>uncertainty</em> excuse is rather poor. And highly frustrating.</p>

<blockquote>
<p class="quoted">
<strong><em>Note:</em></strong> you have to admit that the debian stable management model copes very
well with all the debian included software. You can't release stable with
a new PostgreSQL major version unless each and every package depending on
PostgreSQL will actually work with the newer version, and the debian
scripts will care for upgrading the cluster. Where it's not working good
is when you're using debian for a PostgreSQL server for a proprietary
application, which happens quite frequently too.</p>

</blockquote>

<p>The consequence of this fact leads to my second main gripe against debian
support for PostgreSQL: the extensions. It so happens that the PostgreSQL
extensions are developped for supporting several major versions from the
same source code. So typically, all you need to do is recompile the
extension against the new major version, and there you go.</p>

<p>Now, say debian new stable is coming with <a href="http://packages.debian.org/squeeze/postgresql-8.4">8.4</a> rather than <a href="http://packages.debian.org/lenny/postgresql-8.3">8.3</a> as it used
to. You should be able to just build the extensions (like <a href="http://packages.debian.org/squeeze/postgresql-8.4-prefix">prefix</a>), without
changing the source package, nor droping <code>postgresql-8.3-prefix</code> from the
distribution on the grounds that <code>8.3</code> ain't in debian stable anymore.</p>

<p>I've been ranting a lot about this state of facts, and I finally provided a
patch to the <a href="http://packages.debian.org/sid/postgresql-common">postgresql-common</a> debian packaging, which made it into version
<code>110</code>: welcome <a href="http://packages.debian.org/sid/postgresql-server-dev-all">pg_buildext</a>. An exemple of how to use it can be found in the
git branch for <a href="http://github.com/dimitri/prefix">prefix</a>, it shows up in <a href="http://github.com/dimitri/prefix/blob/master/debian/pgversions">debian/pgversions</a> and <a href="http://github.com/dimitri/prefix/blob/master/debian/rules">debian/rules</a>
files.</p>

<p>As you can see, the <code>pg_buildext</code> tool allows you to list the PostgreSQL major
versions the extension you're packaging supports, and only those that are
both in your list and in the current debian supported major version list
will get built. <code>pg_buildext</code> will do a <code>VPATH</code> build of your extension, so it's
capable of building the same extension for multiple major versions of
PostgreSQL. Here's how it looks:</p>

<pre class="src">
        # build all supported version
        pg_buildext build $(SRCDIR) $(TARGET) <span style="color: #ad7fa8; font-style: italic;">"$(CFLAGS)"</span>

        # then install each of them
        for v in `pg_buildext supported-versions $(SRCDIR)`; do \
                dh_install -ppostgresql-$$v-prefix ;\
        done
</pre>

<p>And the files are to be found in those places:</p>

<pre class="src">
dim ~/dev/prefix cat debian/postgresql-8.3-prefix.install
debian/prefix-8.3/prefix.so usr/lib/postgresql/8.3/lib
debian/prefix-8.3/prefix.sql usr/share/postgresql/8.3/contrib

dim ~/dev/prefix cat debian/postgresql-8.4-prefix.install
debian/prefix-8.4/prefix.so usr/lib/postgresql/8.4/lib
debian/prefix-8.4/prefix.sql usr/share/postgresql/8.4/contrib
</pre>

<p>So you still need to maintain <a href="http://github.com/dimitri/prefix/blob/master/debian/pgversions">debian/pgversions</a> and the
<code>postgresql-X.Y-extension.*</code> files, but then a change in debian support for
PostgreSQL major versions will be handled automatically (there's a facility
to trigger automatic rebuild when necessary).</p>

<p>All this ranting to explain that pretty soon, the extenion's packages that I
maintain will no longer have to be patched when dropping a previously
supported major version of PostgreSQL. I'm breathing a little better, so
thanks a lot <a href="http://www.piware.de/category/debian/">Martin</a>!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/extensions.html">Extensions</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/prefix.html">prefix</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 06 Aug 2010 13:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/06-debian-packaging-postgresql-extensions.html</guid>
</item>
<item>
  <title>Querying the Catalog to plan an upgrade</title>
  <link>http://tapoueh.org/blog/2010/08/05-querying-the-catalog-to-plan-an-upgrade.html</link>
  <description><![CDATA[h1>Querying the Catalog to plan an upgrade</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Thursday, August 05 2010, 11:00</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>Some user on <code>IRC</code> was reading the releases notes in order to plan for a minor
upgrade of his <code>8.3.3</code> installation, and was puzzled about potential needs for
rebuilding <code>GIST</code> indexes. That's from the <a href="http://www.postgresql.org/docs/8.3/static/release-8-3-5.html">8.3.5 release notes</a>, and from the
<a href="http://www.postgresql.org/docs/8.3/static/release-8-3-8.html">8.3.8 notes</a> you see that you need to consider <em>hash</em> indexes on <em>interval</em>
columns too. Now the question is, how to find out if any such beasts are in
use in your database?</p>

<p>It happens that <a href="http://www.postgresql.org/">PostgreSQL</a> is letting you know those things by querying its
<a href="http://www.postgresql.org/docs/8.4/static/catalogs.html">system catalogs</a>. That might look hairy at first, but it's very worth getting
used to those system tables. You could compare that to introspection and
reflexive facilities of some programming languages, except much more useful,
because you're reaching all the system at once. But, well, here it goes:</p>

<pre class="src">
SELECT schemaname, tablename, relname, amname, indexdef
  FROM pg_indexes i
       JOIN pg_class c ON i.indexname = c.relname and c.relkind = <span style="color: #ad7fa8; font-style: italic;">'i'</span>
       JOIN pg_am am ON c.relam = am.oid
 WHERE amname = <span style="color: #ad7fa8; font-style: italic;">'gist'</span>;
</pre>

<p>Now you could replace the <code>WHERE</code> clause with <code>WHERE amname IN ('gist', 'hash')</code>
to check both conditions at once. What about pursuing the restriction on the
<em>hash</em> indexes rebuild to schedule, as they should only get done to indexes on
<code>interval</code> columns. Well let's try it:</p>

<pre class="src">
SELECT schemaname, tablename, relname as indexname, amname, indclass
  FROM pg_indexes i
       JOIN pg_class c on i.indexname = c.relname and c.relkind = <span style="color: #ad7fa8; font-style: italic;">'i'</span>
       JOIN pg_am am on c.relam = am.oid
       JOIN pg_index x on x.indexrelid = c.oid
 WHERE amname in (<span style="color: #ad7fa8; font-style: italic;">'btree'</span>, <span style="color: #ad7fa8; font-style: italic;">'gist'</span>)
       and schemaname not in (<span style="color: #ad7fa8; font-style: italic;">'pg_catalog'</span>, <span style="color: #ad7fa8; font-style: italic;">'information_schema'</span>);
</pre>

<p>We're not there yet, because as you notice, the catalogs are somewhat
optimized and not always in a normal form. That's good for the system's
performance, but it makes querying a bit uneasy. What we want is to get from
the <code>indclass</code> column if there's any of them (it's an <code>oidvector</code>) that applies
to an <code>interval</code> data type. There's a subtlety here as the index could store
<code>interval</code> data even if the column is not of an <code>interval</code> type itself, so we
have to find both cases.</p>

<p>Well the <em>subtlety</em> applies after you know what an <a href="http://www.postgresql.org/docs/8.4/static/xindex.html">operator class</a> is: <em>“An
operator class defines how a particular data type can be used with an
index”</em> is what the <a href="http://www.postgresql.org/docs/8.4/static/sql-createopclass.html">CREATE OPERATOR CLASS</a> manual page teaches us. What we
need to know here is that an index will talk to an operator class to get to
the data type, either the <em>column</em> data type or the index <em>storage</em> one.</p>

<pre class="src">
SELECT schemaname, tablename, relname as indexname, amname, indclass, opcname, typname
  FROM pg_indexes i
       JOIN pg_class c on i.indexname = c.relname and c.relkind = <span style="color: #ad7fa8; font-style: italic;">'i'</span>
       JOIN pg_am am on c.relam = am.oid
       JOIN pg_index x on x.indexrelid = c.oid
       JOIN pg_opclass o
         on string_to_array(x.indclass::text, <span style="color: #ad7fa8; font-style: italic;">' '</span>)::oid[] @&gt; array[o.oid]::oid[]
       JOIN pg_type t on o.opckeytype = t.oid
WHERE amname = <span style="color: #ad7fa8; font-style: italic;">'hash'</span> and t.typname = <span style="color: #ad7fa8; font-style: italic;">'interval'</span>

UNION ALL

SELECT schemaname, tablename, relname as indexname, amname, indclass, opcname, typname
  FROM pg_indexes i
       JOIN pg_class c on i.indexname = c.relname and c.relkind = <span style="color: #ad7fa8; font-style: italic;">'i'</span>
       JOIN pg_am am on c.relam = am.oid
       JOIN pg_index x on x.indexrelid = c.oid
       JOIN pg_opclass o
         on string_to_array(x.indclass::text, <span style="color: #ad7fa8; font-style: italic;">' '</span>)::oid[] @&gt; array[o.oid]::oid[]
       JOIN pg_type t on o.opcintype = t.oid
WHERE amname = <span style="color: #ad7fa8; font-style: italic;">'hash'</span> and t.typname = <span style="color: #ad7fa8; font-style: italic;">'interval'</span>;
</pre>

<p>Most certainly this query will return no row for you, as <em>hash</em> indexes are
not widely used, mainly because they are not crash tolerant. For seeing some
results you could remove the <code>amname</code> restriction of course, that would show
the query is working, but don't forget to add the restriction back to plan
for the upgrade!</p>

<p>But hey, why walking the extra mile here, would you ask me? After all, in
the second query we would already have had the information we needed should
we added the <code>indexdef</code> column, albeit in a human reader friendly way: the
<em>resultset</em> would then contain the <code>CREATE INDEX</code> command you need to issue to
build the index from scratch. That would be enough for checking only the
catalog, but the extra mile allows you to produce a <code>SQL</code> script to build the
indexes that need your attention post upgrade. That last step is left as an
exercise for the reader, though.</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/release.html">release</a> <a href="../../../tags/catalogs.html">catalogs</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 05 Aug 2010 11:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/05-querying-the-catalog-to-plan-an-upgrade.html</guid>
</item>
<item>
  <title>el-get</title>
  <link>http://tapoueh.org/blog/2010/08/blog/2010/08/04-el-get.html</link>
  <description><![CDATA[<p>I've been using emacs for a long time, and a long time it took me to
consider learning <a href="http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/index.html">Emacs Lisp</a>. Before that, I didn't trust my level of
understanding enough to be comfortable in managing my setup efficiently.</p>

<p>One of the main problems of setting up <a href="http://www.gnu.org/software/emacs/">Emacs</a> is that not only you tend to
accumulate so many tricks from <a href="http://www.emacswiki.org/">EmacsWiki</a> and <a href="http://planet.emacsen.org/">blog posts</a> that your <code>.emacs</code> has
to grow to a full <code>~/.emacs.d/</code> directory (starting at <code>~/.emacs.d/init.el</code>),
but also you finally depend on several <em>librairies</em> of code you're not
authoring nor maintaining. Let's call them <em>packages</em>.</p>

<p>Some of them will typically be available on <a href="http://tromey.com/elpa/index.html">ELPA</a>, which allows you to
breathe and keep cool. But most of them, let's face it, are not there. Most
of the packages I use I tend to get them either from <a href="http://www.debian.org/">debian</a> (see
<a href="http://packages.debian.org/sid/apt-rdepends">apt-rdepends</a> for having the complete list of packages that depends on emacs,
unfortunately I'm not finding an online version of the tool to link too), or
from <code>ELPA</code>, or from their own <code>git</code> repository somewhere. Some of them even I
get directly from an <a href="http://www.splode.com/~friedman/software/emacs-lisp">obscure website</a> not maintained anymore, but always
there when you need them.</p>

<p>Of course, my emacs setup is managed in a private <code>git</code> repository. Some
people on <code>#emacs</code> are using <a href="http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html">git submodules</a> (or was it straight <em>import</em>) for
managing external repositories in there, but all I can say is that I frown
on this idea. I want an easy canonical list of packages I depend on to run
emacs, and I want this documentation to be usable as-is. Enters <a href="http://www.emacswiki.org/emacs/el-get.el">el-get</a>!</p>

<p>As we're all damn lazy, here's a <em>visual</em> introduction to <code>el-get</code>:</p>

<pre class="src">
(setq el-get-sources
      '((<span style="color: #da70d6;">:name</span> bbdb
               <span style="color: #da70d6;">:type</span> git
               <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"git://github.com/barak/BBDB.git"</span>
               <span style="color: #da70d6;">:load-path</span> (<span style="color: #bc8f8f;">"./lisp"</span> <span style="color: #bc8f8f;">"./bits"</span>)
               <span style="color: #da70d6;">:info</span> <span style="color: #bc8f8f;">"texinfo"</span>
               <span style="color: #da70d6;">:build</span> (<span style="color: #bc8f8f;">"./configure"</span> <span style="color: #bc8f8f;">"make"</span>))

        (<span style="color: #da70d6;">:name</span> magit
               <span style="color: #da70d6;">:type</span> git
               <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"http://github.com/philjackson/magit.git"</span>
               <span style="color: #da70d6;">:info</span> <span style="color: #bc8f8f;">"."</span>
               <span style="color: #da70d6;">:build</span> (<span style="color: #bc8f8f;">"./autogen.sh"</span> <span style="color: #bc8f8f;">"./configure"</span> <span style="color: #bc8f8f;">"make"</span>))

        (<span style="color: #da70d6;">:name</span> vkill
               <span style="color: #da70d6;">:type</span> http
               <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"http://www.splode.com/~friedman/software/emacs-lisp/src/vki</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">ll.el"</span>
               <span style="color: #da70d6;">:features</span> vkill)

        (<span style="color: #da70d6;">:name</span> yasnippet
               <span style="color: #da70d6;">:type</span> git-svn
               <span style="color: #da70d6;">:url</span> <span style="color: #bc8f8f;">"http://yasnippet.googlecode.com/svn/trunk/"</span>)

        (<span style="color: #da70d6;">:name</span> asciidoc         <span style="color: #da70d6;">:type</span> elpa)
        (<span style="color: #da70d6;">:name</span> dictionary-el    <span style="color: #da70d6;">:type</span> apt-get)
        (<span style="color: #da70d6;">:name</span> emacs-goodies-el <span style="color: #da70d6;">:type</span> apt-get)))

(el-get)
</pre>

<p>So now you have a pretty good documentation of the packages you want
installed, where to get them, and how to install them. For the <em>advanced</em>
methods (such as <code>elpa</code> or <code>apt-get</code>), you basically just need the package
name. When relying on a bare <code>git</code> repository, you need to give some more
information, such as the <code>URL</code> to <em>clone</em> and the <code>build</code> steps if any. Then also
what <em>features</em> to <code>require</code> and maybe where to find the <em>texinfo</em> documentation
of the package, for automatic inclusion into your local <em>Info</em> menu.</p>

<p>The good news is that not only you now have a solid readable description of
all that in a central place, but this very description is all <code>(el-get)</code> needs
to do its magic. This command will check that each and every package is
installed on your system (in <code>el-get-dir</code>) and if that's not the case, it will
actually install it. Then, it will <code>init</code> the packages: that means caring
about the <code>load-path</code>, the <code>Info-directory-list</code> (and <em>dir</em> texinfo menu
building), the <em>loading</em> of the <code>emacs-lisp</code> files, and finally it will <code>require</code>
the <em>features</em>.</p>

<p>Here's a prettyfied <code>ielm</code> session that will serve as a demo:</p>

<pre class="src">
ELISP&gt; (el-get)
(<span style="color: #bc8f8f;">"aspell-en"</span> <span style="color: #bc8f8f;">"aspell-fr"</span> <span style="color: #bc8f8f;">"muse"</span> <span style="color: #bc8f8f;">"dictionary"</span> <span style="color: #bc8f8f;">"htmlize"</span> <span style="color: #bc8f8f;">"bbdb"</span> <span style="color: #bc8f8f;">"google-maps"</span>
<span style="color: #bc8f8f;">"magit"</span> <span style="color: #bc8f8f;">"emms"</span> <span style="color: #bc8f8f;">"nxhtml"</span> <span style="color: #bc8f8f;">"vkill"</span> <span style="color: #bc8f8f;">"xcscope"</span> <span style="color: #bc8f8f;">"yasnippet"</span> <span style="color: #bc8f8f;">"asciidoc"</span>
<span style="color: #bc8f8f;">"auto-dictionary"</span> <span style="color: #bc8f8f;">"css-mode"</span> <span style="color: #bc8f8f;">"gist"</span> <span style="color: #bc8f8f;">"lua-mode"</span> <span style="color: #bc8f8f;">"lisppaste"</span>)
</pre>

<p>All the packages being already installed, it's running fast enough that I
won't bother measuring the run time, that seems to be somewhere around one
second.</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 04 Aug 2010 22:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/blog/2010/08/04-el-get.html</guid>
</item>
<item>
  <title>el-get</title>
  <link>http://tapoueh.org/blog/2010/08/04-el-get.html</link>
  <description><![CDATA[h1>el-get</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Wednesday, August 04 2010, 22:30</div>
</div>
<div id="article">
<p>I've been using emacs for a long time, and a long time it took me to
consider learning <a href="http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/index.html">Emacs Lisp</a>. Before that, I didn't trust my level of
understanding enough to be comfortable in managing my setup efficiently.</p>

<p>One of the main problems of setting up <a href="http://www.gnu.org/software/emacs/">Emacs</a> is that not only you tend to
accumulate so many tricks from <a href="http://www.emacswiki.org/">EmacsWiki</a> and <a href="http://planet.emacsen.org/">blog posts</a> that your <code>.emacs</code> has
to grow to a full <code>~/.emacs.d/</code> directory (starting at <code>~/.emacs.d/init.el</code>),
but also you finally depend on several <em>librairies</em> of code you're not
authoring nor maintaining. Let's call them <em>packages</em>.</p>

<p>Some of them will typically be available on <a href="http://tromey.com/elpa/index.html">ELPA</a>, which allows you to
breathe and keep cool. But most of them, let's face it, are not there. Most
of the packages I use I tend to get them either from <a href="http://www.debian.org/">debian</a> (see
<a href="http://packages.debian.org/sid/apt-rdepends">apt-rdepends</a> for having the complete list of packages that depends on emacs,
unfortunately I'm not finding an online version of the tool to link too), or
from <code>ELPA</code>, or from their own <code>git</code> repository somewhere. Some of them even I
get directly from an <a href="http://www.splode.com/~friedman/software/emacs-lisp">obscure website</a> not maintained anymore, but always
there when you need them.</p>

<p>Of course, my emacs setup is managed in a private <code>git</code> repository. Some
people on <code>#emacs</code> are using <a href="http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html">git submodules</a> (or was it straight <em>import</em>) for
managing external repositories in there, but all I can say is that I frown
on this idea. I want an easy canonical list of packages I depend on to run
emacs, and I want this documentation to be usable as-is. Enters <a href="http://www.emacswiki.org/emacs/el-get.el">el-get</a>!</p>

<p>As we're all damn lazy, here's a <em>visual</em> introduction to <code>el-get</code>:</p>

<pre class="src">
(setq el-get-sources
      '((<span style="color: #729fcf;">:name</span> bbdb
               <span style="color: #729fcf;">:type</span> git
               <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"git://github.com/barak/BBDB.git"</span>
               <span style="color: #729fcf;">:load-path</span> (<span style="color: #ad7fa8; font-style: italic;">"./lisp"</span> <span style="color: #ad7fa8; font-style: italic;">"./bits"</span>)
               <span style="color: #729fcf;">:info</span> <span style="color: #ad7fa8; font-style: italic;">"texinfo"</span>
               <span style="color: #729fcf;">:build</span> (<span style="color: #ad7fa8; font-style: italic;">"./configure"</span> <span style="color: #ad7fa8; font-style: italic;">"make"</span>))

        (<span style="color: #729fcf;">:name</span> magit
               <span style="color: #729fcf;">:type</span> git
               <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"http://github.com/philjackson/magit.git"</span>
               <span style="color: #729fcf;">:info</span> <span style="color: #ad7fa8; font-style: italic;">"."</span>
               <span style="color: #729fcf;">:build</span> (<span style="color: #ad7fa8; font-style: italic;">"./autogen.sh"</span> <span style="color: #ad7fa8; font-style: italic;">"./configure"</span> <span style="color: #ad7fa8; font-style: italic;">"make"</span>))

        (<span style="color: #729fcf;">:name</span> vkill
               <span style="color: #729fcf;">:type</span> http
               <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"http://www.splode.com/~friedman/software/emacs-lisp/src/vki</span><span style="color: #ffff00; background-color: #ff0000; font-weight: bold;">ll.el"</span>
               <span style="color: #729fcf;">:features</span> vkill)

        (<span style="color: #729fcf;">:name</span> yasnippet
               <span style="color: #729fcf;">:type</span> git-svn
               <span style="color: #729fcf;">:url</span> <span style="color: #ad7fa8; font-style: italic;">"http://yasnippet.googlecode.com/svn/trunk/"</span>)

        (<span style="color: #729fcf;">:name</span> asciidoc         <span style="color: #729fcf;">:type</span> elpa)
        (<span style="color: #729fcf;">:name</span> dictionary-el    <span style="color: #729fcf;">:type</span> apt-get)
        (<span style="color: #729fcf;">:name</span> emacs-goodies-el <span style="color: #729fcf;">:type</span> apt-get)))

(el-get)
</pre>

<p>So now you have a pretty good documentation of the packages you want
installed, where to get them, and how to install them. For the <em>advanced</em>
methods (such as <code>elpa</code> or <code>apt-get</code>), you basically just need the package
name. When relying on a bare <code>git</code> repository, you need to give some more
information, such as the <code>URL</code> to <em>clone</em> and the <code>build</code> steps if any. Then also
what <em>features</em> to <code>require</code> and maybe where to find the <em>texinfo</em> documentation
of the package, for automatic inclusion into your local <em>Info</em> menu.</p>

<p>The good news is that not only you now have a solid readable description of
all that in a central place, but this very description is all <code>(el-get)</code> needs
to do its magic. This command will check that each and every package is
installed on your system (in <code>el-get-dir</code>) and if that's not the case, it will
actually install it. Then, it will <code>init</code> the packages: that means caring
about the <code>load-path</code>, the <code>Info-directory-list</code> (and <em>dir</em> texinfo menu
building), the <em>loading</em> of the <code>emacs-lisp</code> files, and finally it will <code>require</code>
the <em>features</em>.</p>

<p>Here's a prettyfied <code>ielm</code> session that will serve as a demo:</p>

<pre class="src">
ELISP&gt; (el-get)
(<span style="color: #ad7fa8; font-style: italic;">"aspell-en"</span> <span style="color: #ad7fa8; font-style: italic;">"aspell-fr"</span> <span style="color: #ad7fa8; font-style: italic;">"muse"</span> <span style="color: #ad7fa8; font-style: italic;">"dictionary"</span> <span style="color: #ad7fa8; font-style: italic;">"htmlize"</span> <span style="color: #ad7fa8; font-style: italic;">"bbdb"</span> <span style="color: #ad7fa8; font-style: italic;">"google-maps"</span>
<span style="color: #ad7fa8; font-style: italic;">"magit"</span> <span style="color: #ad7fa8; font-style: italic;">"emms"</span> <span style="color: #ad7fa8; font-style: italic;">"nxhtml"</span> <span style="color: #ad7fa8; font-style: italic;">"vkill"</span> <span style="color: #ad7fa8; font-style: italic;">"xcscope"</span> <span style="color: #ad7fa8; font-style: italic;">"yasnippet"</span> <span style="color: #ad7fa8; font-style: italic;">"asciidoc"</span>
<span style="color: #ad7fa8; font-style: italic;">"auto-dictionary"</span> <span style="color: #ad7fa8; font-style: italic;">"css-mode"</span> <span style="color: #ad7fa8; font-style: italic;">"gist"</span> <span style="color: #ad7fa8; font-style: italic;">"lua-mode"</span> <span style="color: #ad7fa8; font-style: italic;">"lisppaste"</span>)
</pre>

<p>All the packages being already installed, it's running fast enough that I
won't bother measuring the run time, that seems to be somewhere around one
second.</p>



<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/muse.html">Muse</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/el-get.html">el-get</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Wed, 04 Aug 2010 22:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/04-el-get.html</guid>
</item>
<item>
  <title>Database Virtual Machines</title>
  <link>http://tapoueh.org/blog/2010/08/03-database-virtual-machines.html</link>
  <description><![CDATA[h1>Database Virtual Machines</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/08/index.html>08</a> / </div>
<div class="date">Tuesday, August 03 2010, 13:30</div>
</div>
<div id="article">
<p>Today I'm being told once again about <a href="http://www.sqlite.org/">SQLite</a> as an embedded database
software. That one ain't a <em>database server</em> but a <em>software library</em> that you
can use straight into your main program. I'm yet to use it, but it looks
like <a href="http://www.sqlite.org/lang.html">its SQL support</a> is good enough for simple things — and that covers
<em>loads</em> of things. I guess read-only cache and configuration storage would be
the obvious ones, because it seems that <a href="http://www.sqlite.org/whentouse.html">SQLite use cases</a> aren't including
<a href="http://www.sqlite.org/lockingv3.html">mixed concurrency</a>, that is workloads with concurrent readers and writers.</p>

<p>The part that got my full attention is
<a href="http://www.sqlite.org/vdbe.html">The Virtual Database Engine of SQLite</a>, as this blog title would imply. It
seems to be the same idea as what <a href="http://monetdb.cwi.nl/">MonetDB</a> calls their
<a href="http://monetdb.cwi.nl/MonetDB/Documentation/MAL-Synopsis.html">MonetDB Assembly Language</a>, and I've been trying to summarize some idea about
it in my <a href="http://tapoueh.org/char10.html#sec11">Next Generation PostgreSQL</a> article.</p>

<p>The main thing is how to further optimize <a href="http://www.postgresql.org/">PostgreSQL</a> given what we have. It
seems that among the major road blocks in the performance work is how we get
the data from disk and to the client. We're still spending so many time in
the <code>CPU</code> that the disk bandwidth are not always saturated, and that's a
problem. Further thoughts on the <a href="http://tapoueh.org/char10.html#sec11">full length article</a>, but that's just about
a one page section now!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 03 Aug 2010 13:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/08/03-database-virtual-machines.html</guid>
</item>


<item>
  <title>Partitioning: relation size per “group”</title>
  <link>http://tapoueh.org/blog/2010/07/26-partitioning-relation-size-per-group.html</link>
  <description><![CDATA[h1>Partitioning: relation size per “group”</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Monday, July 26 2010, 17:00</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>This time, we are trying to figure out where is the bulk of the data on
disk. The trick is that we're using <a href="http://www.postgresql.org/docs/current/static/ddl-partitioning.html">DDL partitioning</a>, but we want a “nice”
view of size per <em>partition set</em>. Meaning that if you have for example a
parent table <code>foo</code> with partitions <code>foo_201006</code> and <code>foo_201007</code>, you would want
to see a single category <code>foo</code> containing the accumulated size of all the
partitions underneath <code>foo</code>.</p>

<p>Here we go:</p>

<pre class="src">
select groupe, pg_size_pretty(sum(bytes)::bigint) as size, sum(bytes)
  from (
select relkind as k, nspname, relname, tablename, bytes,
         case when relkind = <span style="color: #ad7fa8; font-style: italic;">'r'</span> and relname ~ <span style="color: #ad7fa8; font-style: italic;">'[0-9]{6}$'</span>
              then substring(relname from 1 for length(relname)-7)

              when relkind = <span style="color: #ad7fa8; font-style: italic;">'i'</span> and  tablename ~ <span style="color: #ad7fa8; font-style: italic;">'[0-9]{6}$'</span>
              then substring(tablename from 1 for length(tablename)-7)

              else <span style="color: #ad7fa8; font-style: italic;">'core'</span>
          end as groupe
  from (
  select nspname, relname,
         case when relkind = <span style="color: #ad7fa8; font-style: italic;">'i'</span>
              then (select relname
                      from pg_index x
                           join pg_class xc on x.indrelid = xc.oid
                           join pg_namespace xn on xc.relnamespace = xn.oid
                     where x.indexrelid = c.oid
                    )
              else null
           end as tablename,
         pg_size_pretty(pg_relation_size(c.oid)) as relation,
         pg_total_relation_size(c.oid) as bytes,
         relkind
    from pg_class c join pg_namespace n on c.relnamespace = n.oid
   where c.relkind in (<span style="color: #ad7fa8; font-style: italic;">'r'</span>, <span style="color: #ad7fa8; font-style: italic;">'i'</span>)
         and nspname in (<span style="color: #ad7fa8; font-style: italic;">'public'</span>, <span style="color: #ad7fa8; font-style: italic;">'archive'</span>)
         and pg_total_relation_size(c.oid) &gt; 32 * 1024
order by 5 desc
       ) as s
       ) as t
group by 1
order by 3 desc;
</pre>

<p>Note that by simply removing those last two lines here, you will get a
detailed view of the <em>indexes</em> and <em>tables</em> that are taking the most volume on
disk at your place.</p>

<p>Now, what about using <a href="http://www.postgresql.org/docs/8.4/static/functions-window.html">window functions</a> here so that we get some better
detailed view of historic changes on each partition? With some evolution
figure in percentage from the previous partition of the same year,
accumulated size per partition and per year, yearly sum, you name it. Here's
another one you might want to try, ready for some tuning (schema name, table
name, etc):</p>

<pre class="src">
WITH s AS (
  select relname,
         pg_relation_size(c.oid) as size,
         pg_total_relation_size(c.oid) as tsize,
         substring(substring(relname from <span style="color: #ad7fa8; font-style: italic;">'[0-9]{6}$'</span>) for 4)::bigint as year
    from pg_class c
         join pg_namespace n on n.oid = c.relnamespace
   where c.relkind = <span style="color: #ad7fa8; font-style: italic;">'r'</span>
     <span style="color: #888a85;">-- and n.nspname = 'public'
</span>     <span style="color: #888a85;">-- and c.relname ~ 'stats'
</span>     and substring(substring(relname from <span style="color: #ad7fa8; font-style: italic;">'[0-9]{6}$'</span>) for 4)::bigint &gt;= 2008
order by relname
),
     sy AS (
  select relname,
         size,
         tsize,
         year,
         (sum(size) over w_year)::bigint as ysize,
         (sum(size) over w_month)::bigint as cumul,
         (lag(size) over (order by relname))::bigint as previous
    from s
  window w_year  as (partition by year),
         w_month as (partition by year order by relname)
),
     syp AS (
  select relname,
         size,
         tsize,
         rank() over (partition by year order by size desc) as rank,
         case when ysize = 0 then ysize
              else round(size / ysize::numeric * 100, 2) end as yp,
         case when previous = 0 then previous
              else round((size / previous::numeric - 1.0) * 100, 2) end as evol,
         cumul,
         year,
         ysize
    from sy
)
  SELECT relname,
         pg_size_pretty(size) as size,
         pg_size_pretty(tsize) as "+indexes",
         evol, yp as "% annuel", rank,
         pg_size_pretty(cumul) as cumul, year,
         pg_size_pretty(ysize) as "yearly sum",
         pg_size_pretty((sum(size) over())::bigint) as total
    FROM syp
ORDER BY relname;
</pre>

<p>Hope you'll find it useful, I certainly do!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 26 Jul 2010 17:00:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/26-partitioning-relation-size-per-group.html</guid>
</item>
<item>
  <title>dim-switch-window.el: fixes</title>
  <link>http://tapoueh.org/blog/2010/07/blog/2010/07/26-dim-switch-windowel-fixes.html</link>
  <description><![CDATA[<p>Thanks to amazing readers of <a href="http://planet.emacsen.org/">planet emacsen</a>, two annoyances of
<a href="http://www.emacswiki.org/emacs/switch-window.el">switch-window.el</a> have already been fixed! The first is that handling of <code>C-g</code>
isn't exactly an option after all, and the other is that you want to avoid
the buffer creation in the simple cases (1 or 2 windows only), because it's
the usual case.</p>

<p>I've received code to handle the second case, that I mostly merged. Thanks a
lot guys, the new version is on <a href="http://wwww.emacswiki.org">emacswiki</a> already!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 26 Jul 2010 11:55:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/blog/2010/07/26-dim-switch-windowel-fixes.html</guid>
</item>
<item>
  <title>dim-switch-window.el: fixes</title>
  <link>http://tapoueh.org/blog/2010/07/26-dim-switch-windowel-fixes.html</link>
  <description><![CDATA[h1>dim-switch-window.el: fixes</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Monday, July 26 2010, 11:55</div>
</div>
<div id="article">
<p>Thanks to amazing readers of <a href="http://planet.emacsen.org/">planet emacsen</a>, two annoyances of
<a href="http://www.emacswiki.org/emacs/switch-window.el">switch-window.el</a> have already been fixed! The first is that handling of <code>C-g</code>
isn't exactly an option after all, and the other is that you want to avoid
the buffer creation in the simple cases (1 or 2 windows only), because it's
the usual case.</p>

<p>I've received code to handle the second case, that I mostly merged. Thanks a
lot guys, the new version is on <a href="http://wwww.emacswiki.org">emacswiki</a> already!</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/switch-window.html">switch-window</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 26 Jul 2010 11:55:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/26-dim-switch-windowel-fixes.html</guid>
</item>
<item>
  <title>dim-switch-window.el</title>
  <link>http://tapoueh.org/blog/2010/07/blog/2010/07/25-dim-switch-windowel.html</link>
  <description><![CDATA[<p>So it's Sunday and I'm thinking I'll get into <code>el-get</code> sometime later. Now is
the time to present <code>dim-switch-window.el</code> which implements a <em>visual</em> <code>C-x o</code>. I
know of only one way to present a <em>visual effect</em>, and that's with a screenshot:</p>

<center>
<p><img src="../../../images//emacs-switch-window.png" alt=""></p>
</center>

<p>So as you can see, it's all about showing a <em>big</em> number in each window,
tweaking each window's name, and waiting till the user press one of the
expected key — or timeout and stay on the same window as before <code>C-x o</code>. When
there's only 1 or 2 windows displayed, though, the right thing happen and
you see no huge number (in the former case, nothing happens, in the latter,
focus moves to the other window).</p>

<p>The code for that can be found on <a href="http://www.emacswiki.org/">emacswiki</a> under the name
<a href="http://www.emacswiki.org/emacs/switch-window.el">switch-window.el</a>. Hope you'll find it useful!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sun, 25 Jul 2010 13:25:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/blog/2010/07/25-dim-switch-windowel.html</guid>
</item>
<item>
  <title>dim-switch-window.el</title>
  <link>http://tapoueh.org/blog/2010/07/25-dim-switch-windowel.html</link>
  <description><![CDATA[h1>dim-switch-window.el</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Sunday, July 25 2010, 13:25</div>
</div>
<div id="article">
<p>So it's Sunday and I'm thinking I'll get into <code>el-get</code> sometime later. Now is
the time to present <code>dim-switch-window.el</code> which implements a <em>visual</em> <code>C-x o</code>. I
know of only one way to present a <em>visual effect</em>, and that's with a screenshot:</p>

<center>
<p><img src="../../../images//emacs-switch-window.png" alt=""></p>
</center>

<p>So as you can see, it's all about showing a <em>big</em> number in each window,
tweaking each window's name, and waiting till the user press one of the
expected key — or timeout and stay on the same window as before <code>C-x o</code>. When
there's only 1 or 2 windows displayed, though, the right thing happen and
you see no huge number (in the former case, nothing happens, in the latter,
focus moves to the other window).</p>

<p>The code for that can be found on <a href="http://www.emacswiki.org/">emacswiki</a> under the name
<a href="http://www.emacswiki.org/emacs/switch-window.el">switch-window.el</a>. Hope you'll find it useful!</p>



<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/switch-window.html">switch-window</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Sun, 25 Jul 2010 13:25:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/25-dim-switch-windowel.html</guid>
</item>
<item>
  <title>ClusterSSH gets dsh support</title>
  <link>http://tapoueh.org/blog/2010/07/blog/2010/07/23-clusterssh-gets-dsh-support.html</link>
  <description><![CDATA[<p>If you don't know about <a href="cssh.html">ClusterSSH</a>, it's a project that builds on <code>M-x term</code>
and <code>ssh</code> to offer a nice and simple way to open remote terminals. It's
available in <a href="http://tromey.com/elpa/index.html">ELPA</a> and developed at <a href="http://github.com/dimitri/cssh">github cssh</a> repository.</p>

<p>The default binding is <code>C-=</code> and asks for the name of the server
to connect to, in the <em>minibuffer</em>, with completion. The host list used for
the completion comes from <code>tramp</code> and is pretty complete, all the more if
you've setup <code>~/.ssh/config</code> with <code>HashKnownHosts no</code>.</p>

<p>So the usual way to use <code>cssh.el</code> would be to just open a single remote
connection at a time. But of course you can open as many as you like, and
you get them all in a mosaic of <code>term</code> in your emacs frame, with an input
window at the bottom to control them all. There were two ways to get there,
either opening all remote hosts whose name is matching a given regexp, that
would be using <code>C-M-=</code> or getting to <code>IBuffer</code> and marking there
the existing remote <code>terms</code> you want to control all at once then use
<code>C-=</code>.</p>

<p>Well I've just added another mode of operation by supporting <em>enhanced</em> <a href="http://www.netfort.gr.jp/~dancer/software/dsh.html.en">dsh</a>
group files. In such files, you're supposed to have a remote host name per
line and that's it. We've added support for line containing <code>@group</code> kind of
lines so that you can <em>include</em> another group easily. To use the facility,
either open your <code>~/.dsh/group</code> directory in <code>dired</code> and type <code>C-=</code>
when on the right line, or simply use the global <code>C-=</code> you
already know and love. Then, type <code>@</code> and complete to any existing group found
in your <code>cssh-dsh-path</code> (it defaults to the right places, so chances are you
will never have to edit this one). And that's it, <a href="http://www.gnu.org/software/emacs/">Emacs</a> will open one <code>term</code>
per remote host you have in the <code>dsh</code> group you just picked. With a <code>*cssh*</code>
controler window, too.</p>

<p>Coming next, how I solved my <code>init.el</code> dependancies burden thanks to <code>el-get</code>!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 23 Jul 2010 22:20:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/blog/2010/07/23-clusterssh-gets-dsh-support.html</guid>
</item>
<item>
  <title>ClusterSSH gets dsh support</title>
  <link>http://tapoueh.org/blog/2010/07/23-clusterssh-gets-dsh-support.html</link>
  <description><![CDATA[h1>ClusterSSH gets dsh support</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Friday, July 23 2010, 22:20</div>
</div>
<div id="article">
<p>If you don't know about <a href="cssh.html">ClusterSSH</a>, it's a project that builds on <code>M-x term</code>
and <code>ssh</code> to offer a nice and simple way to open remote terminals. It's
available in <a href="http://tromey.com/elpa/index.html">ELPA</a> and developed at <a href="http://github.com/dimitri/cssh">github cssh</a> repository.</p>

<p>The default binding is <code>C-=</code> and asks for the name of the server
to connect to, in the <em>minibuffer</em>, with completion. The host list used for
the completion comes from <code>tramp</code> and is pretty complete, all the more if
you've setup <code>~/.ssh/config</code> with <code>HashKnownHosts no</code>.</p>

<p>So the usual way to use <code>cssh.el</code> would be to just open a single remote
connection at a time. But of course you can open as many as you like, and
you get them all in a mosaic of <code>term</code> in your emacs frame, with an input
window at the bottom to control them all. There were two ways to get there,
either opening all remote hosts whose name is matching a given regexp, that
would be using <code>C-M-=</code> or getting to <code>IBuffer</code> and marking there
the existing remote <code>terms</code> you want to control all at once then use
<code>C-=</code>.</p>

<p>Well I've just added another mode of operation by supporting <em>enhanced</em> <a href="http://www.netfort.gr.jp/~dancer/software/dsh.html.en">dsh</a>
group files. In such files, you're supposed to have a remote host name per
line and that's it. We've added support for line containing <code>@group</code> kind of
lines so that you can <em>include</em> another group easily. To use the facility,
either open your <code>~/.dsh/group</code> directory in <code>dired</code> and type <code>C-=</code>
when on the right line, or simply use the global <code>C-=</code> you
already know and love. Then, type <code>@</code> and complete to any existing group found
in your <code>cssh-dsh-path</code> (it defaults to the right places, so chances are you
will never have to edit this one). And that's it, <a href="http://www.gnu.org/software/emacs/">Emacs</a> will open one <code>term</code>
per remote host you have in the <code>dsh</code> group you just picked. With a <code>*cssh*</code>
controler window, too.</p>

<p>Coming next, how I solved my <code>init.el</code> dependancies burden thanks to <code>el-get</code>!</p>


<h2>Tags</h2>

<p><a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/el-get.html">el-get</a> <a href="../../../tags/cssh.html">cssh</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Fri, 23 Jul 2010 22:20:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/23-clusterssh-gets-dsh-support.html</guid>
</item>
<item>
  <title>Emacs and PostgreSQL</title>
  <link>http://tapoueh.org/blog/2010/07/blog/2010/07/22-emacs-and-postgresql.html</link>
  <description><![CDATA[<p>Those are my two all times favorite Open Source Software. Or <a href="http://www.gnu.org/philosophy/free-sw.html">Free Software</a>
in the <a href="http://www.gnu.org/">GNU</a> sense of the world, as both the <em>BSD</em> and the <em>GPL</em> are labeled free
there. Even if I prefer the <a href="http://www.debian.org/social_contract">The Debian Free Software Guidelines</a> as a global
definition and the <a href="http://sam.zoy.org/wtfpl/">WTFPL</a> license. But that's a digression.</p>

<p>I think that <a href="http://www.gnu.org/software/emacs/">Emacs</a> and <a href="http://www.postgresql.org/">PostgreSQL</a> do share a lot in common. I'd begin with
the documentation, which quality is amazing for both projects. Then of
course the extensibility with <a href="http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/Preface.html#Preface">Emacs Lisp</a> on the one hand and
<a href="http://www.postgresql.org/docs/8.4/static/extend.html">catalog-driven operations</a> on the other hand. Whether you're extending Emacs
or PostgreSQL you'll find that it's pretty easy to tweak the system <em>while
it's running</em>. The other comparison points are less important, like the fact
the both the systems get about the same uptime on my laptop (currently <em>13
days, 23 hours, 57 minutes, 10 seconds</em>).</p>

<p>So of course I'm using <em>Emacs</em> to edit <em>PostgreSQL</em> <code>.sql</code> files, including stored
procedures. And it so happens that <a href="http://archives.postgresql.org/pgsql-hackers/2010-07/msg01067.php">line numbering in plpgsql</a> is not as
straightforward as one would naively think, to the point that we'd like to
have better tool support there. So I've extended Emacs <a href="http://www.gnu.org/software/emacs/manual/html_node/emacs/Minor-Modes.html">linum-mode minor mode</a>
to also display the line numbers as computed per PostgreSQL, and here's what
it looks like:</p>

<center>
<p><img src="../../../images//emacs-pgsql-line-numbers.png" alt=""></p>
</center>

<p>Now, here's also the source code, <a href="https://github.com/dimitri/pgsql-linum-format">pgsql-linum-format</a>. Hope you'll enjoy!</p>
]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 22 Jul 2010 09:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/blog/2010/07/22-emacs-and-postgresql.html</guid>
</item>
<item>
  <title>Emacs and PostgreSQL</title>
  <link>http://tapoueh.org/blog/2010/07/22-emacs-and-postgresql.html</link>
  <description><![CDATA[h1>Emacs and PostgreSQL</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Thursday, July 22 2010, 09:30</div>
</div>
<div id="article">
<p>Those are my two all times favorite Open Source Software. Or <a href="http://www.gnu.org/philosophy/free-sw.html">Free Software</a>
in the <a href="http://www.gnu.org/">GNU</a> sense of the world, as both the <em>BSD</em> and the <em>GPL</em> are labeled free
there. Even if I prefer the <a href="http://www.debian.org/social_contract">The Debian Free Software Guidelines</a> as a global
definition and the <a href="http://sam.zoy.org/wtfpl/">WTFPL</a> license. But that's a digression.</p>

<p>I think that <a href="http://www.gnu.org/software/emacs/">Emacs</a> and <a href="http://www.postgresql.org/">PostgreSQL</a> do share a lot in common. I'd begin with
the documentation, which quality is amazing for both projects. Then of
course the extensibility with <a href="http://www.gnu.org/software/emacs/emacs-lisp-intro/html_node/Preface.html#Preface">Emacs Lisp</a> on the one hand and
<a href="http://www.postgresql.org/docs/8.4/static/extend.html">catalog-driven operations</a> on the other hand. Whether you're extending Emacs
or PostgreSQL you'll find that it's pretty easy to tweak the system <em>while
it's running</em>. The other comparison points are less important, like the fact
the both the systems get about the same uptime on my laptop (currently <em>13
days, 23 hours, 57 minutes, 10 seconds</em>).</p>

<p>So of course I'm using <em>Emacs</em> to edit <em>PostgreSQL</em> <code>.sql</code> files, including stored
procedures. And it so happens that <a href="http://archives.postgresql.org/pgsql-hackers/2010-07/msg01067.php">line numbering in plpgsql</a> is not as
straightforward as one would naively think, to the point that we'd like to
have better tool support there. So I've extended Emacs <a href="http://www.gnu.org/software/emacs/manual/html_node/emacs/Minor-Modes.html">linum-mode minor mode</a>
to also display the line numbers as computed per PostgreSQL, and here's what
it looks like:</p>

<center>
<p><img src="../../../images//emacs-pgsql-line-numbers.png" alt=""></p>
</center>

<p>Now, here's also the source code, <a href="https://github.com/dimitri/pgsql-linum-format">pgsql-linum-format</a>. Hope you'll enjoy!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/emacs.html">Emacs</a> <a href="../../../tags/debian.html">debian</a> <a href="../../../tags/plpgsql.html">plpgsql</a> <a href="../../../tags/pgsql-linum-format.html">pgsql-linum-format</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 22 Jul 2010 09:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/22-emacs-and-postgresql.html</guid>
</item>
<item>
  <title>Background writers</title>
  <link>http://tapoueh.org/blog/2010/07/19-background-writers.html</link>
  <description><![CDATA[h1>Background writers</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Monday, July 19 2010, 16:30</div>
</div>
<div id="article">
<p>There's currently a thread on <a href="http://archives.postgresql.org/pgsql-hackers/">hackers</a> about <a href="http://archives.postgresql.org/pgsql-hackers/2010-07/msg00493.php">bg worker: overview</a> and a series
of 6 patches. Thanks a lot <strong><em>Markus</em></strong>! This is all about generalizing a concept
already in use in the <em>autovacuum</em> process, where you have an independent
subsystem that require having an autonomous <em>daemon</em> running and able to start
its own <em>workers</em>.</p>

<p>I've been advocating about generalizing this concept for awhile already, in
order to have <em>postmaster</em> able to communicate to subsystems when to shut down
and start and reload, etc. Some external processes are only external because
there's no need to include them <em>by default</em> in to the database engine, not
because there's no sense to having them in there.</p>

<p>So even if <strong><em>Markus</em></strong> work is mainly about generalizing <em>autovacuum</em> so that he
has a <em>coordinator</em> to ask for helper backends to handle broadcasting of
<em>writesets</em> for <a href="http://postgres-r.org/">Postgres-R</a>, it still could be a very good first step towards
something more general. What I'd like to see the generalization handle are
things like <a href="http://wiki.postgresql.org/wiki/PGQ_Tutorial">PGQ</a>, or the <em>pgagent scheduler</em>. In some cases, <a href="http://pgbouncer.projects.postgresql.org/doc/usage.html">pgbouncer</a> too.</p>

<p>What we're missing there is an <em>API</em> for everybody to be able to extend
PostgreSQL with its own background processes and workers. What would such a
beast look like? I have some preliminary thoughts about this in my
<a href="char10.html#sec16">Next Generation PostgreSQL</a> article, but that's still early thoughts. The
main idea is to steal as much as sensible from
<a href="http://www.erlang.org/doc/man/supervisor.html">Erlang Generic Supervisor Behaviour</a>, and maybe up to its
<a href="http://www.erlang.org/doc/design_principles/fsm.html">Generic Finite State Machines</a> <em>behavior</em>. In the <em>Erlang</em> world, a <em>behavior</em> is a
generic process.</p>

<p>The <em>FSM</em> approach would allow for any user daemon to provide an initial state
and register functions that would do some processing then change the
state. My feeling is that if those functions are exposed at the SQL level,
then you can <em>talk</em> to the daemon from anywhere (the Erlang ideas include a
globally —cluster wide— unique name). Of course the goal would be to
provide an easy way for the <em>FSM</em> functions to have a backend connected to the
target database handle the work for it, or be able to connect itself. Then
we'd need something else here, a way to produce events based on the clock. I
guess relying on <code>SIGALRM</code> is a possibility.</p>

<p>I'm not sure about how yet, but I think getting back in consultancy after
having opened <a href="http://2ndQuadrant.com">2ndQuadrant</a> <a href="http://2ndQuadrant.fr">France</a> has some influence on how I think about all
that. My guess is that those blog posts are a first step on a nice journey!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Mon, 19 Jul 2010 16:30:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/19-background-writers.html</guid>
</item>
<item>
  <title>Logs analysis</title>
  <link>http://tapoueh.org/blog/2010/07/13-logs-analysis.html</link>
  <description><![CDATA[h1>Logs analysis</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Tuesday, July 13 2010, 14:15</div>
</div>
<div id="article">
<p>Nowadays to analyze logs and provide insights, the more common tool to use
is <a href="http://pgfouine.projects.postgresql.org/">pgfouine</a>, which does an excellent job. But there has been some
improvements in logs capabilities that we're not benefiting from yet, and
I'm thinking about the <code>CSV</code> log format.</p>

<p>So the idea would be to turn <em>pgfouine</em> into a set of <code>SQL</code> queries against the
logs themselves once imported into the database. Wait. What about having our
next PostgreSQL version, which is meant (I believe) to include CSV support
in <em>SQL/MED</em>, to directly expose its logs as a system view?</p>

<p>A good thing would be to expose that as a ddl-partitioned table following
the log rotation scheme as setup in <code>postgresql.conf</code>, or maybe given in some
sort of a setup, in order to support <code>logrotate</code> users. At least some
facilities to do that would be welcome, and I'm not sure plain <em>SQL/MED</em> is
that when it comes to <em>source</em> partitioning.</p>

<p>Then all that remains to be done is a set of <code>SQL</code> queries and some static or
dynamic application to derive reports from there.</p>

<p>This is yet again an idea I have in mind but don't have currently time to
explore myself, so I talk about it here in the hope that others will share
the interest. Of course, now that I work at <a href="http://2ndQuadrant.com">2ndQuadrant</a>, you can make it so
that we consider the idea in more details, up to implementing and
contributing it!</p>


<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Tue, 13 Jul 2010 14:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/13-logs-analysis.html</guid>
</item>
<item>
  <title>Using indexes as column store?</title>
  <link>http://tapoueh.org/blog/2010/07/08-using-indexes-as-column-store.html</link>
  <description><![CDATA[h1>Using indexes as column store?</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Thursday, July 08 2010, 11:15</div>
</div>
<div id="article">
<p><span class="hack"> </span></p>

<p>There's a big trend nowadays about using column storage as opposed to what
PostgreSQL is doing, which would be row storage. The difference is that if
you have the same column value in a lot of rows, you could get to a point
where you have this value only once in the underlying storage file. That
means high compression. Then you tweak the <em>executor</em> to be able to load this
value only once, not once per row, and you win another huge source of data
traffic (often enough, from disk).</p>

<p>Well, it occurs to me that maybe we could have column oriented storage
support without adding any new storage facility into PostgreSQL itself, just
using in new ways what we already have now. Column oriented storage looks
somewhat like an index, where any given value is meant to appear only
once. And you have <em>links</em> to know where to find the full row associated in
the main storage.</p>

<p>There's a work in progress to allow for PostgreSQL to use indexes on their
own, without having to get to the main storage for checking the
visibility. That's known as the <a href="http://www.postgresql.org/docs/8.4/static/storage-vm.html">Visibility Map</a>, which is still only a hint
in released versions. The goal is to turn that into a crash-safe trustworthy
source in the future, so that we get <em>covering indexes</em>. That means we can use
an index and skip getting to the full row in main storage and get the
visibility information there.</p>

<p>Now, once we have that, we could consider using the indexes in more
queries. It could be a win to get the column values from the index when
possible and if you don't <em>output</em> more columns from the <em>heap</em>, return the
values from there. Scanning the index only once per value, not once per row.</p>

<p>There's a little more though on the point in the <a href="char10.html#sec10">Next Generation PostgreSQL</a>
article I've been referencing already, should you be interested.</p>



<h2>Tags</h2>

<p><a href="../../../tags/postgresql.html">PostgreSQL</a> <a href="../../../tags/release.html">release</a></p>


</div>

]]></description>
  <author>dim@tapoueh.org (Dimitri Fontaine)</author>
  <pubDate>Thu, 08 Jul 2010 11:15:00 +0200</pubDate>
  <guid isPermaLink="true">http://tapoueh.org/blog/2010/07/08-using-indexes-as-column-store.html</guid>
</item>
<item>
  <title>MVCC in the Cloud</title>
  <link>http://tapoueh.org/blog/2010/07/06-mvcc-in-the-cloud.html</link>
  <description><![CDATA[h1>MVCC in the Cloud</h1>
<div id="breadcrumb"><a href=../../../index.html>/dev/dim</a> / <a href=../../../blog/index.html>blog</a> / <a href=../../../blog/2010/index.html>2010</a> / <a href=../../../blog/2010/07/index.html>07</a> / </div>
<div class="date">Tuesday, July 06 2010, 10:50</div>
</div>
<div id="article">
<p>At <a href="http://char10.org/">CHAR(10)</a> <strong><em>Markus</em></strong> had a talk about
<a href="http://char10.org/talk-schedule-details#talk13">Using MVCC for Clustered Database Systems</a> and explained how <a href="http://postgres-r.org/">Postgres-R</a> does
it. The scope of his project is to maintain a set of database servers in the
same state, eventually.</p>

<p>Now, what does it mean to get &quot;In the Cloud&quot;? Well there are more than one
answer I'm sure, mine would insist on including this &quot;Elasticity&quot; bit. What
I mean here is that it'd be great to be able to add or lose nodes and stay
<em>online</em>. Granted, that what's <em>Postgres-R</em> is providing. Does that make it
ready for the &quot;Cloud&quot;? Well it happens so that I don't think so.</p>

<p>Once you have elasticity, you also want <em>scalability</em>. That could mean lots of
thing, and <em>Postgres-R</em> already provides a great deal of it, at the connect
and reads level: you can do your business <em>unlimited</em> on any node, the others
will eventua
