Category “Encoding” — 2 articles

So, if you followed the previous blog entry, now you have a new database containing all the static tables encoded in UTF-8 rather than SQL_ASCII. Because if it was not yet the case, you now severely distrust this non-encoding.

Now is the time to have a look at properly encoding the live data, those stored in tables that continue to receive write traffic. The idea is to use the UPDATE facilities of PostgreSQL to tweak the data, and too fix the applications so as not to continue inserting badly encoded strings in there.



It happens that you have to manage databases designed by your predecessor, and it even happens that the team used to not have a DBA. Those histerical raisins can lead to having a SQL_ASCII database. The horror! What SQL_ASCII means, if you’re not already familiar with the consequences of such a choice, is that all the text and varchar data that you put in the database is accepted as-is. No checks.

Dimitri Fontaine

PostgreSQL Major Contributor

Open Source Software Engineer

France