Encoding

PostgreSQL Data Types: Text Encoding

Monday 09 Apr 2018 5 mins read PostgreSQL, YeSQL

Continuing our series of PostgreSQL Data Types today we’re going to introduce the PostgreSQL text data type. The first notion to understand when processing text in any program is of course the notion of encoding.

So when addressing the text datatype we must mention encoding settings, and possibly also issues. An encoding is a particular representation of characters in bits and bytes. In the ASCII encoding the letter A is encoded as the 7-bits byte 1000001, or 65 in decimal, or 41 in hexadecimal. All those numbers are going to be written the same way on-disk, and the letter A too.

Getting out of SQL_ASCII, part 2

Tuesday 23 Feb 2010 5 mins read PostgreSQL, Encoding

plpgsql Encoding

So, if you followed the previous blog entry, now you have a new database containing all the static tables encoded in UTF-8 rather than SQL_ASCII. Because if it was not yet the case, you now severely distrust this non-encoding.

Now is the time to have a look at properly encoding the live data, those stored in tables that continue to receive write traffic. The idea is to use the UPDATE facilities of PostgreSQL to tweak the data, and too fix the applications so as not to continue inserting badly encoded strings in there.

Getting out of SQL_ASCII, part 1

Thursday 18 Feb 2010 3 mins read PostgreSQL, Encoding

Encoding

It happens that you have to manage databases designed by your predecessor, and it even happens that the team used to not have a DBA. Those histerical raisins can lead to having a SQL_ASCII database. The horror! What SQL_ASCII means, if you’re not already familiar with the consequences of such a choice, is that all the text and varchar data that you put in the database is accepted as-is. No checks.

3 Articles tagged “Encoding”

PostgreSQL Data Types: Text Encoding

Getting out of SQL_ASCII, part 2

Getting out of SQL_ASCII, part 1

Dimitri Fontaine