The idea of the day ain’t directly from me, I’m just helping with a very thin subpart of the problem. The problem, I can’t say much about, let’s just assume you want to reduce the storage of `MD5` in your database, so you want to abuse bit strings. A solution to use them works fine, but the datatype is still missing some facilities, for example going from and to hexadecimal representation in text.

``````create or replace function hex_to_varbit(h text)
returns varbit
language sql
as \$\$
select ('X' || \$1)::varbit;
\$\$;

create or replace function varbit_to_hex(b varbit)
returns text
language sql
as \$\$
select array_to_string(array_agg(to_hex((b << (32*o))::bit(32)::bigint)), '')
from (select b, generate_series(0, n-1) as o
from (select \$1, octet_length(\$1)/4) as t(b, n)) as x
\$\$;
``````

To understand the magic in the second function, let’s walk through the tests one could do when wanting to grasp how things work in the `bitstring` world (using also some reading of the fine documentation, too).

``````=# select ('101011001011100110010110'::varbit << 0)::bit(8);
bit
----------
10101100
(1 row)

=# select ('101011001011100110010110'::varbit << 8)::bit(8);
bit
----------
10111001
(1 row)

=# select ('101011001011100110010110'::varbit << 16)::bit(8);
bit
----------
10010110
(1 row)

=# select * from *TEMP VERSION OF THE FUNCTION FOR TESTING*
o |                b                 |    x
---+----------------------------------+----------
0 | 10101100101111010001100011011011 | acbd18db
1 | 01001100110000101111100001011100 | 4cc2f85c
2 | 11101101111011110110010101001111 | edef654f
3 | 11001100110001001010010011011000 | ccc4a4d8
(4 rows)
``````

What do we get from that, will you ask? Let’s see a little example:

``````=# select hex_to_varbit(md5('foo'));
hex_to_varbit
----------------------------------------------------------------------------------------------------------------------------------
10101100101111010001100011011011010011001100001011111000010111001110110111101111011001010100111111001100110001001010010011011000
(1 row)

=# select md5('foo'), varbit_to_hex(hex_to_varbit(md5('foo')));
md5                |          varbit_to_hex
----------------------------------+----------------------------------
acbd18db4cc2f85cedef654fccc4a4d8 | acbd18db4cc2f85cedef654fccc4a4d8
(1 row)
``````

Storing `varbits` rather than the `text` form of the `MD5` allows us to go from `6510 MB` down to `4976 MB` on a sample table containing 100 millions rows. We’re targeting more that that, so that’s a great win down here!

In case you wonder, querying the main index on `varbit` rather than the one on `text` for a single result row, the cost of doing the conversion with `varbit_to_hex` seems to be around `28 µs`. We can afford it.

Hope this helps!