Category “Yesql” — 45 articles

Today I want to react to an article that claims that Relational Algebra Is the Root of SQL Problems in which the author hand-waves the following position:

SQL becomes more a hindrance to data manipulation than an efficient tool. SQL’s greatest problem isn’t in the implementation level, but at its theory foundation. The problem can’t be solved by application optimization. Relational algebra isn’t sophisticated enough for handling the complicated data manipulation scenarios.

Then they go on to several arguments from authority to “prove” their point. My reading of the article is that SQL is very hard when you didn’t care to learn it, as most technologies are.

In this article, we’re going to look at the simple examples provided where apparently SQL makes it so much harder to find a solution compared to writing some Java or C++ code. Contrary to the original article, we go as far as to actually writing both the SQL solution and a complete Python solution, so that we can compare.

In another article here, entitled on JSON and SQL, we saw in great details how to import a data set only available as a giant JSON file. Then we normalized the data set, so as to be able to write SQL and process our data. This approach is sometimes very useful and was a good way to learn some of the JSON functions provided by PostgreSQL.

In this article, we’re going to use SQL to export the data from our relational model into a JSON document. The trick that makes it complex in this example is that we have a recursive data model, with a notion of a parent row that exists in the same table as the current one. That’s a nice excuse to learn more about the SQL construct WITH RECURSIVE.

It seems to be usual nowadays to review the previous year, and readers apparently like Top-N Lists — that’s you now, so let’s hope that my understanding works with you too.

Of course 2018 will see its own amount of new and original content added to this blog, with a continuous focus towards how to make the best out of the SQL powerful programming language, and its advanced concurrency semantics.

PostgreSQL ships with an interactive console with the command line tool named psql. It can be used both for scripting and interactive usage and is moreover quite a powerful tool. Interactive features includes autocompletion, readline support (history searches, modern keyboard movements, etc), input and output redirection, formatted output, and more.

In How to Write SQL we saw how to write SQL queries as separate .sql files, and we learnt about using query parameters with the psql syntax for that (:variable, :'variable', and :"identifier").

For writing our database model, the same tooling is all we need. An important aspect of using psql is its capacity to provide immediate feedback, and we can also have that with modeling too.

In our article Exploring a Data Set in SQL we discovered a data set related to music: the Chinook sample database.

Our discovery led us to find albums containing tracks of multiple genres, and for the analytics we were then pursuing, we wanted to clean the data set and assign a single genre per album. We did that in SQL of course, and didn’t actually edit the data.

Finding the most frequent input value in a group is a job for the mode() WITHIN GROUP (ORDER BY sort_expression) Ordered-Set Aggregate Function, as documented in the PostgreSQL page about Aggregate Functions.

PostgreSQL 10 is now available for everyone to use, and hinted by David Fetter I had to review my previous article on Json and SQL to adapt to Set Returning Functions changes.

A Set Returning Function is a PostgreSQL Stored Procedure that can be used as a relation: from a single call it returns an entire result set, much like a subquery or a table.

It used to be possible to use SRF in the SELECT clause, with dubious (but useful at times) semantics, and also in scalar contexts. The semantics have been fixed and are now much clearer, and the uses in scalar contexts are forbidden — they were a hack and never made sense anyway.

on Json and SQL

PostgreSQL has had proper json support for a while now. The unique extensibility approach of the PostgreSQL system allows it to enable native SQL friendly JSON processing.

In this article we’ll play with the Magic: the Gathering card data in JSON format data set, provided with a CC0 licence, and process the information provided. We also see how to normalize a JSON document into a proper database model that benefits from some PostgreSQL advanced features, and how to then inject the JSON documents into the normalized database schema. Finally, we compare some non-trivial processing done against both versions of the database schema.

Dimitri Fontaine

PostgreSQL Major Contributor

Open Source Software Engineer

France