Saturday, January 25, 2014

The best database, PostgreSQL 9.3

Fedora 20 was released few days before Christmas with only slight delay and you can now enjoy many new bleeding-edge applications and one of them is the newest stable PostgreSQL 9.3. What can we find with this new shiny database engine is shortly summarized in this post.

Something on data definition field

One of the most significant news, materialized views, brings to users a possibility to create a real table instead of a virtual view and you can even create indexes on such view. However, after updating the view you need to call REFRESH explicitly, since this feature is not totally completed yet. Still worth using already, though.

Auto-updatable views might already be kind of hacked in previous versions, but now we have this cool feature made by default. Well, such views have to be quite simple, they need to have only one table in FROM clause, they can't modify relation using clauses like DISTINCT or UNION, but still cool, he? For more complicated views you can use INSTEAD OF triggers.

Event triggers allow you to define triggers as a reaction for some DDL query, i.e for queries like CREATE table. This can be used for some complicated settings of users, privileges and so on, where default settings doesn't work sufficiently.

New command COPY can be used for faster import/export data into/out of database, where there is a file or even application (using PIPE COPY) on the other end. The command is executed with rules that postgresql's superuser has and yes, only the superuser can do this, so nothing bad should happen. Another modification of COPY command is FREEZE, which means that constant records get frozen during copying, instead of running quite slow VACUUM FREEZE.

Performance is not without change

Since version 9.3 you can use GiST indexes for unlogged tables, which don't use transactional log and thus are not persistent during crash. Another new indexes, this time for regular expressions, will be surely beneficial for performance of many applications.

Performance improvements can be noticed also during numeric operations, parallel pg_dump, working with new range data type (which was introduced in 9.2), where upstream improved performance using histograms, and the same data type can be made faster using SP-Gist indexes, which was not possible before.

Other performance changes are introduced in shared memory, mmap is now used for implementation; large objects can be now 4TB large, while until now it was only 2GB. Locks can now use limited duration, which can be explicitely specified.

Checksums of data pages are not a performance improvement, rather the opposite, but some paranoid application can benefit from them. For such applications we can also use new style of warning messages, which now include easily parse-able items like SCHEMA NAME, TABLE NAME, CONSTRAINT NAME or LOCATION.

Something for plugin writers

Foreign Data Wrappers (FWD) bring new way how to work with other data than classic PostgreSQL data stack, while the data doesn't need to be only different PostgreSQL databases. We can get or even store data from/into almost any data storage, if technology allows it. New version of PostgreSQL brings also a new API which offers a way how to write new background processes, so called Background worker process. What this will mean in practice we'll see in the future, because there are no other plugins out there so far.

Another new way how to interract with PostgreSQL daemon is offered by pg_xlogdump, which basically prints transactional log in an easily parse-able format. What can we do with it? It seems it is limited only by your imagination.

News in SQL queries writing

I find it hard to explain what new LATERAL JOIN means, so I'll borrow quite easy description from [1] -- Put simply, a LATERAL JOIN enables a subquery in the FROM part of a clause to reference columns from preceding items in the FROM list. Deeper description is beyond this post, so feel free to use upstream's documentation [2].

First steps for supporting JSON data type were introduced in previous version, but now we have finally quite enough to work with it in real application. The thing is that the new version brings possibility to use agregation or we can parse JSON values now, which means we can use only some values from the JSON document. JSON can be also converted to record data type and also some basic values can be easily converted to JSON document. When working with arrays, we can use new functions array_remove and array_replace.

Some bonuses extra

Replication can be now set so that after a master server gets lost, new master can be set from one of the remaining slaves, while until now we had to set all slaves from scratch. PostgreSQL is therefore a bit closer to cloud-oriented database system, while automatic recovery is crucial in such applications.

Not only during systemd boot, where services get run in parallel if possible, we need to know that some service is ready. PostgreSQL now includes a new utility pg_isready, that can be easily used in these cases. It doesn't do nothing more than telling that a PostgreSQL daemon is ready to accept connections. But believe me, it'll help not only package maintainers a lot.

There are some news in the console psql as well -- back-slash commands \gset and \watch. Server now offers possibility to listen on more unix sockets and option unix_socket_directory is now renamed to unix_socket_directories. Well, this last feature is quite important for me personally, since I'm very proud to be recognized as an author of this patch. Well, I should admit that final edits were done by Tom Lane, the most active upstream committer. Anyway, great to see my name in such wonderful project.

And since the year 2014 is here for some time, I'd like to wish all guys involved in the PostgreSQL upstream to have wonderful peaceful time and get much energy for PostgreSQL 9.4. Thank you for all your work during the year 2013.

The article is based on the following articles:
[1] http://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.3
[2] http://www.postgresql.org/docs/devel/static/queries-table-expressions.html#QUERIES-LATERAL
[3] http://www.postgresql.org/docs/devel/static/release-9-3.html
[4] http://www.root.cz/clanky/novinky-v-pripravovanem-postgresql-9-3/

Wednesday, January 01, 2014

Happy New Year Fedora and what to do when upgrading to Heisenbug doesn't work as supposed

The old year 2013 was about to end and I realized I didn't want to start he new one with an old Fedora. Having Fedora 19 still on my personal and also on my working computer, I began. These are my hints and experiences.

My fails but finally successful with fedup

First, on my personal computer, I used dnf (a new, much faster successor-to-be of yum; I recommend it to try it out) to update all packages to the latest F19 version and then ran fedup utility with "--network 20 --nogpgcheck" options. Turning-off GPG checking was the fastest way how to work-around some signature issues. fedup reported some un-available repositories, so I just disabled those for the sake of upgrade.

After downloading packages and installing new fedup drucut I restarted computer and it proceeded without problems. And it still works fine.

So, to sum it up, on my personal computer it went very well, only some un-available repositories and GPG issues were encountered during the whole upgrade, but those were easily worked-around by using appropriate options.

What to do if fedup is not an option

The day after I proceeded with the same steps on my working computer, which is Lenovo T510. I saw the same issues with un-available repositories but not GPG issues any more. So I disabled the repositories and restarted computer after all packages had been downloaded, the same as I did on my personal comuputer.

When fedup dracut started I saw kernel panic after 7s (similar to RHBZ #873459, but I can't reproduce anymore, so I'll try it again on one of my virtual machines -- well quite paradox that I got kind of a heisenbug in a process of getting Fedora Heisenbug) so basically in a very early state. What is more, the same happened when I tried to boot with the newest kernel-3.12.x. But I was still able to boot with older kernel-3.11.x. So I wasn't screwed totally, but I was not able to finish fedup upgrade even repeatedly.

My solution was to go on with another way of upgrading -- upgrading using yum. I put dnf aside, since I wasn't sure if it is able to deal with upgrade (now I regret that I didn't try it) and using distribution-synchronisation option it went very well. After some hours (yeah, I have too many packages) I ran restorecon on whole root to be sure SELinux will work fine and rebooted. A bit surprised, I saw a new Fedora Heisenbug successfully booted and didn't noticed any issues.

Now I'm happy I could welcome the New Year with the bleeding-edge features that are served by Fedora 20 Heisenbug and I would like to wish all (not only) Fedora maintainers to have many solved issues, more interesting challenges and a lot satisfied users.

Guys, thank you all for your contribution in 2013!