Sunday, November 22, 2015

Thoughts from PGConf.eu, PostgreSQL Europe Conference 2015, Day 1

It was my first PGConf.eu and it was awesome. My bad that I waited so long to share some of the thoughts, but fixing it now with summary. Links to presentations are located at https://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2015.

The keynote by Tamara Atanasoska called The bigger picture: More then just code mentioned that community project is more about people, code is second. It was also stressed how important it is to be open to new-comers and users. Great eye opener for enybody involved in open-source.

Upcoming PostgreSQL 9.5 Features talk by  Bruce Momjian involved these news: insert on update, aggregate functions enhancements, in-memory performance, JSON manipulation and operations and improvements in foreign data wrappers, which now can be part of sharding now. I really can't wait for PostgreSQL 9.5 GA.

Dockerizing a Larger PostgreSQL Installation: What could possibly go wrong? by Mladen Marinović was really something what I looked for, because containers is now the topic #1 for me.  Mladen demonstrated using containers mostly as VM, basics of Docker were also introduced, since only half of the room were familiar with it. Problems with using cache during `docker build` and locales inside containers were mentioned, but the solutions was rather hacky to me (using date to trick the daemon, instead of straightforward --nocache). Problem with sending signals to process was alse mentioned, which is something we were looking at also during creation of containers for Open Shift.

Mladen uses Ansible for building images and uses several containers for specific actions -- e.g. a separate container for tools (dump/load). They support replication -- host_standby, restart_command (run in separate container) and run on two physical servers. Backups (pg_basebackup + compress) according retention policy - another container.

It was also mentioned that taking back-up from slave is not that easy. Monitoring, that all containers are alive, is done by HA SW. Problem with OOM were also mentioned, that system may kill your container processes, but solution was to use max_connections. Problem with freezing server was then solved by using timeout for every command. Transparent huge pages were mentioned as something not to use -- use normal huge pages instead. Finaly, upgrading to new major versions were mentioned as tough point always.
See more at: https://bitbucket.org/marin/pgconfeu2015/src

DBA's toolbelt by Kaarel Moppel was more a list of possible tools to look at when you mean it seriously with administration of PostgreSQL. So, just repeating the list may be interesting for someone.:
  • docs + pgconfig for configuring
  • pgbench + pgbench-tools for bechmarking
  • pg_activity (top-like) + pg_view + pgstats + pg_stat_statements + pgstattuple + plugins for monitoring systems (nagios) + postgresql-metricks (spitify) for monitoring
  • pgBadger + pg_loggrep - log analyzing
  • postgresql-toolkit - victorinox for postgresql dba
  • pg-utils, pgx_scripts (https://github.com/pgexperts/pgx_scripts), acid-tools
  • pg_locks, pg_stat_activity (locks)
  • pgObserver, pgCluu, many others, based on what we need to do
  • pgadmin4 on the way (backend and web frontend)
  • developer: pgloader, pg_partman, pgpool-II, PL/Proxy
Managing PostgreSQL with Ansible by Gulcin Yildirim was more an introduction of Ansible.
https://github.com/gulcin/pgconfeu2015

Let’s turn your PostgreSQL into columnar store with cstore_fdw by Jan Holčapek introduced interesting plugin, that may convert clasic row-based storage PostgreSQL engine into column storage database by utilizing foreign data wrapper concept. In some cases this may help a lot for the performance, once ready.

Performance improvements in 9.5 and beyond by Tomas Vondra was not only an interesting insight to particular areas that PostgreSQL hackers look at, but also nice motivation to look at aupgrading to 9.5. For example sorting speed-up up to 3-4x in comparisson to 9.4 in some cases is something I wouldn't really expect. Another comparisson of BRIN vs. BTREE indexes showed that performance is quite similar, but BRIN is much much much smaller. Another set of graphs showed how parallel scan allows to speed-up selects up to half of the time.

Notes from second day are here and from third day are here.