PostgreSQL aka Postgres has been around for nearly 30 years now. More than ever, it’s still one of the best Open Source relational database engine you can find. Last but not least : It remains completly free.
Microsoft is embracing Open Source. Yes, that’s a huge shift. It’s been a discreet Open Source contributor for years and today, one of the most active contributor on GitHub (now part of Microsoft) with 7700 employees involved and projects like Visual Studio Code and its 19000 contributors.
Microsoft Azure public cloud is fantastic playground to build apps using Open Source technologies. Linux is a first class citizen with 50% of virtual machines deployed. Azure can also be operated from Linux, MacOS and Windows using the same tools. I can no longer count how many Open Source tools are integrated and supported by Microsoft’s cloud.
PostgreSQL on Azure
So it makes a lot of sense to have PostgreSQL support on Microsoft Azure. Of course, you can run it using virtual machines, there are a lot of pre-built VM templates for that or using containers but there’s also a service named Azure Database for PostgreSQL. Yes, a PostgreSQL as a Service. Azure Database for PostgreSQL is the real PostgreSQL, not a clone or a modified edition, setup up and operated for you by Microsoft. It means a lot. It means that you no longer have to deal with the operating system updates and patches, complex setup, security mess and boring maintenance. You can focus on your database service and your app. But it also means that you can using all the tools you use to use with PostgreSQL such as psql or pgAdmin. The service has built-in features to find long running queries or wait statistics but when it comes to log analysis, pgBadger is the king of the hill. It is of course compatible with Azure Database for PostgreSQL but it needs some knowledge to make the magic happen.
Setting up Azure Database for PostgreSQL log files
pgBadger is working with PostgreSQL log files so you need to get the right information into those log files. Azure Database for PostgreSQL, well I’ll name it Azure PostgreSQL starting now, does have the possibility to define what will go to the log files. Depending on your selection, pgBadger will be able to extract more or less information. Here is the selection I made :
Here is the kind of log files we’ll get
2019-02-06 15:00:00 UTC-5c5a1da0.3c344-LOG: duration: 15.626 ms statement: /*pga4dash*/ SELECT (SELECT sum(xact_commit) + sum(xact_rollback) FROM pg_stat_database) AS "Transactions", (SELECT sum(xact_commit) FROM pg_stat_database) AS "Commits", (SELECT sum(xact_rollback) FROM pg_stat_database) AS "Rollbacks" 2019-02-06 15:00:00 UTC-5c5a1da0.3c344-LOG: duration: 0.000 ms statement: /*pga4dash*/ SELECT (SELECT count(*) FROM pg_stat_activity) AS "Total", (SELECT count(*) FROM pg_stat_activity WHERE state = 'active') AS "Active", (SELECT count(*) FROM pg_stat_activity WHERE state = 'idle') AS "Idle" 2019-02-06 15:00:00 UTC-5c5a1da0.3c344-LOG: duration: 0.000 ms statement: /*pga4dash*/ SELECT (SELECT sum(tup_inserted) FROM pg_stat_database) AS "Inserts", (SELECT sum(tup_updated) FROM pg_stat_database) AS "Updates", (SELECT sum(tup_deleted) FROM pg_stat_database) AS "Deletes"
Running pgBadger with Azure Database for PostgreSQL log files
PostgreSQL log files may have different formats (stderr, syslog or syslog2) and different prefixes (every log lines will start with some informations such as date/time, process ID and more).
Azure PostgreSQL is using stderr as its log format and we can change that. We are not allowed to specify the log lines prefix too (log_line_prefix parameter). I didn’t find this value in the documentation either so we have to get the configured value from the engine.
First, we need to be able to connect to the service. Azure PostegreSQL is secured by default. That’s a great thing, isn’t it? We will have to allow our client machine to connect to the service. Hopefully, there’s a nice feature to quickly add the current client IP address to the white list.
Now, we can connect to the database using any client tool we like. I choose the light weight psql.
psql -h danvytest.postgres.database.azure.com sportsdb master@danvytest
Finally, let’s get the current log line prefix setting from the service.
SELECT setting FROM pg_settings WHERE name = 'log_line_prefix'; %t-%c-
So, it’s “%t-%c-” by default on Azure PostgreSQL.
We can now run pgBadger on our log files. It has a ton of parameters to play with to produce the desired analysis. By default it will create an out.html file.
pgbadger -v -f stderr -p '%t-%c-' logs/postgresql-2019-02-06_130000.log
pgBadger app is an impressive one file Perl script. So you can run and schedule it in Azure using a VM or an App Service (Web app as a Service). But that’s another story.
I’d like to address a special thank to Gilles Darold, creator and main maintener of the pbBadger project, for his useful tool and the help he gave me.