Planet PostgreSQL

The world's most advanced open source database

How Postgres could index itself

Postgres indexing itself has long been on my wishlist. Andrew Kane from Instacart, and creator of PgHero has bottled up many learnings into a new tool: Dexter. I suspect we’ll get a look at all that went into this, how it works, and how you can leverage it to have a more automatically tuned database.

Scaling a SaaS Application Beyond a Single Postgres with Citus

Migration talks are all to common, from Postgres to MySQL from MySQL to Postgres, or from Dynamo to Postgres. But this one is a little different flavor from Postgres to sharded Postgres with Citus. Sharding into a distributed system of course brings new things to consider and think about, and here you’ll learn about them from first hand experience so hopefully you can avoid mistakes yourself.

Concurrency Deep Dive

This one looks to be a great under the hood look as well as likely very practical. It’ll cover MVCC which is really at so much of the core of how Postgres works, but then bring it up to what it means for things like locks. Best of all, this one like so many others comes with lots of real world experience from Segment.

Postgres window magic

I love me some windows, though not always the easiest things to work with. They can let you easily do things like compute month over month growth between rows in a single query. Bruce whose always a great presenter walks us through them and all the things they’re capable of.

Running PostgreSQL @ Instagram

Instagram is well known a

[...]

PostGIS 2.3.3 Released

Posted by Regina Obe in PostGIS on 2017-07-01 at 00:00

The PostGIS development team is pleased to announce the release of PostGIS 2.3.3 As befits a patch release, the focus is on bugs and breakages. Best served with PostgreSQL 9.6.3+ and pgRouting 2.4.1.

Continue Reading by clicking title hyperlink ..

PG Phriday: Uptown Func

Posted by Shaun M. Thomas in 2ndQuadrant on 2017-06-30 at 15:00

One of the coolest things about Postgres functions is that they can return rows as if they were a table. Not only is it possible, but creating a set-returning function in Postgres is almost frighteningly trivial when compared to other database engines. In fact, some Oracle users are probably rolling their eyes and muttering “Just user a cursor!” already. Just hear us out!

Postgres functions can return cursors too, but using them afterwards isn’t exactly a friendly experience. Cursors are passed by reference, so the function must either accept a parameter to name the cursor, or it generates something wacky like “<unnamed portal 1>”. After that, cursors can only be used with FETCH instead of SELECT, greatly limiting their utility.

What about views, then? Obviously they can return rows like a table, so why not just write everything as a query and turn it into a view? It’s true that practically any function can be emulated with a sufficiently advanced query. This is especially true since Postgres 9 added CTEs, allowing it to generate step-by-step transformations. Some use cases aren’t well covered by that approach, however.

Stop, Wait a Minute

Consider the Fibonacci Sequence. Here’s a function that iteratively generates the whole sequence up to the Nth value:

CREATE OR REPLACE FUNCTION fibonacci(nLoops INT)
RETURNS SETOF NUMERIC AS
$$
DECLARE
  i INT;
  x NUMERIC := 0;
  y NUMERIC := 1;
  z NUMERIC := 1;
BEGIN
  FOR i IN 1..nLoops LOOP
    RETURN NEXT x;
    x = y;
    y = z;
    z = x + y;
  END LOOP;
END;
$$ LANGUAGE plpgsql;

SELECT max(fibonacci) FROM fibonacci(100);

          max          
-----------------------
 218922995834555169026

Amusingly enough, this particular function works until the NUMERIC datatype itself is exhausted, provided sufficient hardware. Our test VM was able to survive until the 400,000th result before subsequent attempts exhausted our meager 8GB of temporary disk space. We’d print it here, but sadly a number with 83,595 digits won’t fit in the article.

Take a Sip, Sign a Check

Now cons

[...]

PostgreSQL and the calendar

Posted by Dimitri Fontaine on 2017-06-30 at 12:35

The modern calendar is a trap for the young engineer’s mind. We deal with the calendar on a daily basis and until exposed to its insanity it’s rather common to think that calendar based computations are easy. That’s until you’ve tried to do it once. A very good read about how the current calendar came to be the way it is now is Erik’s Naggum The Long, Painful History of Time.

Channel binding with OpenSSL and Postgres

Posted by Michael Paquier on 2017-06-30 at 01:30

With the SCRAM implementation done in Postgres 10, it is time to move on with other things in this area. The next, and perhaps last, area of focus in the implementation of channel binding, on which a patch has been submitted for integration into Postgres 11.

Channel binding is a concept defined in RFC 5056, to ensure that the frontend and the backend connecting to each other are the same in order to prevent man-in-the-middle attacks.

RFC 5802, which is the specification for SCRAM, has a section dedicated to channel binding. In the context of Postgres, if a client attempts an SSL connection, then the server needs to advertise the SASL mechanism named SCRAM-SHA-256-PLUS, on top of the existing SCRAM-SHA-256 that has been implemented in Postgres 10, to let the client choose if it wants to perform channel binding or not. For Postgres, with an SSL connection, and if the server has published the -PLUS mechanism, a client has to choose the channel binding option or the server would consider that as a downgrade attack if choosing the non-PLUS channel. The protocol used for the exchange is defined in the documentation.

As mentioned by RFC 5929, there are several types of channel bindings, which define the channel binding data their own way:

tls-unique, which uses the TLS finish message bytes.
tls-server-end-point, which uses a hash of the server certificate.
tls-unique-for-telnet, which uses the TLS finish messages sent by the server and the client after the handshake.

Both of them need to be encoded with base-64. Note that per the definition available, a SCRAM implementation has to implement tls-unique, the other two ones being optional.

The patch proposed to the community at the time this post is written implements two of them: tls-unique and tls-server-end-point. The former because of its mandatory nature, and the second one after discussion to ease the integration of channel binding in the Postgres JDBC driver, where getting the TLS finish message data can be a dependency pain.

Now, finding out how to implement bo

[...]

Performance comparison of PostgreSQL connectors in Matlab, Part I: inserting data

Posted by Peter Gagarinov on 2017-06-29 at 07:48

We actively develop software related to financial modelling and risk estimation. Most of model prototypes basing on data mining, machine learning, quantitative analysis we develop as well as certain parts of production code (thanks for JIT compilation and rich visualization capabilities) are implemented in Matlab. And naturally we always face the need to process huge amounts of financial data. Data processing usually generates even more data that needs to be stored somewhere in a persistent and consistent manner. There are many ways to achieve that, but for us the most reasonable choice is to use a relational database server.

Announcing PGConf Austin 2017 -- Call for Papers now Open

Posted by Joshua Drake on 2017-06-28 at 16:37

PGConf US in partnership with the Austin PostgreSQL User Group (AUSPUG) is pleased to announce that the call for papers for PGConf Local: Austin is now open.

AUSTIN, June 28, 2017 -- The inaugural PGConf US Local: Austin Conference (PGConf Austin) will be held November 3 - 4, 2017, at the Norris Conference and Event Center (2525 W. Anderson Ln #365, Austin, TX 78757) located in northwest Austin.

This two day, multi track conference is a perfect opportunity for users, developers, business analysts and enthusiasts from the Southwest to amplify Postgres and participate with the Postgres community.

The Call for Papers for PGConf Austin is now open, and presentation proposals can be submitted at https://pgconf.us/conferences/Austin2017

The call for papers for PGConf Austin will be open until Sunday, August 20, 2017, and speakers will be notified of acceptance/decline no later than Monday, August 28, 2017. Registration for PGConf Austin will also open August 28.

Conference Schedule:

Friday, November 3, 2017: Half-day Trainings with Postgres experts
Saturday, November 4, 2017: Breakout Sessions (lunch and coffee break provided)

Conference speakers receive complimentary entry to the breakout sessions on November 4 -- the half-day trainings are separately priced sessions. As a nonprofit event series, funding is currently not available for speaker travel and lodging accommodations. Everyone is encouraged to submit a talk or training to the first PGConf supported nonprofit educational events in Texas, and join the growing PostgreSQL community.

Sponsorship Opportunities

PGConf Local series is supported by its generous sponsors, including Diamond Sponsor Amazon Web Services and Platinum Sponsors 2ndQuadrant, Compose, and OpenSCG, as well as Media Sponsor AmplifyPostgres. Business leaders and companies interested in sponsoring PGConf US nonprofit events can view the Conference Prospectus .

About PGConf US:

PGConf US is a nonprofit conference series with a focus on growing the Postgres community through increased awa

[...]

Where is the Postgres community?

Posted by Joshua Drake in CommandPrompt on 2017-06-27 at 18:07

There was a poll recently on @amplifypostgres to determine where that Postgres community should have its interactive communication. The options were Google Hangouts, Slack, Reddit or “Other”. The results were not surprising with Google Hangouts beating Slack with 157 votes cast. There was also notable mentions of IRC, and Gitter. There were a couple of long time Postgresql.Org members that asked the inevitable, “What is wrong with IRC?” . Of course there is nothing wrong with IRC but most users when you tell them to use IRC will say, “IRWhat?” which is either a sign of disdain or ignorance depending on the user.

The problem and what is driving this post was an additional comment made by a long time community member that we need to educate the users because the community (Postgresql) is on IRC and mailing lists.

That thinking ignores the reality that many community silos that have been built up over the years. If IRC and mailing lists were sufficient for the needs of the community, these silos wouldn’t thrive. They are active and serve the needs of many communities that are linked but not directly integrated into Postgresql.Org.

Silo	Members
@amplifypostgres	14000+
G+ PostgreSQL	9100+
r/postgresql	5100+
Facebook Postgresql Server	4500+
Facebook PostgreSQL в России	2500+
Slack Postgres	1100+
Notable Mention
StackOverflow #postgresql	70,000+ Questions

That doesn’t count the many other Postgres communities. The NYCPUG has over 2300 users and I guarantee you that at most a small two digit percentage participate on “The mailing lists” or “IRC”. Let us also not forget some of the largest communities in the world such as Japan and Asia. Very few of them participate on the .Org lists, that doesn’t mean they aren’t part of the community.

If as a community our goal is to not only build software but also to build people then we have to let go of our old man, get off my lawn tendencies and embrace new forms of collaboration. I am a bonafide master of “Good lord, why do

[...]

replace_empty_string

Posted by Pavel Stehule on 2017-06-27 at 09:24

Last half of year I am working on migration relative big application from Oracle to Postgres. This application is based on intensive usage of stored procedures, triggers, views. The base tools are ora2pg and plpgsql_check. Many thanks to Gilles Darold for his work on ora2pg. Half year ago this tool has almost zero support for PL/SQL - and now it is able to translate 90% of big code base of old PL/SQL code to PLpgSQL. There was lot of issues, but often was fixed to next day. Thank you.

Some tools I had to write too. I have some points for Orafce. Last tool what I wrote for this project is replace_empty_string extension. Oracle doesn't save empty strings - it does translation to NULL implicitly. To ensure similar behave I wrote generic trigger, that any empty string replaces by NULL. Default is quite behave, but warning (when string is empty string) is possible.

Example:

CREATE EXTENSION replace_empty_string;

CREATE TABLE res (
 id  int4,
 idesc  text,
 test1  varchar,
 test2  text
);

CREATE TRIGGER res_replace_empty_string
 BEFORE UPDATE OR INSERT ON res
 FOR EACH ROW
 EXECUTE PROCEDURE replace_empty_string ();

INSERT INTO res VALUES (1, 'first', NULL, '');
INSERT INTO res VALUES (2, NULL, '', 'Hello');

\pset null ****

SELECT * FROM res;
 id | idesc | test1 | test2 
----+-------+-------+-------
  1 | first | ****  | ****
  2 | ****  | ****  | Hello
(2 rows)

UPDATE res SET idesc = ''
 WHERE id = 1;

SELECT * FROM res;
 id | idesc | test1 | test2 
----+-------+-------+-------
  2 | ****  | ****  | Hello
  1 | ****  | ****  | ****
(2 rows)

A quick check on Postgres 10 Beta Analytics Performance vs version 9.6

Posted by Kaarel Moppel in Cybertec on 2017-06-26 at 08:50

Some weeks ago already the Beta 1 of upcoming Postgres 10 was released. Besides the very visible version numbering scheme change (no more X.Y.Z ) the release notes promised amongst other very cool and longawaited features – like builtin Logical Replication and Table Partitioning – also many performance improvements, especially in the area of analytics. […]

The post A quick check on Postgres 10 Beta Analytics Performance vs version 9.6 appeared first on Cybertec - The PostgreSQL Database Company.

pgAdmin3 LTS Now Supports PostgreSQL 10

Posted by Holly Orr in OpenSCG on 2017-06-22 at 18:15

We are pleased to announce pgAdmin3 LTS is now available on OSX and Windows for PostgreSQL 10!

As v1.0 of pgAdmin4 was released in September of 2016, the pgAdmin Development Team decided to no longer support pgAdmin III with any more fixes. We believe that pgAdmin3 usage should fade out over time and have forked the code to include basic support for PostgreSQL 9.6+.

Try it out!

Install pgAdmin3 LTS by BigSQL

If you haven’t already installed PostgreSQL, start here.

Via Command Line

Windows users don’t prefix the pgc command with ./ as shown in the following commands:

cd <directory of installation>
./pgc install pgadmin3
./pgc start pgadmin3

Via GUI

If you haven’t already, install pgDevOps.

In pgDevOps, navigate to Package Manager, select the pgAdmin3 component, and install.

Already have pgAdmin3 installed?

cd <directory of installation>
./pgc update
./pgc upgrade pgadmin3

Wanna delete pgAdmin3?

cd <directory of installation>
./pgc remove pgadmin3

As always, give us your feedback so that we can continue to support and respond to the needs of the PostgreSQL community.

explain.depesz.com, paste.depesz.com – TOR announcement and some updates

Posted by Hubert 'depesz' Lubaczewski on 2017-06-22 at 16:23

I assume that you're familiar with explain.depesz.com and paste.depesz.com services. From today they are also available as services in TOR, using respectively: explain4pg4j5wbw.onion and paste4sql64vzyry.onion (of course you need to have TOR Browser or some other way to browse darknet). Why I did that? Well, I believe that TOR is one of the greatest inventions […]

Odd Month Arithmetic

Posted by Bruce Momjian in EnterpriseDB on 2017-06-21 at 16:15

You might be aware of the interval data type, which allows mathematical operations on date, time, and timestamp values. This can lead to odd behavior, but this email posting showed me a new oddity when dealing with months containing a different number of days. First, let's summarize how many days are in the first five months of 2017:

2017-01     31
2017-02     28
2017-03     31
2017-04     30
2017-05     31

PGConf Local: Philly 2017 registration open and schedule published

Posted by Joshua Drake on 2017-06-20 at 16:29

The third PGConf US event of the year is happening July 13 - 14th! Tickets are now available!

https://www.pgconf.us/conferences/Philly2017

This year the event is at Huntsman Hall located at The Wharton School. There will be two days of awesome Postgres content. On July 13th we are pleased to offer two great training sessions by two of the most veteran members of the PostgreSQL community, Bruce Momjian and Joshua Drake. Between them they have well over 40 years of hands-on experience with PostgreSQL.

On July 14th we have eight first-rate sessions covering everything from Postgres performance to development. Check out the full schedule here:

https://www.pgconf.us/conferences/Philly2017/schedule/events

Sponsorship opportunities for the conference are still available. If you use Postgres, support Postgres, or if you are looking to recruit a local Postgres expert, this is a great opportunity. Become a part of our dynamic and growing ecosystem! The prospectus is available at:

https://www.pgconf.us/conferences/Philly2017/sponsors

We would not be able to produce PGConf Local: Philly 2017 without the generous support from our sponsors:

Diamond: Amazon Web Services
Platinum: Compose, OpenSCG
Supporter: Azavea, Command Prompt
Venue: Wharton
Media: @amplifypostgres

For more information about PGConf US please visit: PGConf US

PGConf US: People, Postgres, Data

Using Zabbix with PostgreSQL as the database backend

Posted by Rafael Martinez Guerrero on 2017-06-19 at 19:22

Using Zabbix with PostgreSQL as the database backend rafaelma Mon, 19/06/2017 - 21:22

GDAL New Release 2.2 – BigSQL Makes it Easy to Install

Posted by Holly Orr in OpenSCG on 2017-06-19 at 16:44

The GDAL community just announced the release of GDAL 2.2… and BigSQL just made it easier to implement!

Install it Now

We support Diegos and Yodas with 2 ways to install:

Option 1: Install with the pgDevOps GUI

For GIS analysts (scientists, statisticians, analysts, etc) who find themselves as the de facto DBA (among other things).

“I don’t have time for this! I’m saving this sea turtle. Just give me a GUI!”

1. Go to www.bigsql.org/postgresql/installers and download and run the installer for your operating system.

2. Make sure to check the box for including the pgDevOps component with your install.

3. Open pgDevOps in your web browser:

    <ip of machine>:8051

If installing on local machine:

    localhost:8051

4. In the pgDevOps dashboard click the Package Manager icon.

5. Click the PostgreSQL installation you want to install on.

6. Click on the PostGIS icon and follow the instructions to install.

That’s it!

Option 2: Install via Command Line and PGC

For DBAs who find themselves supporting these strange spatially enabled databases.

“Real men only use command line! And what the hell is GIS?”

Install pgc via command line to a sandbox.
*Note: this command will install the BigSQL distribution into the directory you are currently located.

MAC / Linux:

python -c "$(curl -fsSL http://s3.amazonaws.com/pgcentral/install.py)"

Windows:

@powershell -NoProfile -ExecutionPolicy unrestricted -Command "iex ((new-object net.webclient).DownloadString('http://s3.amazonaws.com/pgcentral/install.ps1'))"

Get into the product home directory by navigating via command line to the bigsql directory. Windows users don’t prefix the pgc command with ./ as shown in the below examples:
```
cd bigsql
```
Using the pgc command line tool, run the update command to get the lastest distribution:
```
./pgc update
```
Then run the install, init, and start commands on pg96:
```
./pgc install pg96 
./pgc start pg96
```
Next, install postgis:
```
./pgc install postgis23-pg96
```

That’s it!

The GDAL_DATA Directory Path

In order for all your Po

[...]

Use WITH TIME ZONE

Posted by Bruce Momjian in EnterpriseDB on 2017-06-19 at 14:45

If you often use the TIMESTAMP data type, you might not be making full use of it. In these queries:

CREATE TABLE tztest (x TIMESTAMP);
 
INSERT INTO tztest VALUES (CURRENT_TIMESTAMP);
 
SELECT * FROM tztest;
             x
----------------------------
 2016-10-25 18:49:20.220891
 
SHOW timezone;
  TimeZone
------------
 US/Eastern
 
SET timezone = 'Asia/Tokyo';
 
SELECT * FROM tztest;
             x
----------------------------
 2016-10-25 18:49:20.220891

SQL and Business Logic

Posted by Dimitri Fontaine on 2017-06-19 at 11:30

Business logic is supposed to be the part of the application where you deal with customer or user facing decisions and computations. It is often argued that this part should be well separated from the rest of the technical infrastructure of your code. Of course, SQL and relational database design is meant to support your business cases (or user stories), so then we can ask ourselves if SQL should be part of your business logic implementation. Or actually, how much of your business logic should be SQL?

pg_chameleon 1.3 out

Posted by Federico Campoli on 2017-06-19 at 09:00

After another round of bug fixes I've released the version 1.3 of my pet project pg_chameleon.

The package is available on pypi as usual.

The changelog is available here.

If you have any question/issue to discuss I created a community on gitter.

Please join!

I've also added a RELASE_NOTES file to explain the changes. As this release changes the relay storage and you plan to upgrade please have look to the file.

The initial implementation for the relay data was to have two log tables t_log_replica_1 and t_log_replica_2 with the replica process accessing one table at a time.

This approach allows autovacuum to take care of the unused partition meanwhile the other is written.

The method worked fine with only one replica worker. However as the flip flop between the tables is calculated indepentently for each source this could cause unwanted bloat on the log tables if several sources are replicating all togheter.

In this scenario autovacuum will struggle to truncate the empty space in the table's end.

The pg_chameleon version 1.3 implements the log tables per source. Each source have a dedicated couple of tables still inherited from
the root partition t_log_replica.

The schema upgrade happen automatically at the first run.
Because this one involves a data copy it could take more time than the usual. If the process seems frozen do not stop it otherwise you may lose your replica setup .

Those are the upgrade steps you should follow.

Stop all the replica sources. The show_status command must show all the rows in stopped status
Take a backup of the schema sch_chameleon with pg_dump
Upgrade pg_chameleon with pip install pg_chameleon --upgrade
Run chameleon.py upgrade_schema --config <your_config> --debug
When the upgrade is finished start the replica process as usual

If something goes wrong in the upgrade process you shall restore the sch_chameleon’s backup, then you should downgrade the installation to pg_chameleon 1.2 and start the replica as usual.

Dollar-quoting for escaping single quotes

Posted by Leo Hsu and Regina Obe on 2017-06-17 at 23:54

PostgreSQL has a feature called dollar-quoting, which allows you to include a body of text without escaping the single quotes. This feature has existed for quite some time. You've probably seen this in action when defining functions for example:

CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
RETURNS text AS
$$
SELECT 'Hello world. My name is ' || param_your_name || '.';
$$
language sql STRICT;

Which is easier to read, than the equivalent escape quoted function:


CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
RETURNS text AS
'
SELECT ''Hello world. My name is '' || param_your_name || ''.'';
'
language sql STRICT;

Continue reading "Dollar-quoting for escaping single quotes"

Amazing pgug Estonia

Posted by Federico Campoli on 2017-06-17 at 08:57

Last 30th of May I had the honor to talk at the second Estonia PostgreSQL User Group.

My talk was about my pet project pg_chameleon and it went very well.

The audience was about 40 people highly interested in the topic and the questions were all around.

I'm impressed by the Estonian PostgreSQL User Group.
Their organizational skills are amazing, the members interest is very high and they are fantastic friends.

If you are curious about the talk there is the recording on EEPUG's youtube channel.

PL/Java Adoption

Posted by Bruce Momjian in EnterpriseDB on 2017-06-16 at 15:30

PL/Java has been around since 2005, but it has regularly struggled to gain users. Unfortunately, a lot of these problems are specific to the Java language and hamper its adoption.

First, there are a limited number of people who know both Java and the Postgres backend code. Fortunately Chapman Flack has recently resumed PL/Java development.

Second, there is the deployment complexity of binding PL/Java to a Java Runtime Environment (JRE) in an easily-deployed way. This 2014 email thread discusses the problems of JRE packaging and installation.

PG Phriday: Postgres on Tap

Posted by Shaun M. Thomas in 2ndQuadrant on 2017-06-16 at 15:00

Testing within the context of a database can sometimes seem a bit overwhelming. Not only is it necessary to create a reasonable set of test data to bootstrap multiple tables, but we need some reliable invocation system to compare query, view, and function output to expected results. And that is no small feat. For a very long time, the "easiest" way to test end-user database objects involved building a Makefile that included PGXS. Then we had to create the tests, run the tests, and copy over the output into static expected files. Modifying any test always meant updating the static output file as well.

It’s great that the tools are there, yet why do we need to be halfway to a Postgres developer in order follow good QA practices with Postgres? Well, maybe we don’t. It turns out, like many things in the Postgres world, that there’s an extension for that. These days, all the cool kids are using pgTAP. Not only does pgTAP make it easier to write tests for Postgres, but its design generally encourages test-driven development. Regardless of how any of us might view that development paradigm, having the availability makes Postgres a more welcoming platform for those who fully embrace it.

To that end, let’s say we have our trusty sensor_log table and we want to write a get/set function pair to control and access its contents. As always, the sensor_log table looks like this:

CREATE TABLE sensor_log (
  sensor_log_id  SERIAL PRIMARY KEY,
  location       VARCHAR NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL
);

CREATE INDEX idx_sensor_log_location ON sensor_log (location);
CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);

We also need to install pgTAP itself. That part at least, is as easy as the usual extension install. For RHEL and Debian system users, it’s not even necessary to build anything, since pgTAP is in the official Postgres YUM and APT repositories. After that, it’s just a quick extension install:

CREATE EXTENSION pgtap;

In the spirit of TDD, we should write the

[...]

PostgreSQL on ZFS with BPF tracing on top.

Posted by Vasilis Ventirozos in OmniTI on 2017-06-16 at 12:01

At OmniTI we love solaris, my personal favourite features are ZFS and DTrace. Unfortunately not many run postgres on solaris so i have decided to implement similar features in linux. Instead of Dtrace i'll install BPF, in-kernel bytecode that can be used for tracing introduced in recent kernels (4.X).
This post will be a part of a three series post. In this post we'll start with setup, in part #2 with ZFS and how to use it for backups / snapshots. In part #3 we'll dig into BPF a bit more.

Step 1 is to setup a new ubuntu. I setup a VM using ubuntu-16.04.2-server-amd64.iso.

As root :
Add the repo for bcc :
> echo "deb [trusted=yes] https://repo.iovisor.org/apt/xenial xenial-nightly main" | sudo tee /etc/apt/sources.list.d/iovisor.list

sudo apt-get update

Install all necessary and some optional packages :

> apt-get install -y sudo wget apt-transport-https joe less build-essential libreadline-dev \
zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev openssh-server screen git \
binutils bcc bcc-tools libbcc-examples python-bcc zfsutils-linux \
systemtap systemtap-sdt-dev

Make sure kernel is properly compiled :

> cat /boot/config-`uname -r` |grep BPF
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_HAVE_BPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m

Test BCC (stands for BPF Compiler Collection)

> python /usr/share/bcc/examples/tracing/bitehist.py
Tracing... Hit Ctrl-C to end.
^C
kbytes : count distribution
0 -> 1 : 7 |************ |
2 -> 3 : 0 | |
4 -> 7 : 22 |****************************************|
8 -> 15 : 19 |********************************** |
16 -> 31 : 8 |************** |
32 -> 63 : 6 |********** |
64 -> 127 : 1 |*[...]

What's New in SQL:2016

Posted by Markus Winand on 2017-06-15 at 00:00

In December 2016, ISO released a new version of the international SQL standard (ISO/IEC 9075:2016). It supersedes the previous version from 2011.

This article is a brief overview of the new features being introduced into the SQL language. Strictly speaking, this article covers the additions to part 2 of the standard (SQL/Foundation), i.e. the most commonly used part.

This article also shows the availability of these features among six major databases. Note that respective figures—shown below—only reflect whether or not the databases support the features in the way described by the standard. For example, an X in the JSON row does not mean the database has no JSON support—it simply means the database doesn't support the JSON features described by the SQL standard. As a matter of fact, all tested databases support JSON in some way—but not necessarily in the way described by the standard.⁰

Overview of Availability of new SQL:2016 features

Before going through all the shiny new features, let's look at some trivia: part 2 of the SQL:2016 standard has 1732 pages—that's 260 pages (~18%) than the 2011 edition. It introduces 44 new optional features (+14%). Let's take a look at them…

Row Pattern Recognition

Row pattern recognition captures groups of rows that follow a pattern. Rows in matched groups can be filtered, grouped, and aggregated. The pattern itself is described with a simple regular expression syntax.

The main use of row pattern recognition is to check time series for patterns. However, the match_recongize clause combines aspects of the where, group by, having and over clauses (window functions) so it is also useful in many other cases.

I have given a presentation about row pattern recognition. It discusses several examples in two implementation variants: with and without the new match_recongize clause. The examples cover some typical use cases, and also some atypical use cases for row pattern recognition:

Consecutive events: identifying sessions in a web-log; tolerating gaps in sequences
Top-N per group (might be faster than window functions!)
Time in

[...]

When To Use Server-Side Logic

Posted by Bruce Momjian in EnterpriseDB on 2017-06-14 at 14:30

The use of server-side logic, particularly stored procedures, has been a highly contentious topic among database professionals for decades. The question has always been what amount of logic should be encoded in the database vs. in client applications or application servers.

Beyond using the database as a simple data store, there are three levels in which logic can be added to Postgres:

Database constraints, e.g. unique, check, foreign keys
Triggers
Manually-called stored procedures, i.e., functions not called as triggers

Let's look at each of these in turn:

Database constraints are often seen as optional by database application developers. They don't really do anything except prevent invalid data from being entered into the database. The overhead of performing these checks is often seen as negative, particularly foreign key checks. (DEFAULT clauses that call stored procedures are similar to triggers; the serial data type uses this.)
Triggers are stored procedures assigned to tables that are executed automatically during INSERT, UPDATE, DELETE, or TRUNCATE commands. They can be triggered before the command runs, usually for complex constraint checks or to modify incoming data, e.g. capitalization. When triggered after the command, they perform post-command operations, such as adding entries to a log table.
Manually-called stored procedures are functions called usually in WHERE clauses or in the target list of SELECT queries. Stored procedures, including ones used as triggers, can be written in many languages.

Installing pgBackRest on OSX

Posted by Douglas Hunley in OpenSCG on 2017-06-14 at 12:40

If you’ve followed my previous posts (here and here), then you already have one or more versions of PostgreSQL installed on your Mac. Maybe these are solely for test or dev purposes and you don’t really care about any of the data therein, but if you do, let me guide you to pgBackRest.

pgBackRest aims to be a simple, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.

Instead of relying on traditional backup tools like tar and rsync, pgBackRest implements all backup features internally and uses a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows for better solutions to database-specific backup challenges. The custom remote protocol allows for more flexibility and limits the types of connections that are required to perform a backup which increases security.

pgBackRest is written in Perl, but don’t hold that against it. As of the 1.19 release, pgBackRest can now use S3 buckets as the storage backend. I really like pgBackRest and tend to use it for myself and customers over any of the other tools in the PostgreSQL ecosphere. So, let’s get started by downloading the latest release from their site, and then installing it. For some reason, no one has added pgBackRest to Homebrew yet (someone, pls!) so let’s do it the manual way:

true

(Keep in mind that I already had Perl setup to connect to PostgreSQL for other uses. You might need to install DBD::Pg.)

Now that pgBackRest is installed, let’s configure it. First, we’ll want to set some of the global properties that affect all pgBackRest operations:

true

As you can see, we set the following:

force the log level for all console output to ‘info’
define the S3 bucket we want to use
define the S3 endpoint to connect to
define our S3 key
define our S3 secret key
set which region our bucket is in
tell pgBackRest that we’re using S3 as the backend
configure retention of full backups
tell pgBackRest to issue a CHECKPOINT so backups can start right away instead of waiting for

[...]