Using psql Variables to Introspect Your Script

Using psql Variables to Introspect Your Script

November 19, 2023

Some good ideas from Lucas’ article

psql support internal variables similar to the variables you can find in a shell.

Use Case: Quantitative Data about a Transaction

I want to run a long transaction that does some data manipulation and transformation, and I want to get an idea about how much it is going to cost me such a transaction, so that I can estimate how to apply the same transformation in production. Usually, I begin the transaction having a look at the current time and WAL position, and I do the same at the end of the transaction. Doing the difference between the values provides me an hint about the wall clock time and the amount of data (assuming no other activity is going on the database). As an example:

testdb=> BEGIN;
BEGIN
testdb=*> SELECT clock_timestamp() AS begin_clock
testdb-*> , pg_current_wal_lsn() AS begin_lsn;
         begin_clock          | begin_lsn
------------------------------+------------
 2023-09-29 10:32:05.51654+02 | 2/A39CC3C0
(1 row)

testdb=*> INSERT INTO t( t )
testdb-*> SELECT 'Dummy ' || v
testdb-*> FROM generate_series( 1, 1000000 ) v;
INSERT 0 1000000
testdb=*> SELECT clock_timestamp() AS end_clock
, pg_current_wal_lsn() AS end_lsn;
           end_clock           |  end_lsn
-------------------------------+------------
 2023-09-29 10:32:48.511892+02 | 2/A81AC000
(1 row)

testdb=*> COMMIT;

Now that I have the times and WAL Isn positions, I can manually compute the cost of this transaction by copying and pasting the results:

testdb=> SELECT '2023-09-29 10:32:48.511892+02'::timestamp
         - '2023-09-29 10:32:05.51654+02'::timestamp AS wall_clock
        , pg_size_pretty( pg_wal_lsn_diff( '2/A81AC000', '2/A39CC3C0' ) ) as size;
   wall_clock    | size
-----------------+-------
 00:00:42.995352 | 72 MB

Using psql Variables to Obtain the Computation Automatically

If I store the begin and end values into psql variables, I can use an immutable query to compute the same results, without having to copy and paste the single values. This trick is made possible by the special command \gset, that allows for the declaration and definition of variables out of a query result.

testdb=> BEGIN;
BEGIN

testdb=*> SELECT clock_timestamp() AS clock
, pg_current_wal_lsn() AS lsn \gset begin_

testdb=*> INSERT INTO t( t )
SELECT 'Dummy ' || v
FROM generate_series( 1, 1000000 ) v;
INSERT 0 1000000

testdb=*> SELECT clock_timestamp() AS clock
, pg_current_wal_lsn() AS lsn \gset end_

testdb=*> SELECT :'end_clock'::timestamp - :'begin_clock'::timestamp as wall_clock
, pg_size_pretty( pg_wal_lsn_diff( :'end_lsn', :'begin_lsn' ) ) as size;
   wall_clock    | size
-----------------+-------
 00:00:11.400421 | 72 MB


testdb=*> COMMIT;
COMMIT

Now you can see how you can use \gset to create variables and then use the results of those variables. I could see using this to get the timing of my scripts and the amount of data that was inserted for example.