Using psql Variables to Introspect Your Script
Some good ideas from Lucas’ article
psql
support internal variables similar to the variables you can find in a
shell.
Use Case: Quantitative Data about a Transaction
I want to run a long transaction that does some data manipulation and transformation, and I want to get an idea about how much it is going to cost me such a transaction, so that I can estimate how to apply the same transformation in production. Usually, I begin the transaction having a look at the current time and WAL position, and I do the same at the end of the transaction. Doing the difference between the values provides me an hint about the wall clock time and the amount of data (assuming no other activity is going on the database). As an example:
testdb=> BEGIN;
BEGIN
testdb=*> SELECT clock_timestamp() AS begin_clock
testdb-*> , pg_current_wal_lsn() AS begin_lsn;
begin_clock | begin_lsn
------------------------------+------------
2023-09-29 10:32:05.51654+02 | 2/A39CC3C0
(1 row)
testdb=*> INSERT INTO t( t )
testdb-*> SELECT 'Dummy ' || v
testdb-*> FROM generate_series( 1, 1000000 ) v;
INSERT 0 1000000
testdb=*> SELECT clock_timestamp() AS end_clock
, pg_current_wal_lsn() AS end_lsn;
end_clock | end_lsn
-------------------------------+------------
2023-09-29 10:32:48.511892+02 | 2/A81AC000
(1 row)
testdb=*> COMMIT;
Now that I have the times and WAL Isn positions, I can manually compute the cost of this transaction by copying and pasting the results:
testdb=> SELECT '2023-09-29 10:32:48.511892+02'::timestamp
- '2023-09-29 10:32:05.51654+02'::timestamp AS wall_clock
, pg_size_pretty( pg_wal_lsn_diff( '2/A81AC000', '2/A39CC3C0' ) ) as size;
wall_clock | size
-----------------+-------
00:00:42.995352 | 72 MB
Using psql Variables to Obtain the Computation Automatically
If I store the begin and end values into psql
variables, I can use an
immutable query to compute the same results, without having to copy and paste
the single values.
This trick is made possible by the special command \gset
, that allows for the
declaration and definition of variables out of a query result.
testdb=> BEGIN;
BEGIN
testdb=*> SELECT clock_timestamp() AS clock
, pg_current_wal_lsn() AS lsn \gset begin_
testdb=*> INSERT INTO t( t )
SELECT 'Dummy ' || v
FROM generate_series( 1, 1000000 ) v;
INSERT 0 1000000
testdb=*> SELECT clock_timestamp() AS clock
, pg_current_wal_lsn() AS lsn \gset end_
testdb=*> SELECT :'end_clock'::timestamp - :'begin_clock'::timestamp as wall_clock
, pg_size_pretty( pg_wal_lsn_diff( :'end_lsn', :'begin_lsn' ) ) as size;
wall_clock | size
-----------------+-------
00:00:11.400421 | 72 MB
testdb=*> COMMIT;
COMMIT
Now you can see how you can use \gset
to create variables and then use the
results of those variables.
I could see using this to get the timing of my scripts and the amount of data
that was inserted for example.