Netezza — Avoid count(*)!

Cristián Canivell Gutiérrez
2 min readMay 13, 2021

What a way to start my first Medium article…

Well, that might not be the biggest problem for some right? Running count(*) over and over again to check how many rows were loaded.

But when your production environment contains 100s of thousands of tables, you might want to speed up the process. You can’t waste resources and time is gold!

Here is when you realise that you got your ETL users into the best practice of running statistics after changing them significantly (some say 5–10% changed volume is the sweet spot). Or you as a DBA, thought that you could schedule the nz_genstats script from Mark Fraase after the ETL to collect statistics in a smart way (e.g not run it on tables that haven’t changed , or other conditions).

Anyways, let’s get straight into it.. here is the SQL you need to make the magic happen (make sure to connect to the database you are interested!).

\c DEMOSELECT current_catalog, tablename, reltuples 
FROM _v_table
WHERE lower(tablename) not like '\_t\_%'
and lower(tablename) not like '\_vt\_%';
---------Results-------DEMO|TEST1|1000000
DEMO|TEST2|5000000
DEMO|TEST3|8000000
DEMO|TEST4|9500000

Now.. this has some caveats… your stats need to be recent, otherwise you might get that these estimates are not perfect.

The value of the row count estimate - reltuples - is going up when data is inserted but is NOT going down when data is deleted!

I hope this was of interest, running this SQL reduced the counting process from hours into seconds.

Another reason for having enforcing good stats…

PS: Thanks to Mark Fraase for warning me to use the _v_table to make life easier :)

--

--