APPROX_COUNT_DISTINCT – fast count distinct values Oracle Database 12C release 1 (12.1)

APPROX_COUNT_DISTINCT is new function available since version 12.1.0.2. It returns approximate number of rows that contain distinct values of expr.

APPROX_COUNT_DISTINCT(expr)

It’s alternative to the COUNT (DISTINCT expr) function, which returns the exact number of rows that contain distinct values of expr.

For processing large amounts of data it’s significantly faster than COUNT, with negligible deviation from the exact result.

Example

Simple table TEST_TBL is created with 32 million of rows to test function APPROX_COUNT_DISTINCT.

CREATE TABLE test_tbl
(
  id1 NUMBER,
  id2 NUMBER,
  id3 NUMBER
) NOLOGGING;


INSERT /*+ APPEND */ INTO test_tbl
SELECT mod(LEVEL, 3000), mod(LEVEL, 1000), mod(LEVEL, 100)
FROM dual
CONNECT BY LEVEL <= 1000000;

COMMIT;

BEGIN
  FOR i IN 1..5
  loop
    INSERT /*+ APPEND */ INTO test_tbl
    SELECT * FROM test_tbl;

    COMMIT;
  END loop;
END;
/

SELECT COUNT(*) FROM test_tbl;

  COUNT(*)
----------
  32000000

Now simple performance test. It’s significant difference in time however results are very precise.

set timing on

SELECT 
  COUNT(DISTINCT id1) cd1, 
  COUNT(DISTINCT id2) cd2, 
  COUNT(DISTINCT id3) cd3
FROM test_tbl; 

       CD1        CD2        CD3
---------- ---------- ----------
      3000       1000        100 

Elapsed: 00:00:35.298

SELECT 
  APPROX_COUNT_DISTINCT(id1) acd1, 
  APPROX_COUNT_DISTINCT(id2) acd2, 
  APPROX_COUNT_DISTINCT(id3) acd3
FROM test_tbl;  

      ACD1       ACD2       ACD3
---------- ---------- ----------
      2962       1012        100 

Elapsed: 00:00:04.468

I find this function very useful especially in warehouses 🙂 when number of rows in tables are big.

Have a fun 🙂
Tomasz

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.