**APPROX_COUNT_DISTINCT** is new function available since version 12.1.0.2. It returns approximate number of rows that contain **distinct** values of *expr.*

APPROX_COUNT_DISTINCT(expr)

It’s alternative to the **COUNT (DISTINCT** *expr***)** function, which returns the exact number of rows that contain **distinct** values of expr.

For processing large amounts of data it’s significantly **faster** than **COUNT**, with negligible deviation from the exact result.

Example

Simple table TEST_TBL is created with 32 million of rows to test function **APPROX_COUNT_DISTINCT**.

CREATE TABLE test_tbl ( id1 NUMBER, id2 NUMBER, id3 NUMBER ) NOLOGGING; INSERT /*+ APPEND */ INTO test_tbl SELECT mod(LEVEL, 3000), mod(LEVEL, 1000), mod(LEVEL, 100) FROM dual CONNECT BY LEVEL <= 1000000; COMMIT; BEGIN FOR i IN 1..5 loop INSERT /*+ APPEND */ INTO test_tbl SELECT * FROM test_tbl; COMMIT; END loop; END; / SELECT COUNT(*) FROM test_tbl; COUNT(*) ----------32000000

Now simple performance test. It’s significant difference in time however results are very precise.

set timing on SELECT COUNT(DISTINCT id1) cd1, COUNT(DISTINCT id2) cd2, COUNT(DISTINCT id3) cd3 FROM test_tbl; CD1 CD2 CD3 ---------- ---------- ---------- 3000 1000 100 Elapsed:00:00:35.298SELECTAPPROX_COUNT_DISTINCT(id1) acd1,APPROX_COUNT_DISTINCT(id2) acd2,APPROX_COUNT_DISTINCT(id3) acd3 FROM test_tbl; ACD1 ACD2 ACD3 ---------- ---------- ---------- 2962 1012 100 Elapsed:00:00:04.468

I find this function very useful especially in warehouses 🙂 when number of rows in tables are big.

Have a fun 🙂

Tomasz