DBMS_PARALLEL_EXECUTE

If you are developer no doubt you have situation that you need to run your code in parallel. I’m not talking about running DML, DDL in parallel – read following article to understand how to do it effectively – Effective parallel execution for DML, DDL in Oracle.

I’m talking about running many independent pieces of your code – SQL, PL/SQL. Everybody knows it can be done using DBMS_SCHEDULER package, however it requires a lot of steps to be done. Define programs, jobs, chains, schedules.

To simplify this process Oracle has introduced very nice package that can help you do it very quickly DBMS_PARALLEL_EXECUTE.

Following picture describes general concept of this package

chunks_1

On the beginning we have “cloud” of independent code to execute:

  • 4 SQL
  • 3 PL/SQL

as next step we split “cloud” code into 4 chunks. You can treat chunk as separate workloads.

  • Chunk 1
  • Chunk 2
  • Chunk 3
  • Chunk 4

and as last step we decide about number of parallel jobs that will run chunks. Oracle will create jobs using DBMS_SCHEDULER and randomly distribute chunks to each job.

  • Job 1 – executes in serial Chunk 1, 2
  • Job 2 – executes in serial Chunk 3, 4

Jobs 1 and 2 will be controlled by master job when flow is started. The flow is called in Oracle documentation as “Task”.

Test data

First test data must be prepared to show functionality of the package

Table CLOUD_OF_CODE that will store our code to execute

create table cloud_of_code
(
  code_id   number,
  code_type varchar2(10),
  code      clob
);

insert into cloud_of_code values 
(1, 'SQL', 'select count(*) from user_tables');

insert into cloud_of_code values 
(2, 'SQL', 'select count(*) from user_tables');

insert into cloud_of_code values 
(3, 'PLSQL', 'begin null; end;');

insert into cloud_of_code values 
(4, 'PLSQL', 'begin null; end;');

commit;

exec dbms_stats.gather_table_stats( user, 'CLOUD_OF_CODE');

Log table LOG_DATA that will store logs about execution of “cloud” code

create table log_data
(
  task_name   varchar2(30),
  start_id    varchar2(30),
  end_id      varchar2(30),
  log_details clob
);

Task

Next step is to create flow – Task to execute. Four Tasks are created to show full functionality of the package.

  • TASK_CHUNKS_BY_COL
  • TASK_CHUNKS_BY_ROWID
  • TASK_CHUNKS_BY_SQL
  • TASK_TO_DROP
begin
  dbms_parallel_execute.create_task
  (
    task_name => 'TASK_CHUNKS_BY_COL',
    comment => 'Task with chunks generated by column'
  ); 
  
  dbms_parallel_execute.create_task
  (
    task_name => 'TASK_CHUNKS_BY_ROWID',
    comment => 'Task with chunks generated by rowid'
  ); 
  
  dbms_parallel_execute.create_task
  (
    task_name => 'TASK_CHUNKS_BY_SQL',
    comment => 'Task with chunks generated by sql'
  ); 

  dbms_parallel_execute.create_task
  (
    task_name => 'TASK_TO_DROP',
    comment => 'Task just to show how to drop it'
  ); 
end;
/

You can find information about created tasks in view

  • DBA|ALL|USER_PARALLEL_EXECUTE_TASKS
select
  task_name, chunk_type, status, table_owner, 
  table_name, number_column
from user_parallel_execute_tasks;

TASK_NAME             CHUNK_TYPE  STATUS   TABLE_NAME  NUMBER_COLUMN 
--------------------- ----------- -------- ----------- --------------
TASK_CHUNKS_BY_ROWID  UNDELARED   CREATED
TASK_CHUNKS_BY_COL    UNDELARED   CREATED
TASK_CHUNKS_BY_SQL    UNDELARED   CREATED
TASK_TO_DROP          UNDELARED   CREATED

Unique task name can be easily generated using function GENERATE_TASK_NAME

select dbms_parallel_execute.generate_task_name
from dual;

GENERATE_TASK_NAME
----------------------
TASK$_22

To drop task run following code

begin
  dbms_parallel_execute.drop_task
  ( task_name => 'TASK_TO_DROP' ); 
end;
/

Chunks

There are three methods to create chunks using DBMS_PARALLEL_PACKAGE

  • CREATE_CHUNKS_BY_NUMBER_COL

This procedure chunks the table (associated with the specified task) by the specified column. The specified column must be a NUMBER column. This procedure takes the MIN and MAX value of the column, and then divide the range evenly according to CHUNK_SIZE.

begin
  dbms_parallel_execute.create_chunks_by_number_col
  (
    task_name    => 'TASK_CHUNKS_BY_COL',
    table_owner  => user,
    table_name   => 'CLOUD_OF_CODE',
    table_column => 'CODE_ID',
    chunk_size   => 2
  ); 
end;
/
  • CREATE_CHUNKS_BY_ROWID

This procedure chunks the table (associated with the specified task) by ROWID. Table statistics NUM_ROWS and BLOCKS are approximate guidance for the size of each chunk. The table to be chunked must be a physical table with physical ROWID having views and table functions. Index Organized Tables are not allowed.

Parameter by_row

  • TRUE refers to NUM_ROWS statistic
  • FALSE refers to BLOCKS statistic

Following example chunks by ROWID based on NUM_ROWS statistic

select table_name, num_rows, blocks 
from user_tables
where table_name='CLOUD_OF_CODE';

TABLE_NAME     NUM_ROWS     BLOCKS
-------------- ---------- ----------
CLOUD_OF_CODE  4          5 

begin
  dbms_parallel_execute.create_chunks_by_rowid
  (
    task_name    => 'TASK_CHUNKS_BY_ROWID',
    table_owner  => user,
    table_name   => 'CLOUD_OF_CODE',
    by_row       => true,
    chunk_size   => 2
  ); 
end;
/
  •  CREATE_CHUNKS_BY_SQL

This procedure chunks the table (associated with the specified task) by means of a user-provided SELECT statement. The select statement which returns the range of each chunk must have two columns: START_ID and END_ID. If task is to chunk by ROWID, then the two columns must be of ROWID type. If the task is to chunk the table by NUMBER column, then the two columns must be of NUMBER type.

Following example chunks by NUMBER and SQL statement

--Query used to create chunks
select 1 start_id, 2 end_id from dual
union all
select 3 start_id, 4 end_id from dual;

  START_ID     END_ID
---------- ----------
         1          2 
         3          4
begin
  dbms_parallel_execute.create_chunks_by_sql
  (
    task_name => 'TASK_CHUNKS_BY_SQL',
    sql_stmt  => 'select 1 start_id, 2 end_id from dual
                  union all
                  select 3 start_id, 4 end_id from dual',
    by_rowid => false   ); 
end; 
/

Task definitions now looks following

select 
  task_name, chunk_type, status, 
  table_name, number_column
from user_parallel_execute_tasks;

TASK_NAME             CHUNK_TYPE   STATUS   TABLE_NAME     NUMBER_COLUMN 
--------------------- ------------ -------- -------------- --------------
TASK_CHUNKS_BY_COL    NUMBER_RANGE CHUNKED  CLOUD_OF_CODE  CODE_ID       
TASK_CHUNKS_BY_ROWID  ROWID_RANGE  CHUNKED  CLOUD_OF_CODE                
TASK_CHUNKS_BY_SQL    NUMBER_RANGE CHUNKED

Chunks details

select chunk_id, task_name, start_id, end_id
from user_parallel_execute_chunks
where task_name in ('TASK_CHUNKS_BY_COL', 'TASK_CHUNKS_BY_SQL')
order by 1,3;
  CHUNK_ID TASK_NAME             START_ID     END_ID
---------- ------------------- ---------- ----------
        93 TASK_CHUNKS_BY_COL           1          2 
        94 TASK_CHUNKS_BY_COL           3          4 
        95 TASK_CHUNKS_BY_SQL           1          2 
        96 TASK_CHUNKS_BY_SQL           3          4

select chunk_id, task_name, start_rowid, end_rowid
from user_parallel_execute_chunks
where task_name in ('TASK_CHUNKS_BY_ROWID')
order by 1,3;

CHUNK_ID TASK_NAME             START_ROWID         END_ROWID
-------- --------------------- ------------------- -------------------
      97 TASK_CHUNKS_BY_ROWID  AAAWnaAAJAAAAFwAAA  AAAWnaAAJAAAAF3CcP

To drop chunks you can run procedure DROP_CHUNKS

DBMS_PARALLEL_EXECUTE.DROP_CHUNKS 
(
   task_name       IN VARCHAR2
);

Wrapper code

It’s time for package DBMS_PARALLEL_EXECUTE to run our chunks. We must provide some SQL code with two placeholders:

  • start_id
  • end_id

like procedure

begin some_procedure(:start_id, :end_id) end;

or some dml

delete from some_table 
where some_column between :start_id and :end_id;

insert into some_table 
values(...,...,:start_id, :end_id);

update some_table 
where some_column between :start_id and :end_id;

Placeholders are automatically assigned when a given chunk is started and is NUMBER type for chunks NUMBER_RANGE or ROWID type for ROWID_RANGE.

So let’s prepare such SQL code. In this case it will be procedure WRAPPER_CODE. I have just added to wrapper extra column “p_task_name” to recognize if chunk code is based on ROWID or NUMBER. The wrapper is quite simple it reads in serial “cloud” code for range start_id-end_id of given chunks and logs it into log table.

create or replace procedure wrapper_code
(
  p_task_name varchar2,
  p_start_id  varchar2, 
  p_end_id    varchar2
)
is
  l_cur        sys_refcursor;
  l_sql        clob;
  l_code_id    cloud_of_code.code_id%type;
  l_code_type  cloud_of_code.code_type%type;
  l_code       cloud_of_code.code%type;
begin
  if p_task_name = 'TASK_CHUNKS_BY_ROWID' then
    l_sql := 'select code_id, code_type, code
                from cloud_of_code
               where rowid between '||p_start_id||' and '||p_end_id;
  else
    l_sql := 'select code_id, code_type, code
                from cloud_of_code
               where code_id between '||p_start_id||' and '||p_end_id;
  end if;

  open l_cur for l_sql;
  
  loop
    fetch l_cur into l_code_id, l_code_type, l_code;
    
    exit when l_cur%notfound;
    
    insert into log_data
    (
      task_name,
      start_id,
      end_id,
      log_details
    )  
    values
    (
      p_task_name, p_start_id, p_end_id, 
      l_code_id||chr(10)||
      l_code_type||chr(10)||
      l_code
    );
  end loop;
end;
/

Run code

Now it’s time to run task using DBMS_PARALLEL_EXECUTE.RUN_TASK. Parallel level is set to 2. So two DBMS_SCHEDULER jobs will be created.

declare
  l_sql    clob;
  l_run_id number;
begin
  l_sql := 'begin 
             wrapper_code(''TASK_CHUNKS_BY_COL'',
                          :start_id,
                          :end_id); 
            end;';

  dbms_parallel_execute.run_task
  (
    task_name      => 'TASK_CHUNKS_BY_COL',
    sql_stmt       => l_sql,
    language_flag  => dbms_sql.native,
    parallel_level => 2
  );
end;
/

Details for execution can be found in view

  • DBA|ALL|USER_PARALLEL_EXECUTE_CHUNKS

Here you can see that two separate jobs were scheduled.

select chunk_id, task_name, status, start_id, end_id, job_name
from user_parallel_execute_chunks
where task_name='TASK_CHUNKS_BY_COL';

CHUNK_ID TASK_NAME           STATUS     START_ID END_ID JOB_NAME
-------- ------------------- ---------- -------- ------ ------------
      93 TASK_CHUNKS_BY_COL  PROCESSED         1      2 TASK$_208_1
      94 TASK_CHUNKS_BY_COL  PROCESSED         3      4 TASK$_208_2

Details can be found as well in standard oracle scheduler job logs view DBA_SCHEDULER_JOB_RUN_DETAILS

select 
 job_name, status, error#, errors
from dba_scheduler_job_run_details
where job_name in ('TASK$_208_1', 'TASK$_208_2');

JOB_NAME     STATUS         ERROR# ERRORS 
------------ ---------- ---------- ----------------
TASK$_208_1  SUCCEEDED           0        
TASK$_208_2  SUCCEEDED           0

our procedure WRAPPER_CODE has also written details logs to own table LOG_DATA.

select * from log_data;

TASK_NAME          START_ID END_ID LOG_DETAILS                     
------------------ -------- ------ --------------------------------
TASK_CHUNKS_BY_COL 1        2      1                               
                                   SQL                             
                                   select count(*) from user_tables

TASK_CHUNKS_BY_COL 1        2      2                               
                                   SQL                             
                                   select count(*) from all_tables 

TASK_CHUNKS_BY_COL 3        4      3                               
                                   PLSQL                           
                                   begin null; end;                

TASK_CHUNKS_BY_COL 3        4      4                               
                                   PLSQL                           
                                   begin null; null; end;

Have a fun 🙂

Tomasz

17 thoughts on “DBMS_PARALLEL_EXECUTE

  1. Hi. When executing DBMS_PARALLEL_EXECUTE repeatedly within one session, using CHUNKS_BY_NUMBER_COL approach, after execution and successfull finish of task, implicit cursor leaves unclosed.
    Thus, when I have task of 500 bulk payments, with 1000 payments in each, in session I sequentially open 500 cursors untill session ends. Basically, 500 bulk payments are processed sequentially and for each bulk payment parallel task is executed to process 1000 payments in 25-entity chunks.
    How to avoid unclosed cursors?

    • Do you get any Oracle error ? Please specify it. Implicit cursors are always closed automatically by Oracle. If it’s not true for your application than it seems you encountered Oracle bug. If you are referring by mistake to explicit cursors like
      OPEN cur
      then you should always close such cursor with
      CLOSE cur

      Regards
      Tomasz

  2. Hi, thanks for your quick reply. After exceeding value of max_open_cursors, newly executed parallel task fails on error: ORA-01000: maximum open cursors exceeded.
    I simulated scenario with 456 bulk payments (i.e. starting same parallel task 456 times sequentialy) and started to fail at 286th iteration, while I have max_open_cursors set on 300 on my database.

    Implicit vs. explicit cursor – I am pretty sure, that cursor is implicit:
    1) It is sort of SELECT MIN(“PKID”), MAX(“PKID”) FROM “MY_PRL_ENTITIES” – and does not occur in application source code
    2) while traced step-by-step, cursor has been open when we executed “DBMS_PARALLEL_EXECUTE.run_task”

    Thanks much,
    Ivan.

  3. … in my own opinion, Oracle did not expected to use DBMS_PRL_EXECUTE this way – I mean repeated calls within one session. But sometimes, situation requires that.

  4. Please help !!
    My table N_MVT_NEW_PREV has 80.000.000 lignes. I want to run an update using tasks.
    But that doesn’t work :
    I think that the tasks hasn’t been runned. it stilled statued at ‘CHUNKED’ (status 4)

    DECLARE
    l_task0 VARCHAR2(30) := ‘test_task’;
    l_sql_stmt VARCHAR2(32767);
    l_try NUMBER;
    l_status NUMBER;
    BEGIN
    SELECT DBMS_PARALLEL_EXECUTE.generate_task_name into l_task0
    FROM dual;
    dbms_output.put_line(l_task0);
    DBMS_PARALLEL_EXECUTE.create_task (l_task0);

    DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(l_task0,’BO’,’N_MVT_NEW_PREV’,TRUE,1000000);

    l_sql_stmt :=’UPDATE /*+ ROWID(dda) */ bo.n_mvt_new_prev N_MVT_NEW_PREV ‘||chr(10)||
    ‘ SET (N_MVT_NEW_PREV.MVT_VER_MT_HT,N_MVT_NEW_PREV.MVT_VER_MT_ADD_HT) = ‘||chr(10)||
    ‘(select B.MVT_VER_MT_HT,B.MVT_VER_MT_ADD_HT ‘||chr(10)||
    ‘from BO.MVT_VER_MAJ_HT_TMP B ‘||chr(10)||
    ‘WHERE N_MVT_NEW_PREV.MVT_CTA_D 4);

    — If there is error, RESUME it for at most 2 times.
    l_try := 0;
    l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task0);
    dbms_output.put_line(‘statut ‘||l_status);
    WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED and l_status != DBMS_PARALLEL_EXECUTE.CHUNKED)
    Loop
    l_try := l_try + 1;
    DBMS_PARALLEL_EXECUTE.resume_task(l_task0);
    l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task0);
    END LOOP;
    l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task0);
    dbms_output.put_line('statut '||l_status);
    DBMS_PARALLEL_EXECUTE.drop_task(l_task0);
    END;
    /

    PL/SQLprocedure successfully completed
    statut 4
    statut 4

    • I don’t understand why people are trying to use dbms_parallel_execute for DML operations. It’s not recommended.

      Much better option is to use clear and easy syntax:
      alter session enable paralel dml;
      then run user DML;
      commit;

      It’s still not recommended if you have many records to update and your table is compressed or you want to avoid row chaining. In such case it’s better to rebuild a given table by CREATE TABLE AS SELECT.

      DBMS_PARALLEL_EXECUTE is created to avoid using complex scheduler. That’s my private opinion :).

      Regards
      Tomasz

      • There are situations where dbms_parallel_execute for DML operations is the most practical way for DML operations.

        For example: even with parallel, one-shot DML of merge into…complex-select-from-huge-table-that-takes-hours-to-open-cursor can run into:
        – out of sort space
        – snapshot too old
        – huge amounts of rolllback generated when other operations are going against the tables by other sessions

        When dealing with high-volume complex SQL, dbms_parallel_execute chunked SQL can open the cursor much quicker and allows for chunked commits; you don’t really want to have to re-execute an operation that takes hours to run from scratch because your one-shot merge has failed after 18 hours with “snapshot too old”.

  5. Hi,
    i have 3 lacks of records in my table, for update the table, i am using
    chunk_size => 1000 in DBMS_PARALLEL_EXECUTION with parallel_level => 10

    AND
    chunk_size => 10000 in DBMS_PARALLEL_EXECUTION with parallel_level => 100

    in difference in Execution time is only 1 minutes only, please tell me what is reason behind that.
    where we can see how much max chunk size and parallel_level we can use in oracle 11.2G
    Thanks in advance:)

  6. Hi, the available views are DBA|USER_PARALLEL_EXECUTE_TASKS and DBA|USER_PARALLEL_EXECUTE_CHUNKS (in Oracle DB 11.g, 12c). Not ALL prefix.

  7. I have a situation where I need to update 150 Million records specific column in a go that too based on some where clause. Please suggest me whether using DBMS_PARALLEL_EXECUTE is a better option or is their a better way out?

    • There are a lot things to consider before running such update:
      – how many columns has got your table
      – is it compressed table
      – is it partitioned table
      – what is current size of the table
      – can you run your sql in parallel or you are blocked by resource manager
      – must the table be online during update
      – how many indexes you have are the global, local
      – have you got extra space in case of CTS (create table as select option)
      and so on and so on

      There is no the better, worse or good option

      Regards
      Tomasz

      • – how many columns has got your table
        table have 17 columns out of which 2 are CLOB which needs to be updated with NULL.
        – is it compressed table
        Yes
        – is it partitioned table
        Yes
        – what is current size of the table
        94401600 MB
        – can you run your sql in parallel or you are blocked by resource manager
        Blocked by resource manager.
        – must the table be online during update
        Yes
        – how many indexes you have are the global, local
        2 LOB type indexes.
        – have you got extra space in case of CTS (create table as select option)
        It will be an extra overhead for such a big table I guess.
        and so on and so on

Leave a Reply to aravind Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.