Setting up a GoldenGate replication with multiple process groups

If your source system is a very busy OLTP system, with a high rate of data changes, then you will find that a single set of GoldenGate processes is not sufficient to replicate the load to the target environment in real time. In such a situation Oracle recommends splitting the load into multiple process groups.

Getting ready

For this recipe we will use the Order Entry demo schema. This schema has been created in both source and target databases, and is in the same state. The GoldenGate binaries, the GoldenGate Admin user, and manager instance have also been set up in the source and target environment.

How to do it...

In this recipe we will set up the following configuration for replicating the Order Entry schema:

As you can see from the preceding diagram, the replication setup performed here will consist of two Extract processes, two Datapumps, and four Replicat processes.

Perform the following steps in the source database:

  1. Create the first Extract process EGGTEST1 in the source environment:
    ./ggsci
    EDIT PARAMS EGGTEST1
    EXTRACT EGGTEST1
    USERID GGATE_ADMIN@DBORATEST, PASSWORD GGATE_ADMIN
    EXTTRAIL /u01/app/ggate/dirdat/EGGTEST1/st
    TABLE oe.warehouses,FILTER (@RANGE (1,2));
    TABLE oe.subcategory_ref_list_nestedtab,FILTER (@RANGE (1,2));
    TABLE oe.purchaseorder,FILTER (@RANGE (1,2));
    TABLE oe.promotions,FILTER (@RANGE (1,2));
    TABLE oe.product_ref_list_nestedtab,FILTER (@RANGE (1,2));
    TABLE oe.product_information,FILTER (@RANGE (1,2));
    TABLE oe.product_descriptions,FILTER (@RANGE (1,2));
    TABLE oe.order_items,FILTER (@RANGE (1,2));
    TABLE oe.orders,FILTER (@RANGE (1,2));
    TABLE oe.lineitem_table,FILTER (@RANGE (1,2));
    TABLE oe.inventories,FILTER (@RANGE (1,2));
    TABLE oe.customers,FILTER (@RANGE (1,2));
    TABLE oe.categories_tab,FILTER (@RANGE (1,2));
    TABLE oe.action_table,FILTER (@RANGE (1,2));
  2. Create the second Extract process EGGTEST2 in the source environment:
    ./ggsci
    EDIT PARAMS EGGTEST2
    EXTRACT EGGTEST2
    USERID GGATE_ADMIN@DBORATEST, PASSWORD GGATE_ADMIN
    EXTTRAIL /u01/app/ggate/dirdat/EGGTEST2/st
    TABLE oe.warehouses,FILTER (@RANGE (2,2));
    TABLE oe.subcategory_ref_list_nestedtab,FILTER (@RANGE (2,2));
    TABLE oe.purchaseorder,FILTER (@RANGE (2,2));
    TABLE oe.promotions,FILTER (@RANGE (2,2));
    TABLE oe.product_ref_list_nestedtab,FILTER (@RANGE (2,2));
    TABLE oe.product_information,FILTER (@RANGE (2,2));
    TABLE oe.product_descriptions,FILTER (@RANGE (2,2));
    TABLE oe.order_items,FILTER (@RANGE (2,2));
    TABLE oe.orders,FILTER (@RANGE (2,2));
    TABLE oe.lineitem_table,FILTER (@RANGE (2,2));
    TABLE oe.inventories,FILTER (@RANGE (2,2));
    TABLE oe.customers,FILTER (@RANGE (2,2));
    TABLE oe.categories_tab,FILTER (@RANGE (2,2));
    TABLE oe.action_table,FILTER (@RANGE (2,2));
  3. Create the first Datapump process PGGTEST1 in the source environment:
    EDIT PARAMS PGGTEST1
    EXTRACT PGGTEST1
    USERID GGATE_ADMIN@DBORATEST, PASSWORD GGATE_ADMIN
    RMTHOST stdby1-ol6-112 , MGRPORT 8809
    RMTTRAIL /u01/app/ggate/dirdat/PGGTEST1/rt
    TABLE oe.*
  4. Create the second Datapump process PGGTEST2 in the source environment:
    EDIT PARAMS PGGTEST2
    EXTRACT PGGTEST2
    USERID GGATE_ADMIN@DBORATEST, PASSWORD GGATE_ADMIN
    RMTHOST stdby1-ol6-112 , MGRPORT 8809
    RMTTRAIL /u01/app/ggate/dirdat/PGGTEST2/rt
    TABLE oe.*
  5. Add both the Extract processes to the source manager configuration:
    ADD EXTRACT EGGTEST1, TRANLOG, BEGIN NOW
    ADD EXTRACT EGGTEST2, TRANLOG, BEGIN NOW
  6. Add both the local trail locations to the Extract processes:
    ADD EXTTRAIL /u01/app/ggate/dirdat/EGGTEST1/st, EXTRACT EGGTEST1
    ADD EXTTRAIL /u01/app/ggate/dirdat/EGGTEST2/st, EXTRACT EGGTEST2
  7. Add the Datapump processes to the source manager configuration:
    ADD EXTRACT PGGTEST1, EXTTRAILSOURCE /u01/app/ggate/dirdat/EGGTEST1/st
    ADD EXTRACT PGGTEST2, EXTTRAILSOURCE /u01/app/ggate/dirdat/EGGTEST2/st
  8. Add the remote trail locations to the Datapump process:
    ADD RMTTRAIL /u01/app/ggate/dirdat/PGGTEST1/rt, EXTRACT PGGTEST1
    ADD RMTTRAIL /u01/app/ggate/dirdat/PGGTEST2/rt, EXTRACT PGGTEST2

Perform the following steps in the target database:

  1. Create a checkpoint table in the target database:
    ./ggsci
    DBLOGIN, USERID GGATE_ADMIN@TGORTEST PASSWORD *****
    ADD CHECKPOINTTABLE 
  2. Create the first GoldenGate Replicat process in the target environment:
    ./ggsci
    EDIT PARAMS RGGTEST1
    REPLICAT RGGTEST1
    USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
    DISCARDFILE /u01/app/ggate/dirrpt/RGGTEST1.dsc,append,MEGABYTES 500
    ASSUMETARGETDEFS
    MAP oe.warehouses, TARGET oe.warehouses, FILTER (@RANGE (1,2));
    MAP oe.subcategory_ref_list_nestedtab, TARGET oe.subcategory_ref_list_nestedtab, FILTER (@RANGE (1,2));
    MAP oe.purchaseorder, TARGET oe.purchaseorder, FILTER (@RANGE (1,2));
    MAP oe.promotions, TARGET oe.promotions, FILTER (@RANGE (1,2));
    MAP oe.product_ref_list_nestedtab, TARGET oe.product_ref_list_nestedtab, FILTER (@RANGE (1,2));
    MAP oe.product_information, TARGET oe.product_information, FILTER (@RANGE (1,2));
    MAP oe.product_descriptions, TARGET oe.product_descriptions, FILTER (@RANGE (1,2));
    MAP oe.order_items, TARGET oe.order_items, FILTER (@RANGE (1,2));
    MAP oe.orders, TARGET oe.orders, FILTER (@RANGE (1,2));
    MAP oe.lineitem_table, TARGET oe.lineitem_table, FILTER (@RANGE (1,2));
    MAP oe.inventories, TARGET oe.inventories, FILTER (@RANGE (1,2));
    MAP oe.customers, TARGET oe.customers, FILTER (@RANGE (1,2));
    MAP oe.categories_tab, TARGET oe.categories_tab, FILTER (@RANGE (1,2));
    MAP oe.action_table, TARGET oe.action_table, FILTER (@RANGE (1,2));
  3. Create the second GoldenGate Replicat process in the target environment:
    ./ggsci
    EDIT PARAMS RGGTEST2
    REPLICAT RGGTEST2
    USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
    DISCARDFILE /u01/app/ggate/dirrpt/RGGTEST2.dsc,append,MEGABYTES 500
    ASSUMETARGETDEFS
    MAP oe.warehouses, TARGET oe.warehouses, FILTER (@RANGE (2,2));
    MAP oe.subcategory_ref_list_nestedtab, TARGET oe.subcategory_ref_list_nestedtab, FILTER (@RANGE (2,2));
    MAP oe.purchaseorder, TARGET oe.purchaseorder, FILTER (@RANGE (2,2));
    MAP oe.promotions, TARGET oe.promotions, FILTER (@RANGE (2,2));
    MAP oe.product_ref_list_nestedtab, TARGET oe.product_ref_list_nestedtab, FILTER (@RANGE (2,2));
    MAP oe.product_information, TARGET oe.product_information, FILTER (@RANGE (2,2));
    MAP oe.product_descriptions, TARGET oe.product_descriptions, FILTER (@RANGE (2,2));
    MAP oe.order_items, TARGET oe.order_items, FILTER (@RANGE (2,2));
    MAP oe.orders, TARGET oe.orders, FILTER (@RANGE (2,2));
    MAP oe.lineitem_table, TARGET oe.lineitem_table, FILTER (@RANGE (2,2));
    MAP oe.inventories, TARGET oe.inventories, FILTER (@RANGE (2,2));
    MAP oe.customers, TARGET oe.customers, FILTER (@RANGE (2,2));
    MAP oe.categories_tab, TARGET oe.categories_tab, FILTER (@RANGE (2,2));
    MAP oe.action_table, TARGET oe.action_table, FILTER (@RANGE (2,2));
  4. Create the third GoldenGate Replicat process in the target environment:
    ./ggsci
    EDIT PARAMS RGGTEST3
    REPLICAT RGGTEST3
    USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
    DISCARDFILE /u01/app/ggate/dirrpt/RGGTEST3.dsc,append,MEGABYTES 500
    ASSUMETARGETDEFS
    MAP oe.warehouses, TARGET oe.warehouses, FILTER (@RANGE (1,2));
    MAP oe.subcategory_ref_list_nestedtab, TARGET oe.subcategory_ref_list_nestedtab, FILTER (@RANGE (1,2));
    MAP oe.purchaseorder, TARGET oe.purchaseorder, FILTER (@RANGE (1,2));
    MAP oe.promotions, TARGET oe.promotions, FILTER (@RANGE (1,2));
    MAP oe.product_ref_list_nestedtab, TARGET oe.product_ref_list_nestedtab, FILTER (@RANGE (1,2));
    MAP oe.product_information, TARGET oe.product_information, FILTER (@RANGE (1,2));
    MAP oe.product_descriptions, TARGET oe.product_descriptions, FILTER (@RANGE (1,2));
    MAP oe.order_items, TARGET oe.order_items, FILTER (@RANGE (1,2));
    MAP oe.orders, TARGET oe.orders, FILTER (@RANGE (1,2));
    MAP oe.lineitem_table, TARGET oe.lineitem_table, FILTER (@RANGE (1,2));
    MAP oe.inventories, TARGET oe.inventories, FILTER (@RANGE (1,2));
    MAP oe.customers, TARGET oe.customers, FILTER (@RANGE (1,2));
    MAP oe.categories_tab, TARGET oe.categories_tab, FILTER (@RANGE (1,2));
    MAP oe.action_table, TARGET oe.action_table, FILTER (@RANGE (1,2));
  5. Create the fourth GoldenGate Replicat process in the target environment:
    ./ggsci
    EDIT PARAMS RGGTEST4
    REPLICAT RGGTEST4
    USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
    DISCARDFILE /u01/app/ggate/dirrpt/RGGTEST4.dsc,append,MEGABYTES 500
    ASSUMETARGETDEFS
    MAP oe.warehouses, TARGET oe.warehouses, FILTER (@RANGE (2,2));
    MAP oe.subcategory_ref_list_nestedtab, TARGET oe.subcategory_ref_list_nestedtab, FILTER (@RANGE (2,2));
    MAP oe.purchaseorder, TARGET oe.purchaseorder, FILTER (@RANGE (2,2));
    MAP oe.promotions, TARGET oe.promotions, FILTER (@RANGE (2,2));
    MAP oe.product_ref_list_nestedtab, TARGET oe.product_ref_list_nestedtab, FILTER (@RANGE (2,2));
    MAP oe.product_information, TARGET oe.product_information, FILTER (@RANGE (2,2));
    MAP oe.product_descriptions, TARGET oe.product_descriptions, FILTER (@RANGE (2,2));
    MAP oe.order_items, TARGET oe.order_items, FILTER (@RANGE (2,2));
    MAP oe.orders, TARGET oe.orders, FILTER (@RANGE (2,2));
    MAP oe.lineitem_table, TARGET oe.lineitem_table, FILTER (@RANGE (2,2));
    MAP oe.inventories, TARGET oe.inventories, FILTER (@RANGE (2,2));
    MAP oe.customers, TARGET oe.customers, FILTER (@RANGE (2,2));
    MAP oe.categories_tab, TARGET oe.categories_tab, FILTER (@RANGE (2,2));
    MAP oe.action_table, TARGET oe.action_table, FILTER (@RANGE (2,2));
  6. Add the Replicat processes to the target manager configuration:
    ADD REPLICAT RGGTEST1, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST1/rt, BEGIN NOW
    ADD REPLICAT RGGTEST2, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST1/rt, BEGIN NOW
    ADD REPLICAT RGGTEST3, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST2/rt, BEGIN NOW
    ADD REPLICAT RGGTEST4, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST2/rt, BEGIN NOW

Perform the following steps in the source database:

  1. Start the Extract and Datapump processes:
    START EXTRACT EGGTEST1
    START EXTRACT EGGTEST2
    START EXTRACT EGGTEST1
    START EXTRACT PGGTEST2

Perform the following steps in the target database:

  1. Start the Replicat processes:
    START REPLICAT RGGTEST1
    START REPLICAT RGGTEST2
    START REPLICAT RGGTEST3
    START REPLICAT RGGTEST4

How it works...

In this example, we have split the extract load into two streams using the FILTER clause with a RANGE option. The FILTER clause is used to refine the changes that a GoldenGate process looks for. The RANGE function distributes the load evenly into multiple streams by calculating the hash value of the key columns in the table. By using both of these options each Extract process only processes half the load. Each Extract process writes its trail files to a separate directory under $GG_HOME/dirdat/ and is configured with a dedicated Datapump process on the source server. These Datapump processes then transfer the trail files to the remote server and write them to separate directories there. The replicat load on the target server is split into four parallel streams. The first two Replicat processes (RGGTEST1 and RGGTEST2) apply the changes captured by the first Extract process (EGGTEST1) on the source server. The next two Replicat processes (RGGTEST3 and RGGTEST4) apply the changes captured by the second Extract process (EGGTEST2) on the source server.

There's more...

You would configure multiple processes to enhance the GoldenGate performance. In most cases you would find that it is the Replicat process that is causing the bottleneck and needs intervention. There are many performance enhancements in the latest release of Oracle GoldenGate. However, at times you would find that adding additional process groups is the only way of achieving the desired replication lag targets. There are various ways to determine how to split tables between various processes which are as follows:

  • You can specify different tables in different processes
  • General rule of thumb is to keep related tables together
  • Tables with referential integrity constraints should always be kept in the same process group
  • You can use the RANGE function to logically split the load of related tables into multiple streams
  • If a single process is not able to handle the load of high rate of change on a single table, then you should split its load using the RANGE clause

See also

  • See the Splitting the replication load into multiple process groups for optimal performance recipe in Chapter 6, Monitoring, Tuning, and Troubleshooting GoldenGate