Friday, 1 May 2015

Performance Tuning With DAC



Performance Tuning With DAC
1.       Managing Indexes
4.       Looping of Workflows

1. Managing Indexes

In general, during the ETL process, before a table is truncated, all the indexes, as defined in the repository, will be dropped before the data is loaded and recreated automatically after the data is loaded.

Index Behavior

The general sequence of actions related to indexes when a task runs is as follows:
1.       Tables are truncated.
2.       Indexes are dropped.
3.       Data is loaded into target tables.
4.       ETL indexes and unique indexes are recreated.
5.       Successor tasks are notified that ETL indexes were created.
6.       Query indexes and additional unique indexes are recreated.
7.       Tables are analyzed.
Note: Successor tasks can start before query-related and unique indexes are created.
To specify an index space for indexes by table type
Setup --> Physical data source-->Index spaces--> generate--> give the value of index space of each table type--> Save

Specifying How Many Indexes Can Be Created in Parallel

1.       In the Physical Data Sources tab of the Setup view, select the appropriate physical data source in the top pane list.
2.       Click the Parallel Indexes subtab.
3.       Click New in the Parallel Indexes toolbar.
4.       In the Name field, query for and enter the names of the table for which you want to specify how many indexes can be created in parallel.
5.       For each table, in the Number of Parallel Indexes field, enter the number of indexes you want to create in parallel.
6.       Click Save.

To specify how many indexes can be created in parallel for all tables associated with a physical data source
1.       In the Physical Data Sources tab of the Setup view, select the appropriate physical data source in the top pane list.
2.       Click the Edit subtab.
3.       In the Num Parallel Indexes Per Table field, enter a numeric value to specify how many indexes can be created in parallel for a table on the specified physical data source.
4.       Click Save.

2. Using Actions to Optimize Indexes and Collect Statistics on Tables

  1. Create a source system parameter called CURRENT_YEAR, using the Timestamp data type and the runtime variable @DAC_ETL_START_TIME_FOR_TARGET.
  1. Create a custom index action to override the drop index behavior.
    1. Create an index action called Drop Partitioned Index.
    2. Enter the following SQL:
3.  alter index getIndexName()
4.  modify partition PART_@DAC_$$CURRENT_YEAR
5.  unusable
For instructions on defining an index action, see "Defining a SQL Script for an Action".
  1. Assign the Drop Partitioned Index action to indexes.
    1. In the Indices tab, query for the table name and indexes that need to be dropped and created during the incremental runs.
    2. For each record in the list of query results, right-click and select Add Actions.
    3. Select Drop Index as the Action Type, Incremental as the Load Type, and Drop Partitioned Index as the Action.
  2. Create a custom index action to override the create index behavior.
    1. Create an index action called Enable Partitioned Index.
    2. Enter the following SQL:
3.  alter index getIndexName()
4.  rebuild partition PART_@DAC_$$CURRENT_YEAR
5.  nologging
  1. Assign the Enable Partitioned Index action to indexes.
    1. In the Indices tab, query for the table name and indexes that need to be dropped and created during the incremental runs.
    2. For each record in the list of query results, right-click and select Add Actions.
    3. Select Drop Index as the Action Type, Incremental as the Load Type, and Enable Partitioned Index as the Action.
  2. Create a custom index action to override the analyze table behavior.
    1. Create an index action called Analyze Current Partition.
    2. Enter the following SQL:
3.  DBMS_STATS.GATHER_TABLE_STATS(
4.  NULL,
5.  TABNAME => 'getTableName()'
6.  CASCADE => FALSE,
7.  PARTNAME => PART_@DAC_CURRENT_YEAR,
8.  ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE,
9.  GRANULARITY => 'FOR ALL INDEXED COLUMNS SIZE AUTO',
10.DEGREE => DBMS_STATS.DEFAULT_DEGREE)
    1. Select Stored Procedure as the Type.
  1. Assign the Analyze Current Partition action to tables.
    1. In the Tables tab, query for the table names that need to be analyzed during the incremental runs.
    2. For each record in the list of query results, right-click and select Add Actions.
    3. Select Analyze Table as the Action Type, Incremental as the Load Type, and Analyze Current Partition as the Action.

3. Using Heuristics to Manage Tasks, Tables and Indexes

Creating a Heuristics Rule

To create a heuristics rule

1. Count: Provides a count of the number of records in the primary source table that will be included in the ETL process (based on the task to which you assign the heuristics rule)
2. Ratio: Provides a count of the number of records in the primary source table and the total number of records in the target table. If you select the Ratio option, the value of the Threshold field is a percentage.
1.       If this heuristics rule applies to a task with multiple primary tables, select one of the following:
o    Sum. Aggregates the total of new or changed records in all of the primary source tables.
o    Average. Returns an average of the new or changed records in all of the primary source tables.
o    Max. Returns the largest count of new or changed records of all of the primary source tables.
o    Min. Returns the smallest count of new or changed records of all of the primary source tables.
2.       In the Source Tables field, select one of the following:
o    Primary. Specifies that intelligence will be gathered on one or more primary tables associated with the task that this heuristics rule is assigned to.
o    With SQL Overrides. Specifies that you will write custom SQL to include tables other than the primary tables.
3.       In the Threshold field, enter a numerical value above which DAC heuristics will allow a particular action to take place.
4.        (Optional) If you want to create this heuristics rule but leave it inactive, select Inactive.
5.       Click Save.

Assigning a Heuristics Rule to a Task

To assign a heuristics rule to a task
Design-->Task-->click on extended properties-->Click new-->select Heuristics.--> Double-click in the Value field -->



STo write custom SQL to create a heuristics rule
  1. In the Design view, select the appropriate source system container from the drop-down list.
  2. Select the Container Specific Actions tab.
  3. Select Heuristic from the rightmost drop-down list on the toolbar.
  4. In the toolbar, click New.
A new, empty record is created.
  1. In the Name field, enter a descriptive name for the SQL, and then click Save.
  2. Click in the Value field to open the Value dialog.
  3. Select a format for the tree view on the left side of the window.
    • Flat view displays the SQL entries in a list format in their order of execution.
    • Category view displays the entries by the categories Custom SQL and Stored Procedure.
You can reorder the entries in the tree by dragging and dropping them.
  1. Click Add in the right-hand toolbar.
A new, empty record is created in the top pane.
  1. Enter or select the appropriate information.
Field
Description
Name
Enter a logical name for the SQL block.
Type
Select one of the following:
o    Select SQL. Indicates the SQL you enter will be a SELECT statement.
o    Stored Procedure. Indicates the SQL you enter will be a stored procedure.
Continue on Fail
Specifies whether an execution should proceed if a given SQL block fails.
Retries
Specifies how many retries are allowed. If the number is not positive, a default number of one (1) will be used.
Valid Database Platform
Specifies the valid database platforms against which the SQL will run. If this field is left empty, the SQL can be run against any database.

4. Looping of Workflows

To define a workflow for looping
1.       Enable the Informatica workflow property Configure Concurrent Execution:
1.       In Informatica Workflow Manager, open the appropriate workflow in the Workflow Designer.
2.       On the menu bar, select Workflows, and then select Edit.
3.       In the Edit Workflow dialog, select the Enabled check box for the Configure Concurrent Execution property.
4.       Click the Configure Concurrent Execution button.S
5.       In the Configure Concurrent Execution dialog, select Allow concurrent run with same instance name.
6.       Click OK.
2.       In the DAC Design view, select the appropriate container from the drop-down list.
3.       In the Tasks tab, query for the task for which you want to configure looping.
4.       With the appropriate task selected in the top pane, click the Extended Properties subtab.
5.       In the bottom pane toolbar, click New.
A new record is created with the name Loops. You cannot change the name of this record.
6.       Click in the Value field of the new record.
The Property Value dialog is displayed.
7.       In the Loops field, enter a numeric value to specify how many times you want the workflow to repeat.
The number of loops that you specify is applicable to all task execution types, that is, Informatica, SQL File, Stored Procedure, and External Program.
8.       (Optional) Select the Parallel check box to have the run instances of the same task execute in parallel. If you do not select the Parallel check box, the run instances of the same task will be executed serially.

Note:
o    Run instances will be executed in parallel only if resources are available. For example, if your environment is set up to run 10 workflows in parallel, and a task's Number of Loops property is set to 20, only the first 10 instances will be executed. These 10 instances will have a Current Runs status of Running. The remaining 10 instances will have a status of Runnable. For a description of all Current Runs statuses, see "Current Runs Tab".
o    When the instances are executed in serial, the first failure encountered stops the rest of the executions.
o    When the instances are executed in parallel, any failure in the running batch stops the rest of the executions.
                        (Optional) Select the Restart All check box to do the following:
o    Reissue the truncate command (if it exists as a task detail).
o    Mark all task details pertaining to the run instances as Queued. The Queued status enables them to be executed again.
If you do not select the Restart All check box, the execution of the task will resume from the point of failure. If any runtime instances of the workflow fails, the task itself is marked as Failed. If you enable the Restart All property, the truncate command (if it exists as a task detail) is reissued, and all the task details pertaining to the run instances are marked as Queued and get re-executed. If you do not enable the Restart All property, the execution resumes from the point of failure
                        Click OK in the Property Value dialog.
                        Click Save in the subtab toolbar.

Accessing the Loops Properties Using Parameters

To create a parameter with a static value
1.       In the DAC Design view, select the appropriate container from the drop-down list.
2.       Do one of the following:
o    To set up a parameter at the task level, go to the Task tab, and click the Parameters subtab.
o    To set up a parameter at the source system level, go to the Source System Parameters tab.
3.       Click New in the toolbar.
4.       In the Name field, enter a name for the parameter.
5.       In the Data Type field, select Text.
6.       In the Load Type field, select one of the following.
o    Full. To specify a full load.
o    Incremental. To specify an incremental load.
o    Both. To specify both full and incremental loads.
7.       Click Save to save the record.
You must save the record before proceeding.
8.       Click in the Value field to open the Enter Parameter Value dialog.
9.       Select Static.
10.   Enter the appropriate value.
11.   Click Save in the toolbar.
To parameterize the Loops property as a runtime value
1.       In the Design view, select the appropriate container from the drop-down list.
2.       Do one of the following:
o    To set up a parameter at the task level, go to the Task tab, and click the Parameters subtab.
o    To set up a parameter at the source system level, go to the Source System Parameters tab.
3.       Click New in the toolbar.
4.       In the Name field, enter a name for the parameter.
5.       In the Data Type field, select Text.
6.       In the Load Type field, select one of the following.
o    Full. To specify a full load.
o    Incremental. To specify an incremental load.
o    Both. To specify both full and incremental loads.
7.       Click Save to save the record.
You must save the record before proceeding.
8.       Click in the Value field to open the Enter Parameter Value dialog.
9.       Select Runtime.
10.   Select the appropriate runtime variable.
11.   Click OK.
12.   Click Save in the toolbar.

5. Customizing customsql.xml to Drop and Create Indexes and Analyze Tables


To edit the Analyze Table syntax
1.       Open the customsql.xml file located in the <DAC_Config_Location>\CustomSQLs directory.
2.       Locate the Analyze Table syntax for the appropriate database type.
For example, the syntax for an Oracle database is as follows:
<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL COLUMNS SIZE AUTO',cascade => true ) </SqlQuery>
3.       Edit the syntax.
For example, to gather statistics for only the indexed columns, edit the syntax as follows:
<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL INDEXED COLUMNS',cascade => true ) </SqlQuery>

2 comments:

  1. Hello Venkat, Its very useful information. I am trying to change customsql.xml to create index in parallel, ex:
    but its not taking up.
    CREATE %1 INDEX
    %2
    ON
    %3
    (
    %4
    )
    PARALLEL 10 NOLOGGING

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete