Performance Tuning
With DAC
1. Managing Indexes
In general, during the ETL process,
before a table is truncated, all the indexes, as defined in the repository,
will be dropped before the data is loaded and recreated automatically after the
data is loaded.
Index Behavior
The general sequence of actions related to indexes when a task runs is as follows:
1. Tables are truncated.
2. Indexes are dropped.
3. Data is loaded into target tables.
4. ETL indexes and unique indexes are recreated.
5. Successor tasks are notified that ETL indexes were
created.
6. Query indexes and additional unique indexes are
recreated.
7. Tables are analyzed.
Note: Successor tasks can start before query-related
and unique indexes are created.To specify an index space for indexes by table type
Setup --> Physical data source-->Index spaces--> generate--> give the value of index space of each table type--> Save
Specifying How Many Indexes Can Be Created in Parallel
1. In the Physical Data Sources tab of the Setup
view, select the appropriate physical data source in the top pane list.
2. Click the Parallel Indexes subtab.
3. Click New in the Parallel Indexes toolbar.
4. In the Name field, query for and enter the
names of the table for which you want to specify how many indexes can be
created in parallel.
5. For each table, in the Number of Parallel
Indexes field, enter the number of indexes you want to create in parallel.
6. Click Save.
To specify how many indexes can be
created in parallel for all tables associated with a physical data source
1.
In the
Physical Data Sources tab of the Setup view, select the appropriate physical
data source in the top pane list.
2.
Click
the Edit subtab.
3.
In the
Num Parallel Indexes Per Table field, enter a numeric value to specify how many
indexes can be created in parallel for a table on the specified physical data
source.
4.
Click
Save.
2. Using Actions to Optimize Indexes and Collect Statistics on Tables
- Create a source system parameter called CURRENT_YEAR, using the Timestamp data type and the runtime variable @DAC_ETL_START_TIME_FOR_TARGET.
- Create a custom index action to override the drop index behavior.
- Create an index action called Drop Partitioned Index.
- Enter the following SQL:
3. alter
index getIndexName()
4. modify
partition PART_@DAC_$$CURRENT_YEAR
5. unusable
- Assign the Drop Partitioned Index action to indexes.
- In the Indices tab, query for the table name and indexes that need to be dropped and created during the incremental runs.
- For each record in the list of query results, right-click and select Add Actions.
- Select Drop Index as the Action Type, Incremental as the Load Type, and Drop Partitioned Index as the Action.
- Create a custom index action to override the create index behavior.
- Create an index action called Enable Partitioned Index.
- Enter the following SQL:
3. alter
index getIndexName()
4. rebuild
partition PART_@DAC_$$CURRENT_YEAR
5. nologging
- Assign the Enable Partitioned Index action to indexes.
- In the Indices tab, query for the table name and indexes that need to be dropped and created during the incremental runs.
- For each record in the list of query results, right-click and select Add Actions.
- Select Drop Index as the Action Type, Incremental as the Load Type, and Enable Partitioned Index as the Action.
- Create a custom index action to override the analyze table behavior.
- Create an index action called Analyze Current Partition.
- Enter the following SQL:
3. DBMS_STATS.GATHER_TABLE_STATS(
4. NULL,
5. TABNAME
=> 'getTableName()'
6. CASCADE
=> FALSE,
7. PARTNAME
=> PART_@DAC_CURRENT_YEAR,
8. ESTIMATE_PERCENT
=> DBMS_STATS.AUTO_SAMPLE_SIZE,
9. GRANULARITY
=> 'FOR ALL INDEXED COLUMNS SIZE AUTO',
10.DEGREE
=> DBMS_STATS.DEFAULT_DEGREE)
- Select Stored Procedure as the Type.
- Assign the Analyze Current Partition action to tables.
- In the Tables tab, query for the table names that need to be analyzed during the incremental runs.
- For each record in the list of query results, right-click and select Add Actions.
- Select Analyze Table as the Action Type, Incremental as the Load Type, and Analyze Current Partition as the Action.
3. Using Heuristics to Manage Tasks, Tables and Indexes
Creating a Heuristics Rule
To create a heuristics rule1. Count: Provides a count of the number of records in the primary source table that will be included in the ETL process (based on the task to which you assign the heuristics rule)
2. Ratio: Provides a count of the number of records in the primary source table and the total number of records in the target table. If you select the Ratio option, the value of the Threshold field is a percentage.
1. If
this heuristics rule applies to a task with multiple primary tables, select one
of the following:
o
Sum.
Aggregates the total of new or changed records in all of the primary source
tables.
o
Average.
Returns an average of the new or changed records in all of the primary source
tables.
o
Max.
Returns the largest count of new or changed records of all of the primary
source tables.
o
Min.
Returns the smallest count of new or changed records of all of the primary
source tables.
2. In
the Source Tables field, select one of the following:
o
Primary.
Specifies that intelligence will be gathered on one or more primary tables
associated with the task that this heuristics rule is assigned to.
o
With SQL Overrides.
Specifies that you will write custom SQL to include tables other than the
primary tables.
3. In
the Threshold field, enter a numerical value above which DAC heuristics will
allow a particular action to take place.
4. (Optional) If you want to create this
heuristics rule but leave it inactive, select Inactive.
5. Click
Save.
Assigning a Heuristics Rule to a Task
To assign a heuristics rule to a taskDesign-->Task-->click on extended properties-->Click new-->select Heuristics.--> Double-click in the Value field -->
STo write custom SQL to
create a heuristics rule
- In the Design view, select the appropriate source system container from the drop-down list.
- Select the Container Specific Actions tab.
- Select Heuristic from the rightmost drop-down list on the toolbar.
- In the toolbar, click New.
A new, empty
record is created.
- In the Name field, enter a descriptive name for the SQL, and then click Save.
- Click in the Value field to open the Value dialog.
- Select a format for the tree view on the left side of the window.
- Flat view displays the SQL entries in a list format in their order of execution.
- Category view displays the entries by the categories Custom SQL and Stored Procedure.
You
can reorder the entries in the tree by dragging and dropping them.
- Click Add in the right-hand toolbar.
A new, empty
record is created in the top pane.
- Enter or select the appropriate information.
Field
|
Description
|
Name
|
Enter a logical name for the SQL
block.
|
Type
|
Select one of the following:
o
Select SQL.
Indicates the SQL you enter will be a SELECT statement.
o
Stored
Procedure. Indicates the SQL you enter will be a stored procedure.
|
Continue on Fail
|
Specifies whether an execution
should proceed if a given SQL block fails.
|
Retries
|
Specifies how many retries are
allowed. If the number is not positive, a default number of one (1) will be
used.
|
Valid Database Platform
|
Specifies the valid database
platforms against which the SQL will run. If this field is left empty, the
SQL can be run against any database.
|
4. Looping of Workflows
To define a workflow for looping
1. Enable
the Informatica workflow property Configure Concurrent
Execution:
1. In
Informatica Workflow Manager, open the appropriate workflow in the Workflow
Designer.
2. On
the menu bar, select Workflows, and then select Edit.
3. In
the Edit Workflow dialog, select the Enabled check box
for the Configure Concurrent Execution property.
4. Click
the Configure Concurrent Execution button.S
5. In
the Configure Concurrent Execution dialog, select Allow
concurrent run with same instance name.
6. Click
OK.
2. In
the DAC Design view, select the appropriate container from the drop-down list.
3. In
the Tasks tab, query for the task for which you want to configure looping.
4. With
the appropriate task selected in the top pane, click the Extended
Properties subtab.
5. In
the bottom pane toolbar, click New.
A new record is created with the name Loops. You cannot change the
name of this record.
6. Click
in the Value field of the new record.
The Property Value dialog is displayed.
7. In
the Loops field, enter a numeric value to specify how many times you want the
workflow to repeat.
The number of loops that you specify is applicable to all task
execution types, that is, Informatica, SQL File, Stored Procedure, and External
Program.
8. (Optional)
Select the Parallel check box to have the run instances
of the same task execute in parallel. If you do not select the Parallel check
box, the run instances of the same task will be executed serially.
Note:
o
Run instances will be executed in parallel only if resources are
available. For example, if your environment is set up to run 10 workflows in
parallel, and a task's Number of Loops property is set to 20, only the first 10
instances will be executed. These 10 instances will have a Current Runs status
of Running. The remaining 10 instances will have a status of Runnable. For a
description of all Current Runs statuses, see "Current
Runs Tab".
o
When the instances are executed in serial, the first failure
encountered stops the rest of the executions.
o
When the instances are executed in parallel, any failure in the
running batch stops the rest of the executions.
(Optional) Select the Restart All check
box to do the following:
o
Reissue the truncate command (if it exists as a task detail).
o
Mark all task details pertaining to the run instances as Queued.
The Queued status enables them to be executed again.
If you do not select the Restart All check box, the execution of
the task will resume from the point of failure. If any runtime instances of the
workflow fails, the task itself is marked as Failed. If you enable the Restart
All property, the truncate command (if it exists as a task detail) is reissued,
and all the task details pertaining to the run instances are marked as Queued
and get re-executed. If you do not enable the Restart All property, the execution
resumes from the point of failure
Click OK in the Property Value dialog.
Click Save in the subtab toolbar.
Accessing the Loops Properties Using Parameters
To create a parameter with a static value
1. In
the DAC Design view, select the appropriate container from the drop-down list.
2. Do
one of the following:
o
To set up a parameter at the task level, go to the Task tab, and
click the Parameters subtab.
o
To set up a parameter at the source system level, go to the Source
System Parameters tab.
3. Click
New in the toolbar.
4. In
the Name field, enter a name for the parameter.
5. In
the Data Type field, select Text.
6. In
the Load Type field, select one of the following.
o
Full.
To specify a full load.
o
Incremental.
To specify an incremental load.
o
Both.
To specify both full and incremental loads.
7. Click
Save to save the record.
You must save the record before proceeding.
8. Click
in the Value field to open the Enter Parameter Value
dialog.
9. Select
Static.
10. Enter
the appropriate value.
11. Click
Save in the toolbar.
To parameterize the Loops property as a runtime value
1. In
the Design view, select the appropriate container from the drop-down list.
2. Do
one of the following:
o
To set up a parameter at the task level, go to the Task tab, and
click the Parameters subtab.
o
To set up a parameter at the source system level, go to the Source
System Parameters tab.
3. Click
New in the toolbar.
4. In
the Name field, enter a name for the parameter.
5. In
the Data Type field, select Text.
6. In
the Load Type field, select one of the following.
o
Full.
To specify a full load.
o
Incremental.
To specify an incremental load.
o
Both.
To specify both full and incremental loads.
7. Click
Save to save the record.
You must save the record before proceeding.
8. Click
in the Value field to open the Enter Parameter Value
dialog.
9. Select
Runtime.
10. Select
the appropriate runtime variable.
11. Click
OK.
12. Click
Save in the toolbar.
5. Customizing customsql.xml to Drop and Create Indexes and Analyze Tables
To edit the Analyze Table syntax
1. Open
the customsql.xml file located in the
<DAC_Config_Location>\CustomSQLs directory.
2. Locate
the Analyze Table syntax for the appropriate database type.
For example, the syntax for an Oracle database is as follows:
<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL COLUMNS SIZE AUTO',cascade => true ) </SqlQuery>
3. Edit
the syntax.
For example, to gather statistics for only the indexed columns,
edit the syntax as follows:
<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL INDEXED COLUMNS',cascade => true ) </SqlQuery>
Hello Venkat, Its very useful information. I am trying to change customsql.xml to create index in parallel, ex:
ReplyDeletebut its not taking up.
CREATE %1 INDEX
%2
ON
%3
(
%4
)
PARALLEL 10 NOLOGGING
This comment has been removed by the author.
ReplyDelete