My e-Notes about DataScience, Machine Learning, Python, Data Analytics, DataStage, DWH and ETL Concepts


Tuesday, 28 February 2017

ETL Preformance Tuning : Identification of Performance Bottleneck

The goal of performance tuning is to optimize session performance by eliminating performance bottlenecks. To tune the performance of a session, we need to identify a performance bottleneck, eliminate it, and then identify the next performance bottleneck until we are satisfied with the session performance. Test Load option can be used to run sessions while tune session performance.
The most common performance bottleneck occurs when the ETL Server writes to a target database. We can identify performance bottlenecks by the following methods:

Running test sessions. We can configure a test session to read from a flat file source or to write to a flat file target or any stage which can hold the data but not write, to identify source and target bottlenecks.

Studying performance details. We can create a set of information called performance details to identify session bottlenecks. Performance details provide information such as buffer input and output efficiency. Collect Performance Data option in Session Property will enable the Session to generate a Counter of Input and Output rows through each Transformation

Monitoring system performance. System monitoring tools can be used to view percent CPU usage, I/O waits, and paging to identify system bottlenecks.

Once the location of a performance bottleneck is determined, we can eliminate the bottleneck by following these guidelines:

Eliminate source and target database bottlenecks. 
Optimize the query, Increase the database network packet size, or   configuring index and key constraints.

Eliminate mapping bottlenecks. 
Fine tune the pipeline logic and transformation settings and options in  mappings to eliminate mapping bottlenecks.

Eliminate session bottlenecks. 
Session strategy can be optimized performance details can be used to help in tuning session  configuration.

Eliminate system bottlenecks. 
Have the system administrator analyze information from system monitoring tools and improve CPU and network performance.

If all the bottlenecks above are tuned, further optimization of session performance can be done by increasing the number of pipeline partitions in the session. Adding partitions can improve performance by utilizing more of the system hardware while processing the session.

Like the below page to get update


The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.