My e-Notes about DataScience, Machine Learning, Python, Data Analytics, DataStage, DWH and ETL Concepts


Sunday, 10 April 2016

5 Tips For Better DataStage Design #12

1. Minimum number of sort stages should be use in a datastage job. “Don’t sort if previously sorted” in sort Stage, this option should be set this to “true”, which improves the Sort Stage performance. The same Hash key should be used.  In Transformer Stage “Preserve Sort Order” can be used to maintain sort order option.

2. Minimum number of stages should be used in a job; otherwise it affects the performance of the job.
If a job is having more stages then the job should be decomposed into smaller number of small jobs. The use of container is a best way for better visualize and readability. If the existing active stages occupy almost all the CPU resources, the performance can be improved by running multiple parallel copies of the same stage process. This is done by using a share container.

3. Use of minimum of Stage variables in transformer is a good practice. The performance degrades when more stage variables are used.

4. The use of column propagation should be taken care . Columns, which are not needed in the job flow, should not be propagated from one Stage to another and from one job to the next. The best option is to disable the RCP.

5. When there is a need of renaming columns or addition of new columns, use of copy or modify stage is good practice.

Like the below page to get update

No comments:

Post a comment


The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.