My scrapbook about almost anything I stumble upon but mostly around Cloud, K8s, OpenShift, DataScience, Machine Learning, Golang, Python, Data Analytics, DataStage, DWH and ETL Concepts. If you find any pages useful don't forget to give thumbs-up :)


Friday, January 8, 2016

5 Tips For Better DataStage Design #7

#1. In case the partition type for the next immediate stage is to be changed then the ‘Propagate partition’ should be set to ‘Clear’ in the current stage.

#2. Make sure that appropriate partitioning and sorting are used in the stages, where ever possible. This enhances the performances. Make sure that you understand the partitioning being used. Otherwise leave it auto.

#3. For fixed width files, final delimiter should be set to 'none' in the file format property.

#4. If any processing stage requires a key ( like remove duplicate, merge, join, etc ) the Keys, sorting keys and Partitioning keys should be same and in the same order

#5. To improve Funnel, all the input links must be hash partitioned on the sort keys.

Like the below page to get update!forum/datagenx

No comments:

Post a Comment


The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.