My e-Notes about DataScience, Machine Learning, Python, Data Analytics, DataStage, DWH and ETL Concepts

Breaking

Wednesday, 2 March 2016

5 Tips For Better DataStage Design #10



1. Establish baselines (especially with I/O), use copy with no output
2. Avoid the use of only one flow for tuning/performance testing.  Prototyping can be a powerful tool.
3. Work in increments...change 1 thing at a time.
4. Evaluate data skew:  repartition to balance the data flow
5. Isolate and Solve - determine which stage is causing a problem.

6. distribute file systems (if possible) to eliminate bottlenecks
7. Do NOT involve the RDBMS in initial testing.
8. Understand and evaluate the tuning knobs available





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

No comments:

Post a comment

Disclaimer

The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.