DataStage Partitioning #2

by Atul Singh on November 11, 2016 in Concept, Datastage, DB2, Entire, Hash, Key, Keyless, Learning, Modulus, MPP, Partitioning, Random, Round Robin, Same, SMP, type

Keyless partition
    Round Robin
    Entire
    Same
    Random

Round Robin
The first record goes to the first processing node, the second to the second processing node, and so on. When DataStage reaches the last processing node in the system, it starts over. This method is useful for resizing partitions of an input data set that are not equal in size. The round robin method always creates approximately equal-sized partitions. This method is the one normally used when DataStage initially partitions data.
Example: Assume 8 nodes are allocated to store the records then 1st record will go into the first node and the 2nd record will go into the second node ……8th record will go into the eighth node and the 9th record will go into the first node and so on….

Entire
Send all rows down all partitions.
Example: Assume 8 nodes are allocated, then in all the 8 nodes all the records will be passed.

Same
Preserve the same partitioning.
Example: Two stages in a job (Sort and Dataset). In sort stage you have done “Hash” partition and in the dataset you have given “Same” partition. In the dataset the data will be preserved with the hash partition.

Random
DataStage uses a random algorithm to choose where the rows goes. The result of Random is that you cannot know where a row will end up.

Application Execution: Parallel jobs can be executed in two ways
> Sequential
> Parallel
In the first slot of the below figure sequential execution is shown. Parallel job can be executed in two processing, SMP and MPP, in the second and third slot of the below figure it is shown.

Ref - www.ibm.com

Like the below page to get update
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

About Atul Singh
I am a Data Consultant at a Canadian financial firm. My keen interests varies from Data Analytics, ML, Kubernetes, NLP to ETL. I love to blog and travel in my spare time. If you’d like to get in touch, feel free to say hello through any of the social links.

DataGenX - Atul's Notes

Breaking

Friday, November 11, 2016

DataStage Partitioning #2

No comments:

Post a Comment

-

Follow Us

Search This Blog

Blog Archive

Disclaimer