Showing posts with label transformer. Show all posts
Showing posts with label transformer. Show all posts

Thursday, 14 September 2017

Evaluation Sequence in Transformer Stage - A Quick DataStage Recipe


What is evaluation sequence in Transformer Stage Or Order of Stage & Loop Variable and Derivations


1. Transformer Stage
     a. Stage Variables
     b. Loop Variables
     c. Derivations

How To:

Evaluate each stage variable initial value
For each input row to process:
Evaluate each stage variable derivation value, unless the derivation is empty
For each output link:
Evaluate each column derivation value
Write the output record
Next output link
Next input row

** The stage variables and the columns within a link are evaluated in the order in which they are displayed in the Transformer editor. Similarly, the output links are also evaluated in the order in which they are displayed

Like the below page to get update

Wednesday, 5 July 2017

Conditionally Aborting Jobs with Transformer Stage

How to develop a job which will stop processing when FROM_DATE and TO_DATE is equal in data? Or
I want to abort the job when reject row count is more than 50?

Above scenarios can be implemented using Transformer Stage but How? Let's check this out -

  • The Transformer can be used to conditionally abort a job when incoming data matches a specific rule. 
    • In our case 1, it is FROM_DATE  = TO_DATE 
    • In our case 2, it is some reject condition 
  • Create a new output link that will handle rows that match the abort rule. 
  • Within the link constraints dialog box, apply the abort rule to this output link
  • Set the “Abort After Rows” count to the number of rows allowed before the job should be aborted .
    • In case 1, it should be 1. as we want to abort the job when FROM_DATE is equal to TO_DATE
    • In case 2, it should be 50 as we want to abort the job when reject condition have more than 50 records

But, since the Transformer will abort the entire job flow immediately, it is possible that valid rows will not have been flushed from Sequential File (export) buffers, or committed to database tables.
It is important to set the Sequential File buffer flush  or database commit parameters otherwise we have to manually remove the data which has been inserted into sequential file or database.

Like the below page to get update

Friday, 6 May 2016

DS Fatal Error: Destination "APT_TRinput0Rec0" is already bound

Fatal Error: Destination "APT_TRinput0Rec0" is already bound - Transformer Stage Error

* Check if the output stage is having identical column names
* Check if RCP is enabled in input links

If yes,
Rename the target output name accordingly
or Disable the RCP>

Like the below page to get update

Tuesday, 3 May 2016

Otherwise Constraint - A Quick DataStage Recipe


How to use "Otherwise" constraint in Transformer Stage

How To:

To use "Otherwise" constraint in Transformer stage, Order of link is important.
Typically link with "Otherwise" constraint should be last in Transformer stage link order

Like the below page to get update

Sunday, 10 April 2016

5 Tips For Better DataStage Design #12

1. Minimum number of sort stages should be use in a datastage job. “Don’t sort if previously sorted” in sort Stage, this option should be set this to “true”, which improves the Sort Stage performance. The same Hash key should be used.  In Transformer Stage “Preserve Sort Order” can be used to maintain sort order option.

2. Minimum number of stages should be used in a job; otherwise it affects the performance of the job.
If a job is having more stages then the job should be decomposed into smaller number of small jobs. The use of container is a best way for better visualize and readability. If the existing active stages occupy almost all the CPU resources, the performance can be improved by running multiple parallel copies of the same stage process. This is done by using a share container.

3. Use of minimum of Stage variables in transformer is a good practice. The performance degrades when more stage variables are used.

4. The use of column propagation should be taken care . Columns, which are not needed in the job flow, should not be propagated from one Stage to another and from one job to the next. The best option is to disable the RCP.

5. When there is a need of renaming columns or addition of new columns, use of copy or modify stage is good practice.

Like the below page to get update

Tuesday, 22 March 2016

Transformer Stage alternative - A Quick DataStage Recipe

What to use instead of "Transformer" Stage

Copy Stage

Use "Copy" stage instead of "Transformer" Stage for following:
Renaming columns
Dropping columns
Default type conversions
Job design placeholder between stages

Modify Stage

Use "Modify" stage
Non default type conversions
Null handling
Character string trimming

Filter Stage

Use "Filter" Stage
Using constraints on output data

Will add more.......

Like the below page to get update

Friday, 11 December 2015

DataStage Scenario #11 - Get numeric or alphabets only

Goal - Extract numeric part and alpha part from a string as below



Output -

Source Part1  Part2
ATUL1234 ATUL 1234
I23S IS 23

Like the below page to get update  

Wednesday, 21 October 2015

DataStage Scenario #8 - Populate date between given two

Scenario :

Design a job with 2 job parameters -

1. Start Date   ( DD-MM-YYYY )
2. End Date  ( DD-MM-YYYY )

Read the parameter and populate all the date between these 2 which is not "SUNDAY"

Like the below page to get update!forum/datagenx

Tuesday, 20 October 2015

NULL behavior in datastage transformer

NULL - It is always a challenge for developers or architect while dealing with NULL in data. Every projects need some base rules to process NULLs.
     Today, we will see how the NULL behave in datastage, I hope, this will help you design a better job or flow to process the NULLs.

1. Null values can now be included in any expression

2. Null values no longer need to be explicitly handled

3. A null value in an expression will return a null value result. Example:
           1 + null = null
           "abc":null = null
           trim(null) = null

4. Exception: IsNull(null) and IsNotNull(null) will return true or false as expected

5. Any comparison involving null is unknown (i.e. is not true). Example:
          1 < null is not true
          1 >= null is not true
         null == null is not true
         null != null is not true

6. Logical operators - null is unknown (i.e. is not true)
        True AND null is not true
        True OR null is true
        False OR null is not true

7. As long as the target column is nullable records will not be dropped

8. Stage Variables are now always nullable

9. Stage Variable derivations can now include any input fields

Like the below page to get update!forum/datagenx

Tuesday, 29 September 2015

DataStage Scenario #6 - Populate 'Not Found'

       Design a job which read a flat file and populate the output as NOT FOUND which is not available in the other file.

Example Input:

Input1 Input2
Atul Amruta
Neeraj Atul
Anita Divya
Amruta Santosh
Divya Pramod



Like the below page to get update!forum/datagenx

Monday, 14 September 2015

DataStage Scenario #2 - Find Palindrome Number

Develop a datastage job to identify whether input data is a palindrome number or not, If it is, need to print the output as below - 



For more Scenario -  CLICK HERE

Friday, 28 August 2015

Datastage Transformer String Functions - Index

Index(String, Substring, Occurrence):   
It will returns starting character position of substring. We can use it as a 'grep' command of Linux when we have to check if given character or substring is existed in input string or not. 
1. Suppose, we have to check whether substring 'le' is existing in input column 'Incol' or not?

Transformation derivation  -
If Index(Incol, 'le', 1) > 0 then 'Exist' Else 'Not Exists'

2. Let's assume we are getting 'India' as a input string, what will be the output of below derivations?
Transformation derivation  -
Index('India', 'a', 1)         =  5
Index('India', 'i', 1)         =   4   # this will not give 1 as matching is case sensitive
Index('India', 'nd', 1)      =   2