Showing posts with label Lookup. Show all posts
Showing posts with label Lookup. Show all posts

Thursday, 16 June 2016

5 Tips For Better DataStage Design #14



1. The use of Lookup stage depends upon the volume of data.Sparse lookup type should be used when primary input data volume is small.If the reference data volume is more, Lookup Stage should be avoided.

2. Use of ORDER BY clause in the database is good as compared to use of sort stage.



3. In Dtatastage Administrator, Tuned the 'Project Tunable' for better performance.

4. For Funnel, the use of this stage reduces the performance of a job. Funnel Stage should be run in continuous mode.

5. If the hash file is used only for lookup then "enable Preload to memory". This will improve the performance.






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 17 May 2016

Lookup Stage behaviour



Today, I am gonna ask you a question, What value I will get from lookup when my datatype is integer (Not Null) and there is no match b/w source and reference data???

Generally, we say, NULL as there is no match b/w source and reference. But that's not true.
So let's see how the DataStage and Lookup behave :-)

http://www.datagenx.net/2016/05/lookup-stage-behaviour.html
When Source and Reference are NULLable -
-       If there is no match b/ source and reference, we will get NULL in output 

When Source and Reference are Not-NULLable -
-       If there is no match b/ source and reference, we will get DataStage Defaults for that datatype.
        such as - 0 for integer and empty string or '' for varchar when data is going out from lookup stage.

So, Be careful when you are planning to filter the data outside lookup stage based on referenced columns value as field in output file is not null, transformer stage don't receive a null (because it comes with the default value 0) and can't handle it as you expec.

Hoping, this will add one pointer in your learning. Let me know your thoughts in comment section.




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 1 February 2016

5 Tips For Better DataStage Design #8



#1. Templates have to be created to enhance reusability and enforce coding standard. Jobs should be created using templates.
#2. The template should contain the standard job flow along with proper naming conventions of components, proper Job level annotation and short/long description. Change record section should be kept in log description to keep track.



#3. Don't copy the job design only. copy using 'save as' or create copy option at the job level.
#4. The DataStage connection should be logged off after completion of work to avoid locked jobs.
#5. Creation of common lookup jobs
#6. Some extraction jobs can be created to created reference datasets. The datasets can then be used in different conversion modules





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Wednesday, 25 November 2015

DataStage Scenario #10 - two realtime scenario



Hello guys, Hoping you are enjoying while solving DataStage Scenarios. Today I am going to ask two real time scenario. try to solve these :-)

Scn1:
We have to design a job, which will extract data from table tab1, when we get some value in a file file1. No relation between table and file.
Simple Hhh? Let's make it little restricted. You can not use the sequencer job, All functionality we need in a single parallel job.





When you able to solve first one, come to this -

Scn2:
Reading source table Stab which is having 20 columns (Sc1, Sc2, Sc3.... ), Need to validate individual column from Sc1 to Scl0 from another table Rtab column Rc1 to Rc10 ( means Sc1 with Rc1, Sc2 with Rc2 .........). The condition is, If any column is got invalid whole row will be dropped and that column captured in a single reject report. Design such a way that we should get two rows in reject file if two column are not valid in a single input row.

Wish you a luck !!



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Tuesday, 29 September 2015

DataStage Scenario #6 - Populate 'Not Found'



Scenario:
       Design a job which read a flat file and populate the output as NOT FOUND which is not available in the other file.

Example Input:

Input1 Input2
Atul Amruta
Neeraj Atul
Anita Divya
Amruta Santosh
Divya Pramod
Swapnil
Pramod
Vivek
Ashish
Amit
Santosh



Output:

Output
Neeraj
Anita
Swapnil
Vivek
Ashish
Amit



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Saturday, 22 August 2015

VLookup in MS Excel





We all agree on this, we have to use MS Excel in our day to day work whether it is technical or not. A day before I came across a Excel Function which is same as DS Lookup :-) called VLookUp. I loved this function cause this reduce my work of 30 min to few seconds. So just want to share this function here, by the way there are lot of material you can find on net for this but this is my version ---


What was my task ? 
I have to pickup a hyperlink value from another sheet by matching a doc name.
If Sheet1.Doc = Sheet2.Doc Then HyperLink Else ''

My Excel sheets were look like --




The syntax of the VLOOKUP function is:
=VLOOKUP(lookup value, table range, output column index , [true/false])

Lookup Value        --- First Key column ( A1 )
Table Range         --- Lookup Table Range ( Sheet2!$A$2:$B$15 )
Output column index --- output column number ( for my case 2 (col B))
true/false          --- True means that an approximate match is acceptable, and False means that only an exact match is acceptable.

Table Range can be defined like below --
Lookup_Sheet_Name ! First_Row_First_Col : Last_Row_Last_Col 

Suppose , your sheet name is - Sheet2
First_Row_First_Col - A2    --> $A$2
Last_Row_Last_Col  - B15  --> $B$15

Table Range will be -- Sheet2!$A$2:$B$15

So,  Our VLOOKUP function will be - 
=VLOOKUP(A1, Sheet2!$A$2:$B$15, 2 , true)

Place this function to that cell where you want to populate the value from lookup table, this will work like - 
If A1 = ( table range) then pick sheet 2 column no 2 Else ''