Saturday, 30 July 2016

#1 How to Copy DataSet from One Server to Another Server



Hi Guys...
I've been asked so many times that how can we move/copy one dataset from one server to another So here is the way which I follow.

At very first step, Analyze if you can avoid this by using some other way like creating sequential file and ftp Or load the data into temporary table which can be accessible on another server, if using datastage packs then via mqs, xml or json formats etc. Why I am suggesting these solutions coz these are easy to design and guaranteed the data quality at other end.

If above solutions are not possible, please follow the below steps -

Points I am going to cover here -
1. Generating a dummy dataset
2. Files which we need to move
3. Why we need these files only
4. Reading the dataset on another server

http://www.datagenx.net/2016/06/datastage-quiz-1.html

 

1. Generating a dummy dataset

I have created a dummy job which is generating a dataset with default APT_Config_file which has 2 nodes.

http://www.datagenx.net/2015/12/how-to-use-universe-shell-uvsh-in.html





Here, I am generating 10 dummy rows with the help of Row Generator stage and storing them into a datasset.

a. Config File - I am using the default config file (replaced the server name in "fastname" with serverA)

APT_CONFIG_FILE = /opt/IBM/InformationServer/Server/Configulations/default.apt

http://www.datagenx.net/2016/06/5-tips-for-better-datastage-design-14.html

check out the "resource disk" location in config file, we need it for further processing

RESOURCE DISK = /opt/IBM/InformationServer/Server/DataSets

b. dataset location - I have created this dataset in my home dir named dummy.ds

DATASET LOC = /home/atul/dummy.ds


Keep looking for next post........





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 28 July 2016

Algorithm Books


If you dont have the dropbox account, create from here - https://db.tt/H5VKcNA0

Algorithm Design and Applications [Michael T. Goodrich and Roberto Tamassia] - Book Link and PPTs Link





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Friday, 22 July 2016

DataStage job got Queued



Today, when I was working on my DEV server ( DS V11.5), the very first time I have faced this 'job queued' status so I waited for some time but my jobs were still in 'queue' so googled this.
I was surprise to know :P that IBM introduced this feature from V9.1 and here I was totally unaware of that.
So sharing what I have found on Developer's GOD - Google   :-)

              Courtesy - www.ibm.com


IBM introduced this feature as Workload Management (WLM) where it is capable to manage the priority and order of DataStage jobs which basically derived from many points such as Max no of running jobs, Memory and Server load etc.
You can manage all these configutations setting under Operation Console. More information you can find on this IBM Knowledge Base Link


I have resolved my issue by resetting the value of WLMON (=0) in DSODBConfig.cfg file.



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 21 July 2016

DBMS Books



If you dont have the dropbox account, create from here - https://db.tt/H5VKcNA0

Database System Concepts 6E - Korth               Book Link and PPTs Link  
Database System 6E_Navathe - Navathe            Book Link and PPTs Link    





** The books and its content belongs only to its author and publication, Shared here for personal learning purpose. 



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 19 July 2016

ETL Strategy #1



The ETL Load will be designed to extract data from all the required Data Sources. The data to feed the Next Gen BI Database will have to be brought from the sources with specific Extraction, Transformation & Load (ETL) processes. It is essential to define a good ETL strategy to ensure that the execution of these tasks will support appropriate data volumes and Design Should be Scalable, and Maintenance free.


Initial Load Strategy

The Initial Load process is there to support the requirement to include historical data that can’t be included through the regular Delta Refresh.
For the Next Gen BI project, it is expected to have a full extract available for the required sources to prepare an Initial Load prior to the regular Delta Refresh. This Initial extraction of the data will then flow through the regular transformation and load process.
As discussed in the control process, Control tables will be used to initiate the first iteration of the Initial ETL process, a list of source table name with extraction dates will be loaded in the control tables. ETL process can be kicked off through the scheduler and the ETL process will read the control tables and process the full extract.
The rest of the process for an Initial load is same as the delta refresh. As shown in the flow chart under the section (Delta from Sources extracted by Timestamp), The only difference is the loading of the control tables to start the process for the first time when a table gets loaded in the Data Warehouse.


Delta Refresh or CDC Strategy

The Delta refresh process will apply only the appropriate transactions to the Data Warehouse. The result is a greatly reduced volume of information to be processed and applied. Also the Delta transactions for the Warehouse can be reused as input for the different Data mart, since they will be part of the Staging area and already processed. As discussed in the Control Process Strategy, control tables will be used to control delta refresh process.


Continued......ETL Strategy #2



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/