Thursday, 14 September 2017

Evaluation Sequence in Transformer Stage - A Quick DataStage Recipe



Recipe:

What is evaluation sequence in Transformer Stage Or Order of Stage & Loop Variable and Derivations

Ingredients:

1. Transformer Stage
     a. Stage Variables
     b. Loop Variables
     c. Derivations


How To:

Evaluate each stage variable initial value
For each input row to process:
Evaluate each stage variable derivation value, unless the derivation is empty
For each output link:
Evaluate each column derivation value
Write the output record
Next output link
Next input row


** The stage variables and the columns within a link are evaluated in the order in which they are displayed in the Transformer editor. Similarly, the output links are also evaluated in the order in which they are displayed




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 12 September 2017

A Newer Version of Jupyter Notebook - Jupyter Lab


We all use Jupyter Notebook (previously known as IPython Notebook) a lot when researching on something or doing stuff on stuff :-)
For those, Who dont know what it is, It is a browser based Notebook which holds the Python code as well as executed Output. You can export it to html, pdf or its native format (*.ipynb).

So coming back to topic, there is a next generation of Jupyter Notebook is available, try it once, I am sure you will fell in love with it. So let's quickly check how you can get it -
Installation:
$ pip install jupyterlab

Execution:
$ jupyter lab

Features which I like the most:
1. You will get a file browser in left side of notebook window for easy access on files
2. Provide 5 quick access button at Leftmost panel (Files, Running, Commands, CellTools & Tabs)
3. Each Notebook will open in same browser tab, means there will be one browser tab and inside that tab there will be multiple jupyter notebook tab will open


For more details, visit - https://github.com/jupyterlab/jupyterlab





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 11 September 2017

Datastage Calling script in Remote Server


Step 1: Setting up UNIX server to automatically login without prompt-ing a password.

1. SSH must be installed in both servers. (primary and remote)
2. User ID for both servers

You can find the Step by Step detail on this Link - Configuring_SSH_on_Linux


Step 2: Creating Datastage job to run script in a remote server

1. Create a new sequencer job
2. Add an Execute Command stage
3. In the Command text value in the ExecCommand tab,

type -

 ssh UserB@ServerB ksh /home/b/test.ksh

This command will execute a script in the remote server.




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Friday, 8 September 2017

BI Report Testing Trends


Extending my quite old post (ETL Testing - Trends and Challenge - link) by sharing my thoughts on BI Report Testing Issues and Solutions. Feel free to add your views and comments if any.

Business Intelligence
Business Intelligence (BI) and Data Warehouse (DW) systems allow companies to use their data for data analysis, business scenarios and forecasting, business planning, operation optimization and financial management and compliance. To construct Data Warehouses, Extract, Transform and Load (ETL) technologies help to collect data from various sources, transform the data depending on business rules and needs, and load the data into a destination database.

Consolidating data into a single corporate view enables the gathering of intelligence about your business, pulling from different data sources, such as your Physician, Hospitals, Labs and Claims systems.


  • BI Report Testing Trends
Once the ETL part is tested , the data being showed onto the reports hold utmost importance. QA team should verify the data reported with the source data for consistency and accuracy.
  • Verify Report data from source (DWH tables/views)
QA should verify the report data (field/column level) from source by creating required SQL at their own based on different filter criteria (as available on report filter page).
  • Creating SQLs 
Create SQL queries to fetch and verify the data from Source and Target. Sometimes it’s not possible to do the complex transformations done in ETL. In such a case the data can be transferred to some file and calculations can be performed.
  • GUI & Layout
Verifying Report GUI (selection page) and layout (report output layout).
  • Performance verification
Verifying Report’s performance (report’s response time should be under predefined time limit as specified by business need). Also report’s performance can be tested for multiple users (those # of users are expected to access the report at same time and this limit should be defined by business need)
  • Security verification
Verifying that only authorized users can access the report OR some specific part of report (if that part should not allow to any general user)





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/