Showing posts with label File. Show all posts
Showing posts with label File. Show all posts

Tuesday, 20 June 2017

Crontab for Windows #iLoveScripting


While working on one of my project, I required to take backup of all the work which I have completed coz workplace is shared among many developers.
     So being a Linux person, I was looking for something simple like Crontab but ended with Windows Task Scheduler. 

Tool is simple but not as Linux Crontab But it did the work asked by me :-)

How to Use Task Scheduler - 

  • Login with Admin privilege user account  
  • Open Run and "Taskschd.msc"
  •  Or  Go to Start --> Control Panel --> System and Maintenance --> Administrative Tools --> Task Scheduler
  • Click on "Create Task" on right hand side
  • This will open a Wizard to create Task
  • Fill the Task Name, Owner, Privilege and Configured for as below - 
  • Now, Click on Next Tab - Trigger, Here you have to define the time when you want to execute the program
  • You can fill the different Setting to customize your schedule.
  • Now, Click on "Action Tab"
  • In this tab, you have to define the action, such as when triggered what program/script should be execute
  • Click OK
  • You can see your task created under "Task Scheduler Library"


For More details on Task Scheduler, You can visit below links -
https://technet.microsoft.com/en-us/library/cc748993(v=ws.11).aspx




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 21 November 2016

Reading DSParam - datastage parameter file



I am sharing a utility which can help you to read DSParam file which holds all the environmental datastage parameters.

Utility to view contents of DSParams file. Useful when trying to see what all the customer has set at the project level.



Usage:
$ cat DSParams | ./DSParamReader.pl | more
or
$ cat DSParams | ./DSParamReader.pl > outputfile


Instructions:
1. copy script text below to a file (DSParamReader.pl) on a UNIX system
2. Set execute permissions on this file. chmod 777 envvar.pl
3. Usually perl is in /usr/bin/perl but you might have to adjust this path if neccessary. (hint "which perl" should tell you which one to use)
4. cat the DSParams file from the project you are concerned with and redirect the output to this script. You may have to put the Fully Qualified Path for this file.
5. capture the output to screen or file. File may be useful to have the customer send the info to you in email.








Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 11 October 2016

5 Tips For Better DataStage Design #16


1. Use 4-node configuration file for unit testing/system testing the job.
2. If there are multiple jobs to be run for the same module. Archive the source files in the after job routine of the last job.
3. Check whether the file exists in the landing directory before moving the sequential file. The ‘mv’ command will move the landing directory if the file is not found.

4. Ensure that the unix files created by any Datastage job is created by the same unix user who has run the job.
5. Make sure that the Short Job Description is filled using ‘Description Annotation’ and it contains the job name as part of the description. Don’t use Annotation for putting the job description.





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Wednesday, 3 August 2016

#3 How to Copy DataSet from One Server to Another Server

This post is third and last of How to Copy DataSet from One Server to Another Server Series

We have generated a populated a dataset and identified the files which we need to move to another server serverB from serverA

Continue.......

4. Reading the dataset on another server

This is the most crucial step, Now all 4 files are moved on serverB or the common location which can be accessible from serverB.

For my case, common dir is my home - /home/users/atul


A. Change the default.apt file
We need to change the fastname in default.apt (config file) which we copied from the serverA, [ NOT the default.apt for serverB]

Open the file in any text editor or vi and change as below screen shot -


Temporarily create the "resource disk" and "resource scratchdisk" location if not existing as defined in above config file.

B. Copy the dataset data files 

Move the dataset data file from common directory to "resource disk" as defined in config file.

cp ~/dummy.ds.* /opt/IBM/InformationServer/Server/DataSets/


Now, all files locations are like -

Config file and Dataset descriptor file - my home dir or common dir
Dataset data files - /opt/IBM/InformationServer/Server/DataSets/


Design a job which will read thess dataset files and populate data into sequential file or any other output.


Job Paramaters -
APT_CONFIG_FILE = /home/users/atul/default.apt

DataSet Properties
DataSet File - /home/users/atul/dummy.ds

That is all, you can read the copied dataset on serverB, you can populate this data to some other output such as seq file, table so that you can avoid the use of copied default.apt config file which is not for serverB.

Try it out, let me know if you have any question.




If you like this post, follow the below pages to get update
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 1 August 2016

#2 How to Copy DataSet from One Server to Another Server


This post is second part of How to Copy DataSet from One Server to Another Server

Continue.......

 After generating the dummy dataset, next step is to identify the files which we need to copy.

2. Files which we need to move

a. APT_CONFIG_FILE - configuration file which used in dataset
b. DataSet Descriptor file - *.ds file, in our case it is dummy.ds
c. DataSet Data files - Actual data files which stored in RESOURCE DISK location

So let's get all the path which we need to access -

APT_CONFIG_FILE = /opt/IBM/InformationServer/Server/Configulations/default.apt
RESOURCE DISK = /opt/IBM/InformationServer/Server/DataSets
DATASET LOC = /home/users/atul/dummy.ds



Use commands or any FTP tool to copy these files in a shared location which can be accessible from another server (serverB)

For my case, I have stored all of them into my linux home direcory which is common in both server.

So I have executed these commands to copy all the required files into my home directory.


cp  /opt/IBM/InformationServer/Server/Configulations/default.apt ~
cp  /opt/IBM/InformationServer/Server/DataSets/dummy.ds.* ~
cp  /home/users/atul/dummy.ds ~


Now, my home directory is having these files -


You can copy these 4 files on serverB where you want to move your dataset. I am not doing the same as my home directory is common for both server.

3. Why we need these files only

Config file was used by datastage to create dataset ( descriptot file, data files, data file location)
So, we needed - config file, dataset descriptor file and dataset data files.





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Saturday, 30 July 2016

#1 How to Copy DataSet from One Server to Another Server



Hi Guys...
I've been asked so many times that how can we move/copy one dataset from one server to another So here is the way which I follow.

At very first step, Analyze if you can avoid this by using some other way like creating sequential file and ftp Or load the data into temporary table which can be accessible on another server, if using datastage packs then via mqs, xml or json formats etc. Why I am suggesting these solutions coz these are easy to design and guaranteed the data quality at other end.

If above solutions are not possible, please follow the below steps -

Points I am going to cover here -
1. Generating a dummy dataset
2. Files which we need to move
3. Why we need these files only
4. Reading the dataset on another server

http://www.datagenx.net/2016/06/datastage-quiz-1.html

 

1. Generating a dummy dataset

I have created a dummy job which is generating a dataset with default APT_Config_file which has 2 nodes.

http://www.datagenx.net/2015/12/how-to-use-universe-shell-uvsh-in.html





Here, I am generating 10 dummy rows with the help of Row Generator stage and storing them into a datasset.

a. Config File - I am using the default config file (replaced the server name in "fastname" with serverA)

APT_CONFIG_FILE = /opt/IBM/InformationServer/Server/Configulations/default.apt

http://www.datagenx.net/2016/06/5-tips-for-better-datastage-design-14.html

check out the "resource disk" location in config file, we need it for further processing

RESOURCE DISK = /opt/IBM/InformationServer/Server/DataSets

b. dataset location - I have created this dataset in my home dir named dummy.ds

DATASET LOC = /home/atul/dummy.ds


Keep looking for next post........





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 19 May 2016

5 Tips For Better DataStage Design #13



1. The query used in the database should be in such a way that required number of rows are fetched. Do not extract the columns which are not required.

2. For parallel jobs, sequential File should not be read using same partitioning.


http://www.datagenx.net/2016/05/5-tips-for-better-datastage-design-13.html


3. For huge amount of data, use of sequential file stage is not a good practice. This stage also should not be used for intermediate storage between jobs. It degrades the performance of the job.

4. The number of lookups in a job design should be minimum. Join stage is a good alternative to lookup stage.

5. For parallel jobs, proper portioning method is to be used for better job performance and accurate flow of data.





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 22 February 2016

Python Points #10d - Writing into Files

Friday, 5 February 2016

DataStage Scenario #14 - Get Day specific file



Hi Guys,
Design jobs for one more realtime scenario -


Requirement-
Job will run once in a day, which read a file from a folder, but filename is changing on each day.

File Name -- FileName_Date.txt

Here -
Date -  Job run date
File - File Name


Example -

FileName on Monday -   InfoSphere_20160201.txt
FileName on Tuesday -   Info_Search_20160202.txt
FileName on Wednesday -   InfoLables_20160203.txt
FileName on Thursday -   InfoLocation_20160204.txt
FileName on Friday -   InfoOptions_20160205.txt




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 3 December 2015

Python Installation from Source in Linux