Showing posts with label commands. Show all posts
Showing posts with label commands. Show all posts

Saturday, 1 April 2017

Get Over Running DataStage Job Details


With the script below, we will get a list of jobs which are taking more time to complete than last run time. By some tweaks, we can use this script to monitor any kind of process.






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 10 October 2016

MarkDown Cheat Sheet

Tuesday, 31 May 2016

Python Points #13 - Comprehensions

Wednesday, 18 May 2016

Windows & Linux - Command Comparison



I am a linux command line geek and when I am working on windows I miss it as there are lot of kung-fu which we can do easily in linux terminal than any GUI tool. So thought of creating a command equivalent list of Windows and Linux.

let me know if there is any daily command which I missed to add here. Hoping this will help you guys.

www.datagenx.net/2016/05/windows-linux-command-comparison.html



Command's Purpose MS-DOS Linux Basic Linux Example
Copies files copy cp cp thisfile.txt /home/thisdirectory
Moves files move mv mv thisfile.txt /home/thisdirectory
Lists files dir ls ls
Clears screen cls clear clear
Closes shell prompt exit exit exit
Displays or sets date date date date
Deletes files del rm rm thisfile.txt
"Echoes" output to the screen echo echo echo this message
Edits text files edit gedit gedit thisfile.txt
Compares the contents of files fc diff diff file1 file2
Finds a string of text in a file find grep grep word or phrase thisfile.txt
Displays command help command /? man or info man command
Creates a directory mkdir, md mkdir mkdir directory
Deletes an existing directory rmdir, rd rmdir rmdirt directory
Views contents of a file more less less thisfile.txt
Renames a file ren mv mv thisfile.txt thatfile.txt
Displays your location in the file system chdir pwd pwd
Changes directories with a specified path (absolute path) cd pathname cd pathname cd /directory/directory
Changes directories with a relative path cd.. cd .. cd ..
Displays the time time date date
Shows amount of RAM in use mem free free
Displays the contents of a file type cat type filename
Reverse IP lookup nslookup nslookup
Checking the network status netstat netstat netstat
Pinging any system ping ping ping 127.0.0.1
Running Process tasklist ps ps -ef
Get login username net session who who 
Tracing IP  tracert traceroute
Sets file permissions Attrib chmod chmod perm1 perm2
Display OS version ver uname -a uname -a
Sorts lines in a file sort sort sort filename
Compare two files diff fc fc file1 file2
OS scheduler at crontab crontab -l
FTP ftp ftp ftp user@host




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 31 March 2016

echo the 'echo'




"echo" command is the very basic command of Linux/Unix. Most of us known this command to print whatever is being passed to it as argument. So let's revise once more if we are missing something here.


"echo" command prints its argument by single space.
[atul@localhost ~]$ echo Atul learns commands 
Atul learns commands

print argument by single space mean it will suppress the space between its arguments.
[atul@localhost ~]$ echo Atul          learns                         commands 
Atul learns commands

if you want to preserver the space between them, we need to make these 3 argument to 1 by using a quote.
[atul@localhost ~]$ echo "Atul     learns   commands" 
Atul     learns   commands



One more example :



** Need to remember -
* echo expect arguments to print whether they are separated by single space or multiple, it will print them with single space
* if you want to keep multiple space in argument string, make them one argument by quoting them.





Friday, 11 March 2016

Python Points #12 - execute OS commads

Thursday, 10 March 2016

Python Points #11 - Set

Wednesday, 17 February 2016

Python Points #10c - File Methods

Saturday, 6 February 2016

Python Points #10a - Reading Files

Sunday, 31 January 2016

Create a Daemon to Trace New Processes


Description

The following code can be used to create a daemon that will watch for processes that show up in the "ps -ef" output with certain characteristics. When it identifies such processes, it will attach to them with a trace utility (i.e. strace, truss, tusc... you must change the code to reflect this on whatever platform this is run on). The tool does not follow these processes with a fork since it will trace any children that contain the same "ps -ef" characteristics. This makes it useful for tracing DS PX programs that contain rsh since truss's fork flag (i.e. "-f") blocks the rsh from executing.



Usage

The script below should be saved to a file such as /tmp/tracer.sh and given rwx permissions.  The trace utility name that is appropriate for your platform should be altered in the "ps -ef" command and in the "for" loop.  The script would then be run using this syntax:
    /tmp/tracer.sh <search string>
As mentioned above, the search string can be any value that would appear in the "ps -ef" output.  Such values might be a user id, particular time, a command, or arguments to a command.  The fifth and eight lines of this script gather lists of all commands to be traced and then attempts to remove commands that should be ignored.  If you find too many processes getting traced, identify why it was selected and then alter these two lines by adding a "grep -v" to the list of items bieng ignored.









Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Thursday, 28 January 2016

How to get table list used in DataStage jobs ?


While developing jobs in DataStage, sometimes we face this requirement to get all the table list which was used by our DataStage jobs, but unfortunately, there is no direct way to get that.

DataStage is not having any command which can give us the table list. But, there are some ways by which can get the table list. All are the steps, which we are going to discuss, are needs one-time setup or development.




1) Setting up a universe query -


We can tweak this query to get the table list for all the DataStage jobs.

2) Parsing job export XML -
a) We can parse the tables from job export XML file. We can write a shell script and parse the XML to get the table name
b) Or we can develop a DataStage job which reads this XML and parses all the tables

Make use of these practices while developement  - 

3) While doing the development of DataStage project, make a practice to maintain a table which is having table and job information. This will help a lot afterward. 

4) Before using any table in any job, store that metadata into DataStage Repository folder. This will help you to do the Usage Analysis afterward.



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Wednesday, 27 January 2016

Python Points #8 - Dictionary

Sunday, 24 January 2016

ps command #4 - Sorting with sort command



We can sort the ps command output by unix sort also which is easy to use. Need to pass ps command output to sort command with proper argument and Volla !! You will get the output as you want.

Let's see how this is work [ sort command arguement can differ per your linux flavour and version ]
 I am using - CentOS 6.3


1. Display the top CPU consuming process (Column 3 - %CPU)
$ ps aux | head -1; ps aux | sort -k3 -nr |grep -v 'USER'| head
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND 
atul     21210  2.0  0.1 110232  1140 pts/3    R+   00:13   0:00 ps aux
hduser    2671  0.8  4.1 960428 42436 pts/1    Sl+  Aug22   5:29 mongod
root      1447  0.2  0.3 185112  3384 ?        Sl   Aug22   1:36 /usr/sbin/vmtoolsd
atul      2478  0.2  2.1 448120 21876 ?        Sl   Aug22   1:51 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr --blockFd 3
rtkit     2359  0.1  0.1 168448  1204 ?        SNl  Aug22   0:44 /usr/libexec/rtkit-daemon
root         7  0.1  0.0      0     0 ?        S    Aug22   0:53 [events/0]
root      2204  0.1  4.3 147500 43872 tty1     Ss+  Aug22   0:45 /usr/bin/Xorg :0 -nr -verbose -audit 4 -auth /var/run/gdm/auth-for-gdm-wEmBs1/database -nolisten tcp vt1
root       920  0.0  0.0      0     0 ?        S    Aug22   0:00 [bluetooth]
root         9  0.0  0.0      0     0 ?        S    Aug22   0:00 [khelper]
root         8  0.0  0.0      0     0 ?        S    Aug22   0:00 [cgroup] 

For my linux sort command arguements are --
-kn  ==> This use to select the column n, such as for column 4, -k4
-n   ==> column is numeric
-r   ==> reverse order

sort -k3 -nr ==> sort the third column of output in numeric reverse sort (largest to smallest)

2. Display the top 10 memory consuming process (Column 4 - %MEM)
$ ps aux | head -1; ps aux | sort -k4 -nr |grep -v 'USER'| head
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND 
root      2204  0.1  4.3 147500 43872 tty1     Ss+  Aug22   0:46 /usr/bin/Xorg :0 -nr -verbose -audit 4 -auth /var/run/gdm/auth-for-gdm-wEmBs1/database -nolisten tcp vt1
hduser    2671  0.8  4.1 960428 42436 pts/1    Sl+  Aug22   5:32 mongod
atul      2458  0.0  2.3 943204 23624 ?        S    Aug22   0:16 nautilus
atul      2516  0.0  2.2 275280 22316 ?        Ss   Aug22   0:06 gnome-screensaver
atul      2478  0.2  2.1 448120 21876 ?        Sl   Aug22   1:52 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr --blockFd 3
atul      2507  0.0  1.6 321388 16680 ?        S    Aug22   0:01 python /usr/share/system-config-printer/applet.py
atul      2589  0.0  1.4 292556 14600 ?        Sl   Aug22   0:14 gnome-terminal
atul      2536  0.0  1.3 395832 13372 ?        S    Aug22   0:00 /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=22
atul      2502  0.0  1.3 474620 13952 ?        Sl   Aug22   0:01 gpk-update-icon
atul      2537  0.0  1.2 459964 12736 ?        S    Aug22   0:10 /usr/libexec/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf-ior-fd=34
 




3. Display the process by time (Column 4 - TIME)
$ ps vx | head -1; ps vx | sort -k4 -r| grep -v 'PID' | head
PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND 2478 ?        Sl     1:52    351   593 447526 21876  2.1 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr --blockFd 3
 2458 ?        S      0:16    228  1763 941440 23624  2.3 nautilus
 2589 ?        Sl     0:15     28   296 292259 14600  1.4 gnome-terminal
 2421 ?        Ssl    0:14     22    34 500541 9676  0.9 /usr/libexec/gnome-settings-daemon
 2479 ?        S      0:13     23   403 310472 11996  1.1 nm-applet --sm-disable
 2537 ?        S      0:10     37   168 459795 12736  1.2 /usr/libexec/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf-ior-fd=34
 2444 ?        Ssl    0:10     25    64 445791 4872  0.4 /usr/bin/pulseaudio --start --log-target=syslog
 2445 ?        S      0:07     51   593 322206 12684  1.2 gnome-panel
 2516 ?        Ss     0:06      4   151 275128 22316  2.2 gnome-screensaver
 2522 ?        Sl     0:05      5    41 231870 1960  0.1 /usr/libexec/gvfs-afc-volume-monitor

 


4. Display the top 10 real memory usage process (Column 8 - RSS)
$ ps vx | head -1; ps vx | sort -k8 -nr| grep -v 'PID' | head 
PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 2458 ?        S      0:16    228  1763 941440 23624  2.3 nautilus
 2516 ?        Ss     0:06      4   151 275128 22316  2.2 gnome-screensaver
 2478 ?        Sl     1:52    351   593 447526 21876  2.1 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr --blockFd 3
 2507 ?        S      0:01     73     2 321385 16680  1.6 python /usr/share/system-config-printer/applet.py
 2589 ?        Sl     0:15     28   296 292259 14600  1.4 gnome-terminal
 2502 ?        Sl     0:01     29   257 474362 13952  1.3 gpk-update-icon
 2536 ?        S      0:00     92  1607 394224 13372  1.3 /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=22
 2537 ?        S      0:10     37   168 459795 12744  1.2 /usr/libexec/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf-ior-fd=34
 2445 ?        S      0:07     51   593 322206 12684  1.2 gnome-panel
 2438 ?        Sl     0:03     30   542 433105 12512  1.2 metacity


Like above examples you can create so many one liners for you. But before using anyone of above one command, check your ps and sort command behavior then use them.
Mostly, every other shell has its own argument for ps and sort but basics are same. For sorting any command output by particular column first understand that output/column and then use sort commnd.

I hope, you find this helpful. Keep Learning !!





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Tuesday, 19 January 2016

Python Points #7 - Loops

Sunday, 17 January 2016

Count of Jobs - A Quick DataStage Recipe



What to Cook:
How to count number of jobs in DS project

Ingredients:
Use dsjobs "-ljobs" command



How to Cook:
Go to DS Administrator "projects" tab
Click on command button
Enter following command to execute:
SH -c "dsjob -ljobs <Project Name> | wc -l"

<Project Name> - Enter your project name






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Tuesday, 12 January 2016

Python Points #6 - Strings

Monday, 4 January 2016

Monitoring Memory by DataStage Processes #2



You can find other parts here -> Monitoring Memory by DataStage Processes #1



Continuously capturing memory usage of all osh processes -


$ while:; do date; ps -e -o pid,ppid,user,vsz,time,args |grep -v grep| grep osh; sleep 3;done

To Stop, Use Ctrl-C

osh processes are processes created by DataStage parallel jobs. This command is used to monitor all osh processes over a period of time. In this example, the grep command is used to filter processes with the string “osh” but this can be modified if you want to filter processes by something else such as user ID or ppid. This loop is started every second but can also be modified by increasing the value after the sleep command. This command will continue to run until you press Ctrl + c.



Monitoring only new processes:

Sometimes it is difficult to find a common string to filter the processes you want to monitor. In those cases, and assuming that you can reproduce the problem or condition you want to analyze, you can use this script to keep track of all new processes.

This script will help us to monitor the new processes generated after this script 'ps_new.sh'. With this script, we can monitor the process of datastage job, which is executed after this script execution, specifically.

=======================================================================
=======================================================================

How to Use this script?

1. Run the script ps_new.sh
     ./ps_new.sh
2. Start the datastage job or reproduce the issue
3. Press Ctrl-C to stop the script ps_new.sh
4. Analyse the output file generated by script






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Monday, 28 December 2015

Python Points #5 - Lists

Thursday, 24 December 2015

Python Points #4 - Conditions

Tuesday, 15 December 2015

How to use Universe Shell (uvsh) in DataStage?


In DataStage Administration, we have to use datastage command line (universe shell) to get the information directly from the datastage universe database.

While accessing it from command line what novice admin do is -

$ uvsh
This directory is not set up for DataStage.
Would you like to set it up(Y/N)?   
Confused ? What to do ?

Always answer that question "no", it means you're in the wrong place.
Always launch "uvsh" or "dssh" from one of two places - $DSHOME or inside a project directory. For the latter you're good to go, for the former you'll need to LOGTO your project name before you issue any sql.



How to use UVSH?

## Entered into the $DSHOME
$ cd $DSHOME

## Sourced the dsenv file
$ . dsenv

## Change directory to the project directory.
$ LOGTO <project_name>

## Run uvsh command 
$ $DSHOME/bin/uvsh

Many Datastage admin support to execute command from Datastage Administrator or use dssh instead of uvsh.

How to use DSSH?
## Sourced the dsenv file
$ . $DSHOME/dsenv

## Change directory to the project directory.
$ LOGTO <project_name>

## Run dssh command 
$ $DSHOME/bin/dssh



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx