Showing posts with label Monitor. Show all posts
Showing posts with label Monitor. Show all posts

Saturday, 1 April 2017

Get Over Running DataStage Job Details


With the script below, we will get a list of jobs which are taking more time to complete than last run time. By some tweaks, we can use this script to monitor any kind of process.






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Wednesday, 12 October 2016

Script to Auto Compress the System Log Files


This Script is originally written by "Andy Welter" to compress the linux system log files. I have modified it to work better. You can find it below (modified version) -

Script Usage:
logroll [-compress|-nocompress]

$ logroll -compress
# compress the log files and move to archive directory

logroll -nocompress
# Move the file to archive directory with compressing it 


=================================================================== ===================================================================




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 4 January 2016

Monitoring Memory by DataStage Processes #2



You can find other parts here -> Monitoring Memory by DataStage Processes #1



Continuously capturing memory usage of all osh processes -


$ while:; do date; ps -e -o pid,ppid,user,vsz,time,args |grep -v grep| grep osh; sleep 3;done

To Stop, Use Ctrl-C

osh processes are processes created by DataStage parallel jobs. This command is used to monitor all osh processes over a period of time. In this example, the grep command is used to filter processes with the string “osh” but this can be modified if you want to filter processes by something else such as user ID or ppid. This loop is started every second but can also be modified by increasing the value after the sleep command. This command will continue to run until you press Ctrl + c.



Monitoring only new processes:

Sometimes it is difficult to find a common string to filter the processes you want to monitor. In those cases, and assuming that you can reproduce the problem or condition you want to analyze, you can use this script to keep track of all new processes.

This script will help us to monitor the new processes generated after this script 'ps_new.sh'. With this script, we can monitor the process of datastage job, which is executed after this script execution, specifically.

=======================================================================
=======================================================================

How to Use this script?

1. Run the script ps_new.sh
     ./ps_new.sh
2. Start the datastage job or reproduce the issue
3. Press Ctrl-C to stop the script ps_new.sh
4. Analyse the output file generated by script






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Wednesday, 9 December 2015

ps command #3 - Sorting



For sorting the ps command output, we can use ps --sort option ( it is not linux sort command). More details can be found on man page of ps command.

--sort spec     specify sorting order. Sorting syntax is [+|-]key[,[+|-]key[,...]] Choose a multi-letter key from the 
                STANDARD FORMAT SPECIFIERS section. The "+" is optional since default direction is increasing numerical
                or lexicographic order. Identical to k. 
                For example: ps jax --sort=uid,-ppid,+pid


ps command output - sorted by memory used ( high to low)

$ ps aux --sort -rss

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
atul     43584  0.0 16.0 633196 162468 ?       Sl   Dec05   1:01 evince /home/atul/Desktop/Learning/book.pdf
atul     17099  0.3 15.7 1244044 159208 ?      Sl   Dec04  10:00 /usr/lib64/firefox/firefox
root      2272  0.2  8.5 223428 86132 tty1     Ss+  Dec03  12:36 /usr/bin/Xorg :0 -br -verbose -audit 4 -auth /var/run/gdm/auth-for-gdm-kda2x4/database -nolisten tcp vt1
atul      2773  0.0  5.3 1199004 53952 ?       Sl   Dec03   1:18 nautilus
atul      2827  0.0  3.5 296192 36036 ?        Ss   Dec03   0:56 gnome-screensaver
atul     43834  0.0  1.2 990904 12892 ?        Sl   Dec05   1:45 /home/atul/Desktop/sublime_text
atul      2799  0.1  1.1 371080 11216 ?        S    Dec03   8:39 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr --blockFd 3
atul     22246  0.0  0.9 300112 10072 ?        Sl   Dec06   0:12 gnome-terminal
atul      2767  0.0  0.7 502416  7464 ?        Sl   Dec03   0:35 gnome-panel
atul     22937  0.0  0.7 305276  7364 ?        S    Dec06   0:00 gedit
atul      2811  0.0  0.6 324292  6332 ?        S    Dec03   0:00 python /usr/share/system-config-printer/applet.py
root     44117  0.0  0.6  50068  6132 ?        Ss   Dec05   0:02 /usr/sbin/restorecond -u
atul      2852  0.0  0.5 548844  5476 ?        S    Dec03   0:13 /usr/libexec/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf-ior-fd=28
atul      2788  0.0  0.4 331480  5032 ?        S    Dec03   0:48 /usr/libexec/wnck-applet --oaf-activate-iid=OAFIID:GNOME_Wncklet_Factory --oaf-ior-fd=18
atul      2760  0.0  0.4 447048  4900 ?        Sl   Dec03   0:26 metacity
atul      2783  0.0  0.4 469076  4464 ?        Sl   Dec03   0:03 gpk-update-icon
atul      2817  0.0  0.3 262056  3608 ?        S    Dec03   0:01 bluetooth-applet



If want the list from low to high , remove '-' before argument

$ ps aux --sort rss

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         2  0.0  0.0      0     0 ?        S    Dec03   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Dec03   0:00 [migration/0]
root         4  0.0  0.0      0     0 ?        S    Dec03   0:05 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    Dec03   0:00 [stopper/0]
root         6  0.0  0.0      0     0 ?        S    Dec03   0:02 [watchdog/0]
root         7  0.0  0.0      0     0 ?        S    Dec03   4:15 [events/0]
root         8  0.0  0.0      0     0 ?        S    Dec03   0:00 [events/0]
root         9  0.0  0.0      0     0 ?        S    Dec03   0:00 [events_long/0]
root        10  0.0  0.0      0     0 ?        S    Dec03   0:00 [events_power_ef]
root        11  0.0  0.0      0     0 ?        S    Dec03   0:00 [cgroup]
root        12  0.0  0.0      0     0 ?        S    Dec03   0:00 [khelper]
root        13  0.0  0.0      0     0 ?        S    Dec03   0:00 [netns]
root        14  0.0  0.0      0     0 ?        S    Dec03   0:00 [async/mgr]
root        15  0.0  0.0      0     0 ?        S    Dec03   0:00 [pm]
root        16  0.0  0.0      0     0 ?        S    Dec03   0:03 [sync_supers]
root        17  0.0  0.0      0     0 ?        S    Dec03   0:02 [bdi-default]


Sort ps output by pid -


$ ps aux --sort pid       # pid from low to high
$ ps aux --sort -pid      # pid from high to low


GNU sort specifiers - 


STANDARD FORMAT SPECIFIERS

Here are the different keywords that may be used to control the output format (e.g. with option -o) or to sort the
selected processes with the GNU-style --sort option.

For example:  ps -eo pid,user,args --sort user

This version of ps tries to recognize most of the keywords used in other implementations of ps.

The following user-defined format specifiers may contain spaces: args, cmd, comm, command, fname, ucmd, ucomm, lstart,
bsdstart, start.

Some keywords may not be available for sorting.

CODE       HEADER   DESCRIPTION

%cpu       %CPU     cpu utilization of the process in "##.#" format. Currently, it is the CPU time used divided by the
                    time the process has been running (cputime/realtime ratio), expressed as a percentage. It will not
                    add up to 100% unless you are lucky. (alias pcpu).

%mem       %MEM     ratio of the process’s resident set size  to the physical memory on the machine, expressed as a
                    percentage. (alias pmem).

bsdstart   START    time the command started. If the process was started less than 24 hours ago, the output format is
                    " HH:MM", else it is "mmm dd" (where mmm is the three letters of the month).

bsdtime    TIME     accumulated cpu time, user + system. The display format is usually "MMM:SS", but can be shifted to
                    the right if the process used more than 999 minutes of cpu time.

c          C        processor utilization. Currently, this is the integer value of the percent usage over the lifetime
                    of the process. (see %cpu).

comm       COMMAND  command name (only the executable name). Modifications to the command name will not be shown.
                    A process marked <defunct> is partly dead, waiting to be fully destroyed by its parent. The output
                    in this column may contain spaces. (alias ucmd, ucomm). See also the args format keyword, the -f
                    option, and the c option.
                    When specified last, this column will extend to the edge of the display. If ps can not determine
                    display width, as when output is redirected (piped) into a file or another command, the output
                    width is undefined. (it may be 80, unlimited, determined by the TERM variable, and so on) The
                    COLUMNS environment variable or --cols option may be used to exactly determine the width in this
                    case. The w or -w option may be also be used to adjust width.

command    COMMAND  see args. (alias args, cmd).

cp         CP       per-mill (tenths of a percent) CPU usage. (see %cpu).

cputime    TIME     cumulative CPU time, "[dd-]hh:mm:ss" format. (alias time).

egroup     EGROUP   effective group ID of the process. This will be the textual group ID, if it can be obtained and
                    the field width permits, or a decimal representation otherwise. (alias group).

etime      ELAPSED  elapsed time since the process was started, in the form [[dd-]hh:]mm:ss.

euid       EUID     effective user ID. (alias uid).

euser      EUSER    effective user name. This will be the textual user ID, if it can be obtained and the field width
                    permits, or a decimal representation otherwise. The n option can be used to force the decimal
                    representation. (alias uname, user).

gid        GID      see egid. (alias egid).

lstart     STARTED  time the command started.

ni         NI       nice value. This ranges from 19 (nicest) to -20 (not nice to others), see nice(1). (alias nice).

pcpu       %CPU     see %cpu. (alias %cpu).

pgid       PGID     process group ID or, equivalently, the process ID of the process group leader. (alias pgrp).

pid        PID      process ID number of the process.

pmem       %MEM     see %mem. (alias %mem).

ppid       PPID     parent process ID.

rss        RSS      resident set size, the non-swapped physical memory that a task has used (in kiloBytes).
                    (alias rssize, rsz).

ruid       RUID     real user ID.

size       SZ       approximate amount of swap space that would be required if the process were to dirty all writable
                    pages and then be swapped out. This number is very rough!

start      STARTED  time the command started. If the process was started less than 24 hours ago, the output format is
                    "HH:MM:SS", else it is "  mmm dd" (where mmm is a three-letter month name).

sz         SZ       size in physical pages of the core image of the process. This includes text, data, and stack
                    space. Device mappings are currently excluded; this is subject to change. See vsz and rss.

time       TIME     cumulative CPU time, "[dd-]hh:mm:ss" format. (alias cputime).

tname      TTY      controlling tty (terminal). (alias tt, tty).

vsz        VSZ      virtual memory size of the process in KiB (1024-byte units). Device mappings are currently
                    excluded; this is subject to change. (alias vsize).







Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Tuesday, 8 December 2015

ps command #2 - Advance


ps command #1

Usually, when we are monitoring process, we are targeting something which can impact our server performance or some specific process. For doing so we grep the ps output -

This is how we call list all http processes -
$ ps aux | grep http
atul      7585  0.0  0.0 177676   592 ?        S    Dec06   0:00 /usr/libexec/gvfsd-http --spawner :1.7 /org/gtk/gvfs/exec_spaw/2
root     28848  0.0  0.0   2700   168 pts/0    D+   02:49   0:00 grep http

you can filter ps command output by any keyword as above.

There are some ps options which can give you a customized output -

To see every process on the system using standard syntax:
$ ps -e
$ ps -ef
$ ps -eF
$ ps -ely


To see every process on the system using BSD syntax:
$ ps ax
$ ps axu


To print a process tree:
$ ps -ejH
$ ps axjf


To get info about threads:
$ ps -eLf
$ ps axms


To get security info:
$ ps -eo euser,ruser,suser,fuser,f,comm,lable
$ ps axZ
$ ps -eM

To see every process running as root (real & effective ID) in user format:
$ ps -U root -u root u


To see every process with a user-defined format:
$ ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
$ ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
$ ps -eopid,tt,user,fname,tmout,f,wchan   

Print only the process IDs of process syslogd:
$ ps -C syslogd -o pid=
 #ps -C <process_name> -o pid=


Print only the name of PID 42:
$ ps -p 42 -o comm=
  #ps -p <process_id> -o comm=






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Monday, 30 November 2015

Monitoring Memory by DataStage Processes #1



Before going to monitoring memory, we need to clear about why and when we have to monitor memory on the server?

Why & When?

  • Troubleshooting and to identify potential resource bottlenecks
  • Detect memory leaks
  • To check resource usage to plan better capacity planning
  • More Memory, Better Performace



To monitor DataStage Memory Usage, we have to work on these 3 points -

1. Monitor memory leaks
               Analyzing memory usage can be useful in several scenarios. Some of the most common scenarios include identifying memory leaks. A memory leak is a type of bug that causes a program to keep increasing its memory usage indefinitely.

2. Tune job design
               Comparing the amount of memory different job designs consume can help you tune your designs to be more memory efficient.

3. Tune job scheduling
               The last scenario is to tune job scheduling. Collecting memory usage by processes over a period of time can help you organize job scheduling to prevent peaks of memory consumption.


Monitoring Memory Usage with ps Command -

- Simple command available in all UNIX/Linux platforms
- Basic syntax to monitor memory usage

ps —e —o pid, ppid, user, vsz, etime, args 

Where  -
pid - process id
ppid - parent's process id
user - user that owns process
vsz - amount of virtual memory
etime - elapsed time process has been running

args - command line that started process


Other ps monitoring -- Check Memory Utilization by Datastage processes

More will be in next post.  Monitoring Memory by DataStage Processes #2



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Thursday, 15 October 2015

Behavior of Multi-Instance job in Director Client



Multi-Instance Job:
                DataStage is supporting Multi-Instance jobs which can be run at same time with different invocation id.
      Today, we will discuss the behavior of multi-instance datastage job in Director.


Running Jobs:
                When we run the multi-instance job, it will ask for Invocation Id to be passed, when the job is running, it will display a new job in director in format <JOB_Name>.<Invok_Id>. Nothing change in original job, it is still in compiled status. So, if we invoke job 3 times with 3 invocation id, it will generate 3 jobs in director -

Jobname.InvkId1
Jobname.InvkId3
Jobname.InvkId3


Monitoring Jobs: 
                 We can monitor each invoked job as it is been generated and visible in the Director with invocation id. But the tool is using the same RT_LOGnn file to write the job log for all invocation id, So we can see the n instance in director and its log in the Director but in backend, it is a single file. We can monitor, stop and check individual job log.


Deleting Jobs:
                If we delete a job instance from director, it will be deleted and other instances will remain there But the job log for this instance is still with RT_LOGnn file ( we can access through from Datastage Command Line but not in Datastage Director as job instance has been deleted).


Purging Job logs:
               If we purge the job log in datastage, it will delete the job instances as well as job logs from RT_LOGnn file. So the difference here is that the Director delete action only deletes records from RT_STATUS whereas the purging mechanism deletes records from RT_LOG.
                  





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx