Showing posts with label memory. Show all posts
Showing posts with label memory. Show all posts

Saturday, 23 June 2018

yet another In Memory DataBase - MemSQL


Writing this post after so many days, yet another IN-MEMORY database is in market which tag line promised with"The DataBase for Real-Time Applications". 

As per MemSQL site - MemSQL is a scalable SQL database that ingests data continuously to perform operational analytics for the front lines of your business. Ingest millions of events per day with ACID transactions while simultaneously analyzing billions of rows of data in relational SQL, JSON, or Geospatial formats.

In my current assignment, I've been asked to look into the capabilities of this db, so starting with very first step "Installation" - 

Installation in linux is quite simple, if you are OK with linux commands, You can follow the installation from HERE

1. Download the software (with sudo or root user) - 

sudo su - root
wget http://download.memsql.com/memsql-ops-6.0.11/memsql-ops-6.0.11.tar.gz

2. Extract the tar ball

tar zxvf memsql-ops-6.0.11.tar.gz

This command will extract lots n lots of files :-)

3. Run the installer script

cd memsql-ops-6.0.11

sudo ./install.sh --simple-cluster



By default, MemSQL supports the machine with 4 cpu core and 8 GB of RAM (which is little unfair;-)) so remove this constraints by below argument -

cd memsql-ops-6.0.11

sudo ./install.sh --simple-cluster --ignore-min-requirements


After being successful installation of MemSQL, it will start setting up MemSQL WebUI.



You can access the MemSQL WebUI on the sever's 9000 port by default. 

https://<SERVER_IP>:9000

4. To connect to MemSQL command line, execute - 

memsql


In next post, I will explain how this db is different than other in-memory db. Till then, Keep Learning , Keep Loving.




Like the below page to get update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Monday, 4 January 2016

Monitoring Memory by DataStage Processes #2



You can find other parts here -> Monitoring Memory by DataStage Processes #1



Continuously capturing memory usage of all osh processes -


$ while:; do date; ps -e -o pid,ppid,user,vsz,time,args |grep -v grep| grep osh; sleep 3;done

To Stop, Use Ctrl-C

osh processes are processes created by DataStage parallel jobs. This command is used to monitor all osh processes over a period of time. In this example, the grep command is used to filter processes with the string “osh” but this can be modified if you want to filter processes by something else such as user ID or ppid. This loop is started every second but can also be modified by increasing the value after the sleep command. This command will continue to run until you press Ctrl + c.



Monitoring only new processes:

Sometimes it is difficult to find a common string to filter the processes you want to monitor. In those cases, and assuming that you can reproduce the problem or condition you want to analyze, you can use this script to keep track of all new processes.

This script will help us to monitor the new processes generated after this script 'ps_new.sh'. With this script, we can monitor the process of datastage job, which is executed after this script execution, specifically.

=======================================================================
=======================================================================

How to Use this script?

1. Run the script ps_new.sh
     ./ps_new.sh
2. Start the datastage job or reproduce the issue
3. Press Ctrl-C to stop the script ps_new.sh
4. Analyse the output file generated by script






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Monday, 30 November 2015

Monitoring Memory by DataStage Processes #1



Before going to monitoring memory, we need to clear about why and when we have to monitor memory on the server?

Why & When?

  • Troubleshooting and to identify potential resource bottlenecks
  • Detect memory leaks
  • To check resource usage to plan better capacity planning
  • More Memory, Better Performace



To monitor DataStage Memory Usage, we have to work on these 3 points -

1. Monitor memory leaks
               Analyzing memory usage can be useful in several scenarios. Some of the most common scenarios include identifying memory leaks. A memory leak is a type of bug that causes a program to keep increasing its memory usage indefinitely.

2. Tune job design
               Comparing the amount of memory different job designs consume can help you tune your designs to be more memory efficient.

3. Tune job scheduling
               The last scenario is to tune job scheduling. Collecting memory usage by processes over a period of time can help you organize job scheduling to prevent peaks of memory consumption.


Monitoring Memory Usage with ps Command -

- Simple command available in all UNIX/Linux platforms
- Basic syntax to monitor memory usage

ps —e —o pid, ppid, user, vsz, etime, args 

Where  -
pid - process id
ppid - parent's process id
user - user that owns process
vsz - amount of virtual memory
etime - elapsed time process has been running

args - command line that started process


Other ps monitoring -- Check Memory Utilization by Datastage processes

More will be in next post.  Monitoring Memory by DataStage Processes #2



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx

Tuesday, 10 November 2015

Check Memory Utilization by Datastage processes



While we are running lots of DataStage job on a Linux DataStage server or different environment are sharing the same server which cause the resource crunch at server-side which affect the job performance.

It's always preferable to have an eye on resource utilization while we are running jobs. Mostly, DataStage admin set a cron with a resource monitoring script which will invoke in every five min ( or more) and check the resource statistics on server and notify them accordingly.

The following processes are started on the DataStage Engine server as follows:

dsapi_slave - server side process for DataStage clients like Designer

osh - Parallel Engine process
DSD.StageRun - Server Engine Stage process
DSD.RUN - DataStage supervisor process used to initialize Parallel Engine and Server Engine jobs. There is one DSD.RUN process for each active job

ps auxw | head -1;ps auxw | grep dsapi_slave | sort -rn -k5  | head -10
USER   PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
atul 38846  0.0  0.0 103308   856 pts/0    S+   07:20   0:00 grep dsapi_slave


The example shown lists the top 10 dsapi_slave processes from a memory utilization perspective. We can substitute or add an appropriate argument for grep like osh, DSD.RUN or even the user name that was used to invoke a DataStage task to get a list that matches your criteria.



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://groups.google.com/forum/#!forum/datagenx