Monday, 31 August 2015

Linux Shell Scripting Best Practices

What is a script?
A script is a set of commands, either in a file, or typed in at the command line, that perform multiple operations on a file or files.

How do they work?
To run a file as a script, the file must have the execution bit set

Best Practices

¨     Writing clean code, using indents and using comments to explain parts of your code.
¨     Defining variables properly and avoid using unnecessary ones.
¨   Proper usage of functions. This would help edit, replace parts of the scripts easily without   proper functioning of the base script.
¨     Using log functions as when needed
¨     Hard-coding hostnames, files directory paths in a script should be avoided.
¨   The same form should be maintained throughout for the code. Inconsistent coding   styles make debugging a lot harder regardless of who's doing it.
¨     Proper documentation is very much needed.
¨  Proper testing needs to be done on Non-Production before implementing the scripts in a production environment.
¨     Generic scripts should have a common and understandable name.
¨    Understand the requirement and scope as where shell scripts can be implemented and where not.

Friday, 28 August 2015

Datastage Transformer String Functions - Index

Index(String, Substring, Occurrence):

It will returns starting character position of substring. We can use it as a 'grep' command of Linux when we have to check if given character or substring is existed in input string or not.

Ex:-
1. Suppose, we have to check whether substring 'le' is existing in input column 'Incol' or not?

Transformation derivation  -
If Index(Incol, 'le', 1) > 0 then 'Exist' Else 'Not Exists'

2. Let's assume we are getting 'India' as a input string, what will be the output of below derivations?

Transformation derivation  -
Index('India', 'a', 1)         =  5
Index('India', 'i', 1)         =   4   # this will not give 1 as matching is case sensitive
Index('India', 'nd', 1)      =   2

Thursday, 27 August 2015

DB2 Column Compare Tool

Whenever, we are working in Datawarehouse / Database (Oracle / MSSQL / DB2) projects, we often need to compare to tables column wise. When such requirement comes, we might end up in confusion putting the columns list in the excel sheet and comparing each column by column.

Here’s procedure you can create in your local DB and run it feeding it with Tables names as input, and in one run you get three outputs: Matched columns, Table1 unmatched columns, table2 unmatched columns. You might need to give schema name of Table if there are two tables in SYSCOLUMNS with the same name.

You can modify or re-write the same procedure for your respective DBs.

Introduction to vi Editor

• Full-screen editor
• Two modes of operation: command and text
• Utilizes one letter commands
• Does not format text
• Flexible search and replace facility with pattern matching
• Allows for user-defined editing functions using macros

Modes of Operation

The vi editor has two modes of operation. Command mode allows the user to perform
operations on the text, such as cut, paste, cursor movement, and replacement. These
operations are all invoked by one character “commands.” Text mode (also known as
insert mode) allows the user to enter text; all characters typed are inserted into the file.

There is no way to tell which mode the editor is in, other than pressing a key on the
keyboard. If the key you press is inserted into the file, the current mode is text mode. If
the key performs an action, the current mode is command mode.

Switching from command mode to text mode is done with one of the insert text
commands, which will be covered later. Returning to command mode is done by
pressing the <Esc> key.

$vi filename • If the file "filename" does not exist, it will be created • Otherwise, vi will open the existing file When a session is initiated, one of two things happens: • If the file to be edited exists, a copy of the file is put into a buffer in temporary by default. • If the file does not exist, an empty buffer is opened for this session. The tilde characters represent empty lines in the editor. The editor starts in command mode. Tuesday, 25 August 2015 Putty - Command Line Magic Continues...... A) Delete All PuTTY Sessions Together When you are swapping an old computer with a new computer, you may end-up transferring all PuTTY sessions to new computer. Once you’ve transferred all PuTTY sessions, execute “putty -cleanup” from the command line as shown below. Go to the path where Putty.exe in installed and execute below command C:>putty –cleanup B) Magic Command to BackUp Putty Sessions a) Open Run in Windows ( Window + R ) b) Put below command in Run regedit /e "%userprofile%\desktop\putty-registry.reg" HKEY_CURRENT_USER\Software\Simontatham c) This will create a reg file of Putty Session on your desktop. d) Putty Sessions are backed up. C) If you are fed up with putting Username again n again in multiple putty sessions. Try this a) Clicked on Any Saved Session and Load it b) Click Connection > Data in the Category navigation tree. Monday, 24 August 2015 Linux Shell Script Scenario Solution - 1 You can find the question HERE Solution : Sunday, 23 August 2015 Linux Shell Script Scenario - 2 - Write a shell script to list the all server Users with details whether they are valid user or not and of they are valid user what is the home directory for them. - Write a script that validates if there exist files that start with a pattern in a given directory. This should work like search as in Windows. - Write a script which will check for a given file in every 60 seconds in a given directory, if find, send us a mail else keep running. We do not want to set this script in crontab. For more scenario - CLICK HERE Saturday, 22 August 2015 Linux Shell Script Scenario - 1 Write a Shell Script to read a parameter file and run the other script with these parameters. Parameter File :  Script_Name = 'script.ksh' Arg1 = 13 Arg2 = 36  Now, Read this parameter file and kick off the command like below -  ./script.ksh 13 36 For more scenario - CLICK HERE VLookup in MS Excel We all agree on this, we have to use MS Excel in our day to day work whether it is technical or not. A day before I came across a Excel Function which is same as DS Lookup :-) called VLookUp. I loved this function cause this reduce my work of 30 min to few seconds. So just want to share this function here, by the way there are lot of material you can find on net for this but this is my version --- What was my task ? I have to pickup a hyperlink value from another sheet by matching a doc name. If Sheet1.Doc = Sheet2.Doc Then HyperLink Else '' My Excel sheets were look like -- The syntax of the VLOOKUP function is: =VLOOKUP(lookup value, table range, output column index , [true/false]) Lookup Value --- First Key column ( A1 ) Table Range --- Lookup Table Range ( Sheet2!$A$2:$B$15 ) Output column index --- output column number ( for my case 2 (col B)) true/false --- True means that an approximate match is acceptable, and False means that only an exact match is acceptable. Table Range can be defined like below -- Lookup_Sheet_Name ! First_Row_First_Col : Last_Row_Last_Col Suppose , your sheet name is - Sheet2 First_Row_First_Col - A2 -->$A$2 Last_Row_Last_Col - B15 -->$B$15 Table Range will be -- Sheet2!$A$2:$B$15 So, Our VLOOKUP function will be - =VLOOKUP(A1, Sheet2!$A$2:$B\$15, 2 , true)

Place this function to that cell where you want to populate the value from lookup table, this will work like -
If A1 = ( table range) then pick sheet 2 column no 2 Else ''

MongoDB & RDBMS Terminology

Term relationship between MongoDB ( NoSQL) and RDBMS -

 MongoDB Term RDBMS Term Database Database Table Collection Tuple or Row Document Column Field or Key Table Join Embeded Document Primary Key Primary Key ( default _id field) mongod ( DB process) oracle/db2 /mysqld mongo sqlplus/db2 client/mysql

Thursday, 20 August 2015

DataStage - A Journey from VMark to IBM

DataStage was conceived at VMark, a spin off from Prime Computers that developed two notable products: UniVerse database and the DataStage ETL tool. The first VMark ETL prototype was built by Lee Scheffler in the first half of 1996[1]. Peter Weyman was VMark VP of Strategy and identified the ETL market as an opportunity. He appointed Lee Scheffler as the architect and conceived the product brand name "Stage" to signify modularity and component-orientation[2]. This tag was used to name DataStage and subsequently used in related products QualityStage, ProfileStage, MetaStage and AuditStage. Lee Scheffler presented the DataStage product overview to the board of VMark in June 1996 and it was approved for development. The product was in alpha testing in October, beta testing in November and was generally available in January of 1997.

VMark acquired UniData in October of 1997 and renamed itself to Ardent Software[3(1)]. In 1999 Ardent Software was acquired by Informix[4(2)] the database software vendor. In April of 2001 IBM acquired Informix and took just the database business leaving the data integration tools to be spun off as an independent software company called Ascential Software[5(3)]. In March of 2005 IBM acquired Ascential Software[6(4)] and made DataStage part of the WebSphere family as WebSphere DataStage. In 2006 the product was released as part of the IBM Information Server under the Information Management family but was still known as WebSphere DataStage. In 2008 the suite was renamed to InfoSphere Information Server and the product was renamed to InfoSphere DataStage.

DataStage Editions
Enterprise Edition: a name give to the version of DataStage that had a parallel processing architecture and parallel ETL jobs.
Server Edition: the name of the original version of DataStage representing Server Jobs. Early DataStage versions only contained Server Jobs. DataStage 5 added Sequence Jobs and DataStage 6 added Parallel Jobs via Enterprise Edition.
MVS Edition: mainframe jobs, developed on a Windows or Unix/Linux platform and transferred to the mainframe as compiled mainframe jobs.
DataStage for PeopleSoft: a server edition with prebuilt PeopleSoft EPM jobs under an OEM arrangement with PeopeSoft and Oracle Corporation.
DataStage TX: for processing complex transactions and messages, formerly known as Mercator.
DataStage SOA: Real Time Integration pack can turn server or parallel jobs into SOA services.

Tuesday, 18 August 2015

Putty - Command Line Magic

We all are using ssh client PUTTY in day to day task and it is very irritating to login in different server again and again. Today, I come up with Putty command line by which we can make this so easy. Wanna to login, Just a click and Voila !!!

So let's start --

a.  First of all make a shortcut of your putty.exe file by right click --> Send to --> Desktop
b. ,This will display like below, just rename it with your server name or address to know which server is going to connect when we click on it.

c. Here, I have used my linux server  192,168,37.129
d. Now, Right click on Putty Shortcut  ---> Properties. This will display like below -

e. We have to edit the Target command  --

For me  :-
User Name - atul

edited command ---    -ssh user@server -pw password
For my case  ---    -ssh atul@192.168.37.129 -pw atul

Add this command to Target value , after whatever is existed there.  So Target's new value is ( in my case ) -   C:\_mine\putty.exe -ssh atul@192.168.37.129 -pw atul

f. Click on Apply and OK.
g. For accessing 192.168.37.129 without entering username and password, simply click on edited shortcut.

** Caution ** :  Do not edit putty in public computer as your username and password is normal text which can be misused by anyone.

Saturday, 15 August 2015

MongoDB - Installation and Configuration in Linux

MongoDB  is an open-source document database, and the leading NoSQL database. Written in C++.

MongoDB features:

Document-Oriented Storage
Full Index Support
Replication & High Availability
Auto-Sharding
Querying