Wednesday, 19 October 2016

Jenkins with Windows #3

Continuing my last posts (Jenkins with Windows #1 & Jenkins with Windows #2)..I am adding some more tips or solution with Jenkins installation.

Install Jenkins as a Windows service - 
You can follow the Jenkins Wiki page to create a windows service for jenkins so whenever you have to start/stop/restart, you no need to open a command prompt and type some commands.
It will install as a service in windows which you can see in services (services.msc).  Please find the page link as below -

Jenkins Wiki Page Link - Install Jenkins as a Windows service

http://www.datagenx.net/2016/10/jenkins-with-windows-1.html


Configuring Jenkins as a Windows service -
When you install jenkins as windows service, you have to change/check jenkins.xml file created in JENKINS_HOME directory.

If you are using port other than 8080, Do the below changes in jenkins.xml Else leave as it is.

<!--
    if you'd like to run Jenkins with a specific version of Java, specify a full path to java.exe.
    The following value assumes that you have java in your PATH.
  -->
  <executable>java</executable>
  <arguments>-Xrs -Xmx256m -Dhudson.lifecycle=hudson.lifecycle.WindowsServiceLifecycle -jar "%BASE%\jenkins.war" --httpPort=9090 --webroot="%BASE%\war"</arguments>


Change the httpPort to port where you want to run your jenkins and then restart the jenkins service. Now you can access jenkins on http://localhost:9090








Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 18 October 2016

Jenkins with Windows #2



In last post, we covered What is jenkins and it's benefits (http://bit.ly/2e7mNbb). Today we are going to cover it's installtion on windos.


Prerequisite & Installation Steps:

1. Your system should have latest java installed and java path should be configured. You can check it as below - 
  
2.  Download the letest win-bash which is required by Jenkin - https://sourceforge.net/projects/win-bash/


3. Extract the win-bash zip file and add the folder path to system environment variable PATH. You can check whether it is configured it or not by executing linux command




4. Check if your 8080 port is available
              netstat -ano | findstr 8080
 If you get any output like below, It means 8080 port is not available. 

5. Configure the JENKINS_HOME variable (add JENKINS_HOME in environment variable and add JENKINS_HOME into PATH variable too.


 6. Add "C:\Windows\SysWOW64" in your PATH
 



7. Go to JenKins home page - https://jenkins.io
8. Download the latest jenkins.war files by clicking on Download Jenkins

9. Execute the below command to install the jenkins

If your 8080 port is available then -
               java -jar jenkins.war
else
               java -jar jenkins.war --httpPort=9090 (you can give any available port)



10. When installation is complete, (Do not close your installation command prompt) Open below link in web browser -
               http://localhost:9090/
       This will route you too, very first screen of jenkins as below -


11. The password you can get in either installation log as below Or the directory mentioned in the page - 


12. After successful login, you will be moved to plugin screen (Choose Install suggested plugins) -

 Plugin installation -

12. After Plugin installation, create a new admin user - 
13. After user creation, Choose Save and Close
14. Refresh the page and login with new admin credentials you created.

15. Jenkins Installation is completed.

 In next post, we will see How to install Jenkins as windows service and What if you missed the new admin user creation.




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Sunday, 16 October 2016

Jenkins with Windows #1


One of my team member is assigned to install and configure "jenkins" on our server so out of curiosity, I asked what is this now ?? but didn't get a satisfactory answer :-) so I thought of having my hand dirty with it. Here, I am sharing whatever I learn.

What is Jenkins:-
Wikipedia sources says,  Jenkins is an open source automation server written in Java. Jenkins helps automating the non-human part of the whole software development process, with now common things like Continuous Integration, but by further empowering teams to implement the technical part of a Continuous Delivery.

https://jenkins.io/

What is Continuous Integration & Continuous Delivery:-
CI is a process that most developers follow to keep their code base intact. It's mostly a common practice when you work in a group environment. For example, an analogy for this would be constructing a new home. There will be multiple contractors working on the site. So, if we have installed the window glasses and the paint person comes in and paints the house there are high chances that he will drop some paint on the glasses or end up breaking the glass. So, the inspector comes and checks it every day to see if something broke. The same process is applied for constructing a new code. CI system gathers all your code from different developers and makes sure it compiles and build fine. This is good. But, not complete. I will get to that once I complete talking about Jenkins.


Jenkins is the inspector in the analogy. Jenkins is nothing but a middle man between your code repo and your build server. It checks for changes on your server every few minutes. If it found them, it gathers them and sends them to your build server. That's what Jenkins is.

Basically Continuous Integration is the practice of running your tests on a non-developer machine automatically everytime someone pushes new code into the source repository.

This has the tremendous advantage of always knowing if all tests work and getting fast feedback. The fast feedback is important so you always know right after you broke the build (introduced changes that made either the compile/build cycle or the tests fail) what you did that failed and how to revert it.

If you only run your tests occasionally the problem is that a lot of code changes may have happened since the last time and it is rather hard to figure out which change introduced the problem. When it is run automatically on every push then it is always pretty obvious what and who introduced the problem.

Built on top of Continuous Integration are Continuous Deployment/Delivery where after a successful test run your instantly and automatically release the latest version of your codebase. Makes deployment a non-issue and helps you speed up your development.


                              Jenkins offers the following major features out of the box, and many more can be added through plugins:

Developer time is focused on work that matters — Much of the work of frequent integrations is handled by automated build and testing systems, meaning developer time isn't wasted on large-scale error-ridden integrations.

Software quality is improved — Any issues are detected and resolved almost immediately, keeping software in a state where it can be safely released at any time.

Faster Development - Integration costs are reduced both because serious integration issues are less likely and because much of the work of integration is automated.

Easy installation: Just run java -jar jenkins.war, deploy it in a servlet container. No additional install, no database. Prefer an installer or native package? We have those as well.

Easy configuration: Jenkins can be configured entirely from its friendly web GUI with extensive on-the-fly error checks and inline help.

Rich plugin ecosystem: Jenkins integrates with virtually every SCM or build tool that exists.

Extensibility: Most parts of Jenkins can be extended and modified, and it's easy to create new Jenkins plugins. This allows you to customize Jenkins to your needs.

Distributed builds: Jenkins can distribute build/test loads to multiple computers with different operating systems. Building software for OS X, Linux, and Windows? No problem.


Check out the part 2 for Installation.



Sources:
https://jenkins.io/ https://en.wikipedia.org/wiki/Jenkins_(software) http://stackoverflow.com https://www.quora.com



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Wednesday, 12 October 2016

Script to Auto Compress the System Log Files


This Script is originally written by "Andy Welter" to compress the linux system log files. I have modified it to work better. You can find it below (modified version) -

Script Usage:
logroll [-compress|-nocompress]

$ logroll -compress
# compress the log files and move to archive directory

logroll -nocompress
# Move the file to archive directory with compressing it 


=================================================================== ===================================================================




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 11 October 2016

5 Tips For Better DataStage Design #16


1. Use 4-node configuration file for unit testing/system testing the job.
2. If there are multiple jobs to be run for the same module. Archive the source files in the after job routine of the last job.
3. Check whether the file exists in the landing directory before moving the sequential file. The ‘mv’ command will move the landing directory if the file is not found.

4. Ensure that the unix files created by any Datastage job is created by the same unix user who has run the job.
5. Make sure that the Short Job Description is filled using ‘Description Annotation’ and it contains the job name as part of the description. Don’t use Annotation for putting the job description.





Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 10 October 2016

MarkDown Cheat Sheet

Saturday, 8 October 2016

#2 DataStage Solutions to Common Warnings/Error - Null Handling


Warnings/Errors Related to Null Handling -



1.1       When checking operator: When binding output interface field “XXXXX” to field “XXXXX”: Converting a nullable source to a non-nullable result

Cause: This can happen when reading from oracle database or in any processing stage where input column is defined as nullable and metadata in datastage is defined as non-nullable.

Resolution: Convert a nullable field to non  nullable. Need to apply available null functions in datastage or in the query.


1.2       APT_CombinedOperatorController(1),0: Field 'XXXXX' from input dataset '0' is NULL. Record dropped.

Cause: This can happen when there is no null handling mentioned on column and the same column is used in constraints/Stage Varibales.

Resolution:  Provide Null handling function to the column mentioned in constraint/Stage variable.


http://www.datagenx.net/2016/09/datastage-solutions-to-common.html


1.3       Fatal Error: Attempt to setIsNull() on the accessor interfacing to non-nullable field "XXXX".

Cause: This can happen when the column in source is nullable but in DB2 stage its mentioned as Non Nullable

Resolution: Change the Nullable field for the column to “Yes” instead of “No” i.e.


1.4       Exporting nullable field without null handling properties

Cause: This can happen when the columns are mentioned as nullable in sequential file stage and no representation for null values was specified.

Resolution: Specify Null field value in Format tab of sequential file stage.






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 6 October 2016

Difference between IBM CDC and CDD



Now a days I am busy with an another IBM tool called CDC - Change Data Capture, now knows as IBM Infosphere Data Replication.
IBM CDC replicates your heterogeneous data in near real time to support data migrations, application consolidation, data synchronization, dynamic warehousing, master data management (MDM), business analytics and data quality processes.
In layman's terms, you can replicate any data automatically in near real time.

When we were at initial stage of our POC, we got a doubt between IBM CDC and CDD (Change Data Delivery) which one we have to use and what is the difference and all. Whenever we were googling about IBM CDD, getting the result with IBM CDC links, In some way we were sure that both are same tool or related with each other. Luckily we found an IBM link which says -
InfoSphere CDD is the exact same code (product) as InfoSphere CDC. The only difference is the licensing model. Please reach out to your IBM Sales Representative for additional details 
(http://www-01.ibm.com/support/docview.wss?uid=swg21650361)

So we have contacted our IBM sales buddy to understand the licensing model and this is what we got to know -

* IBM CDC and CDD is same product
* IBM CDC comes with Source and Target Agents and there is individual licensing cost for each one agent.
i.e. - Assuming you are having 2 different database vender (Oracle, Db2) at source side and 3 target db vender (SQL Server, MySQL, Netezza) so you have to pay for 5 replication CDC agent which can be increase or decrease with no of different database software you are using.

www.datagenx.net

* IBM CDD (Change Data Delivery) comes with little relief with costing but only if you already have IBM DataStage license. IBM CDD comes with multiple source agents software and one target agents for IBM DataStage which saves the cost.
 i.e . - Let's take the previous example, We need to buy 2 (for source dbs) + 1(for datastage) license to use Replication.
The benfit here is, you only need to pay for 1 target agent (for DataStage) and use DataStage to deliver your data to any target which saves multiple target license cost.

www.datagenx.net


This is the only reason why they have different names despite being the same software - 
IBM CDC - Change Data Capture ( n source + n target )  [DataStage not required]
IBM CDD - Change Data Delivery ( n source  + 1 target ) [DataStage required]




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/