Showing posts with label Key. Show all posts
Showing posts with label Key. Show all posts

Tuesday, 30 October 2018

Converting PFX Certs to Certificate and Key Files using OpenSSL



Recently, I have an ask where we want to automate Web Service Health Check manual steps. I have choose Python to do that (Cause one only my fav language :-) but faced a very initial hiccup as PFX certs. As our web services are SSL enabled which need Public/Private Key (Or Certificate) to authenticate the Consumer and Python supports certificates and keys only in PEM format.

So let's learn how we can convert PFX (Personal Information Exchange) certs to PEM (Privacy Enhanced Mail) certificate.



  • Extract certificate.pem or certificate.cer from certificate.pfx:
  • -


  • Extract Key - 
This is 2 step process as below, first command extract the encrypted key and then second command use to decrypt the key to simple text.
-


Now, we can use these keys in our Python Health Check code, we will discuss that later.
Till then, Keep Loving this Blog .......... Happy Learning...........



Like the below page to get update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Friday, 11 November 2016

DataStage Partitioning #2


Keyless partition
    Round Robin
    Entire
    Same
    Random

Round Robin
The first record goes to the first processing node, the second to the second processing node, and so on. When DataStage reaches the last processing node in the system, it starts over. This method is useful for resizing partitions of an input data set that are not equal in size. The round robin method always creates approximately equal-sized partitions. This method is the one normally used when DataStage initially partitions data.
Example: Assume 8 nodes are allocated to store the records then 1st record will go into the first node and the 2nd record will go into the second node ……8th record will go into the eighth node and the 9th record will go into the first node and so on….

Entire
Send all rows down all partitions.
Example: Assume 8 nodes are allocated, then in all the 8 nodes all the records will be passed.

Same
Preserve the same partitioning.
Example: Two stages in a job (Sort and Dataset). In sort stage you have done “Hash” partition and in the dataset you have given “Same” partition. In the dataset the data will be preserved with the hash partition.

Random
DataStage uses a random algorithm to choose where the rows goes. The result of Random is that you cannot know where a row will end up.


Application Execution: Parallel jobs can be executed in two ways
> Sequential
> Parallel
In the first slot of the below figure sequential execution is shown. Parallel job can be executed in two processing, SMP and MPP, in the second and third slot of the below figure it is shown.




Ref - www.ibm.com


Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 10 November 2016

DataStage Partitioning #1


Partitioning mechanism divides a portion of data into smaller segments, which is then processed independently by each node in parallel. It helps make a benefit of parallel architectures like SMP, MPP, Grid computing and Clusters.

Partition is logical. Partition is to divide memory or mass storage into isolated sections. Memory space will be split into many partitions to have high parallelism. In DOS systems, you can partition a disk, and each partition will behave like a separate disk drive.


Note:
In hash partitioning no specified space will be allocated to a partition in the memory. The partition space is allocated depending upon the data.



Why Partition?
•    Ability to run multiple operating systems, or multiple versions of an operating system, on the same server
•    Ability to improve workload balancing and distribution by managing processor allocations across applications and users on the server
•    Ability to leverage hardware models such as “Capacity on Demand” and "Pay as You Grow.”

Types of partition
  • Hash
  • Modulus
  • DB2
  • Auto
  • Random
  • Range
  • Round Robin
  • Entire
  • Same

Auto

DataStage inserts partitioners as necessary to ensure correct result. Generally chooses Round Robin or Same. Since Datastage has limited awareness of data and business rules, best practice is to explicitly specify partitioning as per requirement when processing requires groups of related records.

Key based partition
  • Hash
  • Modulus
  • DB2
  • Range

Hash
Determines partition based on key value(s). One or more keys with different data type are supported. DataStage’s internal algorithm applied to key values determines the partition. All key values are converted to characters before the algorithm is applied.
Example: Key is State. All “CA” rows go into one partition; all “MA” rows go into one partition. Two rows of the same state never go into different partitions.

Modulus

Partition based on modulus of key divided by the number of partitions. Key is an Integer type. ( partition=MOD(key_value/number of partition) )
Example: Key is OrderNumber (Integer type). Rows with the same order number will all go into the same partition.

DB2
Matches DB2 EEE partitioning, DB2 published its hashing algorithm and DataStage copies that.
Example: This partition is used when loading data into the DB2 table. It takes the partition key from the loading DB2 table and inserts the records effectively. If the partition key is defined in the DB2 database then it takes that Partition key otherwise it defaults to primary key.

Range
The partition is chosen based on a range map, which maps ranges of values to specified partitions. This is similar to Hash, but partition mapping is user-determined and partitions are ordered. Range partitioning requires processing the data twice which makes it hard to find a reason for using it.

This figure gives the clear view of Key based Partitioning and repartitioning.



DataStage's parallel technology operates by a divide-and-conquer technique, splitting the largest integration jobs into subsets ("partition parallelism") and flowing these subsets concurrently across all available processors ("pipeline parallelism"). This combination of pipeline and partition parallelism delivers true linear scalability (defined as an increase in performance proportional to the number of processors) and makes hardware the only mitigating factor to performance.

                However, downstream processes may need data partitioned differently. Consider a transformation that is based on customer last name, but the enriching needs to occur on zip code - for house-holding purposes - with loading into the warehouse based on customer credit card number (more on parallel database interfaces below). With dynamic data re-partitioning, data is re-partitioned on-the-fly between processes - without landing the data to disk - based on the downstream process data partitioning needs.


Ref - www.ibm.com



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/