My scrapbook about almost anything I stumble upon in my tech world. If you find anything useful don't forget to give thumbs-up :)

Breaking

Friday, February 26, 2016

Audit Strategy in ETL #2


First - Audit Strategy in ETL #1




As shown in the above diagram, the audit services are dependent on source systems and target systems for capturing audit data; unlike error handling services are dependent on ETL process to capture error data. Audit services are created as shared services that plug in to each ETL process.

Within the proposed Next Gen BI environment there are 4 stages (Initial Staging, Clean Staging, EDW, and Data Marts) where data is landed after extracting from source systems.
There will be at least one ETL job flow process at each of these stages extracting and loading data. Audit services are called at each stage by the ETL job flow process and the audit data is captured and reported.


Some of the key audit data that will be captured by the audit services is given below :


Field # Field Description
1 Data Server Name A name that identifies a database service.  This is called a "server" by some products.  Oracle refers to it by the name "instance" or "database".
2 Database Name
3 Data Table Name A name that identifies a Target Data Table within a Database.
4 Source Name A name that identifies the file, table or other data structure that is the origin of the data to be loaded into the target table.
5 ETL Process Name DataStage Sequencer
6 ETL Sub Process Name DataStage Job
7 New Rows in Source Count The number or rows in the source table that have been added since the prior run date.  This is the number that the process attempted to add to the target table.  (If the source is a file, this field should be NULL.)
8 Changed Rows in Source Count The number of rows in the source table which were modified since the prior run date.  This is the number that the process attempted to modify in the target. (If the source is a file, this field will be NULL.)
9 Processed Rows in Source Count The total number of rows in the source table or file that the process attempted to load to the target.  (If the source is a table, this is the sum of the New_Rows_in_Source_Cnt and Changed_Rows_in_Source_Cnt.)
10 Starting Row Count The number of rows in a table at the beginning of a Table Load.
11 Ending Row Count The number of rows in a table at the conclusion of a Table Load.
12 Insert Row Count The number of relational database table rows that are processed in an INSERT operation.
13 Update Row Count The number of relational database table rows that are processed in an UPDATE operation.
14 DW_CREATE_DTTM                            Date/time at which audit data was captured






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

No comments:

Post a Comment

Disclaimer

The postings on this site are my own and don't necessarily represent IBM's or other companies positions, strategies or opinions. All content provided on this blog is for informational purposes and knowledge sharing only.
The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of his information.