Showing posts with label DBMS. Show all posts
Showing posts with label DBMS. Show all posts

Sunday, 14 January 2018

Mongo DB - Installation and Configuration


MongoDB  is an open-source document database, and the leading NoSQL database. Written in C++.
  
MongoDB features:
    Document-Oriented Storage
    Full Index Support
    Replication & High Availability
    Auto-Sharding
    Querying
    Fast In-Place Updates
    Map/Reduce
    GridFS


Reduce cost, accelerate time to market, and mitigate risk with proactive support and enterprise-grade capabilities.


Today, We will see how to install and run the MongoDB.

MongoDB Installation on Linux


1. DOWNLOAD the stable version of MongoDB. It will a tar file
2. Extract the tar file to some directory.
 
$ tar -xvf mongodb.tar -C /learn/mongodb


3.  change the permisson of folder to user who run the db here-  In my case User - hduser and Group - hadoop
$ chown -R hduser:hadoop /learn/mongodb

4. Add the env var in .bashrc
export MONGO_HOME=/learn/mongodb
export PATH=$PATH:$MONGO_HOME/bin







5. Create the default DB directory for Mongo
$ mkdir -R /data/db
$ chown -R hduser:hadoop /data/db

This is by default, you can specify ur db path when starting the mongo db






$ mongod --dbpath /app/mongodata
this command will start the mongodb. in other terminal you can start work on db. "--dbpath /app/mongodata" is totally optional

If you just use just $ mongod , it will start n use the default db which we have defined in step 5.


Please don't close the current terminal, It can be kill the mongodb process.







6. Start working on MongoDB
$ mongo










Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 7 June 2016

Tuning SQL Statement #4



Using Functions in QueriesWhen using functions in SQL statements it is best to keep the functions away from any columns involving index matching unless function-based indexes are available and can be created.

Do’s   
SELECT account_name, trans_date, amount
FROM transaction
WHERE account_name LIKE 'CAPITAL%';    


Don’ts
SELECT account_name, trans_date, amount
FROM transaction
WHERE SUBSTR(account_name,1,7) = 'CAPITAL';




Sequences
A sequence is often used to create unique integer identifiers as primary keys for tables. A sequence is a distinct database object and is accessed as sequence.NEXTVAL and sequence.CURRVAL.
Use sequences for unique integer identifiers; they are centralized, more controllable, more easily maintained, and perform better than other methods of counting.

Do’s  
INSERT INTO supplier (supplier_id, name, ticker)
      VALUES (supplier_seq.NEXTVAL,'A new supplier', 'TICK');    


 Don’ts
INSERT INTO supplier (supplier_id, name, ticker)
      VALUES ((SELECT MAX(supplier_id)+1
      FROM supplier), 'A new supplier', 'TICK');


    

Equi, Anti, and Range
Using an equals sign (equi) is the fastest comparison condition if a unique index exists. Any type of anti comparison such as != or NOT is looking for what is not in a table and thus must read the entire table; sometimes full index scans can be used.

Do’s   
SELECT account_name, trans_date, amount
FROM transaction
WHERE amount > 0;    SELECT account_name, trans_date, amount
FROM transaction
WHERE amount != 0;
 

Don’ts
SELECT account_name, trans_date, amount
FROM transaction
WHERE amount > 0;    SELECT account_name, trans_date, amount
FROM transaction
WHERE amount  NOT = 0;



LIKE Pattern MatchingIn general, since LIKE will match patterns which are in no way related to indexes, LIKE will usually read an entire table.






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Monday, 23 May 2016

Tuning SQL Statement #3


Tuning SQL Statement #1.......
Tuning SQL Statement #2.......

Consider whether a UNION ALL will suffice in place of a UNION.
The UNION statement performs the equivalent of a SELECT DISTINCT on the final result set. In other words, the UNION clause forces all rows returned by each portion of the UNION to be sorted and merged and duplicate to be filtered before the first row is returned.  A UNION ALL simply returns all rows including duplicates and does not have to perform any sort, merge or filter.  If your tables are mutually exclusive (include no duplicate records), or you don't care if duplicates are returned, the UNION ALL is much more efficient as it does not perform the SELECT DISTINCT function, which saves a lot of unnecessary resources from being used.

Do’s    
SELECT acct_num, balance_amt
FROM debit_transactions
WHERE tran_date = '1-JAN-06' 
UNION ALL
SELECT acct_num, balance_amt
FROM credit_transactions
WHERE tran_date = '1-JAN-06' ;  

Don’ts
SELECT acct_num, balance_amt
FROM debit_transactions
WHERE tran_date = '1-JAN-06'
UNION
SELECT acct_num, balance_amt
FROM credit_transactions
WHERE tran_date = '1-JAN-06' ;



http://www.datagenx.net/2016/04/tuning-select-statement-2.html
The DECODE Function
Using the DECODE function as a replacement for multiple query set operators is good for performance but should only be used in extreme cases such as the UNION clause joined SQL statements as shown.

Do’s  
SELECT stock_id||' '||
      DECODE(SIGN(qtyonhand)
            ,-1,'Out of Stock',0,'Out of Stock'
            ,1,DECODE(SIGN(qtyonhand-min)
                   ,-1,'Under Stocked',0,'Stocked'
                   ,1,DECODE(sign(qtyonhand-max)
                        ,-1,'Stocked',0,'Stocked'
                          ,1,'Over Stocked'
                   )
            )
      ) FROM stock;  
    

Don’ts
SELECT stock_id||' Out of Stock' FROM stock WHERE
   qtyonhand <=0
UNION
SELECT stock_id||' Under Stocked' FROM stock
      WHERE qtyonhand BETWEEN 1 AND min-1
UNION
SELECT stock_id||' Stocked' FROM stock
WHERE qtyonhand BETWEEN min AND max
UNION
SELECT stock_id||' Over Stocked' FROM stock
      WHERE qtyonhand > max;


    

Datatype Conversions
Try to avoid using any type of data conversion function in any part of an SQL statement which could potentially match an index, especially if you are trying to assist performance by matching appropriate indexes.

Do’s  
SELECT ...
FROM customer
WHERE cust_id = TO_NUMBER('11');    SELECT ...
FROM customer
WHERE cust_id = '11';

Don’ts
SELECT ...
FROM customer
WHERE cust_type = 1;    SELECT ...
FROM emp
WHERE TO_NUMBER (cust_type) = 1;






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Wednesday, 27 April 2016

Tuning SQL Statement #2


Tuning SQL Statement #1.......
Tuning SQL Statement #3.......


•    The UNION statement performs the equivalent of a SELECT DISTINCT on the final result set. In other words, the UNION clause forces all rows returned by each portion of the UNION to be sorted and merged and duplicate to be filtered before the first row is returned.  A UNION ALL simply returns all rows including duplicates and does not have to perform any sort, merge or filter.  If your tables are mutually exclusive (include no duplicate records), or you don't care if duplicates are returned, the UNION ALL is much more efficient as it does not perform the SELECT DISTINCT function, which saves a lot of unnecessary resources from being used.

Do's-
SELECT acct_num, balance_amt FROM debit_transactions  WHERE tran_date = '1-JAN-06' 
UNION ALL
SELECT acct_num, balance_amt  FROM credit_transactions  WHERE tran_date = '1-JAN-06' ;  

Don'ts-
SELECT acct_num, balance_amt  FROM debit_transactions  WHERE tran_date = '1-JAN-06'
UNION
SELECT acct_num, balance_amt  FROM credit_transactions  WHERE tran_date = '1-JAN-06' ;



•    Try to avoid using any type of data conversion function in any part of an SQL statement which could potentially match an index, especially if you are trying to assist performance by matching appropriate indexes

Do's-
SELECT ...  FROM customer  

WHERE cust_id = TO_NUMBER('11');  

SELECT ...  FROM customer
WHERE cust_type = 1;    


Don'ts-
SELECT ...  FROM customer
WHERE cust_id = '11';

SELECT ...  FROM emp
WHERE TO_NUMBER (cust_type) = 1;



•    When using functions in SQL statements it is best to keep the functions away from any columns involving index matching unless function-based indexes are available and can be created.

Do's-
SELECT account_name, trans_date, amount  FROM transaction
WHERE account_name LIKE 'CAPITAL%';  

Don'ts-
SELECT account_name, trans_date, amount  FROM transaction
WHERE SUBSTR(account_name,1,7) = 'CAPITAL';



•    If your SELECT statement contains a HAVING clause, write your query so that the WHERE clause does most of the work (removing undesired rows) instead of the HAVING clause do the work of removing undesired rows. Using the WHERE clause appropriately can eliminate unnecessary rows before they get to the GROUP BY and HAVING clause, saving some unnecessary work, and boosting performance

Do's-
SELECT country, AVG (population)  FROM survey
WHERE country != 'INDIA'
AND country != 'CHINA';
GROUP BY country;  

Don'ts-
SELECT country, AVG (population)  FROM survey
GROUP BY country
HAVING country != ‘INDIA'
AND country != 'CHINA’;



•    The ORDER BY clause is always executed after the WHERE clause. Try not to override the WHERE clause with the ORDER BY clause because the Optimizer may choose a less efficient method of execution based on the ORDER BY clause. Try to put your where columns in order by clause if possible.

Do's-
SELECT * FROM ordersline
WHERE order_id < 10
ORDER BY order_id, seq#;  

Don'ts-
SELECT * FROM ordersline
WHERE order_id < 10
ORDER BY seq#;







Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Tuesday, 26 April 2016

Tuning SQL Statement #1



•    Try to use explicit columns and try to read columns in index orders if possible, even to the point of reading indexes and not tables.
•    Use a standard approach to table aliases.  If two identical SQL statements vary because an identical table has two different aliases, then the SQL is different and will not be shared.
•    Use table aliases and prefix all column names by their aliases when more than one table is involved in a query.  This reduces parse time AND prevents future syntax errors if someone adds a column to one of the tables with the same name as a column in another table.
•    Use bind variables.  The values of bind variables do not need to be the same for two statements to be considered identical.  Bind variables are not substituted until a statement has been successfully parsed.

i.e -
Do's     - Select FirstNme, MidName, LastName from atul.employee;
Don'ts  - Select * from atul.employee;



•    Try to match comparison condition column sequences with existing index column sequences, although it is not strictly necessary.
•    Always try to use unique, single-column indexes wherever possible. A single-column unique index is much more likely to produce exact hits. An exact hit is the fastest access method.
•    Try to avoid WHERE clauses that are NEGATIVE. Negative key words are in the WHERE clause, such as "IS NULL", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE"

i.e -
Do's     - Select FirstNme from atul.employee where DeptCd Like 'IN%';
Don'ts  - Select FirstNme from atul.employee where SubSTR(DeptCd, 1,2) = 'IN';

Do's     - Select Material from atul.store where quantity > 0;
Don'ts  - Select Material from atul.store where quantity <> 0;

Do's     - Select Material from atul.store where quantity > 3;
Don'ts  - Select Material from atul.store where quantity + 3 > 6;



Tuming SELECT Statement Continues...............




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Thursday, 27 August 2015

DB2 Column Compare Tool





            Whenever, we are working in Datawarehouse / Database (Oracle / MSSQL / DB2) projects, we often need to compare to tables column wise. When such requirement comes, we might end up in confusion putting the columns list in the excel sheet and comparing each column by column.

              Here’s procedure you can create in your local DB and run it feeding it with Tables names as input, and in one run you get three outputs: Matched columns, Table1 unmatched columns, table2 unmatched columns. You might need to give schema name of Table if there are two tables in SYSCOLUMNS with the same name.




You can modify or re-write the same procedure for your respective DBs.




Saturday, 22 August 2015

MongoDB & RDBMS Terminology





Term relationship between MongoDB ( NoSQL) and RDBMS -






MongoDB Term RDBMS Term
Database  Database
Table Collection
Tuple or Row Document
Column Field or Key
Table Join Embeded Document
Primary Key Primary Key ( default _id field)
mongod ( DB process) oracle/db2 /mysqld
mongo sqlplus/db2 client/mysql




Saturday, 15 August 2015

MongoDB - Installation and Configuration in Linux





MongoDB  is an open-source document database, and the leading NoSQL database. Written in C++.
  
MongoDB features:


    Document-Oriented Storage
    Full Index Support
    Replication & High Availability
    Auto-Sharding
    Querying
    Fast In-Place Updates
    Map/Reduce
    GridFS


Reduce cost, accelerate time to market, and mitigate risk with proactive support and enterprise-grade capabilities.

Today, We will see how to install and run the MongoDB.