Showing posts with label MongoDB. Show all posts
Showing posts with label MongoDB. Show all posts

Monday, 18 February 2019

MongoDB Index in Python - Simple Index


Like RDBMS Systems MongoDB also provide Indexes to improve it's performance to process the query quicker and return the resultset. Mongo supports different type of indexes such as SingleKey, Compound, MultiKey, PartialKey and Text Indexes. We will look into these ones one by one.

Starting with Simple Index or One Key Index which use only one key from the collection/document [quivalent as  Table/Row in RDBMS systems], Let's see how -


Mongo Shell Command:  db.<collectionName>.createIndex({<field>:<direction>})
pyMongo Command:      db.<collectionName>.create_index([(<field>, <direction>)

Let's analyze the impact of Index creation on Query Performance, first via mongo shell, second in python - 

In MongoShell:

In our example, we are taking the collection 'people' as an example which has the field 'last_name'

db.people.find({last_name:'Tucker'}).explain('executionStats')

The above command will generate the executions stats for a query where last_name == Tuker .


as the execution plan shows, mongoDB scanned the whole collection (total 50747 documents for fetching 65 records) to fetch the result which is costly when your collection is big.

Now, Creating a Simple Index or Single Key Index

db.people.createIndex({"last_name":1}) 



Now, querying again the same - 

db.people.find({last_name:'Tucker'}).explain('executionStats')


This time MongoDB finds that there is an Index available on last_name columns which has been used to fetch the result. It scanned only 65 index keys to fetch 65 records. 

Single Key Index can be used in below scenarios - 
   - Querying on the range of Indexed Key values
   - Querying on selected values of Indexed Key

Advantage:
  - Returned result will be sorted by Index Key, no need to put a sort operation if sorting on the index key
  - Index key can be used in any sort order - Ascending or Descending

Consideration while Designing Single Key Index:
  - Do not create Single Key Index on each field available on collections, it will slow down the performance of select and write query both.





Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Friday, 4 January 2019

How to Query on MongoDB by _id - INTERMEDIATE II


What if we have to query on MongoDB collections based on the "_id" field, Can we really query on "_id" field ? If so, what is the syntax ? Let's try this out -

Let's first fetch a document's id -

$ db.user.find_one()

{'_id': ObjectId('5c16e863817810ed3fc5e5f9'),
 'Fname': 'atul',
 'Lname': 'Singh',
 'Grade': 12.0,
 'College': 'SGM',
 'Job': 'Student',
 'Address': 'Young St.'}



Now, we will pick the object id and use this to fetch the same document from collection

$ db.user.find_one({'_id':'5c16e863817810ed3fc5e5f9'}) #this will return nothing

$ db.user.find_one({'_id':ObjectId('5c16e863817810ed3fc5e5f9')})
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
 in 
----> 1 db.user.find_one({'_id':ObjectId('5c16e863817810ed3fc5e5f9')})

NameError: name 'ObjectId' is not defined


We have received this error because ObjectId is not the same as its string representation, it must be converted to ObjectId from a string before it is passed to find command.

$ from bson.objectid import ObjectId
$ db.user.find_one({'_id':ObjectId('5c16e863817810ed3fc5e5f9')})

{'_id': ObjectId('5c16e863817810ed3fc5e5f9'),
 'Fname': 'atul',
 'Lname': 'Singh',
 'Grade': 12.0,
 'College': 'SGM',
 'Job': 'Student',
 'Address': 'Young St.'}



IPYTHON Notebook can be found HERE








Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Thursday, 3 January 2019

MongoDB with Python - Basics IV - Update & Delete Operation


Welcome to one more quick session on MongoDB CRUD Basics with Update and Delete Operations. MDB Provides below methods under these operations -

** Update
      * update_one
      * update_many
      * replace_one

** Delete
      * delete_one
      * delete_many

For on Mongo DB -> Link


As it is quite clear from the name itself (_one and _many) that these methods perform the operation on single or multiple records based on the passed condition.
There are so many operator supported by update statement, few are as below -

$set - Add new or update field value
$unset  - Remove field
$inc - increment the current value
$push - push element into array field
$push with $each - push multiple elements into array field
$pop - pull out last value from array field

There are many more Update operators support by MongoDB, Full list can be found HERE


CRUD Operation - (Update & Delete) : Link






Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Wednesday, 2 January 2019

How to iterate MongoDB Cursor in Python - Intermediate I


Whenever querying on mongodb, always store the output into a variable, called cursor, before performing any operation on data. It will keep your data into variable without messing your output ground. PyMongo Cursor variable supports few functions which helps with few information without actual seeing your data such as retrieved data count or distinct values in particular key etc.

In this session, we will learn about the mongo db cursor variables, for this exercise also we are going to use 'USER' database hosted on free tier MongoDB Atlas (M0) Server.


Jupyter Notebook can be accessed HERE also

= =


Next Post on this Series and more on MongoDB can be find here -> LINK





Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Monday, 31 December 2018

MongoDB with Python - Basics III - Find/Select Operation


Hoping you guys are enjoying the NoSql journey so far (previous posts links), till now we have seen basic CRUD operation. From this post onward, I am diving in details of these operations and starting with FIND or SELECT operation in MongoDB. We will learn what are the ways and options provided by MongoDB to select or project the data.

When you start working with complex queries you might {as I have said "Might"} face difficulties with tracking of braces {([ as I've experienced the same with me/my team/students and colleagues. But no worries, Jupyter Notebook provides a  couple highlighter for braces when selected or you can use notepad++ also (which I think is not so useful as you are not gonna copy/paste the syntax so frequently).

I advise everyone to avoid the writing queries directly on mongo shell prompt as it doesn't provide any intelligence and not so good in fixing queries if made mistake.

I am sure you will love this post as well and if have any question feel free to ask in comment section below.

For on Mongo DB -> Link


There are many more Find/Read Operators supported by MongoDB, Full list can be found HERE

CRUD Operation - (Read) : Link




Next Post on this Series and more on MongoDB can be find here -> LINK




Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Saturday, 29 December 2018

MongoDB with Python - Basics II - CRUD Operations


In this post, we will learn about the Advance Find and Create Operations with Sort, Skip and Limit functionality. Pymongo driver support almost same kind of syntax for python which mongo shell used.
The benefit of python (or any programming language) + mongo is to use both langauge/db functionality to work with mongo. Though, to perform the same operation is faster then performing by python but it depends on the activity you are performing.



CRUD Operation - (Create, Read, Update & Delete) : Link




Next Post on this Series and more on MongoDB can be find here -> LINK




Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Friday, 28 December 2018

MongoDB with Python - Basics I - CRUD Operations


In Previous few posts (Link), We have learnt about MongoDB Cloud Setup, Installation and Basic commands to do CRUD operation with MongoDB. It can be accessed by programming language such as python, java, and node.js by using respective native drivers. We will start with PyMongo (python driver) to access mongo from python.



CRUD Operation - (Create, Read, Update & Delete) : Link




Next Post on this Series and more on MongoDB can be find here -> LINK





Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Tuesday, 25 December 2018

MongoDB Atlas - Off Premise Way (DBaaS)


MongoDb also provide cloud services (Database as a Service - DaaS), called MongoDB Atlas, to host your mongo database on cloud. Let's see how we can setup an cloud account and access the MDB from local machine.

Cluster Step :
1. Create an account on https://cloud.mongodb.com
2. Once you are in, the very first thing which it asked to choose your cluster configuration.
2a. It gives you to choose one of cloud service which are - AWS, Google, Azure, Choose whatever you like
2b. But always choose "FREE TIER CLUSTER" (M0 Instance) else there will be usage charge.
3. Once you have selected appropriate config, it will start building your mongoDB Cluster, it will take few mins to complete the setup.
4. When done, it will be like this, usually use Cluster0 as name, you can modify it though -




How to access from local system :

You need to install Mongo Shell to access cloud db which comes with Mongo DB pkg. You can download and install on your OS (Windows/Linux) from here - https://www.mongodb.com/download-center/enterprise

1. Login on https://cloud.mongodb.com and click on Clusters in left hand side list.
2. Click on Connect and follow below steps -
3a. Whitelist your id so that you can connect with your system or any ip address. Click on "Add a different IP Address" and to allow to connect from any system, Use 0.0.0.0 as IP address
3b. Create your cluster credential


4. Once done, you will see the below screen
5a. Now, Click on Choose connection method and click on "Connect with Mongo Shell" -


5b. Now, click on standard connection string


6. Copy the string and replace the <PASSWORD> with the password which you created in Cluster Setup Step #4.
7. Now, As I have installed the MongoDB Shell in Step #1, we need to add MongoDB Bin directory path into system path. You can add this path into windows env variable or Linux user profile so that you can access mongo command from any location.
8. Once path has been added, open cmd or terminal and paste the connection string which you copied and modified in step #5


9. When connected successfully, you can try to run commands as below -


10. For more commands, you can visit this link - https://www.datagenx.net/2018/12/learn-mongo-db-basics.html

11. Mongo Atlas Cloud Step has been completed and verified successfully. You can connect with the same connection string from any system (if firewall allows and have mongo shell installed)

Let me know in comments if you face any issue while doing Atlas setup.
Next Post on this Series and more on MongoDB can be find here -> LINK




Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


MongoDB - Embedding v/s Referencing


MongoDb, a NoSQL document DB, doesn't support the JOIN as RDBMS do which is a very useful feature in DB domain. So what's new or addition in MongoDb which can overcome the JOIN feature. Let's understand this.

First of all, MongoDb is NOT a replacement of standard RDBMS system. It is misconception in DB world that NoSQL DB system will/can replace RDBMS or vice versa. No, It isn't or going to be. Both Database systems have own pros and cons which we will see later.



Embedding:
As the name itself reveals, Embed the data into the document means put all the data together in one document. This will provide a better read performance when you want to get all the related data in one read call as MongoDb stores one document at one place on the disk so minimum seek time is required when reading the data from disk drive.

Let's suppose. we want to create a data model for below ask -
==

So, Embedding document will look like -
==


Referencing:
Embedding will cause performance slowness when there are frequent CRUD operations on embedded document. In embedding, data duplication is highly probable. In these cases, we create a document reference rather than document. This is similar to parent-child relationship as we have in RDBMS.

Let's see now how our collection Books will look like -
==

Next Post on this Series and more on MongoDB can be find here -> LINK





Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Friday, 21 December 2018

Learn Mongo DB - Basics #1


While going through old pages, saw these mongoDB posts, so thought of continuing, We have already completed Installation of MongoDB, Now starting off with few basic commands which will help you to play around MongoDB :-)
==
Next Post on this Series and more on MongoDB can be find here -> LINK




Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     


Sunday, 14 January 2018

Mongo DB - Installation and Configuration


MongoDB  is an open-source document database, and the leading NoSQL database. Written in C++.
  
MongoDB features:
    Document-Oriented Storage
    Full Index Support
    Replication & High Availability
    Auto-Sharding
    Querying
    Fast In-Place Updates
    Map/Reduce
    GridFS


Reduce cost, accelerate time to market, and mitigate risk with proactive support and enterprise-grade capabilities.


Today, We will see how to install and run the MongoDB.

MongoDB Installation on Linux


1. DOWNLOAD the stable version of MongoDB. It will a tar file
2. Extract the tar file to some directory.
 
$ tar -xvf mongodb.tar -C /learn/mongodb


3.  change the permisson of folder to user who run the db here-  In my case User - hduser and Group - hadoop
$ chown -R hduser:hadoop /learn/mongodb

4. Add the env var in .bashrc
export MONGO_HOME=/learn/mongodb
export PATH=$PATH:$MONGO_HOME/bin







5. Create the default DB directory for Mongo
$ mkdir -R /data/db
$ chown -R hduser:hadoop /data/db

This is by default, you can specify ur db path when starting the mongo db






$ mongod --dbpath /app/mongodata
this command will start the mongodb. in other terminal you can start work on db. "--dbpath /app/mongodata" is totally optional

If you just use just $ mongod , it will start n use the default db which we have defined in step 5.


Please don't close the current terminal, It can be kill the mongodb process.







6. Start working on MongoDB
$ mongo










Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Saturday, 22 August 2015

MongoDB & RDBMS Terminology





Term relationship between MongoDB ( NoSQL) and RDBMS -






MongoDB Term RDBMS Term
Database  Database
Table Collection
Tuple or Row Document
Column Field or Key
Table Join Embeded Document
Primary Key Primary Key ( default _id field)
mongod ( DB process) oracle/db2 /mysqld
mongo sqlplus/db2 client/mysql




Saturday, 15 August 2015

MongoDB - Installation and Configuration in Linux





MongoDB  is an open-source document database, and the leading NoSQL database. Written in C++.
  
MongoDB features:


    Document-Oriented Storage
    Full Index Support
    Replication & High Availability
    Auto-Sharding
    Querying
    Fast In-Place Updates
    Map/Reduce
    GridFS


Reduce cost, accelerate time to market, and mitigate risk with proactive support and enterprise-grade capabilities.

Today, We will see how to install and run the MongoDB.