Monday, 18 February 2019

MongoDB Index in Python - Simple Index

Like RDBMS Systems MongoDB also provide Indexes to improve it's performance to process the query quicker and return the resultset. Mongo supports different type of indexes such as SingleKey, Compound, MultiKey, PartialKey and Text Indexes. We will look into these ones one by one.

Starting with Simple Index or One Key Index which use only one key from the collection/document [quivalent as  Table/Row in RDBMS systems], Let's see how -

Mongo Shell Command:  db.<collectionName>.createIndex({<field>:<direction>})
pyMongo Command:      db.<collectionName>.create_index([(<field>, <direction>)

Let's analyze the impact of Index creation on Query Performance, first via mongo shell, second in python - 

In MongoShell:

In our example, we are taking the collection 'people' as an example which has the field 'last_name'


The above command will generate the executions stats for a query where last_name == Tuker .

as the execution plan shows, mongoDB scanned the whole collection (total 50747 documents for fetching 65 records) to fetch the result which is costly when your collection is big.

Now, Creating a Simple Index or Single Key Index


Now, querying again the same - 


This time MongoDB finds that there is an Index available on last_name columns which has been used to fetch the result. It scanned only 65 index keys to fetch 65 records. 

Single Key Index can be used in below scenarios - 
   - Querying on the range of Indexed Key values
   - Querying on selected values of Indexed Key

  - Returned result will be sorted by Index Key, no need to put a sort operation if sorting on the index key
  - Index key can be used in any sort order - Ascending or Descending

Consideration while Designing Single Key Index:
  - Do not create Single Key Index on each field available on collections, it will slow down the performance of select and write query both.

Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     

Friday, 1 February 2019

Let's Learn - Git - Pull Specific Folder - sparsecheckout

What if your git repository has lots of folders but you have to work on a specific file in a particular folder. This git feature is called Sparse checkout. Previous Versions of git doesn't support this feature which forces you to download the whole repository. Sometime repository is too big to download and time-consuming process.

Current git versions support Sparse checkout which allows you to clone or fetch only a particular folder from a very big repository. Let's see how we can achieve it.

Task - Need to sync a folder named 'other' from 'DataGenX' repository 

Step #1: Initialize the Repository
Create a folder where you want to sync your git repo folder and Initialize git

Step #2: Add the Remote Repository
Add the remote Git repository with this local git repo as below -

Step #3: Create and Checkout a branch [Optional Step]
Creating of the branch is a totally optional step but it is advisable to create.

Step #4: Enable the Sparse Checkout Properties
Now, we have to enable the Sparse checkout properties and adding the folder name (in our case - 'other') in property file which we want to sync.

Step #5: Pull the Specific Folder
This is the last step where we pull the specific folder as below -

git pull <remote> <pull_branch_name>  #not locally created

while running this command, need to give proper branch name from where you want to pull the data, In our case, it is master.

Step #6: List and work with synced directory

Commands as below - 

Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     

Friday, 4 January 2019

How to Query on MongoDB by _id - INTERMEDIATE II

What if we have to query on MongoDB collections based on the "_id" field, Can we really query on "_id" field ? If so, what is the syntax ? Let's try this out -

Let's first fetch a document's id -

$ db.user.find_one()

{'_id': ObjectId('5c16e863817810ed3fc5e5f9'),
 'Fname': 'atul',
 'Lname': 'Singh',
 'Grade': 12.0,
 'College': 'SGM',
 'Job': 'Student',
 'Address': 'Young St.'}

Now, we will pick the object id and use this to fetch the same document from collection

$ db.user.find_one({'_id':'5c16e863817810ed3fc5e5f9'}) #this will return nothing

$ db.user.find_one({'_id':ObjectId('5c16e863817810ed3fc5e5f9')})
NameError                                 Traceback (most recent call last)
----> 1 db.user.find_one({'_id':ObjectId('5c16e863817810ed3fc5e5f9')})

NameError: name 'ObjectId' is not defined

We have received this error because ObjectId is not the same as its string representation, it must be converted to ObjectId from a string before it is passed to find command.

$ from bson.objectid import ObjectId
$ db.user.find_one({'_id':ObjectId('5c16e863817810ed3fc5e5f9')})

{'_id': ObjectId('5c16e863817810ed3fc5e5f9'),
 'Fname': 'atul',
 'Lname': 'Singh',
 'Grade': 12.0,
 'College': 'SGM',
 'Job': 'Student',
 'Address': 'Young St.'}

IPYTHON Notebook can be found HERE

Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     

Thursday, 3 January 2019

MongoDB with Python - Basics IV - Update & Delete Operation

Welcome to one more quick session on MongoDB CRUD Basics with Update and Delete Operations. MDB Provides below methods under these operations -

** Update
      * update_one
      * update_many
      * replace_one

** Delete
      * delete_one
      * delete_many

For on Mongo DB -> Link

As it is quite clear from the name itself (_one and _many) that these methods perform the operation on single or multiple records based on the passed condition.
There are so many operator supported by update statement, few are as below -

$set - Add new or update field value
$unset  - Remove field
$inc - increment the current value
$push - push element into array field
$push with $each - push multiple elements into array field
$pop - pull out last value from array field

There are many more Update operators support by MongoDB, Full list can be found HERE

CRUD Operation - (Update & Delete) : Link

Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group     

Wednesday, 2 January 2019

How to iterate MongoDB Cursor in Python - Intermediate I

Whenever querying on mongodb, always store the output into a variable, called cursor, before performing any operation on data. It will keep your data into variable without messing your output ground. PyMongo Cursor variable supports few functions which helps with few information without actual seeing your data such as retrieved data count or distinct values in particular key etc.

In this session, we will learn about the mongo db cursor variables, for this exercise also we are going to use 'USER' database hosted on free tier MongoDB Atlas (M0) Server.

Jupyter Notebook can be accessed HERE also

= =

Next Post on this Series and more on MongoDB can be find here -> LINK

Like the below page to get the update  
Facebook Page      Facebook Group      Twitter Feed      Google+ Feed      Telegram Group