Showing posts sorted by relevance for query pandas. Sort by date Show all posts
Showing posts sorted by relevance for query pandas. Sort by date Show all posts

Monday, 14 March 2016

Machine Learning Links you must Visit



1. Scikit-Learn Tutorial Series - http://buff.ly/1XnDWv6
2. 7 Free Machine Learning Courses - http://buff.ly/1XoBapa 
3. k-nearest neighbor algorithm using Python - http://buff.ly/1SLyUZX
4. 7 Steps to Mastering Machine Learning With Python - http://buff.ly/1SLyZwR


Analytics in Python 

1. Learning Pandas #1 - Series - http://bit.ly/2hAlZ0u
2. Learning Pandas #2 - DataFrame  -  http://bit.ly/2ii6Mlu
3. Learning Pandas #3 - Working on Summary & Missing Data - http://bit.ly/2iUROTB
4. Learning Pandas #4 - Hierarchical Indexing - http://bit.ly/2i8AMx9






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Saturday, 25 March 2017

Check if Python Pandas DataFrame Column is having NaN or NULL


Before implementing any algorithm on the given data, It is a best practice to explore it first so that you can get an idea about the data. Today, we will learn how to check for missing/Nan/NULL values in data.

1. Reading the data
Reading the csv data into storing it into a pandas dataframe.


2. Exploring data
Checking out the data, how it looks by using head command which fetch me some top rows from dataframe.


3. Checking NULLs
Pandas is proving two methods to check NULLs - isnull() and notnull()
These two returns TRUE and FALSE respectively if the value is NULL. So let's check what it will return for our data

isnull() test

notnull() test

Check 0th row, LoanAmount Column - In isnull() test it is TRUE and in notnull() test it is FALSE. It mean, this row/column is holding null.

But we will not prefer this way for large dataset, as this will return TRUE/FALSE matrix for each data point, instead we would interested to know the counts or a simple check if dataset is holding NULL or not.

Use any()
Python also provide any() method which returns TRUE if there is at least single data point which is true for checked condition.


Use all()
Returns TRUE if all the data points follow the condition.


Now, as we know that there are some nulls/NaN values in our data frame, let's check those out - 

data.isnull().sum() - this will return the count of NULLs/NaN values in each column.


If you want to get total no of NaN values, need to take sum once again -

data.isnull().sum().sum()


If you want to get any particular column's NaN calculations - 




Here, I have attached the complete Jupyter Notebook for you -



If you want to download the data, You can get it from HERE.




Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/

Friday, 6 January 2017

10 minutes with pandas library

Thursday, 5 January 2017

Learning Pandas #5 - read & write data from file

Wednesday, 4 January 2017

Learning Pandas #4 - Hierarchical Indexing

Sunday, 1 January 2017

Learning Pandas #3 - Working on Summary & MissingData

Saturday, 31 December 2016

Learning Pandas - DataFrame #2

Friday, 30 December 2016

Learning Pandas - Series #1

Wednesday, 30 November 2016

Learning Graphlab - SFrame #1


Hoping you guys went through the last post (Lnk -> Getting Started with Graphlab), In this post we will do some handson SFrame datatype of Graphlab which is same as dataframe of pandas python library.

i. Reading the CSV file
==
rdCSV

ii. save DataSet 
==

iii. load DataSet
==


iv. Check Total Rows and Columns
==
rowNum

v. Check Columns data type and Name
==
colTypes

vi. Add new column
==
addCol

vii. Delete column
==

viii. Rename column
==
renameCol

ix. Column Swapping (location)
==






Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/