Sunday, 6 March 2016

Python SyntaxError - Non-ASCII character '\xe2' in file


If you get below error while running your python code - 

SyntaxError: Non-ASCII character '\xe2' in file .\set_learn.py on line 32, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

and You are using Notepad++ so here is how you have to resolve this -

1. By converting the Text Encoding

Go to Menu -> Encoding -> Convert to UTF-8

and save the file.


2. By seach and replace the \xe2 value to empty

Use Ctrl-F
Find [^\xe2]+
or Find [^\x00-\x7F]+ to delete all non-ascii char
Select Search mode as -Regular Expression
Hit Enter to replace all values


3. In Linux

a. Find the line which is having bad charaters -
grep -nP "[\x80-\xFF]" INPUT_FILE


b. Some ways to remove 
sed -i 's/[^[:print:]]//g' INPUT_FILE > clean-file
sed 's/[\x80-\xff]//g' INPUT_FILE > clean-file
tr -cd '\11\12\15\40-\176' < INPUT_FILE > clean-file

** word of caution - It may remove some charaters which you need file as we are using range, so take a backup of your file first



Like the below page to get update  
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/