first make sure you have a unicode representation
s = u'café'
then run the following:
import unicodedata
unicode(unicodedata.normalize('NFD', s).encode('ascii', 'ignore'), 'utf-8')
what this does is decomposes the unicode string into components, keeps only the ascii characters, then converts it back to unicode.
voila
No comments:
Post a Comment