Searching on encoded string in Postgres with Python
I have a db query like so which I am executing in Python on a Postgres
database:
"Select * from my_tbl where big_string like '%Almodóvar%'"
However, in the column I am searching on Almodóvar is represented as
'Almod\u00f3var' and so the query returns nothing.
What can I do to make the two strings match up? Would prefer to work with
Almodóvar on the Python side rather than the column in the database but I
am flexible.
Additional info prompted by comments:
The database uses UTF-8. The field I am querying on is acquired from an
external API (The Movie Database: tmdb.org). The data was retrieved
RESTfully as json and then inserted into a text field of the database
after a json.dump.
Because the data includes a lot of foreign names and characters, working
with it has been a series of encoding-related headaches. If there is a
silver bullet for making this data play nice with Python, I would be very
grateful to know what that is.
UPDATE 2:
It looks like it's json encoding that created my quandary.
print json.dumps("Almodóvar")
yields
Almod\u00f3var
so will do a json.dumps on my strings before searching for them in the
json encoded text field. Seems inelegant. If someone has a better idea,
please let me know.
No comments:
Post a Comment