Wednesday, 11 September 2013

Suggestions for a multi-faceted search software stack

Suggestions for a multi-faceted search software stack

I need to create a search facility as part of a new project for a client.
The records will be things that happen on one or more specific dates. It
would be great to get SO's advice on what tools would be best used for the
following requirements:
Needs to (multi-faceted) search tens of thousands of records (based on
fields such as category, date, price etc)
Needs to search on multi-value fields (i.e. tags)
Needs to be able to order by static factors (such as price, distance etc)
Needs to be able to order by dynamic / frequently changing factors (such
as user engagement / traffic etc)
Needs to be able to only return records for which there has been activity
in the user's own social network (i.e. 'only show me results my friends
have engaged with').
Will be deployed in EC2
My current thoughts are:
Use a hybrid of something like Amazon CloudSearch and Redis
10s of thousands are not actually that many records. Perhaps do the bulk
of the work in an RDBMS, with CloudSearch for full-text searching?
Use Redis to maintain a sets of recently interacted with records for each
user, then union them to get the records in a user's network.
My main concern is the latency of pulling back perhaps many thousands of
IDs from various services (Redis/CloudSearch) and then having to union
them in the client code. However, perhaps this is unfounded.
I'm hoping that there is perhaps a technology stack out there which I have
missed that can solve a lot of this for me. I don't want to go reinventing
the wheel.
Any suggestions welcome!

No comments:

Post a Comment