Skip to content

Instantly share code, notes, and snippets.

@nickhoffman
Created January 4, 2012 21:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nickhoffman/1562222 to your computer and use it in GitHub Desktop.
Save nickhoffman/1562222 to your computer and use it in GitHub Desktop.
How to find filterable values when each result can have different filterable fields

Say you have an online store, and each product belongs to one catalog. In each catalog, you specify what custom attributes its products have. Eg:

  • Catalog: Transformers
    • Product Attributes: Character, Allegiance
  • Catalog: Games
    • Product Attributes: Number of Players, Average Game Time

When viewing a catalog, you want to enable users to filter by any custom product attribute. This is done by choosing which attribute to filter by from a dropdown, and then choosing the desired value for that attribute from a dropdown. Eg: filter products where “Character” is “Megatron”.

Here’s the question: How do you determine what values to put in the second dropdown?

Obviously, you can just iterate over each of the products in the catalog, gathering the values for each custom attribute. However, that requires you to load every product from the database. Yick.

The only alternative that I can think of is to store (in each catalog) the values for its products’ custom attributes.

@batasrki
Copy link

batasrki commented Jan 4, 2012

A comment and a (possible) answer.

Comment:
Do you have or have you considered having standard attributes to filter on per catalog? Stuff that you could kind of hard code, I guess.

Possible answer:
What about building indexes per catalog on the custom attributes? You could then query the indexes rather than products themselves. Should be faster, no?

@nickhoffman
Copy link
Author

I'd love to have a standard set of attributes, but it's just not possible. This is because so many products are so different. For example, the "Character" and "Allegiance" attributes are needed for Transformers products, but don't apply to Barbie or Board Game products.

Your suggestion of storing the custom attributes and their possible values on a per-catalog basis in an index is a more efficient approach to the alternative that I wrote in the gist. It still requires data denormalization, which I think is unavoidable here.

@batasrki
Copy link

batasrki commented Jan 4, 2012

You're using Mongo, denormalization is expected, :). Indexes might be faster, just take care of the "exceeds RAM" issue.

@nickhoffman
Copy link
Author

Hahah yeah, I know it's expected. It's just a pain in the arse, because the index will need to be updated each time the value of a product's custom attribute changes.

@batasrki
Copy link

batasrki commented Jan 4, 2012

Yeah, what you'll need to do is a background job or a message queue that'll do that out of the request cycle. Hopefully, you won't have too many updates to deal with.

@nickhoffman
Copy link
Author

I already use Resque, so performing the update in the background is easy. It's just yet another moving part that will require maintenance. That's what bugs me =P

@batasrki
Copy link

batasrki commented Jan 4, 2012 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment