-
1. Re: Searching and projecting content
rhauch Mar 5, 2010 11:58 AM (in response to meetoblivion)What is the primary organizational driver for the content? My guess is that the primary driver is not categories, but something a bit more natural to the content. As such, I'd think it makes more sense to have the content own the categories (or at least the association of which categories the content is in).
Have you thought about storing/caching on each category the number of associated content, and periodically refreshing those values (via searches)? If you don't need perfectly accurate, up-to-date numbers, this could save a lot of repeated searches and would be very fast. Of course, any need to view the related content under a category (or categories) could be done via a query. (I'd suggest specifying a limit and offset if you can.)
Another consideration is how the content will actually reference the category: a direct or indirect (weak) reference. The benefit of a direct reference is that the repository will maintain the association and dereferencing [1], but the repository will then enforce referential integrity (meaning you can't delete a category if it's being referenced by content). That may or may not be be a good thing in your case. Plus, some JCR advocates recommend against using references.
The benefit of indirect references is that you are more in control. For example, using the category name may make altering existing categories more difficult (which may be acceptable if it is an infrequent activity), while querying and searching would be very fast and the queries more intuitive (e.g., "... WHERE [my:category] = 'Category1'..." or "... WHERE [my:category] IN ('Category1','Category2','Category3') ..."). Alternatively you could choose to use a numeric identifier, which would make renaming categories easier but would change the queries to use the identifiers in the criteria. As long as your application could cache the identifiers for the categories, you wouldn't need to first look them up.
Does this help?
[1] If the content uses REFERENCE properties to store the association to the category nodes, retrieving the referencing content for a category may be as simple as calling "getReferences()" on the category node. That method actually returns the REFERENCE properties that are owned by the content nodes, as an iterator. And the iterators size method should return an accurate count, as long as users have all access to all content.