When querying a document attribute which is an array of strings, such as tags or labels associated with a document, it can become necessary to say that 'at least the queried strings' are present in the array.
To retrieve all documents with tags/labels including both 'cold' and 'england' (London, Manchester) an array-contains-all operation should be used.
'array-contains-any' would return [London, Manchester, New York].
'in'/'array-contains-only' would return no results (query length does not match).
Proposed 'array-contains-all' would correctly return [London, Manchester].
'array-contains-all' can of course be accomplished by post-filtering array-contains-any results, however if there are a large number of entities then this becomes a very inefficient solution with respect both to the database query response time/payload size and the post-processing time.
Individual boolean attributes could be assigned for each label, this is however unfeasible when a very large/growing number of (possibly user-defined) unique tags/labels are possible.
We have a very large number of documents with hundreds of unique labels being used to filter and serve content in response to user actions with a requirement for very low-latency.
array-contains-any is a very inefficient approach to retrieve those few documents containing all the labels queried/requested.
We need the array-contains-all operator.
When querying a document attribute which is an array of strings, such as tags or labels associated with a document, it can become necessary to say that 'at least the queried strings' are present in the array.
For example consider the following entities:
London: {labels:["south", "england", "cold"]}
Manchester: {labels:["north", "england", "cold"]}
New York: {labels:["east", "usa", "cold"]}
California: {labels:["west", "usa", "warm"]}
Texas: {labels:["south", "usa", "warm"]}
To retrieve all documents with tags/labels including both 'cold' and 'england' (London, Manchester) an array-contains-all operation should be used.
'array-contains-any' would return [London, Manchester, New York].
'in'/'array-contains-only' would return no results (query length does not match).
Proposed 'array-contains-all' would correctly return [London, Manchester].
'array-contains-all' can of course be accomplished by post-filtering array-contains-any results, however if there are a large number of entities then this becomes a very inefficient solution with respect both to the database query response time/payload size and the post-processing time.
Individual boolean attributes could be assigned for each label, this is however unfeasible when a very large/growing number of (possibly user-defined) unique tags/labels are possible.
We have a very large number of documents with hundreds of unique labels being used to filter and serve content in response to user actions with a requirement for very low-latency.
array-contains-any is a very inefficient approach to retrieve those few documents containing all the labels queried/requested.
We need the array-contains-all operator.