Supported text filters

The supported text filters are: contains, exact_match, word_match, categories, regex

contains

contains.pngcontains.png
Filtering documents containing "Durian BID" in description using filter_type `contains`.
This filter returns a document only if it contains a string value. **This filter is case-sensitive.**

For instance, if a product name is composed of a name and a number (e.g. ABC-123), one might remember the name but not the number. This filter can easily return all products including the ABC string.

filters = [
    {
        "field": "description",
        "filter_type": "contains",
        "condition": "==",
        "condition_value": "Durian BID"
    }
]

filtered_data = ds.get_documents(filters=filters)

exact_match

Exact match.pngExact match.png
Filtering documents with "Durian Leather 2 Seater Sofa" as the product_name.
This filter works with string values and only returns documents with a field value that exactly matches the filtered criteria. **This filter is case-sensitive.**

For instance under filtering by 'Samsung galaxy s21', the result will only contain products explicitly having 'Samsung galaxy s21' in their specified field.

filters = [
    {
        "field": "product_name",
        "filter_type": "exact_match",
        "condition": "==",
        "condition_value": "Durian Leather 2 Seater Sofa"
    }
]

filtered_data = ds.get_documents(filters=filters)

word_match

wordmatch.pngwordmatch.png
Filtering documents matching "Home curtain" in the description field.
This filter has similarities to both `exact_match` and `contains`. It returns a document only if it contains a **word** value matching the filter; meaning substrings are covered in this category but as long as they can be extracted with common word separators like the white-space (blank). **This filter is case-sensitive.**

For instance, the filter value "Home Gallery", can lead to extraction of a document with "Buy Home Fashion Gallery Polyester ..." in the description field as both words are explicitly seen in the text.

filters = [
    {
        "field": "description",
        "filter_type": "word_match",
        "condition": "==",
        "condition_value": "Home curtain"
    }
]

filtered_data = ds.get_documents(filters=filters)

categories

categories.pngcategories.png
Filtering documents with "LG" or "Samsung" as the brand.
This filter checks the entries in a database and returns ones in which a field value exists in a given filter list. *This filter is case-sensitive.*

For instance, if the product name is any of Sony, Samsung, or LG.

filters = [
    {
        "field": "brand",
        "filter_type": "categories",
        "condition": ">=",
        "condition_value": ['LG', 'samsung']
    }
]

filtered_data = ds.get_documents(filters=filters)

regex

7cbd106-contains.png7cbd106-contains.png
Filtering documents containing "Durian (\w+)" in description using filter_type `regexp`.
This filter returns a document only if it matches regexp (i.e. regular expression). Note that substrings are covered in this category. *Note that this filter is case-sensitive.*

For instance, if a product name is composed of a name and a number (e.g. ABC-123), one might remember the name but not the number. This filter can easily return all products including the ABC string.

Relevance AI has the same regular expression schema as Apache Lucene to parse queries.

filters = [
    {
        "field": "description",
        "filter_type": "regexp",
        "condition": "==",
        "condition_value": ".*Durian (\w+)"
    }
]

filtered_data = ds.get_documents(filters=filters)