Transform your data
The easiest way to transform an existing dataset is to run
ds.apply. This function fetches all the data-points in a dataset, runs the specified function and writes the result back to the dataset.
For instance, in the sample code below, we count the number of characters in a text field.
from relevanceai import Client client = Client(token=YOUR_ACTIVATION_TOKEN) client.Dataset("sample") ds["text"].apply(lambda x: len(x), output_field="text_character_count")
You can also run functions that run on batches or multiple fields with
bulk_apply. Especially useful for code that runs faster on batches.
def bulk_character_count(docs): # sample code showing a sample bulk apply function for d in docs: d['text_character_count'] = len(d['text']) return docs ds.bulk_apply(bulk_character_count, select_fields=["text"])
Updated 5 months ago