Dataset
Bases: BaseModel
A dataset class representing a dataset in the Tenyks platform
Attributes:
Name | Type | Description |
---|---|---|
client |
Client
|
The client to interact with the Tenyks API. |
workspace_name |
str
|
Name of the workspace the dataset belongs to. |
key |
str
|
Key of the dataset. |
name |
str
|
Name of the dataset. |
owner |
str
|
Owner of the dataset. |
owner_email |
EmailStr
|
Owner email of the dataset. |
created_at |
datetime
|
Creation timestamp of the dataset. |
images_location |
Optional[Union[AWSLocation, AzureLocation, GCSLocation]]
|
Directory location of the images of the dataset. |
metadata_location |
Optional[Union[AWSLocation, AzureLocation, GCSLocation]]
|
Directory location of the metadata of the dataset. |
categories |
List[Category]
|
Categories/classes of the dataset. |
models |
List
|
Names of the models of the dataset. |
status |
str
|
Status of the dataset. |
n_images |
int
|
Number of images in the dataset. |
iou_threshold |
float
|
IOU threshold set for the dataset. |
add_image(image_path, annotations=None, tags=None, verbose=False)
Add an image to the dataset along with its annotations and tags.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_path |
str
|
The path of the image to add. |
required |
annotations |
Optional[List[Annotation]]
|
The annotations to add to the image. Defaults to None. |
None
|
tags |
Optional[List[Tag]]
|
The tags to add to the image. Defaults to None. |
None
|
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to False. |
False
|
count_images(filter=None, model_key=None)
Return image count that match the filter criteria.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter |
Optional[str]
|
Filter conditions for counting. Defaults to None. |
None
|
model_key |
Optional[str]
|
Model key to filter images. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
int |
int
|
Number of images that match the filter criteria. |
create_model(name, confidence_threshold=None, iou_threshold=None)
Create a new model for the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the new model. |
required |
confidence_threshold |
Optional[float]
|
The confidence threshold for the model. Defaults to None. |
None
|
iou_threshold |
Optional[float]
|
The IOU threshold for the model. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
Model |
Model
|
The newly created model. |
delete_model(key)
Delete a model from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
str
|
The key of the model to delete. |
required |
finetune_search_model(search_query, ground_truth_search_results)
Placeholder method for finetuning search
Parameters:
Name | Type | Description | Default |
---|---|---|---|
search_query |
str
|
search query on which to finetune model |
required |
ground_truth_search_results |
List[Image]
|
ground truth images that should be retrieved |
required |
get_category_by_id(category_id)
Retrieve a category by its ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
category_id |
int
|
The ID of the category to retrieve. |
required |
Returns:
Name | Type | Description |
---|---|---|
Category |
Category
|
The category corresponding to the given ID. |
get_category_by_name(category_name)
Retrieve a category by its name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
category_name |
str
|
The name of the category to retrieve. |
required |
Returns:
Name | Type | Description |
---|---|---|
Category |
Category
|
The category corresponding to the given name. |
get_image_by_key(image_key)
Retrieve an image by its key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_key |
str
|
The key of the image to retrieve. |
required |
Returns:
Name | Type | Description |
---|---|---|
Image |
Image
|
The image corresponding to the given key. |
get_model(key)
Retrieve a model by its key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
str
|
The key of the model to retrieve. |
required |
Returns:
Name | Type | Description |
---|---|---|
Model |
Model
|
The model corresponding to the given key. |
get_model_names()
Retrieve the names of the models associated with the dataset.
Returns:
Type | Description |
---|---|
List[str]
|
List[str]: A list of model display names. |
get_models()
Retrieve the models associated with the dataset.
Returns:
Type | Description |
---|---|
List[Model]
|
List[Model]: A list of models associated with the dataset. |
get_tag_by_key(tag_key)
Retrieve a tag by its key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tag_key |
str
|
The key of the tag to retrieve. |
required |
Returns:
Name | Type | Description |
---|---|---|
Tag |
Tag
|
The tag corresponding to the given key. |
get_tag_by_name(tag_name)
Retrieve a tag by its display name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tag_name |
str
|
The name of the tag to retrieve. |
required |
Returns:
Name | Type | Description |
---|---|---|
Tag |
Tag
|
The tag corresponding to the given display name. |
get_tags()
Retrieve the tags associated with the dataset.
Returns:
Type | Description |
---|---|
List[Tag]
|
List[Tag]: A list of tags created for the dataset. |
head(n=5)
Retrieve the first few images from the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int
|
The number of images to retrieve. Defaults to 5. |
5
|
Returns:
Type | Description |
---|---|
List[Image]
|
List[Image]: A list of the first |
images_generator(filter=None, sort_by=None, model_key=None, page_size=250)
Generator to retrieve images from the dataset in a paginated manner.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter |
Optional[str]
|
Filter conditions for the search. Defaults to None. |
None
|
sort_by |
Optional[str]
|
Sort criteria for the search. Defaults to None. |
None
|
model_key |
Optional[str]
|
Model key to filter images. Defaults to None. |
None
|
page_size |
Optional[int]
|
Number of images per page. Defaults to 250. |
250
|
Yields:
Name | Type | Description |
---|---|---|
Generator |
Generator
|
A generator yielding images. |
ingest(import_operation=None, verbose=True)
Trigger the ingestion process for the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
import_operation |
Optional[str]
|
The import operation type. Defaults to None. |
None
|
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to True. |
True
|
save_image_metadata(metadata_key, metadata_values)
Add or update custom metadata for images in a dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metadata_key |
str
|
The key representing the type of metadata to be saved. Must contain only alphanumeric characters (no spaces, underscores, or special characters), e.g. brightness. |
required |
metadata_values |
Dict[str, Union[int, float]]
|
A dictionary where the keys are image identifiers and the values are the metadata values to be saved (either integer or float). |
required |
Example
metadata_values = { "image1": 0.75, "image2": 0.85, "image3": 0.65, # More image metadata... } dataset.save_image_metadata( metadata_key="brightness", metadata_values=metadata_values )
Note
The metadata values are sent to the server in batches of 500 to avoid overwhelming the API. Each batch is processed sequentially, and the method logs the progress of each batch. After all batches are processed, the dataset's metadata key is updated accordingly.
search_images(n_images=250, filter=None, sort_by=None, model_key=None)
Perform image search in the dataset based on filters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_images |
Optional[int]
|
The number of images to retrieve. Defaults to 250. |
250
|
filter |
Optional[str]
|
Filter conditions for the search. Defaults to None. |
None
|
sort_by |
Optional[str]
|
Sort criteria for the search. Defaults to None. |
None
|
model_key |
Optional[str]
|
Model key to filter images. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
List[Image]
|
List[Image]: A list of images that match the search criteria. |
search_video(n_videos=50, filter=None, sort_by=None, model_key=None)
Perform video search in the dataset based on filters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_videos |
Optional[str]
|
Number of video clips to return. Defaults to 50. |
50
|
filter |
Optional[str]
|
Filter conditions for the search. Defaults to None. |
None
|
sort_by |
Optional[str]
|
Sort criteria for the search. Defaults to None. |
None
|
model_key |
Optional[str]
|
Model key to filter videos. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
List[VideoClip]
|
List[VideoClip]: A list of video clips that match the search criteria. |
update_image(image_key, annotations, tags=None, verbose=False)
Update an existing image's annotations and tags.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_key |
str
|
The key of the image to update. |
required |
annotations |
List[Annotation]
|
The new annotations for the image. |
required |
tags |
Optional[List[Tag]]
|
The new tags for the image. Defaults to None. |
None
|
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to False. |
False
|
upload_annotations(coco_path_or_dict, verbose=True)
Upload annotations to the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
coco_path_or_dict |
Union[str, dict]
|
The file path or dictionary of COCO annotations to upload. |
required |
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to True. |
True
|
upload_annotations_from_cloud(coco_file_location)
Upload annotations to the dataset from a cloud location.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
coco_file_location |
Union[AWSLocation, AzureLocation, GCSLocation]
|
The cloud location of the COCO annotations to upload. |
required |
upload_custom_embeddings(embedding_name, embedding_location, embedding_type='images', verbose=True)
Upload custom embeddings to the dataset for use in Embedding viewer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
embedding_name |
str
|
The display name of the embeddings. |
required |
embedding_location |
dict
|
The location of the embeddings in cloud storage. |
required |
embedding_type |
str
|
The type of embeddings. At present only 'images' is supported. 'annotations'/'predictions' coming soon! |
'images'
|
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to True. |
True
|
upload_custom_embeddings_from_local(embedding_name, embedding_path, embedding_type='images', verbose=True)
Upload custom embeddings from a local file to the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
embedding_name |
str
|
The display name of the embeddings. |
required |
embedding_path |
str
|
The path to the custom embeddings JSON. |
required |
embedding_type |
str
|
The type of embeddings. At present only 'images' is supported. 'annotations'/'predictions' coming soon! |
'images'
|
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to True. |
True
|
upload_images(image_directory_or_paths, verbose=True)
Upload images to the dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_directory_or_paths |
Union[str, Path, List[str]]
|
The directory or paths of the images to upload. |
required |
verbose |
Optional[bool]
|
If True, provides progress updates. Defaults to True. |
True
|
upload_videos_from_cloud_and_ingest(video_folder_location, sample_rate_per_second, frames_to_subsample, prompts=['objects'], threshold=0.005)
Create a new dataset in the current workspace.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
video_folder_location |
Union[AWSLocation, GCSLocation, AzureLocation]
|
The location of the folder of videos where the images uploaded to the dataset come from |
required |
video_clip_generator(filter=None, sort_by=None, model_key=None, page_size=50)
Generator to retrieve video clips from the dataset in a paginated manner.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter |
Optional[str]
|
Filter conditions for the search. Defaults to None. |
None
|
sort_by |
Optional[str]
|
Sort criteria for the search. Defaults to None. |
None
|
model_key |
Optional[str]
|
Model key to filter videos. Defaults to None. |
None
|
page_size |
Optional[int]
|
Number of images per page. Defaults to 50. |
50
|
Yields:
Name | Type | Description |
---|---|---|
Generator |
Generator
|
A generator yielding images. |