Quickstart Tutorial
Here you will find examples on how to interact and do basic Tenyks workflows through SDK
from tenyks_sdk.sdk import Tenyks, Annotation, Category, Tag, display_images
from tenyks_sdk.sdk.cloud import AWSLocation, AWSCredentials, AzureLocation, AzureCredentials, AzureTokenType, GCSLocation
Authentication
You can authenticate to the Tenyks platform setting the following parameters and using the authenticate_with_login
or authenticate_with_api_key
method.
auth_params = {
"api_base_url": "https://dashboard.tenyks.ai/api",
"username": "xxx",
"password": "xxx",
"workspace_name": "xxx",
}
tenyks = Tenyks.authenticate_with_login(**auth_params)
OR
auth_params = {
"api_base_url": "https://dashboard.tenyks.ai/api",
"api_key": "xxx",
"api_secret": "xxx",
"workspace_name": "xxx",
}
tenyks = Tenyks.authenticate_with_api_key(**auth_params)
Workspaces
A workspace is an isolated environment where you can create and manage your datasets and models. Each workspace has its own users.
Through the SDK you can check the workspaces to which you have access. You can switch workspace, create a new one, add and remove users to a specific workspace.
# Get a list of all workspaces you have access to
tenyks.get_workspaces()
# Set the workspace you want to work with, you can switch between your workspaces
tenyks.set_workspace("WORKSPACE_KEY")
# Create a new workspace
new_workspace = tenyks.create_workspace("new_workspace")
# Get a specific workspace by id
workspace = tenyks.get_workspace("WORKSPACE_ID")
# List all users in the workspace
workspace.get_users()
You can add a new user to a workspace with the following parameters:
* username
: username of the user
* sub
: unique identifier of the user
* email
: email of the user
* providers
: list of identity providers, usually it's just one between "DEFAULT" and "GOOGLE"
Currently you have to contact Tenyks team to get the user information in order to add it to a workspace.
workspace.add_user("alan_turing", "111-111", "alan@turing.com", ["DEFAULT"])
# Delete users from the workspace by user sub
workspace.delete_users(["111-111"])
Datasets
You can list all the datasets you have access to, create a new one or delete an existing one.
# List all the datasets in your workspace
tenyks.get_datasets()
# Get a specific dataset by its key
dataset = tenyks.get_dataset("DATASET_KEY")
# Delete a dataset by its key
tenyks.delete_dataset("DATASET_KEY")
You can create a new dataset, upload images, upload annotations and trigger the dataset ingestion on the platform. You can upload images and annotations from your local machine or from S3/Azure/GCS.
Create a new dataset from local files
# Create a new empty dataset in your workspace
dataset = tenyks.create_dataset("NEW_DATASET_KEY")
# Upload images to the new dataset from your local storage, you can pass a directory path or a list of file paths
dataset.upload_images("/path/to/images_directory")
# dataset.upload_images(["/path/to/image1.jpg", "/path/to/image2.jpg"])
# Upload annotations to the new dataset from your local storage, you have to pass the path to a valid COCO annotations file
# You can skip this step if you don't have annotations for your images
dataset.upload_annotations("/path/to/annotations_file")
# Trigger the dataset ingestion process on the Tenyks platform (async operation)
dataset.ingest()
Create a new dataset importing from Cloud Storage
You can create a new dataset importing images and/or annotations from AWS S3, Azure or Google Cloud Storage. You need to provide your cloud credentials and location of the files/folder.
AWS S3
my_aws_credentials = AWSCredentials(
aws_access_key_id="YOUR_AWS_ACCESS_KEY",
aws_secret_access_key="YOUR_AWS_SECRET_ACCESS_KEY",
region_name="YOUR_AWS_REGION",
)
images_location = AWSLocation(
s3_uri="S3_URI_TO_IMAGES_DIRECTORY",
credentials=my_aws_credentials,
)
annotations_location = AWSLocation(
s3_uri="S3_URI_TO_ANNOTATIONS_FILE",
credentials=my_aws_credentials,
)
Azure
If your images are in Azure, you have to provide an empty folder for you metadata as well.
my_azure_credentials = AzureCredentials(
type=AzureTokenType.CONNECTION_STRING,
value="DefaultEndpointsProtocol=https;AccountName=xxx;AccountKey=xxxx=;EndpointSuffix=core.windows.net",
)
images_location = AzureLocation(
azure_uri="AZURE_URI_TO_IMAGES_DIRECTORY",
credentials=my_azure_credentials,
)
# Empty metadata location folder
metadata_location = AzureLocation(
azure_uri="AZURE_URI_TO_EMPTY_DIRECTORY",
credentials=my_azure_credentials,
)
annotations_location = AzureLocation(
azure_uri="AZURE_URI_TO_ANNOTATIONS_FILE",
credentials=my_azure_credentials,
)
Google Cloud Storage
If your images are in Google Cloud Storage, you have to provide an empty folder for you metadata in GCS as well.
my_gcs_credentials = {
"type": "service_account",
"project_id": "my-project-id",
"private_key_id": "my-private-key-id",
"private_key": "-----BEGIN PRIVATE KEY-----\n...",
"client_email": "my-service-account@example.com",
"client_id": "my-client-id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/my-service-account@example.com",
}
images_location = GCSLocation(
gcs_uri="GCS_URI_TO_IMAGES_DIRECTORY",
credentials=my_gcs_credentials,
)
# Empty metadata location folder
metadata_location = GCSLocation(
gcs_uri="GCS_URI_TO_EMPTY_DIRECTORY",
credentials=my_gcs_credentials,
)
annotations_location = GCSLocation(
gcs_uri="GCS_URI_TO_ANNOTATIONS_FILE",
credentials=my_gcs_credentials,
)
# Create a new dataset linked to images folder stored in the cloud, if Azure/GCS you have to provide metadata location as well
dataset_from_cloud = tenyks.create_dataset(
"NEW_DATASET_KEY", images_location=images_location
)
# Upload annotations to the new dataset from the cloud storage
dataset_from_cloud.upload_annotations_from_cloud(annotations_location)
# Trigger the dataset ingestion process on the Tenyks platform (async operation)
dataset_from_cloud.ingest()
Models
For each dataset, you can list all the models, create a new one, or delete an existing one.
# Get a specific dataset from your workspace by its key
dataset = tenyks.get_dataset("DATASET_KEY")
# List all the models of the dataset
dataset.get_models()
# Create a new empty model in the dataset
new_model = dataset.create_model("NEW_MODEL_KEY")
# Upload predictions to the new model from your local storage, you have to pass the path to a valid COCO annotations file
new_model.upload_predictions("/path/to/predictions_file")
# Trigger the model ingestion process on the Tenyks platform (async operation)
new_model.ingest()
# Delete a model from the dataset by its key
dataset.delete_model("MODEL_KEY")
Categories and Tags
You can list and get categories and tags for a specific dataset. You can use those categories and tags to add or update images in a dataset (more on that later).
dataset = tenyks.get_dataset("DATASET_KEY")
dataset
dataset.categories
dataset.tags
my_category = dataset.get_category_by_name("CATEGORY_NAME")
my_category_2 = dataset.get_category_by_id(0)
my_category
my_tag = dataset.get_tag_by_name("TAG_NAME")
my_tag
Search and display images
There are several methods to get and search images in a dataset/model. You can display the images with their bounding boxes.
dataset = tenyks.get_dataset("DATASET_KEY")
# Get the first n images of the dataset (default is 5)
dataset.head()
# Get a specific image by its key and display it
image = dataset.get_image_by_key("IMAGE_KEY")
display_images([image], draw_bboxes=True)
# Display the first ten images of the dataset
first_ten_images = dataset.head(n=10)
display_images(first_ten_images, n_cols=3)
Search Filters
# Images with not too small pedestrians at night
search_result = dataset.search_images(
filter="and(annotation_category:[pedestrian],not(or(annotation_width<50,annotation_height<50)), not(image_tag:[daytime_day]))",
)
display_images(search_result, n_images_to_show=10)
# Images with taxis at night (hybrid search)
search_result = dataset.search_images(
sort_by="vector_text(taxi)",
)
display_images(search_result, n_images_to_show=10)
# You can create a generator to iterate over the search results
search_generator = dataset.images_generator(filter="annotation_category:[pedestrian]")
for i, im in enumerate(search_generator):
print(f"Image {i}: {im}")
# Number of images with a specific annotation category
number_of_images_with_truck = dataset.count_images("annotation_category:[truck]")
print(f"Number of images with trucks: {number_of_images_with_truck}")
# Number of images with trucks in the top left quarter of the image
number_of_images_with_truck = dataset.count_images(
"and(annotation_category:[truck],annotation_x_location<640,annotation_y_location<360)"
)
print(
f"Number of images with trucks in the top left quarter of the image: {number_of_images_with_truck}"
)
Model Search
You can search for images with annotations and predictions for a specific model. You can do that with a Model object or even setting the model_key
parameter on the dataset search_images
method.
images_filtered_by_model = dataset.search_images(
n_images=10,
model_key="MODEL_KEY",
filter="and(bounding_box_match_failure:[MISPREDICTED], annotation_width>=300)",
)
model = dataset.get_model("MODEL_KEY")
model_search_result = model.search_images(
n_images=10,
filter="and(bounding_box_match_failure:[MISPREDICTED], annotation_width>=300)",
)
display_images(model_search_result, n_images_to_show=10, n_cols=2, draw_bboxes=True)
Update/Add image
You can update or add an image in a dataset with new annotations, categories and tags.
You can use categories and tags from the dataset or create new ones.
dataset = tenyks.get_dataset("DATASET_KEY")
Update
boat = Category(name="Boat")
updated_annotations = [
Annotation(
coordinates=[0, 0, 100, 280],
category=boat,
tags=[
Tag(name="tag1", values=["value1"]),
],
),
Annotation(coordinates=[500, 100, 100, 200], category=boat),
]
updated_tags = [Tag(name="MyNewImageTag", values=["MyValue"])]
dataset.update_image("IMAGE_KEY", updated_annotations, updated_tags)
Add
car = dataset.get_category_by_name("Car")
truck = dataset.get_category_by_name("Truck")
new_image_path = (
"PATH/TO/NEW_IMAGE"
)
new_annotations = [
Annotation(
coordinates=[0, 0, 100, 280],
category=car,
tags=[
Tag(name="tag1", values=["value1"]),
Tag(name="tag2", values=["value1", "values2"]),
],
),
Annotation(coordinates=[500, 100, 100, 200], category=truck),
]
new_tags = [Tag(name="ImageTagValue", values=["ImageTagValue"])]
dataset.add_image(new_image_path, new_annotations, new_tags)
Upload Custom Embeddings
You can upload custom embeddings to a dataset. Currently your custom embeddings should be in Json or Arrow format in S3.
custom_embedding_location = {
"type": "aws_s3",
"s3_uri": "S3_URI_TO_CUSTOM_EMBEDDINGS_FOLDER",
"credentials": {
"aws_access_key_id": "XXXXXXXX",
"aws_secret_access_key": "XXXXXXXXXX",
"region_name": "XXXXXXXX",
},
}
dataset = tenyks.get_dataset("new_dataset_from_jupyter")
dataset.upload_custom_embeddings(
embedding_type="images",
embedding_name="my_embeddings",
embedding_location=custom_embedding_location,
embedding_filename="arrow",
)