Model Adapter Class Methods

In this tutorial, we can see the flowcharts for the Model Adapter functions. There are two main types of functions:

  • Wrapper Functions : Those are functions that come as part of the BaseModelAdapter class and perform most of the auxiliary tasks needed for adapting a machine learning model to the Dataloop format.
  • User Functions : Each wrapper function will call a user function, i.e. a function implemented by the user in their ModelAdapter with the specific code related to the model they wish to adapt.

In the following sections we will see explanations for each function of each category.

Wrapper Functions

Here we will see the flowcharts explaining the logic of the wrapper functions. The blocks in green show operations performed by the wrapper function, while blocks in yellow show a call to a user function, which will require implementation.

load_from_model

This function is used to download model artifacts and load them to model object so it will be ready to be used for training and inference.

load_from_model
download artifacts @
local_path=~/.dataloop/models/{model.name}
load(local_path)

The load_from_model function performs starts by downloading artifacts as seen in the download_artifacts block. It will define the variable local_path to the path ~/.dataloop/models/{model.name} where {model.name} will be filled with the name of the model as defined during its creation. Inside this directory, the model.artifacts will be downloaded. Those usually include weight files, but can include any other auxiliary files needed by the model that were uploaded by the user. More information at this page.

Once all files are in local_path the user function load is invoked. Its explanation can be found below.

The directory structure will be, considering that ~/.dataloop is the default DATALOOP_PATH:

Copy
Copied
Directory tree at this stage:  
DATALOOP_PATH  
|-- models  
|   |-- model.name  
|      |-- artifacts  

save_to_model

This function saves the current model files and uploads them as artifacts.

save_to_model
save(local_path)
upload artifact @
local_path/*

The save_to_model function invokes the save user function explained below and then uploads all the files in local_path as model artifacts, as described here.

predict_items

This function receives a list of items over which the model will need to perform predictions and create annotations.

No
Yes
predict_items
i_batch = next batch start
batch = items[i: (i+batch_size)]
prepare_item_func
annotation = predict(batch)
upload batch_collections for batch_items
Is last batch?
return items and annotations list

The predict_items function will prepare batches of items with batch_size being defined in the model's configurations. For each batch it will call prepare_item_func which will preprocess the batch's items before they can be used as input to the model in the predict(batch) which will have the predictions stored in annotation. The annotations are then stored in batch_collections and uploaded as annotations for all the items in the batch.

Once the predictions are performed over all the items in all the batches, the function returns a list of items and another list with their respective annotations.

train_model

When running a training session from the model adapter, we start by calling the train_model wrapper function so the model can learn according to the data provided to it in its creation.

train_model
load_from_model
prepare_data
train(data_path, out_path)
save_to_model
cleanup

Train model starts by loading the model, prepares the data by downloading it according to the directory structure shown below. It then invokes the train function implemented by the user and saves the result of training by calling save_to_model. Lastly, it does a cleanup by deleting all the local copies of the dataset files used for training.

When creating the model, train and validation subsets should be defined for the dataset as shown here. The prepare_data function will create the directory structure shown below and in the data_path it will download the training data in the directories seen in this schema. The data will be divided according to the model's subset filters, and items will hold the data items themselves while json has the annotation JSONs associated to each item. Keep that in mind when preprocessing the data in the train function.

Copy
Copied
Directory tree at convert_from_dtlpy (supposing train and validation subsets):  
-- DATALOOP_PATH  
   |-- models  
   |   |-- model.name  
   |      |-- artifacts  
   |-- model_data  
       |-- model.id_model.name  
           |-- timestamp (root_path)  
               |-- output (out_path)  
               |-- datasets  
                   |-- dataset.id (data_path)  
                       |-- train  
                       |   |-- items  
                       |   |   |-- train_dir (from filter)  
                       |   |-- json  
                       |       |-- train_dir (from filter)  
                       |-- validation  
                           |-- items  
                           |   |-- val_dir (from filter)  
                           |-- json  
                               |-- val_dir (from filter)  
  

evaluate_model

This function generates predictions for a whole test set provided to it and then creates metrics that will be uploaded to Dataloop platform.

evaluate_model
load_from_model
predict_dataset
evaluate

It starts by invoking load_from_model so it has the latest model artifacts and then calls predict_dataset which in turn calls predict_items for a whole dataset and a filter provided, which should determine a test set for this dataset. Finally, evaluate will compute metrics by comparing the model's predictions with the ground truth present in the test set.

User Function

Those are the function that users must implement in order to run the model. For only prediction, must implement load and predict. For training - train and save are also required.

load

After the wrapper function download all the model artifact to the local directory, users must implement this function to load the model (using the local files and the model config) and instantiate the model.

save

Users need to implement this function to dump the model state to a local directory, e.g. torch.save(model.state_dict(), PATH)
After that, the wrapper function will take care of the rest and will upload the files into the platform, update the model config, and save everything on to the model entity

train

This function is called the wrapper function loads the model, downloads and prepare the data.
Now everything is ready locally and this function implements the actual model training.
When this is done, there's no need to do anything - the wrapper will take care of the saving and uploading.

predict

This function is called the load model, so now we have the model ready to predict.
Each item goes through the prepare_item_func and a batch is ready to predict.
After the model prediction, user will need to prepare the annotation is the Dataloop format using the DL annotations.

prepare_item_func

Prepares each item for prediction. Bt default, images will be downloaded and loaded into a NDarray as a batch (NHWC)