Extract Analytics
Total Working Time - User Stats Studio Time and Annotation Classify Bulk Time
Total working time is derived by summing up the measure types userStatsStudioTime
and annotationClassifyBulkTime
.
Below is an example of tracking the total working time at the dataset level.
Note: Grouping is not supported for this metric in the payload, and dimensions must be passed in the context itself. If multiple dimensions, such as userId or datasetId, are passed in the context, the API will return aggregated data based on the provided dimensions.
Code Description
The following code is used to extract the total working time shown in the image above. In the payload
, datasetId
and userId
are optional parameters.
-
If neither parameter is provided: The aggregated data will be fetched for all datasets and users available in the
projectId
. - If both parameters are provided: The aggregated data will be extracted specifically based on the given parameters.
import dtlpy as dl
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id]
},
"measures": [
{
"measureType": "userStatsStudioTime",
"pageSize": 0
},
{
"measureType": "annotationClassifyBulkTime",
"pageSize": 0
}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
total_time = 0
studio_time = samples[0]
bulk_time = samples[1]
if studio_time['response']:
total_time += studio_time['response'][0]['activityDuration']
if bulk_time['response']:
total_time += bulk_time['response'][0]['totalTime']
print(f'total_time in minutes: {int(total_time / (1000 * 60))}')
Net Annotation Time | Avg Annotation on Item | Avg Annotation Time | Avg Annotation per Items | Total Annotation Time
The annotationCounters
measureType is used for tracking the following metrics:
- Net annotation time
- Average item time
- Average annotation time
- Annotations per item (available in UI)
The annotationWholeTime
measureType is used for tracking:
- Total annotation time (available in UI)
Annotation Counter and Annotation Whole Time
Note: Grouping is not supported for these metrics in the payload, and dimensions must be passed in the context.
-
If multiple dimensions such as
userId
,datasetId
, etc., are passed in the context, the response will return aggregated data for the provided dimensions. -
If the parameters
datasetId
anduserId
are not passed, the aggregated data will be fetched for all datasets and users available in theprojectId
.
The following code is used to extract the metrics shown in the image above. In the payload, datasetId
and userId
are optional parameters.
-
If neither parameter is provided
: The aggregated data will be fetched for all datasets and users available in the
projectId
. - If both parameters are provided : The aggregated data will be extracted specifically based on the given parameters.
import dtlpy as dl
import math
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id]
},
"measures": [
{
"measureType": "annotationCounters",
"pageSize": 0
},
{
"measureType": "annotationWholeTime",
"pageSize": 0}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
net_annotation_time = 0
avg_item_time = 0
avg_annotation_time = 0
avg_annotations_per_item = 0
annotation_whole_time = 0
annotation_counters = samples[0]
annotation_wholetime = samples[1]
if annotation_counters['response']:
net_annotation_time += annotation_counters['response'][0]['totalTime']
avg_item_time += annotation_counters['response'][0]['avgItemAnnotationTime']
avg_annotation_time += annotation_counters['response'][0]['avgAnnotationTime']
avg_annotations_per_item += annotation_counters['response'][0]['avgAnnotationCountPerItem']
if annotation_wholetime['response']:
annotation_whole_time += annotation_wholetime['response'][0]['totalTime']
print(f'net_annotation_time in minutes: {int(net_annotation_time/(1000*60))} \n'
f'avg_item_time in minutes: {int(avg_item_time/(1000*60))} \n'
f'avg_annotation_time in minutes: {int(avg_annotation_time/(1000*60))} \n'
f'avg_annotations_per_item: {math.ceil(avg_annotations_per_item)} \n'
f'annotation_whole_time in minutes: {int(annotation_whole_time/(1000*60))}')
Average Annotation Time Per Label
avgAnnotationTimePerLabel
measureType is used for tracking average annotation time per label.
Below code is an example of tracking the average annotation time per label at the dataset level.
Note: Grouping is not supported for this metric in the payload, and dimensions must be passed in the context itself.
Code Description
The following code is used to extract the metrics shown in the image above. In the payload, datasetId
and userId
are optional parameters.
-
If neither parameter is provided
: The data will be fetched for all labels available in the datasets for the
projectId
. - If both parameters are provided : The data will be extracted specifically based on the given parameters.
import dtlpy as dl
import pandas as pd
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id],
"userId": ["e714acd9f43445e73c0a03752454c262e5d43f7a7a97542988b5d874190635af"]
},
"measures": [
{
"measureType": "avgAnnotationTimePerLabel",
"sortDirection": "descending"
}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
if samples[0]['response']:
data = samples[0]['response']
df = pd.DataFrame.from_dict(data=data)
Item Annotation Duration
Use the following code to extract the item annotation duration at the dataset level.
Here we are using the itemAnnotationDuration
measureType.
import dtlpy as dl
import pandas as pd
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id],
"userId": ["e714acd9f43445e73c0a03752454c262e5d43f7a7a97542988b5d874190635af"]
},
"measures": [
{
"measureType": "itemAnnotationDuration",
"sortDirection": "descending"
}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
if samples[0]['response']:
data = samples[0]['response']
df = pd.DataFrame.from_dict(data=data)
Annotation Timeline
The annotationTimeline
measureType is used for tracking the annotation timeline.
Below is an example of tracking the annotation timeline at the dataset level.
Note: Grouping is not supported for this metric in the payload, and dimensions must be passed in the context itself.
The following code is used to extract the metrics shown in the image above. In the payload, datasetId
and userId
are optional parameters.
-
If
timeGranularity
is not provided :
By default, it will pick "hour" as thetimeGranularity
. In the example code below, "hour" and "day" are passed astimeGranularity
, and the response will include both hour-level and day-level data. -
If neither
datasetId
noruserId
is provided : The data will be fetched for all items available in the datasets for theprojectId
. - If both parameters are provided : The data will be extracted specifically based on the given parameters.
import dtlpy as dl
import pandas as pd
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id]
},
"measures": [
{
"measureType": "annotationTimeline",
"sortDirection": "descending",
"timeGranularity": ["hour", "day"]}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
if samples[0]['response']:
hour_data = samples[0]['response']
hour_df = pd.DataFrame.from_dict(data=hour_data)
if samples[1]['response']:
day_data = samples[1]['response']
day_df = pd.DataFrame.from_dict(data=day_data)
Item Status Timeline
The itemStatusTimeline
measureType is used for tracking the item status timeline.
Below is an example of tracking the Item Status Timeline at the dataset level.
Note: Grouping is not supported for this metric in the payload, and dimensions must be passed in the context itself.
The following code is used to extract the metrics shown in the image above. In the payload, datasetId
and userId
are optional parameters.
-
If
timeGranularity
is not provided :
By default, it will pick "hour" as thetimeGranularity
. In the example code below, "hour" and "day" are passed astimeGranularity
, and the response will include both hour-level and day-level data. -
If neither
datasetId
noruserId
is provided :
The data will be fetched for all items available in the datasets for theprojectId
. -
If both parameters are provided
:
The data will be extracted specifically based on the given parameters.
import dtlpy as dl
import pandas as pd
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id]
},
"measures": [
{
"measureType": "itemStatusTimeline",
"sortDirection": "descending",
"timeGranularity": ["hour", "day"]
}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
if samples[0]['response']:
hour_data = samples[0]['response']
hour_df = pd.DataFrame.from_dict(data=hour_data)
if samples[1]['response']:
day_data = samples[1]['response']
day_df = pd.DataFrame.from_dict(data=day_data)
Average Annotation Time Per Annotator
The avgItemAnnotationTimePerAnnotator
measureType is used for tracking the average annotation time per annotator.
Below is an example of tracking the Average Annotation Time Per Annotator at the dataset level.
Note: Grouping is not supported for this metric in the payload, and dimensions must be passed in the context itself.
The following code is used to extract the metrics shown in the image above. In the payload, datasetId
and userId
are optional parameters.
-
If neither
datasetId
noruserId
is provided : The data will be fetched for all items available in the datasets for theprojectId
. - If both parameters are provided : The data will be extracted specifically based on the given parameters.
import dtlpy as dl
import pandas as pd
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id]
},
"measures": [
{
"measureType": "avgItemAnnotationTimePerAnnotator",
"sortDirection": "descending"
}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
if samples[0]['response']:
data = samples[0]['response']
df = pd.DataFrame.from_dict(data=data)
Count Items in Annotation Time Bucket
The countItemInAnnotationTimeBucket
measureType is used for tracking the count of items in annotation time buckets.
Below is an example of tracking the Count Items in Annotation Time Bucket at the dataset level.
Note: Grouping is not supported for this metric in the payload, and dimensions must be passed in the context itself.
The following code is used to extract the metrics shown in the image above. In the payload, datasetId
and userId
are optional parameters.
-
If
timeGranularity
is not provided :
By default, it will pick "hour" as thetimeGranularity
. In the example code below, "hour" and "day" are passed astimeGranularity
, and the response will include both hour-level and day-level data. -
If neither
datasetId
noruserId
is provided : The data will be fetched for all items available in the datasets for theprojectId
. - If both parameters are provided : The data will be extracted specifically based on the given parameters.
import dtlpy as dl
import pandas as pd
project = dl.projects.get(project_name='my project')
dataset = project.datasets.get(dataset_name='my dataset')
payload = {
"startTime": project.created_at,
"endTime": None,
"context": {
"projectId": [project.id],
"datasetId": [dataset.id]
},
"measures": [
{
"measureType": "countItemInAnnotationTimeBucket",
"sortDirection": "descending",
"timeGranularity": ["hour", "day"]}
]
}
success, resp = dl.client_api.gen_request(req_type="post",
path="/analytics/query",
json=payload)
samples = resp.json()
if samples[0]['response']:
hour_data = samples[0]['response']
hour_df = pd.DataFrame.from_dict(data=hour_data)
if samples[1]['response']:
day_data = samples[1]['response']
day_df = pd.DataFrame.from_dict(data=day_data)