Result Object Structure + Examples
The AIServerModel and CloudServerModel classes return a result object that contains the inference results from the predict
and predict_batch
functions.
Example:
For example, the result can be structured like this:
{
"result": [
[
{ "category_id": 1, "label": "foo", "score": 0.2 },
{ "category_id": 0, "label": "bar", "score": 0.1 }
],
"frame123"
],
"imageFrame": imageBitmap
}
Accessing the Result Data
- Inference Results: Access the main results using
someResult.result[0]
. - Frame Info / Number: Get the frame information or frame number using
someResult.result[1]
. - Original Input Image: Access the original input image with
someResult.imageFrame
.
Inference Result Types
The inference results can be one of the following types:
- Detection: Contains
bbox
,category_id
,label
,score
, optionallandmarks
andmask
for instance segmentation and pose estimation. - Classification: Contains
label
andscore
for single-label classification. - Multi-Label Classification: Contains
classifier
name and aresults
array with multiple{ label, score }
entries. - Segmentation Mask: Contains
shape
anddata
, wheredata
is a flat array (orUint8Array
) representing class IDs per pixel. - Pose Detection: Contains
landmarks
array with each landmark havingcategory_id
,landmark
coordinates, optionalscore
,label
, and connectivity information.
Example Results
1. Detection Result
Detection results include bounding boxes (bbox
) along with category IDs, labels, and confidence scores. Optionally, masks and landmarks may be included for segmentation and pose.
[
{
"category_id": 2,
"label": "person",
"score": 0.98,
"bbox": [0.1, 0.2, 0.4, 0.8]
},
{
"category_id": 3,
"label": "dog",
"score": 0.87,
"bbox": [0.5, 0.3, 0.9, 0.7],
"mask": {
"width": 128,
"height": 128,
"data": "<RLE string>"
}
}
]
2. Classification Result
Single-label classification results contain a label
, score
, and category_id
.
3. Multi-Label Classification Result
Multi-label classification results include a classifier name and an array of label-score pairs.
[
{
"classifier": "scene_tags",
"results": [
{ "label": "beach", "score": 0.85 },
{ "label": "sunset", "score": 0.65 }
]
}
]
4. Segmentation Mask Result
Semantic segmentation results provide a shape
and a data
array, which is typically returned as a Uint8Array
.
5. Pose Detection Result
Pose detection results include landmarks
for each detected person, where each landmark has coordinates, optional score and label, and connectivity information.
[
{
"category_id": 0,
"label": "person",
"score": 0.98,
"bbox": [0.1, 0.2, 0.4, 0.8],
"landmarks": [
{
"category_id": 0,
"landmark": [0.15, 0.25],
"connect": [1],
"label": "nose",
"score": 0.99
},
{
"category_id": 1,
"landmark": [0.15, 0.35],
"connect": [0],
"label": "left_eye",
"score": 0.97
}
]
}
]