Result Object Structure + Examples
The AIServerModel and CloudServerModel classes return a result object that contains the inference results from the predict and predict_batch functions.
Example:
For example, the result can be structured like this:
{
"result": [
[
{ "category_id": 1, "label": "foo", "score": 0.2 },
{ "category_id": 0, "label": "bar", "score": 0.1 }
],
"frame123"
],
"imageFrame": imageBitmap
}
Accessing the Result Data
- Inference Results: Access the main results using
someResult.result[0]. - Frame Info / Number: Get the frame information or frame number using
someResult.result[1]. - Original Input Image: Access the original input image with
someResult.imageFrame.
Inference Result Types
The inference results can be one of the following types:
- Detection: Contains
bbox,category_id,label,score, optionallandmarksandmaskfor instance segmentation and pose estimation. - Classification: Contains
labelandscorefor single-label classification. - Multi-Label Classification: Contains
classifiername and aresultsarray with multiple{ label, score }entries. - Segmentation Mask: Contains
shapeanddata, wheredatais a flat array (orUint8Array) representing class IDs per pixel. - Pose Detection: Contains
landmarksarray with each landmark havingcategory_id,landmarkcoordinates, optionalscore,label, and connectivity information.
Example Results
1. Detection Result
Detection results include bounding boxes (bbox) along with category IDs, labels, and confidence scores. Optionally, masks and landmarks may be included for segmentation and pose.
[
{
"category_id": 2,
"label": "person",
"score": 0.98,
"bbox": [0.1, 0.2, 0.4, 0.8]
},
{
"category_id": 3,
"label": "dog",
"score": 0.87,
"bbox": [0.5, 0.3, 0.9, 0.7],
"mask": {
"width": 128,
"height": 128,
"data": "<RLE string>"
}
}
]
2. Classification Result
Single-label classification results contain a label, score, and category_id.
3. Multi-Label Classification Result
Multi-label classification results include a classifier name and an array of label-score pairs.
[
{
"classifier": "scene_tags",
"results": [
{ "label": "beach", "score": 0.85 },
{ "label": "sunset", "score": 0.65 }
]
}
]
4. Segmentation Mask Result
Semantic segmentation results provide a shape and a data array, which is typically returned as a Uint8Array.
5. Pose Detection Result
Pose detection results include landmarks for each detected person, where each landmark has coordinates, optional score and label, and connectivity information.
[
{
"category_id": 0,
"label": "person",
"score": 0.98,
"bbox": [0.1, 0.2, 0.4, 0.8],
"landmarks": [
{
"category_id": 0,
"landmark": [0.15, 0.25],
"connect": [1],
"label": "nose",
"score": 0.99
},
{
"category_id": 1,
"landmark": [0.15, 0.35],
"connect": [0],
"label": "left_eye",
"score": 0.97
}
]
}
]