layout | category | title | date |
---|---|---|---|
post |
deep_learning |
Notes On YOLO |
2015-12-14 |
ground-truth:
49 x (1 + 20 + 4) =>
49 x (1 x obj_gt + 20 x classes_gt + 4 x box_gt)
predict-data:
49 x 20 + 49x(1x2) + 49x(4x2) =>
49 x (20 x classes) + 49 x (2 x obj_confidence) + 49 x (2 x predict_boxes)
(1) The multi-part loss function differ from the code implementation:
- \lambda_{coord} \sum_{i=0}^{S^2} \sum_{j=0}^{S} \mathcal{1}_{ij}^{obj} (\sqrt{w_i} - \sqrt{\hat{w}_i})^2 + (\sqrt{h_i} - \sqrt{\hat{h}_i})^2 $$
while in forward_detection_layer(), detection_layer.c, different loss calculation:
{% highlight cpp %} *(l.cost) += pow(1-iou, 2); {% endhighlight %}
- \sum_{i=0}^{S^2} \sum_{j=0}^{S} \mathcal{1}_{ij}^{obj} (C_i - \hat{C}_i)^2 $$
{% highlight cpp %} *(l.cost) -= l.noobject_scale * pow(l.output[p_index], 2); *(l.cost) += l.object_scale * pow(1-l.output[p_index], 2); {% endhighlight %}
- \lambda_{noobj} \sum_{i=0}^{S^2} \sum_{j=0}^{S} \mathcal{1}_{ij}^{obj} (C_i - \hat{C}_i)^2 $$
{% highlight cpp %} (l.cost) += l.noobject_scalepow(l.output[p_index], 2); {% endhighlight %}
- \sum_{i=0}^{S^2} \mathcal{1}{i}^{obj} \sum{c \in classes} (p_i(c) - \hat{p}_i(c))^2 $$
{% highlight cpp %} *(l.cost) += l.class_scale * pow(state.truth[truth_index+1+j] - l.output[class_index+j], 2); {% endhighlight %}