'End-to-End Object Detection with Transformers' 1. Theroretical Background $$\mathcal{L}_{\text{match}}(y_{i}, \hat{y}_{\sigma(i)}) = -\mathbb{1}_{{c_{i} \neq \phi}}\hat{p}_{\sigma(i)}(c_{i}) + \mathbb{1}_{{c_{i} \neq \phi}}\mathcal{L}_{\text{box}}(b_{i}, \hat{b}_{\sigma(i)})$$ References [1] https://github.com/gokul-pv/DetectionTransformer [2] https://github.com/facebookresearch/detr