-
Notifications
You must be signed in to change notification settings - Fork 80
Open
Description
Hi!
I am super interested in the pointing functionality but haven't seen anyone asked about this detail:
When you resize the images in the preprocessor, do you also rescale the "point coordinates" accordingly? It feels to be the right way but from the fact that the code handles the formatter before resizing the images:
first do data formatting
molmo/olmo/data/model_preprocessor.py
Line 836 in 793fa38
| messages, formatter_metadata = self.formater(example, self.is_training, self.for_inference, rng) |
then call multimodal processor
molmo/olmo/data/model_preprocessor.py
Line 841 in 793fa38
| batch = self.mm_preprocessor( |
(I might certainly miss some details!!), looks like the points' coordinates are kept to its original value and serialized into texts.
Can you share some more insights on this? Thanks a lot!
Metadata
Metadata
Assignees
Labels
No labels