Skip to content

TMKL1 generates different L1 feature vector compare to ThreatExchange TMK native #20

@warbieandyama

Description

@warbieandyama

I am trying to use the python library provided for computing TMK, and realized the vector value is a lot different than the TMK L1 features computed in C++. Is this expected?

For more details:
hasher = hashers.TMKL1(frame_hasher=PDQHashF())
hash1 = hasher.compute(filepath='../chair-22-sd-grey-bar.mp4', hash_format="vector")
The videos are from here: https://github.com/facebook/ThreatExchange/tree/main/tmk/sample-videos

While compare with the normalized feature generated from ThreatExchange TMK, the cosine distance between the two vectors is 1.17 (cosine similarity = -0.17).

I am trying to understand if this is expected, it's due to not using the correct parameters. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions