Skip to content

Emoji vs Emoji_Presentation #9

Description

@brodieG

AFAICT in determining whether a code-point should be wide or not the code (or at least the code that gens the tables) relies on checking whether it's emoji or emoji_presentation (not 100% certain):

# https://www.unicode.org/reports/tr51/#def_basic_emoji_set
emoji = ((emoji_props['Emoji'] - emoji_props['Emoji_Component'])
         | emoji_props['Emoji_Presentation'])

But the current tr51 states:

The emoji code points are those with property values Emoji=Yes, Emoji_Component=No, and Emoji_Presentation=Yes.

It seems the | above is effectively doing an or, though I do not know python so I have no idea if the code is doing what I think it's doing. However, I do see:

utf8::utf8_width(c('\u2139', '\u2728'))
## [1] 2 2

u2139 is not in the "Emoji Presentation" section of Emoji_data, but u2728 is.

On my system (mojave OS X terminal) this is what I see:

image

I don't pretend that my system is the end all be-all in terms of the correct display computation, but it appears to behave as per tr51.

There is additional ambiguity with some emoji with text presentation that actually have a wide-ish text presentation:

image

Though clearly the terminal treats it as 1-wide (FWIW, until recently the terminal also treated normal emojis as 1-wide...).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions