Skip to content

Tags: arp242/uni

Tags

v2.9.0

Toggle v2.9.0's commit message

Verified

This commit was signed with the committer’s verified signature.
arp242 Martin Tournoij
Release 2.9.0

v2.8.0

Toggle v2.8.0's commit message

Verified

This commit was signed with the committer’s verified signature.
arp242 Martin Tournoij
Release 2.8.0

v2.7.0

Toggle v2.7.0's commit message
Version v2.7.0

- Improve `-format` flag:

  - Add `%name` as an alias for `%(name l:auto)`; this is a lot less typing and
    requires less shell quoting, and >90% of the time this is what you want.

  - Automatically prepend character, codepoint, and name if the format flag
    starts with `+`; for example:

        % uni identify -f +'%unicode %plane' a
                     Name                 Unicode Plane
        'a'  U+0061  LATIN SMALL LETTER A 1.1     Basic Multilingual Plane

  This should make quickly printing some property a lot quicker.

- Align and colourize JSON output.

- Update CLDR information, adding significantly more aliases for emojis.

- Add `cells` column, which returns how many cells a codepoint will display at
  (0, 1, or 2).

- Add `aliases` column, which lists the alias names. Also add this to the
  default output:

      % uni s factorial
           CPoint  Dec    UTF8        HTML       Name  Aliases
      '!'  U+0021  33     21          !     EXCLAMATION MARK [factorial, bang]

- Add `refs` columns, which references other related/similar codepoints:

      % uni p -q U+46 -f '%(name): %(refs)'
      LATIN CAPITAL LETTER F: U+2109, U+2131, U+2132

      % uni p -q U+46 -f '%(refs)' | uni p
           CPoint  Dec    UTF8        HTML       Name  Aliases
      '℉'  U+2109  8457   e2 84 89    ℉   DEGREE FAHRENHEIT
      'ℱ'  U+2131  8497   e2 84 b1    ℱ     SCRIPT CAPITAL F [Fourier transform]
      'Ⅎ'  U+2132  8498   e2 84 b2    Ⅎ   TURNED CAPITAL F [Claudian digamma inversum]

- Allow arguments to `print`to start or end with a comma or slash. This comes up
  when copy/pasting some list of codepoints from another source; there's no real
  reason to error out on this.

- Allow listing unicode versions with `uni list unicode` and planes with `uni
  list planes`.

- `uni list` without arguments errors, instead of listing all.

- Add `h` format flag to not print the header for this column.

v2.6.0

Toggle v2.6.0's commit message
Release 2.6.0

- Update to Unicode 15.1.

- Add "script" property – also supported in the list and print commands:

      % uni identify -f '%(script l:auto) %(cpoint) %(name)' 'a Ω'
      Script CPoint Name
      Latin  U+0061 LATIN SMALL LETTER A
      Common U+0020 SPACE
      Greek  U+03A9 GREEK CAPITAL LETTER OMEGA

      % uni list scripts
      Scripts:
      Name                    Assigned
      Adlam                         83
      Ahom                          54
      Anatolian Hieroglyphs        582
      …

      % uni print 'script:linear a'
      Showing script Linear A
           CPoint  Dec    UTF8        HTML       Name (Cat)
      '𐘀'  U+10600 67072  f0 90 98 80 𐘀  LINEAR A SIGN AB001 (Other_Letter)
      '𐘁'  U+10601 67073  f0 90 98 81 𐘁  LINEAR A SIGN AB002 (Other_Letter)
      '𐘂'  U+10602 67074  f0 90 98 82 𐘂  LINEAR A SIGN AB003 (Other_Letter)
      …

- Add "unicode" property, which tells you in which Unicode version a codepoint
  was introduced:

      % uni identify -f '%(unicode l:auto) %(cpoint l:auto) %(name)' a𐘂🫁
      Unicode CPoint  Name
      1.1     U+0061  LATIN SMALL LETTER A
      7.0     U+10602 LINEAR A SIGN AB003
      13.0    U+1FAC1 LUNGS

- Show unprintable control characters as the open box (␣, U+2423) instead of the
  replacement character (�, U+FFFD). It already did that for C1 control
  characters, and U+FFFD looked more like a bug than intentional. The -raw/-r
  flag still overrides this.

- Always print Private Use characters as-is for %(char) instead of using U+FFFD
  replacement character. It's usually safe to print this, and having to use -raw
  is confusing.

- `ls` command is now an alias for `list.

v2.5.1

Toggle v2.5.1's commit message

Verified

This commit was signed with the committer’s verified signature.
arp242 Martin Tournoij
Release 2.5.1

v2.5.0

Toggle v2.5.0's commit message

Verified

This commit was signed with the committer’s verified signature.
arp242 Martin Tournoij
Cleanup some things

v2.4.0

Toggle v2.4.0's commit message

Verified

This commit was signed with the committer’s verified signature.
arp242 Martin Tournoij
Release v2.4.0

v2.3.0

Toggle v2.3.0's commit message
Release v2.3.0

Changes:

- Update to Unicode 14.0.

- UTF-16 and JSON are printed as lower case, just like UTF-8 was. Upper-case is
  used only for codepoints (i.e. U+00AC).

- `uni print` can now print from UTF-8 byte sequence; for example to print the €
  sign:

      uni p utf8:e282ac
      uni p 'utf8:e2 82 ac'
      uni p 'utf8:0xe2 0x82 0xac'

  Bytes can optionally be separated by any combination of `0x`, `-`, `_`, or spaces.

v2.2.1

Toggle v2.2.1's commit message
Add 0d20 to get a codepoint by decimal 20

v2.2.0

Toggle v2.2.0's commit message
Need to update import paths too