Tags · arp242/uni

v2.9.0

Release 2.9.0

Dec 16, 2025
f66ba1d
zip
tar.gz
Notes
Downloads

v2.8.0

Release 2.8.0

Sep 10, 2024
1d280da
zip
tar.gz
Notes
Downloads

v2.7.0

Version v2.7.0

- Improve `-format` flag:

- Add `%name` as an alias for `%(name l:auto)`; this is a lot less typing and
requires less shell quoting, and >90% of the time this is what you want.

- Automatically prepend character, codepoint, and name if the format flag
starts with `+`; for example:

% uni identify -f +'%unicode %plane' a
Name Unicode Plane
'a' U+0061 LATIN SMALL LETTER A 1.1 Basic Multilingual Plane

This should make quickly printing some property a lot quicker.

- Align and colourize JSON output.

- Update CLDR information, adding significantly more aliases for emojis.

- Add `cells` column, which returns how many cells a codepoint will display at
(0, 1, or 2).

- Add `aliases` column, which lists the alias names. Also add this to the
default output:

% uni s factorial
CPoint Dec UTF8 HTML Name Aliases
'!' U+0021 33 21 &excl; EXCLAMATION MARK [factorial, bang]

- Add `refs` columns, which references other related/similar codepoints:

% uni p -q U+46 -f '%(name): %(refs)'
LATIN CAPITAL LETTER F: U+2109, U+2131, U+2132

% uni p -q U+46 -f '%(refs)' | uni p
CPoint Dec UTF8 HTML Name Aliases
'℉' U+2109 8457 e2 84 89 &#x2109; DEGREE FAHRENHEIT
'ℱ' U+2131 8497 e2 84 b1 &Fscr; SCRIPT CAPITAL F [Fourier transform]
'Ⅎ' U+2132 8498 e2 84 b2 &#x2132; TURNED CAPITAL F [Claudian digamma inversum]

- Allow arguments to `print`to start or end with a comma or slash. This comes up
when copy/pasting some list of codepoints from another source; there's no real
reason to error out on this.

- Allow listing unicode versions with `uni list unicode` and planes with `uni
list planes`.

- `uni list` without arguments errors, instead of listing all.

- Add `h` format flag to not print the header for this column.

May 22, 2024
481d298
zip
tar.gz
Notes
Downloads

v2.6.0

Release 2.6.0

- Update to Unicode 15.1.

- Add "script" property – also supported in the list and print commands:

      % uni identify -f '%(script l:auto) %(cpoint) %(name)' 'a Ω'
      Script CPoint Name
      Latin  U+0061 LATIN SMALL LETTER A
      Common U+0020 SPACE
      Greek  U+03A9 GREEK CAPITAL LETTER OMEGA

      % uni list scripts
      Scripts:
      Name                    Assigned
      Adlam                         83
      Ahom                          54
      Anatolian Hieroglyphs        582
      …

      % uni print 'script:linear a'
      Showing script Linear A
           CPoint  Dec    UTF8        HTML       Name (Cat)
      '𐘀'  U+10600 67072  f0 90 98 80 &#x10600;  LINEAR A SIGN AB001 (Other_Letter)
      '𐘁'  U+10601 67073  f0 90 98 81 &#x10601;  LINEAR A SIGN AB002 (Other_Letter)
      '𐘂'  U+10602 67074  f0 90 98 82 &#x10602;  LINEAR A SIGN AB003 (Other_Letter)
      …

- Add "unicode" property, which tells you in which Unicode version a codepoint
  was introduced:

      % uni identify -f '%(unicode l:auto) %(cpoint l:auto) %(name)' a𐘂🫁
      Unicode CPoint  Name
      1.1     U+0061  LATIN SMALL LETTER A
      7.0     U+10602 LINEAR A SIGN AB003
      13.0    U+1FAC1 LUNGS

- Show unprintable control characters as the open box (␣, U+2423) instead of the
  replacement character (�, U+FFFD). It already did that for C1 control
  characters, and U+FFFD looked more like a bug than intentional. The -raw/-r
  flag still overrides this.

- Always print Private Use characters as-is for %(char) instead of using U+FFFD
  replacement character. It's usually safe to print this, and having to use -raw
  is confusing.

- `ls` command is now an alias for `list.

Nov 24, 2023
b25052b
zip
tar.gz
Notes
Downloads

v2.5.1

Release 2.5.1

May 9, 2022
f33796f
zip
tar.gz
Notes
Downloads

v2.5.0

Cleanup some things

May 3, 2022
4c9e955
zip
tar.gz
Notes
Downloads

v2.4.0

Release v2.4.0

Dec 20, 2021
62da7a3
zip
tar.gz
Notes
Downloads

v2.3.0

Release v2.3.0

Changes:

- Update to Unicode 14.0.

- UTF-16 and JSON are printed as lower case, just like UTF-8 was. Upper-case is
  used only for codepoints (i.e. U+00AC).

- `uni print` can now print from UTF-8 byte sequence; for example to print the €
  sign:

      uni p utf8:e282ac
      uni p 'utf8:e2 82 ac'
      uni p 'utf8:0xe2 0x82 0xac'

  Bytes can optionally be separated by any combination of `0x`, `-`, `_`, or spaces.

Oct 5, 2021
2eb3645
zip
tar.gz
Notes
Downloads

v2.2.1

Add 0d20 to get a codepoint by decimal 20

Jun 15, 2021
26b7f77
zip
tar.gz
Notes
Downloads

v2.2.0

Need to update import paths too

May 30, 2021
9a30c44
zip
tar.gz
Notes
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!