ISO-8859-8-I is the IANA charset name for the character encoding ISO/IEC 8859-8 used together with the control codes from ISO/IEC 6429 for the C0 (00–1F hex) and C1 (80–9F) parts. The characters are in logical order.
Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. Most applications only interpret the control codes for LF, CR, and HT. A few applications also interpret VT, FF, and NEL (in C1). Very few applications interpret the other C0 and C1 control codes.
ISO-8859-8 is sometimes in logical order (HTML, XML), and sometimes in visual (left-to-right) order (plain text without any markup). The WHATWG Encoding Standard used by HTML5 treats ISO-8859-8 and ISO-8859-8-I as distinct encodings with the same mapping due to influence on the layout direction, but notes that this no longer applies to ISO-8859-6 (Arabic), only to ISO-8859-8.[1]
Logical order for this charset requires bidi processing for display.
The Microsoft Windows code page for Hebrew, Windows-1255, uses logical order, and adds support for vowel points as combining characters, and some additional punctuation. It is mostly an extension of ISO-8859-8-I without C1 controls, except for the omission of the double underscore, and replacement of the universal currency sign (¤) with the sheqel sign (₪).
References
edit- ^ van Kesteren, Anne. "9. Legacy single-byte encodings". Encoding Standard. WHATWG.
Note: ISO-8859-8 and ISO-8859-8-I are distinct encoding names, because ISO-8859-8 has influence on the layout direction. And although historically this might have been the case for ISO-8859-6 and "ISO-8859-6-I" as well, that is no longer true.