[mew-int 01607] Re: windows 1252
Kazu Yamamoto ( 山本和彦 )
kazu at example.com
Mon Nov 10 16:11:23 JST 2003
Hello Handa-san,
Thank you for your explanation.
> (2) ctext (alias of compound-text)
>
> On conversion, it works not fully compatible with the
> specification of X Compound Text because it encodes any
> Emacs characters while using an designation sequence for
> private character sets (please note that all Emacs charasets
> have a iso-final-char). So, Big5 characters are preceded by
> ESC $ ( 0 or 1, mule-unicode-0100-24ff characters are
> preceded by ESC - 1.
^^^^^^^
Let me clarify.
Q1) It seemes to me that Emacs encodes mule-unicode-0100-24ff with ESC
$ - 1. But the explanation above says ESC - 1. Which one is correct as
Emacs's spec?
Q2) I don't think it's not good idea to disclose the internal
representation "mule-unicode-0100-24ff" into a file. According to the
spec of ctext provided with XFree86, it has extension for UTF-8:
---
7. The UTF-8 encoding
Unicode characters that are not contained in one of the
approved standard encodings can be encoded using the UTF-8
encoding. The following escape sequences are used:
01/11 02/05 04/07 switch into UTF-8 mode
01/11 02/05 04/00 return from UTF-8 mode
The first is the ISO registered sequence for UTF-8 (ISO-
IR-196), the second is the ISO-2022 ``standard return''
sequence. While in UTF-8 mode, the UTF-8 encoding replaces
the currently designated GL and GR encodings. After return
from UTF-8 mode, the previously designated GL and GR encod-
ings are reactivated.
---
How about using this to encode mule-unicode-0100-24ff?
> When it runs under emacs-unicode version, on writing the
> file, if all the characters can be encoded by ctext, keep
> using it. If not (because, in emacs-unicode, some character
> doesn't belong to any charset that has iso-final-char), use
> utf-8. And in both cases, add a coding tag. On reading,
> check the coding tag at first. If no coding tag, read by
> ctext, otherwise, read by the coding system specified in the
> tag.
I remember that, some years ago, Handa-san said to me, "The current
Emacs is using mule-unicode but will migrate to Unicode". But I don't
know what exactly emacs-unicode refers to. Which versions? Or
a different source tree?
--Kazu
More information about the Mew-int
mailing list