[mew-int 01620] Re: windows 1252

Fri Nov 14 11:57:01 JST 2003

In article <87he18grg9.fsf at example.com>, "Stephen J. Turnbull" <stephen at example.com> writes:
> I certainly agree that UTF-8 should be used for encoding.  The
> question is should the DOCS UTF-8 (XFree86 only, I fear) sequence be
> used to invoke it, or should the DOCS private final byte UTF-8 (X11
> standard extended segment) be used.

I don't understand what "DOCS private final byte UTF-8"
means.  Do you mean using the following for UTF-8?

6.  Non-Standard Character Set Encodings
[...]
     01/11 02/05 02/15 03/00 M L   variable number of octets per character

Do you know if there exist an application that send/receive
such an encoding?  If so, now we have three methods for
transfering UTF-8 in inter-client communication (the above,
XFree86's only UTF-8 encoding using ESC % G ..., use
UTF8_STRING instead of CTEXT), and there's no way to know
which receiver accept which encoding.  Sigh...

Kenichi>  Emacs decodes extended segment for ISO-8859-15 correctly,
Kenichi>  but doesn't use it for encoding.  According to Dave,
Kenichi>  Latin-9 (ISO-8859-15) users don't want it.  See this code
Kenichi>  in mule.el.

> I know it violates the CTEXT standard but many Linux apps give it to
> you anyway.

> It's interesting that they happily take the standard codes.  That's
> useful to know.

I've just confirmed that, in iso-8859-15 locale, XFree86
client (gnome-terminal) sends iso-8859-15 chars in extended
segment, not in the standard encoding (i.e. ESC - b ...),
but accepts iso-8859-15 in the standard encoding.

---
Ken'ichi HANDA
handa at example.com