[mew-int 01591] Re: windows 1252

Wed Nov 5 00:55:02 JST 2003

From: Stefan Monnier <monnier at example.com>
Subject: [mew-int 01590] Re: windows 1252

> > (1) Backgourd compatibility to non-Mule Emacsen.
> 
> > 	Non-Mule Emacsen use 8bit as ISO-8859-1. Thus, to share the
> > 	cache among Mule Emacsen and non-Mule Emacsen, we need to
> > 	character set whose 8bit is ISO-8859-1.
> 
> utf-8 would be ideal here.

Not correct.

UTF-8 is compatible to US-ASCII but not to ISO-8859-1.

That is, U0000-U007f is encoded to 0x00-0x0f while U0080-U00FF is
encoded to two 8bit bytes (110xxxxx 10xxxxxx).

> > (2) Co-exist of Emacs and XEmacs.
> 
> > 	The 'emacs-mule coding-system is not appropriate since XEmacs
> > 	has a different internal representation from Emacs'one. Note
> > 	Emacsen use different 'emacs-mule coding-system among
> > 	versions.
> 
> iso-2022 would be the answer here.

Yes, ctext is one instance of the ISO-2022 framework.

> It's unfortunate, but I guess it makes sense.
> It should be possible to make ctext-with-extensions work for your case.

To support a new character set in ctext, we only need to register a
new escape sequence. The new ctext is forward compatible, and backward
compatible if the new character set is not encoded. So, we don't need
a new coding-system name, I think.

> BTW, windows-1252 should internally be turned into a mix of chars from
> various charsets and they should (hopefully) all be encodable directly in
> ctext, so I'm not sure what is your exact problem.  Could you describe what
> currently happens with windows-1252 and what you'd like to see instead ?

As I said, I don't know windows-1252 well and I don't know the current
ctext can encode all windows-1252 characters. I would like to know
correct information about this.

--Kazu