[mew-int 01621] Re: windows 1252

Kenichi Handa handa at example.com
Fri Nov 14 12:39:55 JST 2003


In article <9003-Thu13Nov2003214931+0200-eliz at example.com>, "Eli Zaretskii" <eliz at example.com> writes:
>>  I think Dave is correct because CTEXT spec has this
>>  paragraph.
>>  
>>  	Extended segments are not to be	used for any character set
>>  	encoding that can be constructed from a	GL/GR pair of
>>  	approved standard encodings. For example, it is	incorrect to
>>  	use an extended	segment	for any	of the ISO 8859	family of
>>  	encodings.

> For the record, when I worked on this code, I added the ISO 8859
> charsets mentioned above because the then official version of the
> CTEXT spec did not include them in the list of approved standard
> encodings.  So, as far as that CTEXT spec was concerned, these
> charsets were not members of the ISO 8859 family.

Hmmm, I didn't understand the above paragraph as you, but it
seems that you are correct.  Dave, what do you think?

FYI, I found this section in the spec.

------------------------------------------------------------
10.  Extensions

There is no absolute requirement for a parser to deal with
anything but the particular encoding syntax defined in this
specification.	However, it is possible	that Compound Text
may be extended	in the future, and as such it may be desir-
able to	construct the parser to	handle 2022/6429 syntax	more
generally.

There are two general formats covering all control sequences
that are expected to appear in extensions:

01/11 {I} F

     For this format, I	is always in the range 02/00 to
     02/15, and	F is always in the range 03/00 to 07/14.

[...]
If extensions to this specification are	defined	in the
future,	then any string	incorporating instances	of such
extensions must	start with one of the following	control
sequences:

     01/11 02/03 V 03/00   ignoring extensions is OK
     01/11 02/03 V 03/01   ignoring extensions is not OK
[...]
------------------------------------------------------------

So, designating ISO-8859-15 by ESC - b (i.e. 01/11 {I} F)
without any of the last two ESC sequences explicitly
violates CTEXT even if CTEXT is exteneded in the future.

---
Ken'ichi HANDA
handa at example.com



More information about the Mew-int mailing list