[Mew-dist 1608] one request to RFC 2047
Kazu Yamamoto ( 山本和彦 )
Kazu at example.com
1997年 8月 24日 (日) 21:03:06 JST
Hello Keith,
I tried to find you to discuss this issue in Munich but failed. The
following is one request to RFC2047 which I sent you before and I have
not received a reply to, I think. I'm very sorry that I didn't find
this problem when RFC2047 was ID.
RFC 2047 encoding for fields defined as *text, such as Subject:, has a
problem, I believe.
To my understanding, RFC2047 says:
(1) An 'encoded-word' that appears in a header field defined as
'*text' MUST be separated from any adjacent 'encoded-word' or 'text'
by 'linear-white-space'.
I believe this means that
Subject: English =?iso-2022-jp?B?encoded-Japanese?=
is illegal and we must(or should) use
Subject: English
=?iso-2022-jp?B?encoded-Japanese?=
instead.
RFC2048 also says:
(2) When displaying a particular header field that contains multiple
'encoded-word's, any 'linear-white-space' that separates a pair of
adjacent 'encoded-word's is ignored. However, this rule doesn't apply
the case between *text and 'encoded-word'.
Thus,
Subject: English
=?iso-2022-jp?B?encoded-Japanese?=
is decoded as
Subject: English Japanese.
Note that one space remains between English and Japanese in this
case.
The problem is how to encode Subject: if English and Japanese is
continuous. An example is as follows:
Subject: EnglishJapanese
Only solution I found is
Subject: =?us-ascii?Q?English?=
=?J?B?encoded-Japanese?=.
(Note that
Subject: English=?J?B?encoded-Japanese?=
is not allowed by rule (1). )
However, this is discouraged by the third rule defined RFC 2047:
(3) Use of 'encoded-word's to represent strings of purely ASCII
characters is allowed, but discouraged.
Some mail utilities modify Subject:. A typical example is the case
where name of a mailing list is prepended to Subject:.
Subject: [ML name] text
If such utilities just append to ML name to Japanese only Subject:, it
is a violation of RFC2047:
Subject: [ML name]=?J?B?encoded-Japanese?=
A reply message may contain the following Subject:
Subject: =?Q?US-ASCII?[ML_name]?=
=?J?B?encoded-Japanese?=
Since Subject: doesn't start with [ML name], they append unnecessary
strings again:
Subject: [ML name]=?Q?US-ASCII?[ML_name]?=
=?J?B?encoded-Japanese?=
<Proposed resolution>
I would propose to eliminate rule (1). The intention of rule (1) is to
allow RFC2047 decoders to notice 'encoded-word' easily. However, to my
implementation experience, this rule makes it much difficult to
implement an encoder. Without rule (1), decoders can find
'encoded-word, for instance, with help of regular expression.
Harald said to me at Munich that application area is planning to
revise RFC2047 so that it can contain language tag. I hope that the
proposed resolution will be included in the next spec.
--Kazu
Mew-dist メーリングリストの案内