Open Discussion (#3) - Composite Unicode (encoded utf-8) (#58) - Message List

Composite Unicode (encoded utf-8)

My language contains of some character á, à, ớ, ế, ề, ờ, ở, ắ, ấ, ầ,... The latest version of Logic Mail can not display it correctly.

I checked this mail and noted that it's been written by Composite Unicode and I recheck again and learned that there's two kind of Unicode for my language, "Precompound Unicode" and "Composite Unicode".

Logic Mail handle "Precompound Unicode" well, but for "Composite Unicode" It is not displayed correctly.

This is the tested mail. (Noted that the subject is encoded and displayed incorrectly)

Delivered-To: xxxxxx@gmail.com
Received: by 10.86.97.2 with SMTP id u2cs486664fgb;
        Tue, 28 Oct 2008 21:45:09 -0700 (PDT)
Received: by 10.110.26.20 with SMTP id 20mr5210873tiz.29.1225255507365;
        Tue, 28 Oct 2008 21:45:07 -0700 (PDT)
Received: by 10.110.16.14 with HTTP; Tue, 28 Oct 2008 21:45:07 -0700 (PDT)
Message-ID: <f863d0ac0810282145y3d581522qfe61202bfa179332@mail.gmail.com>
Date: Wed, 29 Oct 2008 11:45:07 +0700
From: "Gmail Invite" <xxx.xxx@gmail.com>
To: "Gmail Invite" <xxx.xxx@gmail.com>, "xxxx Bui" <xxxxx@gmail.com>
Subject: =?UTF-8?Q?Th=C3=A1ng_8/2008_em_th=E1=BA=A5y?= =?UTF-8?Q?_c=C3=B3_l=E1=BB=8Bch_thi_l=E1=BA=A1i_m=C3=B4n?= =?UTF-8?Q?_Lu=E1=BA=ADt_H=C3=ACnh_s=E1=BB=B1_d=C3=A0nh_c?= =?UTF-8?Q?ho_l=E1=BB=9Bp_h=E1=BB=8Dc_th=E1=BB=A9_3-5-7.?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: base64
Content-Disposition: inline
RGVhciBhbmgvY2jhu4ssCkVtIGLhu4sgb3V0IG3DtG4gTHXhuq10IEjDrG5oIHPhu7EuClRow6Fu
ZyA4LzIwMDggZW0gdGjhuqV5IGPDsyBs4buLY2ggdGhpIGzhuqFpIG3DtG4gTHXhuq10IEjDrG5o
IHPhu7EgZMOgbmggY2hvIGzhu5twIGjhu41jCnRo4bupIDMtNS03LgpWw6wgZW0gaOG7jWMgbOG7
m3AgMi00LTYsIHhlbSB0cm9uZyBkYW5oIHPDoWNoIHRoaSBs4bqhaSB0aMOsIGtow7RuZyB0aOG6
pXkgdMOqbgpuw6puIGVtIGtow7RuZyDEkWkgdGhpLgpOaMawbmcgZ2nhu50gY2jhu50gbMOidSBx
dcOhIGtow7RuZyB0aOG6pXkgbOG7i2NoIHRoaSBs4bqhaSBtw7RuIEx14bqtdCBow6xuaCBz4bux
IGTDoG5oCmNobyBs4bubcCAyLTQtNiBuw6puIGPFqW5nIHRo4bqleSBsbyBsby4KCkVtIGtow7Ru
ZyDEkWkgdGhpIGzhuqFpIMSR4bujdCB0csaw4bubYyBsw6AgY8OzIMSRw7puZyBraMO0bmc/IEhh
eSBsw6AgZW0gxJHDoyBi4buPIGzhu6Ega+G7swp0aGkgbOG6oWkgxJHDsyBy4buTaT8KVsOsIGVt
IMOtdCDEkWkgaOG7jWMgbsOqbiBraMO0bmcgYmnhur90IHLDtS4gQW5oL2No4buLIG7DoG8gYmnh
ur90IGNobyBlbSB4aW4gdGjDtG5nIHRpbiB24bubaS4KCkVtIGPDoW0gxqFuIQpLaW0gSOG7k25n
Cg==
  • Message #148

    I did some investigation last night, along with a friend of mine who can read Vietnamese and thus interpret many of the websites I found in my searches.

    Here is what I've found so far:

    • Most text out there on the web is formatted like your sample, and the BlackBerry web browser won't even render it properly
    • The Java 6 API (J2SE) has a Normalizer that can convert this text into a format the BlackBerry can display
    • Fixing this issue would require porting a Unicode normalizer to the embedded Java (J2ME/BlackBerry) version LogicMail runs on.
    • It is definitely do-able, but may require a fair amount of work to implement.

    I'll do more investigation on what is actually involved in doing Unicode normalization inside of LogicMail, and let you know how practical of a task this turns out to be.

  • Message #214

    Just wanted to let you know, as of today, I added an optional "Unicode normalization" feature to LogicMail 1.1 (the "maintenance branch"). If you turn this on, it should fix your problem :-)

Subscriptions

Moderation