iconv GB18030 problems?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

iconv GB18030 problems?

ольга крыжановская
Can any one say why the following iconv fails in GB18030 and prints 2
'?' of the unicode character U+1F000?

printf '\xf0\x9f\x80\x80' | iconv -f 'UTF-8' -t GB18030 | iconv -f GB18030
??

My understanding is that GB18030 supports all Unicode characters with
a GBK-like encoding, right?

Olga
--
      ,   _                                    _   ,
     { \/`o;====-    Olga Kryzhanovska   -====;o`\/ }
.----'-/`-/     [hidden email]   \-`\-'----.
 `'-..-| /       http://twitter.com/fleyta     \ |-..-'`
      /\/\     Solaris/BSD//C/C++ programmer   /\/\
      `--`                                      `--`
_______________________________________________
opensolaris-discuss mailing list
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: iconv GB18030 problems?

Cedric Blancher
On 20 July 2012 11:02, ольга крыжановская <[hidden email]> wrote:
> Can any one say why the following iconv fails in GB18030 and prints 2
> '?' of the unicode character U+1F000?
>
> printf '\xf0\x9f\x80\x80' | iconv -f 'UTF-8' -t GB18030 | iconv -f GB18030
> ??
>
> My understanding is that GB18030 supports all Unicode characters with
> a GBK-like encoding, right?

Right. My understanding is that GB18030 is slightly broken (iconv
isn't the only part, the whole Tibetan glyphs come up as ? as well)
and Sun^WORACLE doesn't care. I think a well-tuned email to the PRC
ministry of commerce [english.mofcom.gov.cn] will be the only way to
get that fixed (all software sold in China must conform to GB18030,
and if the software does not it will get banned from gov.cn sales or
even banned from China altogether. And the communists KNOW how to make
ORACLE dance).

Ced
--
Cedric Blancher <[hidden email]>
Institute Pasteur
_______________________________________________
opensolaris-discuss mailing list
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [i18n-discuss] iconv GB18030 problems?

Alan Coopersmith-2
On 07/20/12 06:18 AM, Cedric Blancher wrote:

> On 20 July 2012 11:02, ольга крыжановская <[hidden email]> wrote:
>> Can any one say why the following iconv fails in GB18030 and prints 2
>> '?' of the unicode character U+1F000?
>>
>> printf '\xf0\x9f\x80\x80' | iconv -f 'UTF-8' -t GB18030 | iconv -f GB18030
>> ??
>>
>> My understanding is that GB18030 supports all Unicode characters with
>> a GBK-like encoding, right?
>
> Right. My understanding is that GB18030 is slightly broken (iconv
> isn't the only part, the whole Tibetan glyphs come up as ? as well)
> and Sun^WORACLE doesn't care. I think a well-tuned email to the PRC
> ministry of commerce [english.mofcom.gov.cn] will be the only way to
> get that fixed

A customer with a support contract filing an escalation is usually the
easiest way to get a fix and doesn't rely on making vague threats or
trying to involve bureaucrats in other governments.

--
        -Alan Coopersmith-              [hidden email]
         Oracle Solaris Engineering - http://blogs.oracle.com/alanc
_______________________________________________
opensolaris-discuss mailing list
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: iconv GB18030 problems?

Jan Hnatek
In reply to this post by ольга крыжановская
Hi Olga,

I got the following response forwarding your query:
===
That is because current GB18030<->unicode conversion code table we're
using is NOT latest one.
The character in your input belongs to CJK unified ideographs extension
B, which is defined only in GB18030-2005 standard.
===

Regards,
hnhn

On 07/20/12 11:02 AM, ольга крыжановская wrote:

> Can any one say why the following iconv fails in GB18030 and prints 2
> '?' of the unicode character U+1F000?
>
> printf '\xf0\x9f\x80\x80' | iconv -f 'UTF-8' -t GB18030 | iconv -f GB18030
> ??
>
> My understanding is that GB18030 supports all Unicode characters with
> a GBK-like encoding, right?
>
> Olga

--
Jan Hnatek
[hidden email]
_______________________________________________
opensolaris-discuss mailing list
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: iconv GB18030 problems?

Cedric Blancher
On 23 July 2012 09:18, Jan Hnatek <[hidden email]> wrote:
> Hi Olga,
>
> I got the following response forwarding your query:
> ===
> That is because current GB18030<->unicode conversion code table we're using
> is NOT latest one.

So why does it take so long for every Unix operating system (tested
with AIX, FreeBSD, Linux) with GB18030 support to have this, with
Solaris as the only one lagging behind (no, I can't file a support
request; we don't have a support contract anymore, we god rid of those
after Oracle was not able to fulfil its support contracts)?

Ced
--
Cedric Blancher <[hidden email]>
Institute Pasteur
_______________________________________________
opensolaris-discuss mailing list
[hidden email]