Mailing List archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[linux-dvb] Re: DVB character coding...
Gerd Knorr wrote:
> "Robert Schlabbach" <robert_s@gmx.net> writes:
>
> > But the codings 0x12 and 0x13 bring up another problem: In KSC5601 and
> > GB2312, the codes 0x80 through 0x9F are used as lead bytes - but DVB
> > defines them as control codes. Obviously you can't have the same byte serve
> > two different meanings. I suppose there simply are no control codes for
> > these character codings?
>
> My code does control code handling for codings < 0x10 only. Don't
> remember why I did it that way though. Maybe because the specs sayed
> so, but also might be because its unclear how control codes are
> supposed to work with the multibyte encodings.
Hm, for 0x13 I found the following comment in our code:
FIXME: document 595.doc on dvb.org states:
1. If the value of leading byte is "0x13"; then the remaining bytes are
coded in pairs with the Big5 subset of Unicode 3.0. This Big5 subset can
be round-trip transcoded to the Big5 character standard [5] without loss
of information. This Big5 subset of Unicode 3.0 contains all 13,053
characters of Big5 character standard [5].
(I don't know who wrote that or what 595.doc is.)
For double byte char sets the control codes are 0xe080 ... 0xe09f.
> > <RTL> Television - Long name is "RTL Television", short name is "RTL"
> > <S>uper< RTL> - Long name is "Super RTL", short name is "S RTL"
> >
> > This is something I didn't know before...
>
> Intesting, I didn't know either ;)
I did ;-) However, I was too lazy to implement it in dvbscan
and instead put the following comment there:
/* remove control characters (FIXME: handle short/long name) */
> BTW: I've seen ^Z (0x1a) in eit descriptions, what the heck does that
> mean? I've noticed because xml parsers refuse to accept the files if
> you stuff that as-is into a xml file ...
Probably a M$-DOS EOF?
Johannes
Home |
Main Index |
Thread Index