UPC is a provider for middle european countries (Czechia, Hungary and Poland). They use iso6937-2 for encoding their EPG data so this looks quite strange in the vdr. The applied patch does a "remapping" to iso8859-2 so that characters are displayed correct. (Currently only tested with Czech and Hungarian, but should also work for Polish)
While testing this with the help of an hungarian user, i also found out that the the codepage for Hungary must be 8859-2, not -1.
The patch is work by Helmut Auer.
cheers, Tim
Thiemo Gehrke wrote:
UPC is a provider for middle european countries (Czechia, Hungary and Poland). They use iso6937-2 for encoding their EPG data so this looks quite strange in the vdr. The applied patch does a "remapping" to iso8859-2 so that characters are displayed correct. (Currently only tested with Czech and Hungarian, but should also work for Polish)
While testing this with the help of an hungarian user, i also found out that the the codepage for Hungary must be 8859-2, not -1.
The patch is work by Helmut Auer.
cheers, Tim
--- vdr-1.4.4-vanilla/epg.c 2006-10-28 11:12:42.000000000 +0200 +++ vdr-1.4/epg.c 2006-11-28 12:39:33.000000000 +0100 @@ -18,6 +18,165 @@
#define RUNNINGSTATUSTIMEOUT 30 // seconds before the running status is considered unknown
+// UPC Direct / HBO strange two-character encoding. 0xC2 means acute, 0xCF caron. +// many thanks to the czechs who helped me while solving this. ...
How is their encoding coded in the first byte of the texts? I can't seem to find an encoding for iso6937-2 in ETSI EN 300 46, section A.2.
Also, what happens if you run such a string through iconv() to convert it from iso6937-2 to iso8859-2 or UTF-8?
I'm asking because this is how VDR will handle character sets in the next version.
Klaus
Hi, I converted iso6937 to iso8859-1 by small patch. It was simplest way, because I only deleted non-spacing characters (diacritical marks). It was fast solution form me (strange letters in EPG and bad recordings file names ), but it is not good because EPG doesn't contents correct Czech, .... , ... letters.
How it working in Czech rep: All DVB-T TVs using 'table 00 - Latin alphabet'. This table is a superset of ISO/IEC 6937. I talked with persons from Ceské radiokomunice (broadcaster) and CzechTV, and wonted explain them that iso8859-2 is better choice :-(. Their opinion is that "table 00" is more complex, they can display non-czech characters without problems.
I am afraid that most Europe DVB using 'table 00 - Latin alphabet'. , but they don't use characters 0xC0 to 0xCF. (non-spacing characters: the character is printed together with next character. Like mechanical type writer. It is crazy).
I haven't any idea how it solve correctly, and how select default character set for VDR.
A few lines from ETSI EN 300 468 V1.7.1 (2005-12)
Annex A.2 If the first byte of the text field has a value in the range "0x20" to "0xFF" then this and all subsequent bytes in the text item are coded using the default character coding table (table 00 - Latin alphabet) of figure A.1.
Notes for picture A.1
Figure A.1: Character code table 00 - Latin alphabet NOTE 1: The SPACE character is located in position 20h of the code table. NOTE 2: NBSP = no-break space. NOTE 3: SHY = soft hyphen. NOTE 4: This table is a superset of ISO/IEC 6937 [24] with addition of the Euro symbol. NOTE 5: All characters in column C are non-spacing characters (diacritical marks).
Milos
----- Original Message ----- From: "Klaus Schmidinger" Klaus.Schmidinger@cadsoft.de To: vdr@linuxtv.org Sent: Sunday, December 10, 2006 11:51 AM Subject: Re: [vdr] [PATCH] Fix EPG for UPC direct
Thiemo Gehrke wrote:
UPC is a provider for middle european countries (Czechia, Hungary and Poland). They use iso6937-2 for encoding their EPG data so this looks quite strange in the vdr. The applied patch does a "remapping" to iso8859-2 so that characters are displayed correct. (Currently only tested with Czech and Hungarian, but should also work for Polish)
While testing this with the help of an hungarian user, i also found out that the the codepage for Hungary must be 8859-2, not -1.
The patch is work by Helmut Auer.
cheers, Tim
--- vdr-1.4.4-vanilla/epg.c 2006-10-28 11:12:42.000000000 +0200 +++ vdr-1.4/epg.c 2006-11-28 12:39:33.000000000 +0100 @@ -18,6 +18,165 @@
#define RUNNINGSTATUSTIMEOUT 30 // seconds before the running status is considered unknown
+// UPC Direct / HBO strange two-character encoding. 0xC2 means acute, 0xCF caron. +// many thanks to the czechs who helped me while solving this. ...
How is their encoding coded in the first byte of the texts? I can't seem to find an encoding for iso6937-2 in ETSI EN 300 46, section A.2.
Also, what happens if you run such a string through iconv() to convert it from iso6937-2 to iso8859-2 or UTF-8?
I'm asking because this is how VDR will handle character sets in the next version.
Klaus
vdr mailing list vdr@linuxtv.org http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
I demand that Thiemo Gehrke may or may not have written...
UPC is a provider for middle european countries (Czechia, Hungary and Poland). They use iso6937-2 for encoding their EPG data so this looks quite strange in the vdr.
The applied patch does a "remapping" to iso8859-2 so that characters are displayed correct. (Currently only tested with Czech and Hungarian, but should also work for Polish)
Hmm. I recall seeing similar encoding being used by the BBC, though I don't see any examples of it ATM.
Obviously, the output encoding should be ISO8859-1, not -2...
On Fri, 8 Dec 2006, Thiemo Gehrke wrote:
UPC is a provider for middle european countries (Czechia, Hungary and Poland). They use iso6937-2 for encoding their EPG data so this looks quite strange in the vdr.
Hello the applied patch made no diffrence on my system, you can see the snapshot here: http://sysphere.org/~anrxc/upc.png
Thiemo Gehrke wrote:
UPC is a provider for middle european countries (Czechia, Hungary and Poland). They use iso6937-2 for encoding their EPG data so this looks quite strange in the vdr. The applied patch does a "remapping" to iso8859-2 so that characters are displayed correct. (Currently only tested with Czech and Hungarian, but should also work for Polish)
Here is an iconv version of the patch: http://toms-cafe.de/vdr/download/vdr-epg-conv-iso6937-1.4.5.diff
Tom