VDR 1.5.3: UTF-8 vs. ExchangeChars()

List overview All Threads
Download

newer

older

Subtitles patch causing ProjectX...

VDR timer + mplayer = VDR crash....

Klaus Schmidinger

12 Jun 2007 12 Jun '07

3:46 p.m.

The function ExchangeChars() in VDR/recording.c converts characters that can't be used in file names on Windows to "#XX", where XX is the hex code of the character.

This was simple when VDR only worked with single byte character sets, but now that it can handle UTF-8 this needs to be changed, too.

Currently it has a list of characters that can be used "as is", and converts everything else to "#XX". I guess it is better to convert any "disturbing" characters to "#XX" and leave the rest untouched. However, this requires that we know exactly which characters can't be used with Windows.

Does anybody have that kind of information? Or should this be done completely different?

Note that I don't use this feature myself, so unless there is some input from others, I won't be changing anything here.

Klaus

Show replies by date

Stone

12 Jun 12 Jun

4:28 p.m.

On 6/12/07, Klaus Schmidinger Klaus.Schmidinger@cadsoft.de wrote:

...

I use this FAT feature so I can mount my linux drive on windows and stream movies with vdradmin to my windows machine. I have noticed that VDR might currently be renaming a little more than is required. Invalid characters include:

. " / \ [ ] : ; = ,

http://support.microsoft.com/kb/142982

Regards.

alexander-riedel＠t-online.de

9:09 p.m.

also * ? !

-----Original Message----- Date: Tue, 12 Jun 2007 18:28:10 +0200 Subject: Re: [vdr] VDR 1.5.3: UTF-8 vs. ExchangeChars() From: Stone To: "VDR Mailing List"

On 6/12/07, KLAUS SCHMIDINGER wrote: The function ExchangeChars() in VDR/recording.c converts characters that can't be used in file names on Windows to "#XX", where XX is

the hex code of the character.

This was simple when VDR only worked with single byte character sets, but now that it can handle UTF-8 this needs to be changed, too.

Does anybody have that kind of information? Or should this be done completely different?

Note that I don't use this feature myself, so unless there is some input from others, I won't be changing anything here. I use this FAT feature so I can mount my linux drive on windows and stream movies with vdradmin to my windows machine. I have noticed that VDR might currently be renaming a little more than is required. Invalid characters include: . " / [ ] : ; = , http://support.microsoft.com/kb/142982 [2] Regards.

Links: ------ [1] mailto:Klaus.Schmidinger@cadsoft.de [2] http://support.microsoft.com/kb/142982

Klaus Schmidinger

9:25 p.m.

On 06/12/07 23:09, alexander-riedel@t-online.de wrote:

...

also * ? !

I was wondering already about "Invalid characters *include*", which to me means that there are probably more than these.

Ok, so far we have

. " / \ [ ] : ; = ,

and

* ? !

However, '!' is in the list of characters that can be used "as is" right now, so where do you take it from that this character is also invalid?

Also: the '.' is perfectly ok, unless at the end of a directory name. So, from my point of view the characters to be exchanged to "#XX" are

" / \ [ ] : ; = , * ? and . at the end of a directory name

Any others?

Klaus

"The day Microsoft invents something that doesn't suck, it will be a vacuum cleaner."

- anon

...

Oleg Roitburd

9:09 p.m.

On Tue, 2007-06-12 at 17:46 +0200, Klaus Schmidinger wrote:

...

Sorry ... I don't understand and can't find any sence. If you export for window, you make this with SAMBA. And you can configure share with UTF-8 man smb.conf ------------------------- unix charset (G) Specifies the charset the unix machine Samba runs on uses. Samba needs to know this in order to be able to convert text to the charsets other SMB clients use.

This is also the charset Samba will use when specifying argu‐ ments to scripts that it invokes.

Default: unix charset = UTF8

------------------------------------------------------------ you can drop VFAT part

Regards Oleg Roitburd

Klaus Schmidinger

9:51 p.m.

On 06/12/07 23:09, Oleg Roitburd wrote:

...

On Tue, 2007-06-12 at 17:46 +0200, Klaus Schmidinger wrote:

...
The function ExchangeChars() in VDR/recording.c converts characters that can't be used in file names on Windows to "#XX", where XX is the hex code of the character.

Sorry ... I don't understand and can't find any sence. If you export for window, you make this with SAMBA. And you can configure share with UTF-8 man smb.conf

unix charset (G) Specifies the charset the unix machine Samba runs on uses. Samba needs to know this in order to be able to convert text to the charsets other SMB clients use.
   This  is  also  the charset Samba will use when specifying argu‐
   ments to scripts that it invokes.

   Default: unix charset = UTF8
you can drop VFAT part

Regards Oleg Roitburd

Well, that would be the optimal solution :-)

Any objections?

Klaus

Marius Heidenstecker

13 Jun 13 Jun

9:23 a.m.

Am Dienstag, 12. Juni 2007 23:51 schrieb Klaus Schmidinger:

...

I do use a FAT32 external USB-HDD on which I store VDR-recordings with the VFAT-option enabled. That way I can take VDR-recordings on that HDD to friends' Windows boxes and watch it there with VLC for example. So I'd like to keep it. Or is there a way to avoid that problem by, for example, taking different mount options on FAT32-file system?

DMH

Oleg Roitburd

1:27 p.m.

Am Mittwoch, 13. Juni 2007 11:23 schrieb Marius Heidenstecker:

...

take a look at manpage of mount or burn DVD for your friends with burn-plugin. Windows recognize UTF-8 at DVD very well

Regards Oleg Roitburd

Klaus Schmidinger

1:45 p.m.

On 06/13/2007 03:27 PM, Oleg Roitburd wrote:

...

I guess the problem is not whether Windows can handle UTF-8. The problem are the characters like ':' etc. that can't be used in a Windows file name.

The question is: what happens if a FAT32 partition is mounted on a Linux system (with proper UTF-8 settings) and a program creates a file named "a:b" on that partition? Will it fail? Will Linux see it but Windows won't? Or does it just work magically? In the latter case, I can't really see how this should be possible, because Windows just *can't* handle a file name like "a:b", because "a:" would be (mis-)interpreted as a drive letter.

Klaus

Pertti Kosunen

11:39 a.m.

Oleg Roitburd wrote:

...

Yes you can use Samba and UTF-8, but some characters are still illegal in DOS/Windows filenames.

Oleg Roitburd

1:31 p.m.

Am Mittwoch, 13. Juni 2007 13:39 schrieb Pertti Kosunen:

...

As maintainer of ArVDR ( VDR distribution for Russian user) I would say, that in 1.5 years as we use UTF-8 patch without VFAT part, I havn't heard any complaints about this issue.

Regards Oleg Roitburd

Pertti Kosunen

2:04 p.m.

Oleg Roitburd wrote:

...

touch 'foo:bar'

And dir in Windows shows: 13.06.2007 14:24 0 FF4GBY~Q

You can 'del "FF4GBY~Q"' though.

Stone

4:18 p.m.

On 6/13/07, Pertti Kosunen pertti.kosunen@pp.nic.fi wrote:

...

The problem is not being able to share the linux directory over the network to a windows box, it is trying to open a file with invalid chars. When a file contains invalid DOS chars, it will display incorrectly and windows will not be able to play it.

Regards.

Pertti Kosunen

14 Jun 14 Jun

10:08 a.m.

Stone wrote:

...

Yes, apparently i misunderstood Olegs point..

alexander-riedel＠t-online.de

12 Jun 12 Jun

9:31 p.m.

Klaus Schmidinger

9:40 p.m.

On 06/12/07 23:31, alexander-riedel@t-online.de wrote:

...

Hi,

i think we need a list of INVALID CHARACTERS and not inverted

I thought I had asked for that...

Klaus

...

6517

Age (days ago)

6519

Last active (days ago)

vdr@linuxtv.org

15 comments

6 participants

tags (0)

participants (6)

alexander-riedel＠t-online.de
Klaus Schmidinger
Marius Heidenstecker
Oleg Roitburd
Pertti Kosunen
Stone