On Sonntag, 10. Februar 2008, Udo Richter wrote:
What special handling does asprintf with utf8? Is there some example that causes the trouble?
Worst case I can imagine would be that there's an invalid 0 byte inside an utf8 multibyte char
printf and family sometimes have to count characters, so I suppose they have to scan UTF
I know from mysql and postgresql that they also scan every UTF string passed from the client for illegal chars and abort the transaction if they find any.
My problem code:
mgDb::Build_cddbid(const mgSQLString& artist) const { char *s; asprintf(&s,"%ld-%.9s",random(),artist.original());
segfaults only if illegal utf8 chars appear in artist.original()
asprintf returns -1, so s is nothing that could be freed, and this gives a nice backtrace:
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1319449712 (LWP 22989)] 0xb7bf57ea in free () from /lib/tls/i686/cmov/libc.so.6 (gdb) bt #0 0xb7bf57ea in free () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7986908 in mgDb::Build_cddbid (this=0x86ed8e8, artist=@0xb15aa698) at mg_db.c:1023
If I change %.9s to %s, everything is fine.
I cannot easily simplify that, if I try like this, it works:
char artist[50]; strcpy(artist,"Celine Dion"); artist[1]=0xe9; asprintf(&buffer,"%ld-%.9s",random(),artist); printf(buffer); free(buffer);