Anonymous | Login | 2024-05-04 17:55 CEST |
Main | My View | View Issues |
Viewing Issue Advanced Details [ Jump to Notes ] | [ View Simple ] [ Issue History ] [ Print ] | |||||||||||
ID | Category | Severity | Reproducibility | Date Submitted | Last Update | |||||||
0005298 | [git] regular use | major | always | 2017-02-19 18:02 | 2017-02-20 10:21 | |||||||
Reporter | danny | View Status | public | |||||||||
Assigned To | dam | |||||||||||
Priority | normal | Resolution | open | Platform | ||||||||
Status | assigned | OS | ||||||||||
Projection | none | OS Version | ||||||||||
ETA | none | Product Build | ||||||||||
Summary | 0005298: git fast-export producing corrupt output with UTF8 locale on Solaris 11 | |||||||||||
Description |
On Solaris 11, using git 2.3.1 or 2.4.0 from OpenCSW, when commiter or author contain multibyte characters, git fast-export writes extra garbage after the author/committer line: $ locale LANG=en_US.UTF8 LC_CTYPE="en_US.UTF8" [...] LC_ALL= $ git fast-export --all blob mark :1 data 6 Hello reset refs/heads/master commit refs/heads/master mark :2 author Füü Bär <foo@example.com> 1487506001 +0100 co committer Füu Bär <foo@example.com> 1487506001 +0100 data 6 Hello M 100644 :1 foo.txt reset refs/heads/master from :2 With LC_CTYPE=C the problem disappears (this correct output can be used with git fast-import to create a test case repository): $ LC_CTYPE=C git fast-export --all blob mark :1 data 6 Hello reset refs/heads/master commit refs/heads/master mark :2 author Füü Bär <foo@example.com> 1487506001 +0100 committer Füu Bär <foo@example.com> 1487506001 +0100 data 6 Hello M 100644 :1 foo.txt reset refs/heads/master from :2 |
|||||||||||
Steps To Reproduce | ||||||||||||
Additional Information |
The problem might be that git is not compiled in c99 compliant mode. On Solaris 11, this apparently causes printf() to interpret the length field when formatting strings as a number of characters instead of a number of bytes, which results in additional garbage output when the string contains multibyte UTF8 characters. This following C program illustrates the difference in behaviour: $ cat printf_test.c #include <stdio.h> #include <strings.h> #include <locale.h> int main() { char test[] = { 102, -61, -74, -61, -74, 98, -61, -92, 114, 64, 64, 64, 0 }; setlocale(LC_CTYPE, ""); printf("%.9s\n", test); return 0; } $ cc -o printf_test printf_test.c $ ./printf_test fööbär@@@ $ cc -o printf_test printf_test.c -xc99 $ ./printf_test fööbär Or with gcc: $ gcc -o printf_test printf_test.c $ ./printf_test fööbär@@@ $ gcc -o printf_test printf_test.c /usr/lib/32/values-xpg6.o $ ./printf_test fööbär |
|||||||||||
Tags | No tags attached. | |||||||||||
Attached Files | ||||||||||||
|
There are no notes attached to this issue. |
Copyright © 2000 - 2008 Mantis Group |