OpenCSW Bug Tracker

Project:

Viewing Issue Advanced Details [ Jump to Notes ]

[ View Simple ] [ Issue History ] [ Print ]

Category

Severity

Reproducibility

Date Submitted

Last Update

0005298

[git] regular use

major

always

2017-02-19 18:02

2017-02-20 10:21

Reporter

danny

View Status

public

Assigned To

dam

Priority

normal

Resolution

open

Platform

Status

assigned

Projection

none

OS Version

ETA

none

Product Build

Summary

0005298: git fast-export producing corrupt output with UTF8 locale on Solaris 11

Description

On Solaris 11, using git 2.3.1 or 2.4.0 from OpenCSW, when commiter or author contain multibyte characters, git fast-export writes extra garbage after the author/committer line:

$ locale
LANG=en_US.UTF8
LC_CTYPE="en_US.UTF8"
[...]
LC_ALL=

$ git fast-export --all
blob
mark :1
data 6
Hello

reset refs/heads/master
commit refs/heads/master
mark :2
author Füü Bär <foo@example.com> 1487506001 +0100
co
committer Füu Bär <foo@example.com> 1487506001 +0100

data 6
Hello
M 100644 :1 foo.txt

reset refs/heads/master
from :2

With LC_CTYPE=C the problem disappears (this correct output can be used with git fast-import to create a test case repository):

$ LC_CTYPE=C git fast-export --all
blob
mark :1
data 6
Hello

reset refs/heads/master
commit refs/heads/master
mark :2
author Füü Bär <foo@example.com> 1487506001 +0100
committer Füu Bär <foo@example.com> 1487506001 +0100
data 6
Hello
M 100644 :1 foo.txt

reset refs/heads/master
from :2

Steps To Reproduce

Additional Information

The problem might be that git is not compiled in c99 compliant mode. On Solaris 11, this apparently causes printf() to interpret the length field when formatting strings as a number of characters instead of a number of bytes, which results in additional garbage output when the string contains multibyte UTF8 characters.

This following C program illustrates the difference in behaviour:

$ cat printf_test.c
#include <stdio.h>
#include <strings.h>
#include <locale.h>

int main() {
    char test[] = { 102, -61, -74, -61, -74, 98, -61, -92, 114, 64, 64, 64, 0 };
    setlocale(LC_CTYPE, "");
    printf("%.9s\n", test);
    return 0;
}

$ cc -o printf_test printf_test.c
$ ./printf_test
fööbär@@@

$ cc -o printf_test printf_test.c -xc99
$ ./printf_test
fööbär

Or with gcc:

$ gcc -o printf_test printf_test.c
$ ./printf_test
fööbär@@@

$ gcc -o printf_test printf_test.c /usr/lib/32/values-xpg6.o
$ ./printf_test
fööbär

Relationships