[fpc-pascal] Weird string behavior

Santiago A. svaa at ciberpiula.net
Thu Jul 21 20:36:22 CEST 2016


Hello:

I'm working on windows XP, FPC 3.0.0 from stable Lazarus 1.6.

I've come across this issue: When I concatenate two strings in UTF8 they
are converted to ansi (Win-1252) .
A bug?
Am I missing something?

I have attached a demo.


-- 
Saludos

Santiago A.

-------------- next part --------------
program testconvertstr;

//-----------------------------------

procedure DisplayBytes(aString:String);
var
  s:RawByteString absolute aString;
  i:Integer;
begin
  Write('  ');
  for i:=1 to length(s) do
    write(ord(s[i]),' ');
  writeln;
end;

//-----------------------------------
// body
//-----------------------------------
var
  AnsiStrA:string;
  AnsiStrB:string;
  Utf8StrA:string;
  Utf8StrB:string;
  Utf8StrConcat:string;
begin
  AnsiStrA:=' ';
  AnsiStrA[1]:=#243; // o acute win-1252
  AnsiStrB:='A';

  Writeln('AnsiStrA: ');DisplayBytes(AnsiStrA); // 243
  Writeln('AnsiStrB: ');DisplayBytes(AnsiStrB); // 65


  Utf8StrA:=AnsiToUtf8(AnsiStrA); // 195 179
  Utf8StrB:=AnsiToUtf8(AnsiStrB); // 65

  writeln;
  Writeln('Utf8StrA:');DisplayBytes(Utf8StrA); // 195 179
  Writeln('Utf8StrB:');DisplayBytes(Utf8StrB); // 65

  // Expected 195 179 65, but displays 243 65
  // as if after concatenation they were automatically
  // reverted to win-1252
  Writeln('Utf8StrA+Utf8StrB  ???!!!!:');DisplayBytes(Utf8StrA+Utf8StrB);

  writeln;
  Writeln('Utf8StrA again:');DisplayBytes(Utf8StrA); // 195 179
  Writeln('Utf8StrB again:');DisplayBytes(Utf8StrB); // 65


  Utf8StrConcat:=Utf8StrA+Utf8StrB;
  // same unexpected result assigning a intermediate var
  writeln;
  Writeln('Utf8StrConcat:');DisplayBytes(Utf8StrConcat);
  Readln;

end.



More information about the fpc-pascal mailing list