[fpc-pascal] rawbytestring

Marco van de Voort marcov at stack.nl
Sat Aug 18 19:53:52 CEST 2012


In past unicode discussions, rawbytestring was offered as a solution to
mixing utf8 and unicodestring (utf16).

I got the impression that rawbytestring was a kind of open array string, to
which strings could be passed without implicit conversion. (saving a lot in
overloading)

A week or two back I saw something in a delphi component that made me doubt
that hypothesis and today I tested with a small program.

It turns out that rawbytestring is only such for 1-byte types, and anything
else gets converted to the default ascii (1-byte) encoding, which is
Windows-1252, resulting in lossy conversions.

FPC 2.7.1 is fully compatible. This makes simply changing general RTL
routines that take strings to rawbytestring difficult? Or am I missing
something?

output:

short 1252 1
ansi 1252 1
utf8 65001 1
unic 1252 1
wide 1252 1

program:

program sometest;

{$ifdef fpc}
{$mode delphi}
{$else}
{$APPTYPE CONSOLE}
{$endif}

uses
  SysUtils;

procedure testraw(s:string;x:rawbytestring);

begin
  writeln(s,' ',stringcodepage(x),' ' ,StringElementSize(x));
end;

var sshort : shortstring;
    sutf8 : UTF8String;
    sunicode : unicodestring;
    swide : widestring;
    sansi : ansistring;

begin
  sshort:='Fiancé';
  sansi:='Fiancé';
  sutf8:='Fiancé';
  sunicode:='Fiancé';
  swide:='Fiancé';
  testraw('short',sshort);
  testraw('ansi',sansi);
  testraw('utf8',sutf8);
  testraw('unic',sunicode);
  testraw('wide',swide);
  readln;
end.



More information about the fpc-pascal mailing list