[fpc-pascal] Feature announcement: Unicode character constants outside of BMP

Sven Barth pascaldragon at googlemail.com
Sun May 27 16:12:28 CEST 2018


Hello together!

Small feature announcement this time, but some might like it nevertheless:

FPC now supports unicode character constants that are outside of the 
Basic Multilingual Plane (BMP), thus all with a value > $FFFF, but < 
$10FFFF (the highest possible Unicode code point). As these are encoded 
as surrogate pairs in UTF-16 they aren't a single UnicodeChar, but 
instead they are always a UnicodeString constant.
All four bases that FPC supports for specifying character constants are 
covered: binary (#%…), octal (#&…), decimal (#…) and hexadecimal (#$…).
This is also Delphi compatible.

Example (based on the code points provided in here: 
https://en.wikipedia.org/wiki/UTF-16#Examples ):

=== code begin ===

program tuchar;

procedure DumpStr(const aStr: UnicodeString);
var
   u: UnicodeChar;
begin
   for u in aStr do
     Write(HexStr(Ord(u), 4), ' ');
   Writeln;
end;

var
   s: UnicodeString;
begin
   s := #$10437;
   DumpStr(s);
   s := #$24b62;
   DumpStr(s);
end.

=== code end ===

This will result in the following output:

=== output begin ===

D801 DC37
D852 DF62

=== output end ===

Regards,
Sven



More information about the fpc-pascal mailing list