[fpc-pascal] Parse unicode scalar

Hairy Pixels genericptr at gmail.com
Mon Jul 3 09:12:03 CEST 2023



> On Jul 3, 2023, at 2:04 PM, Tomas Hajny via fpc-pascal <fpc-pascal at lists.freepascal.org> wrote:
> 
> No - in this case, the "header" is the highest bit of that byte being 0.

Oh it's the header BIT. Admittedly I don't understand how this function returns the highest bit using that case, which I think he was suggesting.

function UTF8CodepointSizeFast(p: PChar): integer;
begin
 case p^ of
   #0..#191   : Result := 1;
   #192..#223 : Result := 2;
   #224..#239 : Result := 3;
   #240..#247 : Result := 4;
   else Result := 1; // An optimization + prevents compiler warning about uninitialized Result.
 end;
end;

Regards,
Ryan Joseph



More information about the fpc-pascal mailing list