[fpc-devel] for-in-index loop

Fri Jan 25 11:36:47 CET 2013

Michael Schnell <mschnell at lumino.de> hat am 25. Januar 2013 um 11:22
geschrieben:
> On 01/25/2013 11:12 AM, Michael Van Canneyt wrote:
> >
> > Pchar ?
> >
> You seem to miss my point: the n'th printable character in an utf-8
> coded string (may same be stored as a pchar or a string) starts at the
> m'th byte (m>=n).
>
> To find m for a given n you need to scan all bytes < m.
>
> Thus a loop such as
>
> for I = 1 to 100000 do begin
> n = Integer (random(100000));
> c = myString[n];
> end;
>
> Is rather fast with ANSI coded Strings.

Same silly loop in UTF8:

// find random characters in myString
for I := 1 to 100000 do begin
  n := Integer (random(100000));
  cp := UTF8FindNearestCharStart(PChar(myString),length(myString),n);
end;

> When myString is coded in utf-8, it obviously provides silly code byte
> instead of printable characters, and replacing the term myString[n] by a
> straight forward function searching for the n'th printable character
> will be very slow.

Maybe real world examples would be better to prove a point.

Mattias