[fpc-pascal] Split stream into words

Michael Van Canneyt michael at freepascal.org
Tue Jul 3 10:31:43 CEST 2018


Hi,

What's the easiest way to split a stream into words ?
Words are just that: words, but - here is the caveat - they must support unicode.
So Michael and Michaƫl are both words.

Tried regexpr unit (the obvious choice), but that does not seem to do the trick:

{$mode objfpc}
{$H+}
uses cwstring, sysutils, classes, regexpr;

Var
   Split : TStringList;
   S : String;
   R : TRegexpr;

begin
   Split:=TStringList.Create;
   Split.LoadFromFile(ParamStr(1));
   S:=Split.Text;
   Split.Clear;
   r := TRegExpr.Create;
   try
     r.Expression :='[\w]+';
     r.Split (S, Split);
     for S in Split do
       Writeln('Found: ',S);
   finally
     r.Free;
   end;
end.

Prints simply nonsense...

Michael.


More information about the fpc-pascal mailing list