[fpc-devel] TMask classes

Michael Van Canneyt michael at freepascal.org
Thu Dec 3 20:57:04 CET 2020


Hello José,

FPC has 2 units with similar functionality, fpmasks (in fpindexer) and maskutils (in fcl-base).

I think one of these units could be extended with extra functions; 
I don't think we should introduce new units.

For maskutils there are already some tests, so if you write tests for your
classes, you could extend tcmaskutils.pp (directory fcl-base/tests)

So If you prepare a patch and upload it to the bugtracker, I'll look at it -
or someone else...

Michael.

On Thu, 3 Dec 2020, José Mejuto via fpc-devel wrote:

> Hello,
>
> I've written a TMask class family and I would like to know if it could 
> be of any interest for the fpc code base, if there is interest I must 
> write some unit test code but if there is no interest I'll not write it ;-)
>
> There is a TMask class in LazUtils but it lacks some features and has an 
> speed problem in large strings when 2 or more '*' are involved.
>
> My TMask family is:
>
> TMaskAnsi, TMaskUTF8 and TMaskUnicode:
>
> This three classes hold the logic for each "char" space, ANSI 1 char, 
> UTF8 1 to 6 bytes and Unicode 2 or 4 bytes.
>
> Available wildcards are:
>
> * = Any text.
> ? = Just one character unit.
> [abc] = Anyone of "abc"
> [a-c] = Anyone of "abc" (range)
> [!ab] = Negate a match "a" or "b"
> [!a-c] = Negate a match group
> [?] = One character unit or none.
> \? = Escape character '\' (next char is a literal "?")
>
> So the usual masks, nothing new under the sun.
>
> TMaskAnsi does not allow high chars ranges, so "[á-ú]" is not valid as 
> characters in ANSI usually does not follow expected order. UTF8 and 
> Unicode allows them as usually they are in the "expected order".
>
> Any feature can be toggled, so it can be set to ignore ranges, in 
> example, so "[a-z]" will be interpreted as "a" or "-" or "z". Also the 
> escape char can be configured, but it must be ASCII < 127.
>
> Derived from this classes there are:
>
> TMaskAnsiWindows, TMaskUTF8Windows and TMaskUnicodeWindows.
>
> This classes implements the special wildcards of Windows, inherited from 
> CP/M, like "*." must match all files without extension, "*.*" must match 
> any file, "file.ext" must match "file.extension" or "file??.ext" must 
> match "file.ext", "file1.ext" and "file1a.ext".
>
> This special cases, are called Quirks and can be enabled or disabled one 
> by one like ".ext" which in the past matches any file with ".ext" as 
> extension, but this not happen in recent Windows versions, so in the 
> code is disabled by default.
>
> Thank you for reading such amount of text, have a nice day.
>
> -- 
>
> _______________________________________________
> fpc-devel maillist  -  fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>


More information about the fpc-devel mailing list