[fpc-pascal] Read lines into UnicodeString variable from UCS2 (UTF-16) encoded text file

LacaK lacak at zoznam.sk
Thu Sep 5 09:00:49 CEST 2019



>
>>> You may be able to improve on this using system.BlockRead.
>> Probably yes, but then I must read in local buffer and examine buffer 
>> for CR/LF.
>>
>> And return from my function UCS2ReadLn() only portion of string up to
>> CR/LF and rest of string return on next call to my function.
>> (so I must keep unprocessed part in global buffer)
>>
>>
>>> Also, you are assuming low order byte first which may not be portable.
>>
>> Yes, In my case LE is sufficient as far as I check presence of BOM 
>> $FF$FE
>
> Just as a comment - a contribution allowing ReadLn to read UTF-16 
> files (preferably complete from functional point of view, especially 
> without shortcuts like handling only UCS2 instead of complete Unicode) 
> would be obviously welcome.


Is there consensus/demand on such solution and any patch in this 
direction will be accepted?
If yes we must agree on implementation details and IMO also someone must 
check what situation is in Delphi ... because I guess, that if Delphi 
does not support this that also FPC will not diverge?
Question1: should be supported "SetTextCodePage(CP_UTF16)" and 
"SetTextCodePage(CP_UTF16BE)"?
Question2: is this supported in Delphi?
If answer to both questions is YES then I will fill bug report as start 
point.

As I wrote there is in sources explicit comment: "// all standard input 
is assumed to be ansi-encoded" which will be no more true if we will add 
UTF-16 support.

I can imagine, that we can add check for TextRec(T).CodePage=CP_UTF16 
and CP_UTF16BE and these two situations handle specially (in read and 
also in write procedures of text files)

But as far as Read[Ln]/Write[Ln] is core functionality I think, that 
somebody of core developers should look at it ... ;-)

-Laco.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20190905/3c892ab5/attachment.html>


More information about the fpc-pascal mailing list