[fpc-pascal] Read lines into UnicodeString variable from UCS2 (UTF-16) encoded text file

LacaK lacak at zoznam.sk
Fri Sep 6 07:24:09 CEST 2019


 From user POV we have this situation:
- on one side there is input text file encoded UTF-16 (either LE or BE)
- on other side there is FPC, where RTL procedures like AssignFile, 
SetTextCodePage, Reset, Read(Ln), Write(Ln) are available.

My original intention was simply use call to existing procedure 
SetTextCodePage with parameter CP_UTF16, which in my opinion will simply 
signal to RTL, that input/output text file is/should be encoded using UTF16.
Then any subsequent call to ReadLn with any destination variable 
(ansistring, unicodestring, integer, etc.) will simply do something like:
- read from file byte sequence, which will be interpreted as UTF-16 so 
we will have on input UnicodeString
- this UnicodeString will be further transliterated to requested 
destination variable (as there are in FPC implicit conversions between 
UnicodeString and AnsiString this would be no problem)

(for Write(Ln) same will happen only in reverse order: source variable 
-> UnicodeString -> Write to File)

If SetTextCodePage(CP_UTF16) is not appropriate, then we must IMO 
introduce any new procedure which will give to user possibility signal 
that "I have UTF-16 encoded text file" or "I want that all writes to my 
text file should be encoded UTF-16".
(but personally I do not see reason to introduce new procedure as 
SetTetCodePage for me perfectly fit)

So firstly we need design/proposal, which is/will be accepted.
(probably here is needed deeper knowledge of RTL internals so it is 
reason why also others core developers should step in)

L.


> On 2019-09-05 13:04, Joost van der Sluis wrote:
>> Op 05-09-19 om 12:06 schreef Tomas Hajny:
>>> On 2019-09-05 09:00, LacaK wrote:
>>>> Is there consensus/demand on such solution and any patch in this
>>>> direction will be accepted?
>>>
>>> I'm not aware of potential discussion about this so far, thus I 
>>> cannot talk about any existing consensus (let's hear others), but I 
>>> believe that such a consensus could be reached.
>>
>>> Yes, that's for sure. There's at least one person from the core team 
>>> list already involved. ;-)
>>
>> I think that this question from LacaK was not that strange. For people
>> outside the core team, it is not always clear who is member of core.
>  .
>  .
>
> Absolutely, the question was perfectly valid, sorry if my response 
> sounded differently. In any case, I also explicitly mentioned people 
> I'd like to be involved in reaching the consensus. I will make sure to 
> get their opinion (either here or elsewhere) and provide the summary 
> here for LacaK and others as appropriate.
>
> Tomas
> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


More information about the fpc-pascal mailing list