[fpc-pascal] Read lines into UnicodeString variable from UCS2 (UTF-16) encoded text file
LacaK
lacak at zoznam.sk
Fri Sep 6 07:24:09 CEST 2019
From user POV we have this situation:
- on one side there is input text file encoded UTF-16 (either LE or BE)
- on other side there is FPC, where RTL procedures like AssignFile,
SetTextCodePage, Reset, Read(Ln), Write(Ln) are available.
My original intention was simply use call to existing procedure
SetTextCodePage with parameter CP_UTF16, which in my opinion will simply
signal to RTL, that input/output text file is/should be encoded using UTF16.
Then any subsequent call to ReadLn with any destination variable
(ansistring, unicodestring, integer, etc.) will simply do something like:
- read from file byte sequence, which will be interpreted as UTF-16 so
we will have on input UnicodeString
- this UnicodeString will be further transliterated to requested
destination variable (as there are in FPC implicit conversions between
UnicodeString and AnsiString this would be no problem)
(for Write(Ln) same will happen only in reverse order: source variable
-> UnicodeString -> Write to File)
If SetTextCodePage(CP_UTF16) is not appropriate, then we must IMO
introduce any new procedure which will give to user possibility signal
that "I have UTF-16 encoded text file" or "I want that all writes to my
text file should be encoded UTF-16".
(but personally I do not see reason to introduce new procedure as
SetTetCodePage for me perfectly fit)
So firstly we need design/proposal, which is/will be accepted.
(probably here is needed deeper knowledge of RTL internals so it is
reason why also others core developers should step in)
L.
> On 2019-09-05 13:04, Joost van der Sluis wrote:
>> Op 05-09-19 om 12:06 schreef Tomas Hajny:
>>> On 2019-09-05 09:00, LacaK wrote:
>>>> Is there consensus/demand on such solution and any patch in this
>>>> direction will be accepted?
>>>
>>> I'm not aware of potential discussion about this so far, thus I
>>> cannot talk about any existing consensus (let's hear others), but I
>>> believe that such a consensus could be reached.
>>
>>> Yes, that's for sure. There's at least one person from the core team
>>> list already involved. ;-)
>>
>> I think that this question from LacaK was not that strange. For people
>> outside the core team, it is not always clear who is member of core.
> .
> .
>
> Absolutely, the question was perfectly valid, sorry if my response
> sounded differently. In any case, I also explicitly mentioned people
> I'd like to be involved in reaching the consensus. I will make sure to
> get their opinion (either here or elsewhere) and provide the summary
> here for LacaK and others as appropriate.
>
> Tomas
> _______________________________________________
> fpc-pascal maillist - fpc-pascal at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
More information about the fpc-pascal
mailing list