<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Apologies: when I typed "FTP" below I meant "FPC" :( I'm
currently drowning in acronym soup.<br>
</p>
<div class="moz-cite-prefix">On 05/09/2019 09:24, Tony Whyman wrote:<br>
</div>
<blockquote type="cite"
cite="mid:de4e7617-411c-5950-3707-1ac7fb6077bc@mccallumwhyman.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<p>A few points:</p>
<p>1. IMHO: This is currently a Windows problem where the console
buffer is UCS2. Linux (and probably all other cases its UTF8 -
to be verified).</p>
<p>2. The following Microsoft blog post is interesting background
on where MS are going with this:</p>
<p><a class="moz-txt-link-freetext"
href="https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/"
moz-do-not-send="true">https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/</a></p>
<p>3. The current Windows API includes "SetConsoleCP" which should
(I haven't tested this) allow you to set transliteration to
UTF-8 when you call the Windows ReadConsoleInput API function.
This seems to imply that FTP can be a consistent UTF8
environment even when the Windows Console buffer is UCS2.</p>
<p>4. Because console input is buffered, you probably cannot have
a situation where readln changes the console code page to fit
the type (unicode or ansistring) of the variable that you are
reading into.</p>
<p>5. You could change FTP so that under Windows, the console is
always read using UCS2 with transliteration to ansistring
happening when required and depending on the type of the
variable that you are reading into. I think that is probably
what you are asking for under Windows:</p>
<p>- The console code page is always UCS2.</p>
<p>- Console input is read into unicodestrings in native mode</p>
<p>- Console input is read into ansistrings with transliteration
from UCS2 after the input buffer has been parsed.</p>
<p>- Conversion to integers, floats, etc. occurs after
transliteration to ansistring in order to avoid too many changes
to the RTL.</p>
<p>- Under other OSs, Console input is UTF8 (or a supported ANSI
code page). Transliteration to unicodestrings occurs after
parsing the input buffer.</p>
<p>6. The question is: is it worth having a different approach to
Windows when Windows allows you to set the console input buffer
to UTF8 and hence have a common input environment for all OSs?<br>
</p>
<div class="moz-cite-prefix">On 05/09/2019 08:00, LacaK wrote:<br>
</div>
<blockquote type="cite"
cite="mid:cec4938d-663f-0a34-599d-55efdf51d863@zoznam.sk">Is
there consensus/demand on such solution and any patch in this
direction will be accepted?<br>
If yes we must agree on implementation details and IMO also
someone must check what situation is in Delphi ... because I
guess, that if Delphi does not support this that also FPC will
not diverge?<br>
Question1: should be supported "SetTextCodePage(CP_UTF16)<a
class="mw-selflink selflink" moz-do-not-send="true">"</a> and
<a class="mw-selflink selflink" moz-do-not-send="true">"</a><a
class="mw-selflink selflink" moz-do-not-send="true">SetTextCodePage(CP_UTF16BE)"?<br>
Question2: is this supported in Delphi?<br>
If answer to both questions is YES then I will fill bug report
as start point.</a></blockquote>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
fpc-pascal maillist - <a class="moz-txt-link-abbreviated" href="mailto:fpc-pascal@lists.freepascal.org">fpc-pascal@lists.freepascal.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal">https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal</a>
</pre>
</blockquote>
</body>
</html>