<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 07/30/2013 04:29 AM, Noah Silva

      wrote:<br>

    </div>

    <br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>No, UTF16 only needs more memory if most of the text is

              ASCII.  It actually uses less than UTF8 in the average

              case for Japanese, for example. <br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    Of course you are right here. <br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> Linux OS API in

                most cases is 8 Bit,<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>I assume by 8bit, you mean variable byte encoding like

              UTF8.</div>

          </div>

        </div>

      </div>

    </blockquote>

    Yep.<br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> Conversions are

                very expensive. <br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>This is not as bad as some people make it out to be.

               You have to be converting a *lot* of data for it to be

              noticeable.</div>

          </div>

        </div>

      </div>

    </blockquote>

    That is why I pointed out that the way to select an encoding depends

    on how much "calculations" are done on the strings. <br>

    <br>

    But in fact I tend to agree, while the argument why - when

    converting to Unicode - the Lazarus team chose to do the LCL API  in

    UTF-8 (while MSE chose UTF-16 for the same purpose) was exactly this

    (I never felt comfortable with that, BTW). <br>

    <br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote"><br>

            <div bgcolor="#FFFFFF" text="#000000">> I suppose this is

              bound to change once fpc has completed the move to "new

              Delphi Strings". <br>

            </div>

            <div><br>

            </div>

            <div>I really don't think so, the reasons are even well

              detailed in the Wiki. <br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    I always was told that Delphi compatibility is the primary driving

    forth for any modifications. This necessarily suggests this move

    (which is not possible before fpc does provides "new Delphi

    Strings"). But there might be multiple  opinions. <br>

    <br>

    In fact my primary intentions with Lazarus / fpc are not to do my

    own generic projects, but to help my colleagues to move their huge

    Delphi XE program system to Linux. This in fact needs complete

    support for "new Delphi Strings". <br>

    <br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>From what I understand, the plan is for strings to

              store their codepage as an attribute internally along with

              their length, and since the compiler/runtime library will

              know their codepage, it can convert as necessary. </div>

          </div>

        </div>

      </div>

    </blockquote>

    That already is ready to use in the svn and is exactly the said "new

    Delphi Strings", and - when activated - completely compatible with

    Delphi XE. It's rather nice and fast, but Delphi lacks a

    _completely_dynamic_encoding_ type with auto-conversion only when

    necessary. (IMHO rather easy doable by compiler magic, but

    "forgotten" in Delphi XE)  <br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> Either way, you can make your own StringList variants

              for each type easily enough.  <br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    Not without compiler support (if you want auto-conversion when

    necessary). <br>

    <blockquote

cite="mid:CANC_V0Mac5tdBrAnqBjaJoKSwY=1Ts2E8r62r=kKtqiAWrK-jw@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote"><br>

            <div>In fact, I am fine with manual conversions, so long as

              99% of everything "just works" with UTF8 and/or UTF16.  <br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    I'm not fine with TStringList and friends forcing any predefined

    encoding. This in fact does work rather nicely without the

    application programmer even noticing it. But IMHO a cross platform

    system like fpc can be expected to do better, doing away with

    windowish remains from Delphi whenever possible. <br>

    <br>

    -Michael<br>

  </body>

</html>