<div dir="ltr">Hello everyone!<div><br></div><div>Success! Writeln was indeed the main culprit.</div><div><br></div><div>The following works:</div><div><div> WritelnUTF8(('$B$"$$$&$($*=)MU86(B'));</div></div><div> </div><div>
(Where: </div><div>1. WritelnUTF8 is a procedure that uses the Win32 API WriteConsole directly, bypassing whatever writeln is doing.</div><div>2. Where the source code is saves as UTF8 with BOM</div><div>3. Where I don't set the console output page to UTF8)</div>
<div><br></div><div>But I'm outputting UTF8 to an ANSI console and it works?!</div><div>Preseumably, somewhere the RTL sees that my source code is UTF8 and decodes to convert it to ANSI and that gets output and it works since the codepage is ANSI.</div>
<div><br></div><div>A cleaner path would be to set the output codepage to ANSI and tell the compiler not to touch my UTF8, but that doesn't seem to work.</div><div><br></div><div>My guess is that Writeln is somehow/somewhere trying to convert the already converted text again, which would very certainly result in garbage!</div>
<div><br></div><div>Once I got the idea to use WriteConsole, I searched and found this:</div><div><a href="http://forum.lazarus.freepascal.org/index.php?topic=17548.0">http://forum.lazarus.freepascal.org/index.php?topic=17548.0</a><br>
</div><div><br></div><div>(svn: 37432)</div><div><br></div><div>thank you,</div><div> Noah Silva</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2013/7/9 Noah Silva <span dir="ltr"><<a href="mailto:shiruba@galapagossoftware.com" target="_blank">shiruba@galapagossoftware.com</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>I deal with in Japanese (and sometimes other languages) in a lot of my programs, and nothing I do seems to work consistently on Windows systems. (OS X is no problem).</div>
<div><br></div>
<div>I have followed steps in the Wiki, etc., but to little avail, so I have some questions for anyone who knows more than me:</div><div>1. What encoding "should" I be writing to the terminal? from experimenting with text files using the cat command in powershell, it seems that local ("ANSI") encoding should be used. This makes sense since older versions of windows only supported local encodings.</div>
<div>2. Is there any reason why writing out data in the local encoding (with write statements, etc.) should get corrupted? For example is some level of the RTL assuming something about the encoding? (I don't think so, but...)</div>
<div>3. Is there a way to set the output to UTF8 so I can just write out UTF8 and be done with it?</div><div><br></div><div>Just to give an example:</div><div>1. I read in an SJIS CSV file, and display it on the screen, and it's corrupted. </div>
<div>2. I convert it to UTF8 before displaying it, and it's still corrupted. </div><div>3.I cat the file to the screen and it's ok. </div><div>4. I write the output to a file instead of the console, and it's ok.</div>
<div><br></div><div>Something seems odd.</div><div><br></div><div>Does anyone else have these issues?</div><div><br></div><div>Thank you,</div><div> Noah Silva</div><div>
<br></div><div>p.s.: I know that the actual data in my programs isn't broken, because if I write it to a file or database, there is no problem with corruption.</div></div>
</blockquote></div><br></div>