[fpc-devel] Possible issue with 2.7.1 string encodings
Martin
lazarus at mfriebe.de
Sun Aug 25 14:44:34 CEST 2013
I suspect this to be an issue with the new 2.7.1 encoding. If someone
could please review...
Some background fist.
I was looking into a report of a user, where the IDE (Lazarus) would not
show the debug-line-info (blue dots in gutter), for some files (but work
for others)
> fpc svn 25364
> lazarus svn 42490
> kubuntu 13.04,
from what I deduct, Turkish locale.
I then narrowed it down as follows:
- to get the blue dots, the filename for which it is needed, is stored
in a stringlist
- then the info is added list.objects
- then the info is looked up list.IndexOf(filename)
The list is Sorted, and CaseSensitiv-False
*** The problem, index of returns -1 for strings that are in the list.
At one time even the following happened:
if list.IndexOf(s) >= 0 then exit;
list.Add(s);
would give an exception: duplicates not allowed (I do seriously doubt,
that the above code has much potential to be wrong)
- However that was no longer reproducible, so I collected evidence
otherwise.
-----
What I found debugging.
There is the following function
function TGDBMILineInfo.IndexOf(const ASource: String): integer;
begin
Result := FSourceIndex.IndexOf(ASource);
if Result <> -1
then Result := PtrInt(FSourceIndex.Objects[Result]);
end;
Only the first line is of interest. It already returns -1 for existing
strings, as far as I can tell.
To find some info I added debugln as follows.
Note the part
PInteger(ASource)[0], // just some part of the string, for verification
PInteger(ASource)[1], // on 2.7.1 Encoding ? // on 2.6.2 length
PInteger(ASource)[-2] // on 2.7.1 length // on 2.6.2 ref count.
function TGDBMILineInfo.IndexOf(const ASource: String): integer;
var
i: Integer;
begin
Result := FSourceIndex.IndexOf(ASource);
debugln(['TGDBMILineInfo.IndexOf (A) res=', Result, ' ', ASource, '
// ',DbgStr(ASource),
', #',PInteger(ASource)[0],', #',PInteger(ASource)[1],',
#',PInteger(ASource)[-2]]);
for i := 0 to FSourceIndex.Count -1 do
debugln(['TGDBMILineInfo.IndexOf (B) pos=', i, ' ', FSourceIndex[i],
' // ',DbgStr(FSourceIndex[i]),
', #',PInteger(FSourceIndex[i])[0],',
#',PInteger(FSourceIndex[i])[-1],', #',PInteger(FSourceIndex[i])[-2] ]);
if Result <> -1
then Result := PtrInt(FSourceIndex.Objects[Result]);
debugln(['TGDBMILineInfo.IndexOf (C) res=', Result, ' ', ASource, '
// ',DbgStr(ASource)]);
end;
And the result
TGDBMILineInfo.IndexOf (A) res=-1
/home/lazarus/projeler/TiB5651/Gunici_biriktir.inc //
/home/lazarus/projeler/TiB5651/Gunici_biriktir.inc, #1836017711,
#1634479973, #50
TGDBMILineInfo.IndexOf (B) pos=0
/home/lazarus/projeler/TiB5651/Gunici_biriktir.inc //
/home/lazarus/projeler/TiB5651/Gunici_biriktir.inc, #1836017711, #0, #50
TGDBMILineInfo.IndexOf (B) pos=1
/home/lazarus/projeler/TiB5651/UGS_tib5651.lpr //
/home/lazarus/projeler/TiB5651/UGS_tib5651.lpr, #1836017711, #0, #46
TGDBMILineInfo.IndexOf (C) res=-1
/home/lazarus/projeler/TiB5651/Gunici_biriktir.inc //
/home/lazarus/projeler/TiB5651/Gunici_biriktir.inc
Result of
Result := FSourceIndex.IndexOf(ASource);
is -1
but the string is on index 0
Only something changed its encoding. I have no idea what...
However:
All strings come from one and the same variable in one and the same
object. They are passed as parameters, or stored temporarily in other
Fields (all variables and fields are "String" / all units have {$mode
objfpc}{$H+}, some functions have "const name: string" )
Running on 2.6.2, shows that the string passed as argument to the above
function, and the string in the list have the same ref-count
(ref-count=10 ). This makes it very likely it is indeed the same string
(not just same content)
So I have no idea, what could change the encoding. Since the string is
(for all I can tell) NOT edited in anyway.
Most scary, that IndexOf and Add seem to have different opinion on
string equality.
More information about the fpc-devel
mailing list