[fpc-pascal]Strings vs. Ansistrings

Jonas Maebe jonas at zeus.rug.ac.be
Fri Sep 22 13:28:38 CEST 2000


>I *have* read the reference guide section on ansistrings, many times
>actually (I've never used them before, so I wanted to familiarize myself
>before writing any code). That's where I got the impression that short
>strings and ansistrings could be used in a similar fashion. To paraphrase
>the guide: "Whatever the actual type, ansistrings and short strings can be
>use interchangeably". This statement made me think that there weren't any
>glaring differences, and that I could use one just like the other.

This is only when using standard functions (like pos, length, copy, 
delete, the "+" sign for concatenating etc.). If you start working around 
the standard constructs (ie. by using 'move' instead of using a normal 
assignment), this is not true anymore.

The reason is that an ansistring is a pointer to a record which contains 
the current length, the amount of references to this record, the current 
maximum length and an array of char (which has a size equal to the 
current maximum length) which contains the actual string.

The ansistring pointer points to this array of char, and in the 12 
preceding bytes the length, maxlength and refcount are stored as longints 
(since they aren't meant to be accessed  directly).

Suppose you have something like

var
  s1, s2: ansistring;
begin
  readln(s1);
  s2 := s1
end.

After the readln is executed, s1 has a refcount of 1 (the only ansistring 
that points to the string is s1), maxlength of minimum 255 (default), 
length is the actual length of the string and the array of char contains 
the string you typed in with an extra terminating #0 char (because, since 
the ansistring points to the array of char, you can perfectly typecast an 
ansistring to a pchar this way).

Now, when you assign s1 to s2, the following happens:

a) it is checked whether s2 already points to a string. In this case, 
that obviously isn't so, but otherwise the reference count of the string 
s2 points to would be decreased (since after the assignment, s2 wouldn't 
point anymore to that string) and if the reference count of that string 
was zero afterwards, the memory it occupied would be freed
b) it is checked whether s1 already points to a string (an empty 
ansistring = nil). That's the case here, so the reference count of the 
string s1 points to is increased by one, since after the assignment, 
there are now two ansistrings pointing to that string
c) the pointer in s1 is copied to s2

Now, if you'd do something like

var
  s1, s2: ansistring;
begin
  readln(s1);
  move(s1,s2,sizeof(s1));
end.

then the compiler doesn't see anymore you are assigning one ansistring to 
another (the compiler doesn't know what the move procedure exactly does, 
it just sees it as a procedure with three parameters of which the first 
two are untyped and the last one is a longint). This means the 
decreasing/increasing of the reference counting doesn't happen, and as 
such you afterwards have two ansistrings (s1 and s2) pointing to the same 
string (the one read using readln), while the refcount is only 1.

This means that if you afterwards assign a new value to s1 or s2 (e.g. by 
using readln again), the reference count is decreased and becomes 0! The 
program then thinks that no other ansistrings are pointing anymore to 
this string and realeases/reuses the memory for a new string.

In your program, you're doing something like

move(s1,array_var[index]^,length(s1));

Since s1 is a pointer, you are moving way too much data this way as well 
(in addition to not have the program update the reference count).

>> Ansistrings provide some very nice features (like the reference
>> couting, automatic (de)allocation of memory etc, but for them
>> to work you have to play by the rules.
>
>I'd love to 'play by the rules', if that's what makes ansistrings work as
>advertised. Where can I find these rules?

They are not explicitely in the manual, but in general you could say: 
don't confuse the compiler by removing the type of a variable (e.g. the 
move() procedure takes two untyped var parameters, so inside the move() 
procedure the compiler doesn't know anymore it's working with 
ansistrings) before modifying or copying their contents, unless you know 
how things internally work.

>> There's probably a bug in the windows crt unit. I don't have
>> windows though, so I can't check it.
>
>Anything I can do to help track it down?

I don't know. Here's the current clreol routine from the windows crt 
unit, but I don't see anythign wrong in it at first sight:

***
procedure ClrEol;
{
  Clear from current position to end of line.
}
var Temp: Dword;
    CharInfo: Char;
    Coord: TCoord;
    X,Y: Longint;
begin
  GetScreenCursor(x,y);

  CharInfo := #32;
  Coord.X := X - 1;
  Coord.Y := Y - 1;

  FillConsoleOutputCharacter(OutHandle, CharInfo, WinMaxX - (X + 01), 
Coord, @Temp);
end;
***


Jonas




More information about the fpc-pascal mailing list