program begin end. // comments?: Weird code snippet #2: Generic Double Linked List

Thursday, July 14, 2011

Weird code snippet #2: Generic Double Linked List

They say Generics and pointers don't mix.

type
  PMyThing<T> = ^TMyThing<T> // [DCC Error] E2508 type parameters not allowed on this type
  TMyThing<T> = record
     Thing: T;
  end;

Ok, they don't. But there is a loophole!

type
  TMyThing<T> = record
    Thing: T;
    NextThing: ^TMyThing<T>;
  end;

Why this is allowed, I don't know. Just like I don't really understand why the first one is forbidden. There is probably some good explanation for it.

Still - it can be fun breaking the rules!

unit GenericDoubleLinkedList;

interface
uses
  Classes, Generics.Defaults;

type
  TLinkVisitor<T> = reference to procedure(const Item: T);

  TDoubleLinked<T> = record
    Value: T;
    PrevLink: ^TDoubleLinked<T>; // Hey, it compiles!
    NextLink: ^TDoubleLinked<T>;
    constructor Create(aValue:T);
    function Add(aValue:T): TDoubleLinked<T>;
    function HasNext:Boolean;
    function Next: TDoubleLinked<T>;
    function HasPrev:Boolean;
    function Prev: TDoubleLinked<T>;
    function First: TDoubleLinked<T>;
    function Last: TDoubleLinked<T>;
    procedure ForEach(const Proc: TLinkVisitor<T>);
  end;

  procedure Test(const Log:TStrings);

implementation

{ TDoubleLinked<T> }

constructor TDoubleLinked<T>.Create(aValue: T);
begin
  Value := aValue;
  NextLink := nil;
  PrevLink := nil;
end;

function TDoubleLinked<T>.Add(aValue: T): TDoubleLinked<T>;
var
  p: ^TDoubleLinked<T>; // But this one is not assignment compatible
begin
  p := AllocMem(SizeOf(TDoubleLinked<T>)); // Make space
  p^ := Self;      // Copy current value to allocated block
  Value := aValue; // Set self to new value
  p.NextLink := @Self;
  if Assigned(p.PrevLink) // Fix up previous nextlink
   then Pointer(p.PrevLink.NextLink) := Pointer(p);
  Pointer(PrevLink) := Pointer(p);  // Point back to old value
  Result := Self;
end;

function TDoubleLinked<T>.HasPrev: Boolean;
begin
  Result := PrevLink <> nil;
end;

function TDoubleLinked<T>.Prev: TDoubleLinked<T>;
begin
  Result := TDoubleLinked<T>(PrevLink^)
end;

function TDoubleLinked<T>.HasNext: Boolean;
begin
  Result := NextLink <> nil;
end;

function TDoubleLinked<T>.Next: TDoubleLinked<T>;
begin
  Result := TDoubleLinked<T>(NextLink^)
end;

function TDoubleLinked<T>.First: TDoubleLinked<T>;
begin
  Result := Self;
  while Result.HasPrev
   do Result := Result.Prev;
end;

function TDoubleLinked<T>.Last: TDoubleLinked<T>;
begin
  Result := Self;
  while Result.HasNext
   do Result := Result.Next;
end;

procedure TDoubleLinked<T>.ForEach(const Proc: TLinkVisitor<T>);
var
  Node: TDoubleLinked<T>;
begin
  Node := First;
  Proc(Node.Value);
  while Node.HasNext
  do begin
    Node := Node.Next;
    Proc(Node.Value);
  end;
end;

procedure Test(const Log:TStrings);
var
  List, Node : TDoubleLinked<String>;
begin
  List.Create('One');
  List.Add('Two');
  List.Add('Three');
  Node := List;  // Bad idea
  List.Add('Four');
  Node.Add('ThreeAndAHalf');

  List.ForEach(
    procedure(const Value:String)
    begin
      Log.Add('List: ' + Value)
    end);

  Node.ForEach(
    procedure(const Value:String)
    begin
      Log.Add('Node: ' + Value)
    end);
end;

end.

The problem is that "List" is not a pointer, but the tail item of the list. Hence, a Delete procedure needs to take this into consideration.

Even worse, if you add a second Node variable to point to something in the list, that reference will not be fixed up after adding 'Four', and hence it will take the tail place of the list - for both references, effectively forgetting the 'Four' item.

So, although this was somewhat entertaining, a mix of Generics and pointers probably isn't something we should make use of.

Exercise for the reader: Implement the TDoubleLinked<T>.Delete; procedure.

End question: Why are we not allowed to declare pointers to generic types?

20 comments:

Stefan Glienke14 July, 2011 15:27
This is a compiler bug, as Barry mentioned here: http://stackoverflow.com/questions/793472/pointer-to-generic-type
ReplyDelete
Replies
Lars Fosdal14 July, 2011 15:37
Thanks, Stefan! Good to know!
There is a QC issue on it as well: http://qc.embarcadero.com/wc/qcmain.aspx?d=66584
ReplyDelete
Replies
Chris14 July, 2011 20:25
I don't understand the Delete issue, or more exactly, the reason List and Node are declared as TDoubleLinked rather than ^TDoubleLinked. Could you explain...?
ReplyDelete
Replies
Warren14 July, 2011 22:04
My eyes, they are bleeding.

W
ReplyDelete
Replies
Lars Fosdal14 July, 2011 22:28
Chris - the ^TDoubleLinked<T> will not be assignment compatible as the pointer type cannot be declared, hence the non-pointer ref.

The Delete issue is that since the tail end is a non-pointer, it cannot be disposed, so you would have to shuffle data from the "neighbour" pointer reference into the List, and dispose that pointer reference instead.

Warren - I know... would be a lot nicer if generic pointers actually were allowed.
ReplyDelete
Replies
Chris15 July, 2011 00:42
I realise what the Delete issue is, but it only arises due to how you've decided to code the list itself, which is weird in itself IMO (e.g., in having Add change the contents of Self 'in place' and returning a pointer to what was Self, rather than keeping Self as it is and returning a pointer to the new node). If you were to write the list more straightforwardly, Delete becomes very easy.

As I wasn't sure how to post code in a Blogger comments box, I've blogged about the topic myself instead: http://delphihaven.wordpress.com/2011/07/14/weird-in-more-ways-than-one/
ReplyDelete
Replies
Lars Fosdal15 July, 2011 01:24
Chris - as commented on your blog - only by sacrificing type safety using raw pointers - which remove the reason for using generics alltogether.
ReplyDelete
Replies
Corrie Engelbrecht15 July, 2011 01:45
Only structured types can be genericized. A pointer is not a structured type.

When you declare:
type
TGenRec = record x:T; end;

nothing will happen. The compiler does accept it, because there is a parameter T and a usage x:T;

But when you then declare:
var
rec : TGenRec;

a new type is created called TGenRec;

When you try:
type
PRec = ^TGenRec;

There is no instance usage of T, so the parameter can never be applied in response to a type argument.

This is why you cannot declare a pointer to a generic record.
ReplyDelete
Replies
Chris15 July, 2011 09:48
Lars - as I say over there, if type safety is your utter concern, you shouldn't be using pointers at all.
ReplyDelete
Replies
Lars Fosdal15 July, 2011 10:39
I didn't. At least not on the outside...
ReplyDelete
Replies
Lars Fosdal15 July, 2011 10:46
Corrie - I think Blogger ate your <T> so it is a bit hard to follow your example.

But consider
type
  PLink<T> = ^TLink<T>;
  TLink<T> = record
    value: T;
  end;

Are you saying that the compiler would not be able to understand how to interpret the referential integrity of a variable of the type PLink<T>?
ReplyDelete
Replies
Corrie Engelbrecht15 July, 2011 13:13
Hi Lars,

I was tried to correct my post, but I had network issues:( Yes, Blogger ate the angle brackets. I will try again, and explain in another way.

Consider this:
type
TRec1<T> = record x:T; end; // no type exists yet
TRec2<T> = record x:T; end; // no type exists yet

1. No types exist yet. This is important.
2. Also the T in TRec1 does not in any way relate to the T in TRec2.

var
r1 : TRec1<Integer>; // Only now is a type created

There is nothing to link the following pointer declaration attempt to:
type
PRec1<T> = ^TRec1<T>; // cannot compile

This is why the following has to be done to compile:

type
TRec3 = TRec1<Integer>; // Only now, when there is actually a type argument, is a type TRec1<Integer> created

So now only can this compile:
type
PRec3 = ^TRec3; // Declaration can precede TRec3 as well, it will still work

PS: The preview showed all of the angle brackets, and all of the types correctly. I hope the actual final post will still look correct!
ReplyDelete
Replies
Lars Fosdal15 July, 2011 14:20
Right - I get that this doesn't work now - but I don't see how it can't be made to work technically.

It might be complicated for a single pass compiler, but basically it is a matter of unwrapping nested declarations. Nested declarations in general are not allowed and probably for a good reason. It would quickly become possible to construct a huge dependency tree that way, but if the compiler applied the same model for records as for classes - we would get a very powerful, flexible, and type safe system for extendable memory structures.

For a class:
type
  TBaseClass<T> = class
    property Value:T;
end;
you can't do
  TExtClass<T> = TBaseClass<T>;
but you can do
  TExtClass<T> = class(TBaseClass<T>);

Staying with that model, it would work to allow
type
  RBaseRec<T> = record
    Value:T;
  end;
  PBaseRec = ^record(RBaseRec);

The ^ is simply notation for how the instance will be referenced.

Aggregated records would be really handy for protocol buffers.
ReplyDelete
Replies
Lars Fosdal15 July, 2011 14:21
bah... forgot to escape the brackets again in that last ^record(RBaseRec<T>);
ReplyDelete
Replies
Lars Fosdal15 July, 2011 14:24
And again...
PBaseRec<T> = ^record(RBaseRec<T>);
ReplyDelete
Replies
Corrie Engelbrecht15 July, 2011 14:38
No, I don't think you understand.

It is impossible to compile this in Pascal:
type
  PRec<T> = ^TRec<T>;
  TRec<T> = record x:T; end;

There would have to be a rule that the T must be unique across all generic type parameters in at least unit scope for this to work.

So there would have to be a rule that this is illegal:
<pre>
type
  TRec1<T> = record v:T; end; // the usual
  TRec2<X> = record v:X; end; // still ok
  TRec3<T> = record v:T; end; // Not ok under hypothetical new rule. T has been used already.
</pre>
ReplyDelete
Replies
Lars Fosdal15 July, 2011 14:55
Hmm. I really don't follow you here.

AFAIK, the left side T is declarative, and right side T is referential, and the scope is limited to the construct, hence you can reuse T as much as you like across declarations?
ReplyDelete
Replies
Corrie Engelbrecht15 July, 2011 15:10
Yes, but every type declaration is a new beginning. You cannot use T in a record definition, and later on refer to the SAME T in another record definition.

See?

In the same way, you cannot use T in a record definition, and then later refer to the SAME T in a pointer definition.

So, in the following code:

type
  TRec1<T> = record v:T; end;
  TRec2<X> = record v:X; end;
  TRec3<T> = record v:T; end;

The first T is only for TRec1. You cannot reference that T in any of the subsequent record declarations.

Exactly like the T <> X in the above snippet, and no record can refer to the X except for TRec2.

And the first T (of TRec1) is NOT the same T as TRec3.

And for that very same reason, no other type can refer to the X (or the T).

That is why, in the following snippet:

type
  TRec1<T> = record v:T; end;
  PRec1<T> = ^TRec1<T>

...the two T's are not the same T.

And they can never be the same T under current Object Pascal syntactical rules.

The addition of requiring that all generic params (the T) be unique would make it possible to correctly associate PRec1<T> with TRec1<T>, but otherwise it's impossible.

You see, a key concept to note is that the following declaration:
type
  TRec1<T> = record v:T; end;

DOES NOT cause a type to be created.
ReplyDelete
Replies
Corrie Engelbrecht15 July, 2011 15:13
Did you not read "There would have to be a rule" and " hypothetical new rule" ?

It is obvious that I'm suggesting a way for your generic pointers to work with a new syntax rule.

Of course it is true that "the scope is limited to the construct, hence you can reuse T as much as you like across declarations"

But that is one of the two things that disallow the generic record pointers your article suggests.

The other thing is that a generic type is not a real type until the T becomes an actual type argument.
ReplyDelete
Replies
Lars Fosdal15 July, 2011 16:01
Ok. Got it. Not possible with todays rules and type generation. For it to work - the ^ would have to trigger a new rule, creating one type with two referential models when the parametric type is known after an instance declaration.

Duplicate references would need to resolve to same type instance, I guess.
ReplyDelete
Replies

Add comment

program begin end. // comments?

Pages

Delphi Programming

Thursday, July 14, 2011

Weird code snippet #2: Generic Double Linked List

20 comments:

Programming Related Blogs