Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension lowering #7771

Open
MadsTorgersen opened this issue Dec 16, 2023 · 46 comments
Open

Extension lowering #7771

MadsTorgersen opened this issue Dec 16, 2023 · 46 comments

Comments

@MadsTorgersen
Copy link
Contributor

Lowering of extensions

Extensions are "transparent wrappers", that allow types to be augmented with additional members and (eventually) interfaces.

This outlines how we can implement extensions by lowering them to structs, and applying an erasure approach to signatures and generic instantiation.

In this document, base extensions and interfaces are ignored, except for brief consideration at the end.

Extension declarations

An extension declaration like this:

public extension E for U
{
    public static U Create() { ... }
    public static U operator +(U e1, U e2) { ... }
    public int M() { ... this ... }
    public string this[int i] { ... this ... }
}

is lowered to a struct declaration, which contains a private field of the underlying type, as well as the function members from the extension declaration, modified to indirect through the field as necessary:

public struct E
{
    private U __this;
    public static U Create() { ... }
    public static U operator +(U e1, U e2) { ... }
    public int M() { ... __this ... }
    public string this[int i] { ... __this ... }
}

In addition an attribute or other marker may be used to designate that the struct represents an extension.

Extensions in generic instantiations

Extensions used as type arguments or array element types are erased to their underlying type:

List<E>

is lowered to

List<U>

This is the main mechanism ensuring that extensions and their underlying types are interchangeable, even through generic instantiation. In this way, a collection of the underlying type can be freely reinterpreted in terms of an extension, and the elements thus gain the extra members afforded by the extension.

Within member bodies, the compiler can keep track of the fact that a List<U> is "really" a List<E>, and provide appropriate conversions.

Extensions in signatures

Extensions are erased from signatures, and are instead communicated through metadata (attributes).

public E M(E e, E[] a, List<E> l) { ... }

is lowered to something like

public U M([Extension(...)] U e, [Extension(...)] U[] a, [Extension(...)] List<U> l) { ... }

The exact encoding scheme for the attributes is TBD, but will probably resemble those for nullable reference types and tuple element names, which are similarly type elements that are erased by the compiler.

This encoding scheme means that methods cannot be overloaded by different extensions for the same underlying type.

Extension member access

In order to provide access to extension members, the compiler is able to freely convert between the extension type and its underlying type.

This conversion relies on the fact that an extension always has exactly the same shape in memory as the underlying type. This makes it safe for the compiler to utilize the Unsafe.As(...) method.

The method

void M(E e)
{
    var i = e.M();
}

is lowered to

void M([Extension(...)] U e)
{
    ref var __e = Unsafe.As<U, E>(ref e); // Creates a ref of type E to e
    var i = __e.M();
}

Conversions to and from extensions

The bi-directional implicit identity conversion between extensions and their underlying type can likewise be supported through the use of Unsafe.As:

U u = ...;
E e = u; // identity conversion
e.M();

which lowers to

U u = ...;
E e = Unsafe.As<U, E>(ref u);
e.M();

If the underlying type is a value type, the user may need to make e a ref to u, so that the value isn't copied in the conversion, and mutations occurring within extension members apply back to the underlying value:

U u = ...;
ref E e = ref u; // identity conversion
e.M(); // Mutations apply to u

which lowers to

U u = ...;
ref E e = ref Unsafe.As<U, E>(ref u); // identity conversion
e.M(); // Mutations apply to u

Implicit extensions

Implicit extensions are automatically used as "fallback" extensions for their underlying type in a given static scope. The compiler uses lookup machinery similar to today's extension methods to find where extension members are invoked, and implicitly inserts a conversion to the appropriate extension type. This is described in detail in the Extensions specification.

From there, the lowering proceeds as described above.

Base extensions

It is TBD how base extensions are encoded in the lowered extension declaration. When converting to base extensions, the compiler can use the same approach as between extensions and underlying types, since the underlying representation in memory is unchanged.

Extensions with interfaces

The lowering of an extension declaration which implements interfaces is in and of itself simple: Simply add the interfaces to the lowered struct declaration.

However, semantics become complicated. In particular, the implicit conversions between generic instantiations over extensions vs underlying types may no longer apply when the extension implement interfaces, and those interfaces are material to satisfying constraints in the generic instantiation itself.

A compiler-only approach to this would be to simply not have such conversions when an extension implements an interface, but that means adding an interface to an extension is highly breaking. A more permissive and situational approach would likely require significant new runtime feature work however.

@SunnieShine

This comment was marked as outdated.

@KennethHoff
Copy link

KennethHoff commented Dec 16, 2023

This encoding scheme means that methods cannot be overloaded by different extensions for the same underlying type.

That means you can't do something like this, which is unfortunate:

public extension TeacherId for Guid;
public extension StudentId for Guid;

public Person? Load(TeacherId id)
{
    return dbContext.Teachers.FirstOrDefault(x => x.Id == id);
}

public Person? Load(StudentId id)
{
    return dbContext.Students.FirstOrDefault(x => x.Id == id);
}

Instead you'd have to do this:

public extension TeacherId for Guid;
public extension StudentId for Guid;

public Person? LoadTeacher(TeacherId id)
{
    return dbContext.Teachers.FirstOrDefault(x => x.Id == id);
}

public Person? LoadStudent(StudentId id)
{
    return dbContext.Students.FirstOrDefault(x => x.Id == id);
}

(I'm also assuming public extension X for A in this context is an explicit extension - meaning explicit is implicit, and you have to explicitly state if you want it to be an implicit extension - something I agree with, but can be confusing to developers at first glance).

The thing I think/thought I will be using explicit extensions for the most is for typed values like this, where I want something to be functionally identical to a backing type, but has a different meaning in my domain. Currently I do this by making something like this, but that has a lot of problems - especially with IO (Serialization/Deserialization)

public readonly record struct PersonId(Guid Value)
{
    public override string ToString()
    {
        return Value.ToString();
    }

    // Override `Parse`, `TryParse`, and add some Json serialization attributes, and so on...
}

@En3Tho
Copy link

En3Tho commented Dec 16, 2023

I really wonder how this can be properly done without runtime support.
For example:

public void Do<T, U>(Dictionary<T, U> dict) where T : ISomeInterface1 where U : ISomeInterface2
{
    typeof(T/U) // - what is it here? Is it int or a wrapper?
    typeof(T/U).GetMethod("MyMethodFromInterface") // - and how will this work?
    dict[default(T)] = default(U); // - what happens here if Resize is called for example? From runtime point of view Wrapper[] and int[] are different types.
}
// extend int with ISomeInterface1 and ISomeInterface2
Do(new Dictionary<int, int>());

@KennethHoff
Copy link

KennethHoff commented Dec 17, 2023

Adding to what @En3Tho and I said yesterday. I consider being unable to overload based on extensions to be a fairly huge problem, so I really hope this will be considered when working out the implementation details.

I think it's time to consider updating the IL metadata. Alongside this issue (extensions) - which I presume would be able to bypass the "According to IL it is the underlying type, so you can't overload based on them" problem with an updated IL representation - there are a few other language features that I believe would benefit from an update to the IL. dotnet/runtime#89730 being the first one that came to mind, but there are a few others that might, like dotnet/runtime#94620 and #5556, and I'm sure many more.

I can't really think of an example of explicit extensions that don't fit into the use-case mentioned above¹ (#7771 (comment)), and because of that I think the conceptual model of "TeacherId is a type that coincidentally have the exact same underlying data-structure as a Guid" is more useful than "TeacherId is a Guid with a different compiler-enforced name" (Which seemingly is what this lowering strategy will give us). While I'm sure the latter has some use-cases, I consider the former to be much more useful.

Personally - and this might be taking it to the extreme - I would want to be able to disable implicit upcasting of explicit extensions (say that three times fast..), such that the following code would be illegal (I believe it currently is legal):

public extension TeacherId for Guid;

public Person? LoadPerson(Guid id)
{
	return dbContext.Persons.FirstOrDefault(x => x.Id == id);
}

TeacherId teacherId = ...;
LoadPerson(teacherId); // Personally I want this to error.

Guid guid = (Guid)teacherId; // Explicit casting should be allowed though.
LoadPerson(guid);

I realize this can be achieved through Roslyn Analyzers, but I want the language designers to be aware of this use-case.

Additionally, I hope the following is illegal by default, because if it isn't, then this feature is practically worthless for my use-case:

public extension TeacherId for Guid;
public extension StudentId for Guid;

public Person? LoadPerson(Guid id)
{
	return dbContext.Persons.FirstOrDefault(x => x.Id == id);
}

public Person? LoadStudent(StudentId id)
{
	return dbContext.Students.FirstOrDefault(x => x.Id == id);
}

TeacherId teacherId = ...;
LoadStudent(teacherId); // No "sideways casting" allowed.

Guid guid = ...;
LoadStudent(guid); // No downcasting allowed.

¹ I am in no way saying there aren't any, just that I can't think of any.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Dec 17, 2023

I consider being unable to overload based on extensions to be a fairly huge problem

At that point, just wrap the type manually with a new nominal type. This can also be done easily and cheaply

The point of an extension is that it's new functionality, but the same type (just like an extension method). It's just broadening that to all members.

If you actually want me types, that's the easy thing that is already supported today :-)

@KennethHoff
Copy link

KennethHoff commented Dec 17, 2023

I consider being unable to overload based on extensions to be a fairly huge problem

At that point, just wrap the type manually with a new nominal type. This can also be done easily and cheaply

The point is that that brings a lot of baggage, like requiring custom JsonSerializers, ToString overloads, Value Converters (EF Core), IParseable implementations, and so on..

The point of an extension is that it's new functionality, but the same type (just like an extension method). It's just broadening that to all members.

I agree that implicit extensions offers a lot of new useful functionality in this regard, but explicit extensions seemingly doesn't (But I'd love to be proven wrong)

@CyrusNajmabadi
Copy link
Member

The point is that that brings a lot of baggage, like requiring custom JsonSerializers, ToString overloads, Value Converters (EF Core), IParseable implementations, and so on..

All that could be done with generators. :-)

If you want distinct types, that's the easy case. The challenge has always been in adding functionality to existing types.

@CyrusNajmabadi
Copy link
Member

but explicit extensions seemingly doesn't (But I'd love to be proven wrong)

It definitely does. It just doesn't allow things like overloading. But so what? Just name the methods differently. :-)

@KennethHoff
Copy link

KennethHoff commented Dec 17, 2023

It definitely does. It just doesn't allow things like overloading. But so what? Just name the methods differently. :-)

Assuming you still need nominal types to do real differentiation, then I fail to see what explicit extensions adds to my repertoire that my previous solution and/or implicit extensions doesn't.

@HaloFour
Copy link
Contributor

As extensions are themselves nominal types, it seems unfortunate to not be able to treat them as nominal types and to do stuff like overloading on them. Not being able to do so I think makes it awkward to use explicit extensions as a mechanism for type-safe aliases. Having to use two distinct names and cast to the explicit extension seems quite redundant:

foo.LoadStudent((StudentId) id);
foo.LoadPerson((PersonId) id);

The above proposal doesn't mention implicit/explicit extensions at all, either because that aspect of extensions remains up in the air or because it's assumed that it would not impact encoding. I'd like to posit the argument that explicit extensions should further blur the lines between aliases and types, and the encoding should bare that out. Otherwise, I don't understand the purpose of having the extension be a nominal type at all, or to allow types to be declared as that extension in signatures of generic type arguments.

@agocke
Copy link
Member

agocke commented Dec 17, 2023

This discussion has me convinced that people want two different features and are calling both of them "extensions."

If you want the thing called "type classes", then the concept of being able to "overload" on extension in non-sensical. The definition of a type class is to define an interface or signature and then define how a type conforms to that interface. Giving a name to that particular implementation has some meaning, although now you are defining something closer to ML modules, but defining a new type is nonsensical. An implementation is, by definition, not a type. A type class is a type (or constraint bound). To say that a consumer of a type class can overload on implementation definitions is as nonsensical as saying that a method taking an interface gets to decide which interface implementation gets chosen. The whole point of the abstraction is that the caller has the flexibility, not the callee.

If a new type is defined, then the extensions feature isn't really about type classes at all, it's about subtyping. That is, an extension is really just a new type that is inheriting the components of the base type.

While we could have a single base syntax that allows for both features, I'm not sure that's a good idea.

@KennethHoff
Copy link

KennethHoff commented Dec 17, 2023

I think the term extension make little sense for the concept that I was describing earlier in this conversation. It's not extending anything, it's just repackaging something else, giving it a new name and pretending like they are not the same thing.

In my domain, TeacherId, StudentId, and Guid should never be used interchangeably. It is, for the purpose of my domain, a coincident that they have the same underlying implementation.

I don't think anybody would call what I just described as "an extension to Guid", as I'm more likely to change the for Guid part than the extension StudentId part in a refactor.

I don't know what I would call what I want to achieve however¹, but extension ain't it.

Still confused as to what an explicit extension is though - if the use-case I've described is invalid - except for having yet another level of obfuscation; Needing both a using directive and a cast operator to use it.

¹ Type classes? @agocke mentioned that, but my googling gave me the most abstract answer I've ever seen, so I've no idea if that's it or not.

@HaloFour
Copy link
Contributor

@agocke

If these aren't intended to be actual types, it also makes absolutely no sense to be able to define variables or parameters of them either, yet that's exactly how you're expected to use explicit extensions. Given the enormous amount of overlap I don't think it makes any sense to make them two completely different language features either. That feels like making them separate features for the virtue of adhering to some strict textbook definition of "type class" rather than providing solutions to solve problems.

@HaloFour
Copy link
Contributor

I think the problem is that "explicit extensions" are already a gray area.

I'm not super versed on typeclasses, what little understanding I have of them is through Scala which probably takes some liberties in order to make them make sense in the JVM. In Scala, the given implementation or witness is not a nominal type and thus can't be used as a nominal type. As long as the given implementation is in scope, the members of the typeclass are available on the underlying type directly, and via generics you'd only ever reference the witness implementation through the typeclass itself. This also appears to be the case with Rust traits and Swift protocols, where the implementation is either only named for the sake of scoping, or not named at all, respectively.

Explicit extensions feel like something completely different in that the witness is a nominal type that you can use as a type in local variables and method signatures. If they aren't a part of the type system, I would argue that it doesn't make sense to allow that. If it's unpalatable to make them a part of the type system because they're called "extensions", then I would argue that maybe we should change "implicit extension" to just "extension" and "explicit extension" to "type".

The recent changes to aliasing in C# generated a lot of feedback from people wanting a way to define strict aliases in the language that behaved like types. It feels like extensions gets us something like 95% of the way there, both in terms of syntax and capabilities.

@alrz
Copy link
Contributor

alrz commented Dec 19, 2023

I have to say that I find this confusing: struct E { private U __this; } why wrap then erase? that looks like a "newtype" imo. Is it only to track the type throughout the code? I can imagine the compiler could do that without actually emitting a wrapper type.

This issue is already apparent in the declaration: public extension TeacherId for Guid;. That example is probably best represented by something like type TeacherId = Guid;, because no actual "extension" is being defined. For me extension is like a container for extension members on a particular type aka "extension everything" which would be lowered to simple static methods, and so it wouldn't be useful to be empty.

Aside: even with that definition I'm not sure how the extension name would be used, e.g. how to call an instance extension property where it would be otherwise ambiguous? The fact that it can also implement interfaces is where things get conflated in my mind. I'm looking at impl .. for declarations in Rust, those would never need a name because it just adds a "conversion".

@Perksey
Copy link
Member

Perksey commented Dec 20, 2023

that allow types to be augmented with additional members and (eventually) interfaces.

Is the "for" syntax a placeholder? It feels like it wouldn't be too much hard to just reuse the existing base class syntax i.e.

public extension MyExt : ThingImExtending

this is also later expandable to interfaces without it looking alien i.e.

public extension MyExt : ThingImExtending, IMyExtraThingImImplementing

which in my opinion looks more familiar than this

public extension MyExt for ThingImExtending : IMyExtraThingImImplementing

as this could be read as "Implement MyExt for ThingImExtending where ThingImExtending implements IMyExtraThingImImplementing" which, while this isn't a concept expressible now or in the near future in the C# language, from the perspective of someone who spends a lot of time in Rust, this is a mistake that could be made from other languages that have the "impl ... for" pattern and will definitely be confusing once C# has a richer type system more akin to these languages.

Obviously I recognise this isn't a work item now, but some agreement on how this could be done in the future is required to ensure the syntax doesn't clash or create confusion with future features.

@CyrusNajmabadi
Copy link
Member

@Perksey all syntax is a strawman placeholder for now.

@KennethHoff
Copy link

KennethHoff commented Dec 20, 2023

@Perksey The reason for the for infix is because you can extend extensions, like this:

public extension BaseExt for Guid;
public extension SuperExt for Guid : BaseExt;

Add in interfaces and they're more confusing

public extension BaseExt for Guid : IFirst;
public extension SuperExt for Guid : BaseExt, ISecond;

Add in partial implementation and they're even more confusing.

public extension BaseExt for Guid;
public partial extension SuperExt for Guid : BaseExt, ISecond;
public partial extension SuperExt for Guid : IThird; // I believe you're allowed to omit the base-class in partial declarations.

How would this third case work without the for keyword?

public extension BaseExt : Guid;
public partial extension SuperExt : Guid, BaseExt, ISecond; // Is this implementing an interface `BaseExt`?
public partial extension SuperExt : IThird; // Is this extending `IThird` all of a sudden? I thought it was extending `Guid`?

.. but it is as @CyrusNajmabadi said; Syntax is secondary, but this is the reason for the current choice, and for is an existing keyword (so no breaking change) and it "coincidentally" also sounds nice.

@Perksey
Copy link
Member

Perksey commented Dec 20, 2023

public extension SuperExt for Guid : BaseExt;

Surely the SuperExt will be for the type of the BaseExt? I don't see a world in which you can extend an extension for a type for which the original extension isn't extending.

How would this third case work without the for keyword?

Likely the same way that partial works today, provided my above comment rings true.

@CyrusNajmabadi
Copy link
Member

Surely the SuperExt will be for the type of the BaseExt

Definitely not:

Consider something as simple as:

extension ObjectExtensions for object { ... }

Then you do:

extension StringExtensions for string : ObjectExtensions

I would commonly expect this sort of extension hierarchy to occur.

@Perksey
Copy link
Member

Perksey commented Dec 20, 2023

Ahhhhhhh damn, foiled by OOP again!

@Perksey
Copy link
Member

Perksey commented Dec 20, 2023

Perhaps

public extension<string> StringExtensions : ObjectExtensions

as this reads as "public extension of string extending the ObjectExtensions extension". This is noisier, but is also akin to the existing type declaration syntax.

The only downside with this one is that it looks extremely strange with generics in the loop.

The main reason I'm interested in the syntax is that I have absolutely no doubt that in its current form, most of the discussions around this proposal will be around the intricacies of how it's lowered and expressed whereas I only care about how it looks in my code and how I use it, in its current form with its current constraints I don't see a world in which I'll be surprised with whatever functional decisions are made. I've been surprised before, but this is why I'm asking about the syntax (which is sort of like the final product) rather than the lowering itself (which are sort of the steps along the way)

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Dec 20, 2023

@Perksey #7771 (comment)

whereas I only care about how it looks in my code and how I use it

Hammering out syntax will come after we're happy with semantics.

rather than the lowering itself

That's literally this discussion though :)

@KennethHoff
Copy link

@Perksey

most of the discussions around this proposal will be around the intricacies of how it's lowered

That is what this thread is about after all. Syntax-related discussions feels like it should be in #5497 (or maybe a separate issue? Discussion?)

@Perksey
Copy link
Member

Perksey commented Dec 20, 2023

Ah missed that one (just went straight to the latest issue), thanks.

@agocke
Copy link
Member

agocke commented Dec 20, 2023

@HaloFour You may not be very well versed in type classes but you summarized fairly well.

Let me give a quick run down of what problem they're meant to solve, and how.

Type classes where created to help with "ad-hoc polymorphism". Ad-hoc polymorphism was defined as follows by Wadler[1]:

Ad-hoc polymorphism occurs when a function is defined over several different types, acting in a different way for each type. A typical example is overloaded multiplication: the same symbol may be used to denote multiplication of integers (as in 3*3) and multiplication of floating point values (as in 3.14*3.14)

One historical problem with ad-hoc polymorphism solutions is that they don't integrate well with parametric polymorphism (generics). You can see this is pre-generic-math C# where multiplication couldn't be used on generic type parameters because multiplication was not defined for anything that could be used as a constraint. Even now, if you were to try to define a new operation for Int32, you couldn't do it because that functionality would have to be added to the definition of Int32.

Existing ad-hoc polymorphism solutions also don't interact well with generic types. Let's say you wanted to implement deep-equals functionality for C#. You would start by defining an interface

interface IDeepEquatable<in T> {
    bool DeepEquals(T t);
}

Now let's say you want to define it for List<int>. First you'll need to implement IDeepEquatable for int. Let's say you can get that into the framework. That still leaves List<T>. As it is, you can't implement IDeepEquatable without knowing that T implements IDeepEquatable. But if you add the constraint to T, you can't create Lists of non-IDeepEquatable types, which is not correct.

So you need some mechanism for "conditional implementation." Type classes is one such option. And it appears that wrapping could be one such implementation. The problem is the composite types. If int doesn't implement IDeepEquatable, then you have to wrap it. Same with List -- but now you end up with nested wrappers, e.g.

public struct IntWrap(int value) : IDeepEquatable<int> {
  public bool DeepEquals(int other) => value == other;
}
public struct ListWrap<T>(List<T> value) : IDeepEquatable<List<T>>
  where T : IDeepEquatable<T>
{
   public bool DeepEquals(List<T> other) => ...;
}

So to construct a ListWrap you need a List<IntWrap>, which means you need to copy the list. Even worse, imagine if someone wrote code like

public void M(List<int> list) {
  if (list is null) { ... }
}

Now that always returns false because the wrapper is never null. Repeat for all other pattern checks ad nauseam. The problem is that type classes are meant to describe functions on existing types. They're not meant to introduce new types. Introducing new types introduces new semantics.

[1] Wadler, Blott. How to make ad-hoc polymorphism less ad hoc. 1988

@TahirAhmadov
Copy link

Hmmm.
What would be the roadblock if we were to make a different turn and use the actual struct type in metadata? For simple scenarios like invocations etc., it just changes where the Unsafe.As is done. For generics, though, if we have:

var list = new List<E>();
var list2 = (List<U>)list; // can this work somehow?

Fundamentally List<U> and List<E> are represented equally in memory anyway. Is this the runtime support "wall" that we would hit in this direction?

@HaloFour
Copy link
Contributor

HaloFour commented Dec 21, 2023

@agocke

You may not be very well versed in type classes but you summarized fairly well.

I'll take that as a win. 😁

I can see where generics makes this a problem, and it's probably why type classes in Scala have to adhere to a particular convention of trait. They don't define the behavior of the type, they define the behavior of the witness working with the type. The witness is entirely separate, and needs to be passed as an implicit parameter. Scala 3 hides a lot of that complexity, but I think it ultimately desugars down to the same thing. Either way, neither Int32 nor List<T> can be IDeepEquatable<T>, but a method can use a given implementation of IDeepEquatable<List<int>> in order to determine if two List<int>s have deep equality.

Trying to bridge that with normal interface implementation sounds like a runtime conundrum. Type erasure may simplify the result, but it still feels like a lot of runtime work to really make this work. Trouble is, if you have a method with the signature bool TheyAreDeeplyEquals<T>(List<T> a, List<T> b) where T : IDeepEquality<T>, where the heck do you pass a witness? Is that where Unsafe.As comes in? You sneak the List<int> in as a List<Int32DeepEquatable> where Int32DeepEquatable is a struct implementation of IDeepEquatable<int>?

Admittedly I'm a bit out of my league and can see where trying to get this to work at all can create warts and friction points elsewhere.

Just, wow: SharpLab

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

namespace TestTypeClasses {

    public interface IDeeplyEquatable<in T> {
        bool DeepEquals(T other);
    }

    public struct Int32DeeplyEquatable : IDeeplyEquatable<Int32DeeplyEquatable> {
        private int _val;
        bool IDeeplyEquatable<Int32DeeplyEquatable>.DeepEquals(Int32DeeplyEquatable other) => _val == other._val;
    }

    public struct ListDeeplyEquatable<T> : IDeeplyEquatable<ListDeeplyEquatable<T>> where T : IDeeplyEquatable<T> {
        private List<T> _val;
        bool IDeeplyEquatable<ListDeeplyEquatable<T>>.DeepEquals(ListDeeplyEquatable<T> other) {
            if (_val == other._val) {
                return true;
            }
            if (_val == null || other._val == null) {
                return false;
            }
            int count = _val.Count;
            if (other._val.Count != count) {
                return false;
            }
            for (int i = 0; i < count; i++) {
                T left = _val[i];
                T right = other._val[i];
                if (left is null) {
                    if (right is not null) {
                        return false;
                    }
                }
                else if (!left.DeepEquals(right)) {
                    return false;
                }
            }
            return true;
        }
    }

    internal class Program {
        static void Main(string[] args) {
            List<int> list1 = [1, 2, 3];
            List<int> list2 = [1, 2, 3];

            ref ListDeeplyEquatable<Int32DeeplyEquatable> tc1 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list1);
            ref ListDeeplyEquatable<Int32DeeplyEquatable> tc2 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list2);

            bool deeplyEquals = AreDeeplyEquals(tc1, tc2);
            Console.WriteLine($"Equals: {deeplyEquals}");
        }

        static bool AreDeeplyEquals<T>(T left, T right) where T : IDeeplyEquatable<T> {
            return left.DeepEquals(right);
        }
    }
}

@FaustVX
Copy link

FaustVX commented Dec 21, 2023

Hi @HaloFour
If you change list2 to [4, 5, 6], AreDeeplyEquals still says true, but if the 2 list have différents length, it says false.
I don't know why, but I think IDeeplyEquatable<ListDeeplyEquatable<T>> works well, but IDeeplyEquatable<Int32DeeplyEquatable> doesn't.
On each loop inside IDeeplyEquatable<ListDeeplyEquatable<T>>.DeepEquals, left and right have the same value.
sharplab.io

@HaloFour
Copy link
Contributor

HaloFour commented Dec 21, 2023

@FaustVX

oops, that's a bug in my implementation, I'll update my comment to reflect the working version:

            for (int i = 0; i < count; i++) {
                T left = _val[i];
                T right = _val[i]; // oops!  should be: T right = other._val[i];
                if (left is null) {
                    if (right is not null) {
                        return false;
                    }
                }
                else if (!left.DeepEquals(right)) {
                    return false;
                }
            }

The approach works, although there are warts. The type of T is ListDeeplyEquatable<Int32DeeplyEquatable> and it's a runtime error to attempt to cast it to List<int>. I understand that this approach also can create problems for GC. I assume that runtime changes are going to be required in order to support this in general, so it's not particularly surprising that we can't cleanly manage this today.

@FaustVX
Copy link

FaustVX commented Dec 21, 2023

@HaloFour
Haha, I didn't even see that.
I've also replicated in Vs Code and debugged thought it.
And didn't saw that 😄

But BTW, I'm impressed by the power of the type system/memory management of C# to allow "casting" a class to a struct, with interface implemented, and everything works well

@FaustVX
Copy link

FaustVX commented Dec 21, 2023

@HaloFour

The type of T is ListDeeplyEquatable<Int32DeeplyEquatable> and it's a runtime error to attempt to cast it to List<int>

I don't understand what you're wrote, I can cast it: ref var list3 = ref Unsafe.As<ListDeeplyEquatable<Int32DeeplyEquatable>, List<int>>(ref tc2); with no runtime error.
sharplab.io

And also, why the ref variable here ?

ref var tc1 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list1);
ref var tc2 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list2);

Why not simply :

var tc1 = Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list1);
var tc2 = Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list2);

@HaloFour
Copy link
Contributor

@FaustVX

I don't understand what you're wrote, I can cast it: ref var list3 = ref Unsafe.As<ListDeeplyEquatable<Int32DeeplyEquatable>, List<int>>(ref tc2); with no runtime error.

Right, I think you'd use the ref version for a struct in order to avoid an unnecessary copy. That's not necessary here.

@TahirAhmadov
Copy link

TahirAhmadov commented Dec 22, 2023

Wait, so everything is already supported? I'm confused.
The below demonstrates "composition" - language automatically generating a struct from 2 extensions, each for its own interface.
SharpLab

PS. Updated SharpLap - @FaustVX your bug was incorrect assignment to T right

@HaloFour
Copy link
Contributor

Wait, so everything is already supported? I'm confused.

Possible through unsafe hacks that exploit implementation details but likely to cause things to explode at runtime? Yes. Supported? Definitely not.

@hez2010
Copy link

hez2010 commented Dec 22, 2023

Trying to bridge that with normal interface implementation sounds like a runtime conundrum. Type erasure may simplify the result, but it still feels like a lot of runtime work to really make this work. Trouble is, if you have a method with the signature bool TheyAreDeeplyEquals<T>(List<T> a, List<T> b) where T : IDeepEquality<T>, where the heck do you pass a witness? Is that where Unsafe.As comes in? You sneak the List<int> in as a List<Int32DeepEquatable> where Int32DeepEquatable is a struct implementation of IDeepEquatable<int>?

Admittedly I'm a bit out of my league and can see where trying to get this to work at all can create warts and friction points elsewhere.

Just, wow: SharpLab

    internal class Program {
        static void Main(string[] args) {
            List<int> list1 = [1, 2, 3];
            List<int> list2 = [1, 2, 3];

            ref ListDeeplyEquatable<Int32DeeplyEquatable> tc1 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list1);
            ref ListDeeplyEquatable<Int32DeeplyEquatable> tc2 = ref Unsafe.As<List<int>, ListDeeplyEquatable<Int32DeeplyEquatable>>(ref list2);

            bool deeplyEquals = AreDeeplyEquals(tc1, tc2);
            Console.WriteLine($"Equals: {deeplyEquals}");
        }

        static bool AreDeeplyEquals<T>(T left, T right) where T : IDeeplyEquatable<T> {
            return left.DeepEquals(right);
        }
    }
}

A problem here is that tc1.GetType() will now return ListDeeplyEquatable<Int32DeeplyEquatable> instead of List<int>: SharpLab, which makes it non-transparent. We definitely need some runtime support here to make the type (i.e. ListDeeplyEquatable<Int32DeeplyEquatable>) compatible with List<int> while still behaves exactly as List<int>.

Ideally, the following things should be supported:

  1. tc1.GetType() yields List<int>, not ListDeeplyEquatable<Int32DeeplyEquatable>
  2. The result of tc1.GetType().GetInterfaces() contains IDeeplyEquatable<List<int>>, not IDeeplyEquatable<ListDeeplyEquatable<Int32DeeplyEquatable>>
  3. The result of tc1.GetType().GetMethods() contains DeepEquals
  4. An API that can indicate IDeeplyEquatable<List<int>> here is derived from IDeeplyEquatable<ListDeeplyEquatable<Int32DeeplyEquatable>>

@ds5678
Copy link

ds5678 commented Dec 22, 2023

I like the idea of allowing overloads for extensions, but only if the null checking problem is solved.

static bool Method(Extension e)
{
    return e is null; //This must be able to return true.
}

There could be special casing (similar to Nullable<T>), but I doubt all the edge cases can be handled.

Edit: I'm very excited for the extensions feature. Huge thanks to the LDT for working on it. Happy holidays!

@agocke
Copy link
Member

agocke commented Dec 22, 2023

@HaloFour Right, so let's dive into type witnesses.

From the previous description of type classes you can see that the module system basically creates a dichotomy between types and classes. Types are data. Classes are behavior (functions). If you start messing with the data, you need a different type. But behavior is fluid, flexible, context-dependent. The entire concept is predicated on the idea that types are fundamental, but how you view a type changes depending on the context. Sometimes a List can be compared with another List, sometimes it can't. Sometimes a List is serializable, sometimes it isn't. You want to be able to swap the available behavior on types in and out, automatically, depending on the context.

This starts to suggest certain implementation decisions. And this is where type witnesses come in. In type theory, a witness just means that the system gains evidence that a certain evaluation is sound, where it might not be so automatically. Implementation-wise, we might as well also carry the implementation of said evaluation in the witness, since we're carrying it around anyway. And this whole thing implies a fundamental design decision: types and classes (read: interfaces) are stored separately. If your function takes an argument with some sort of interface constraint, the type and the interface implementation would be stored separately. That allows the caller to swap out the appropriate implementation based on context. This is what languages like Scala call a "type witness."

The trouble with this approach in C# is, like you've noted, the interface definitions in C# implicitly produce instance invocations. Even though there's nominally an implicit this parameter for the function, for a C# type witness this would have the type of the witness, not the original type.

The other observation is that the witness doesn't carry any state. As established, state is part of the type. Only the behavior is part of the class. So all witness are function-definitions-only, no state. This is important in two ways. First, it explains my hesitancy at mixing "new type" with "type class." In conventional type class implementations, there's an important divide between the two. Second, it carries some implementation considerations.

When I mentioned that the type witness is an extra argument, one way to do that would be an actual extra function argument. This is the way Scala implements things with implicit parameters. The other way is as a type argument, which is how "type witness" is usually constructed in type theory. Since we know the witness is stateless, there's no need to carry around an extra parameter. We only need type information (dispatch information). You could imagine this as being an empty struct in C#. Consider,

interface IDeepEquals<in T> { static bool Equals(T self, T other); }
struct IntWitness { public static bool Equals(int self, int other) => self == other; }
bool M<TEq, TWitness>(TEq left, TEq right) where TWitness : struct, IDeepEquals<TEq>
  => default(TWitness).Equals(left, right);

Now we're carrying the witness along as a type argument instead of a function argument. But we still have the problem that the signatures are being defined in terms of the witness type.

So maybe some sort of runtime-transparent wrappers are the answer here. Or maybe some specialty type arguments that let you override the implicit receiver.

But it all still comes down to what problem we're trying to solve, and what kind of module system we want to attach. Some of these implementation choices are perfectly fine if we want to narrowly solve some specific type-class-focused problems. But once the scope starts expanding, potential problems appear quickly.

@HaloFour
Copy link
Contributor

But it all still comes down to what problem we're trying to solve, and what kind of module system we want to attach.

Not to mention trying to make it work with a 22 year old ecosystem.

Either way, I think the only real concern I have is what "explicit extensions" are intended to mean, given that they are nominal types in some way. There have been numerous requests for type-safe aliases in the language and it's been suggested that explicit extensions might be a way to accomplish that (if only by me?). Perhaps type erasure doesn't preclude that functionality, but it might create warts. It's not terribly uncommon to run into this with Java and generics, and it's super annoying.

@Sergio0694
Copy link

Posting another scenario here too in case anyone is able to share how would something like this (hypothetically speaking) work. I'd be interested in understanding what possible ways could there be to make this work when lowered. Consider this:

struct Foo : IFoo
{
    public void DoStuff()
    {
    }
}

interface IFoo
{
    void DoStuff();
}

interface IBar<T>
{
    static abstract void Baz(ref T x);
}

implicit extension FooExtension for Foo : IBar<Foo>
{
    public static void Baz(ref Foo x)
    {
    }
}

static void Test<T>(in T x)
    where T : IFoo, IBar<T>
{
    T.Baz(ref x);
    x.DoStuff();
}

And then you'd call it like so:

Test(new Foo());

The part I'm not really following is:

  • Test has absolutely no idea about any extension methods for type arguments for T. There is nothing in its signature that indicates any of that: it just has some T type parameter that's constrained to both IFoo and IBar<T>.
  • From the callsite, I'm calling Test with Foo as type argument. So I fully expect that my T will be exactly Foo. Meaning that eg. if Test did anything like typeof(T), that would return Foo. Eg. say, if it used that type for caching or whatever.
  • For simplicity, I'm only asking about cases where the interface being added via extensions is static (ie. only static members).

How would that new interface and implementation for IBar<Foo> be wired up through Test in cases like this one? 🤔


For context, this is exactly the kind of setup I have in ComputeSharp, and the idea would be that once implicit extensions became a thing, assuming this "static interface implementation through an extension" scenario was possible, I could have my source generator emit these implicit extension types for all user defined types, so that they would then be able to call all APIs constrained to both those interfaces (but users would only manually implement the first one). This is pretty much the same kind of setup that eg. @eiriktsarpalis mentioned he'd like to have in System.Text.Json as well. Just trying to get a better understanding at what would possible lowering strategies for this case look like 😄

@BlinD-HuNTeR
Copy link

How would that new interface and implementation for IBar<Foo> be wired up through Test in cases like this one? 🤔

I suppose one way of doing that could be the following:

  • As specified in the first post, extension that implements interface is lowered to a struct implementing that interface. So you would have a struct FooExtension that implements IBar<Foo>
  • When caller uses Foo as the type argument of the Test<T> method, the struct will be used instead. For this to be possible, however, the generated struct must also implement IFoo, so the compiler should generate this implementation with stub methods that forward the call to the original Foo object.
  • Finally, if the called method ever does typeof(T), it should return Foo instead of FooExtension. This would require runtime changes, but I suppose it shouldn't be hard to make the JIT recognize the ldtoken instruction that references an extension type, and generate native code that references the original type instead.
    Also, since FooExtension a struct, it means that any generic types/methods that are called with that struct as type parameter, will have exclusive implementations generated, so there will be no conflict with other extensions, nor with calls involving the original Foo.

@hez2010
Copy link

hez2010 commented Jan 9, 2024

  • Finally, if the called method ever does typeof(T), it should return Foo instead of FooExtension. This would require runtime changes, but I suppose it shouldn't be hard to make the JIT recognize the ldtoken instruction that references an extension type, and generate native code that references the original type instead.

This would regress the runtime performance for all ldtoken calls. Even if we ignore the performance issue here, how would you deal with object.GetType?

interface IBar { }
extension FooExtension for Foo : IBar { }

F(new Foo());
void F(IBar bar) => Console.WriteLine(bar.GetType());

There's no way for the compiler to distinguish whether an argument passed into the F is an extension wrapper or not.

@TahirAhmadov
Copy link

TahirAhmadov commented Jan 9, 2024

@hez2010 I think they hope to be able to do some magic which allows this to work at no cost.
However my perspective was that when a method receives IBar, the only promise there is that IBar can be invoked; yet, many others insist on it being possible to simulate the actual instance being passed in, which means GetType and also object.ReferenceEquals and what not.
I'm still not sure what the use case is for this, but oh well - if they can pull it off using "magic", I'm all for it - but in the meantime it does seem to be complicating the feature - I can see how it could have been delivered sooner without this added requirement.
PS. IMO the "pay to play" approach would be new object.ReferenceEqualsWithExtension and Type.GetTypeWithExtension methods...

@Sergio0694
Copy link

Just to clarify, in my example I was specifically only talking about extensions implementing static interfaces (ie. only static members), because my understanding is that could be simpler to do than having extensions implementing interfaces with instance members as well. Eg. in the case of just static members, you should be able to get all the info you need just via eg. some special generic type context. Not saying this makes it easy, but just saying it should at least be a subset of the more generalized "some interface" case, and possibly (hopefully?) with a simpler implementation 🤔

@BlinD-HuNTeR
Copy link

how would you deal with object.GetType?

This question involves a broader scope. Basically there are 2 different scenarios for extension with interfaces: one scenario is calling a non-generic method that simply takes the interface as parameter, and the other is calling a generic method with a generic parameter constrained to the interface.

For the first scenario, there are still many questions unanswered. First of all, boxing is unavoidable here, which means that the extension and the underlying instance are going to be two different objects. And if they are different objects, what should object.GetType return? What about ReferenceEquals? And also, what should be the behaviour of isinst/castclass instructions? Should the receiver be able to transparently check the type of underlying object and cast to it? I suppose that would increase the complexity a little bit.

For the generic method scenario, it can be done without boxing, which means that extension and underlying object have the same memory layout. But the behaviour of object.GetType is still to be decided.

@hamarb123
Copy link

hamarb123 commented Jan 29, 2024

Not sure if it's been asked yet, but is there any reason we're not using custom modifiers to enable overloading by extension type (for explicit extensions anyway)?

public extension StudentId for Guid
{
}
public extension TeacherId for Guid
{
}

public static string GetName(StudentId id) => ...;
public static string GetName(TeacherId id) => ...;

->

.method public hidebysig static string GetName(valuetype [System.Runtime]System.Guid modopt(valuetype StudentId) id) cil managed
{
    ...
}
.method public hidebysig static string GetName(valuetype [System.Runtime]System.Guid modopt(valuetype TeacherId) id) cil managed
{
    ...
}

or similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests