Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal]: Extending patterns to "as" #8210

Open
1 of 4 tasks
IS4Code opened this issue Jun 12, 2024 · 20 comments
Open
1 of 4 tasks

[Proposal]: Extending patterns to "as" #8210

IS4Code opened this issue Jun 12, 2024 · 20 comments
Assignees

Comments

@IS4Code
Copy link

IS4Code commented Jun 12, 2024

Extending patterns to "as"

  • Proposed
  • Prototype: Not Started
  • Implementation: Not Started
  • Specification: Not Started

Summary

This proposal seeks to extend pattern matching to the as operator in a manner analogous to is, to allow x as <pattern> achieve an effect similar to x is <pattern> y ? y : null, or to explore ways to arrive at a similar outcome.

Motivation

A lot of the language is built around pattern matching and option-like null propagation or guards (?. and ??), yet these two systems don't go that well together. For example:

Class(string? a, string? b)
{
    this.a = a != null ? a : "<unspecified>";
    if (b == null) throw new ArgumentNullException(nameof(b));
    this.b = b;

    // can be easily simplified as

    this.a = a ?? "<unspecified>";
    this.b = b ?? throw new ArgumentNullException(nameof(b));
}

Yet in situations where null is not the only "exceptional" value, there is no syntax to simplify to:

Class(string? a, int b)
{
    this.a = a is not null or "" ? a : "<unspecified>";
    if (b < 0) throw new ArgumentOutOfRangeException(nameof(b));
    this.b = b;
    // both "a" and "b" still need to be referenced twice
}

In many situations, there are special values outside of null that need to be checked ‒ empty strings or collections, 0, -1, or SomeEnum.None, or "invalid" objects, but there is no easy way of discarding these values other than using conditions.

What if we were instead able to leverage the existing null-centered syntax to "coerce" the unwanted values to be null? Is there any piece of syntax with the meaning "make sure this is X, otherwise be null"? Actually, yes ‒ as!

Detailed design

This proposed syntax aims to extend as, consistently with its original relation to is, to support (non-assigning) patterns. Roughly speaking:

x as P

would be more or less equivalent to

x is P y ? y : null

This is a natural extension for the original syntax x as T, now reinterpreted as x is T y ? y : null. The operation corresponds to "match x to P, or produce null if unmatched".

Concretely, this would parallel is in the following situations:

is syntax is semantics as syntax as semantics
x is T type check x as T unchanged (cast to T?, or null on failure), possibly extended to allow non-nullable value type T
x is T { ... } type and properties check x as T { ... } cast to T?, ensuring the properties, or null on failure
x is { ... } properties check x as { ... } cast to the nullable equivalent of the type of x, ensuring the properties, or null on failure
x is T and P type and P check x as T and P cast to T? (or a more specific applicable type from P), ensuring P, or null on failure
x is P P check x as P cast to the nullable equivalent of the type of x (or a more specific applicable type from P), ensuring P, or null on failure

The pattern may not create any new variables, since otherwise their use would require the check that the result is not null.

Examples of semantics

object? o;
string s;
byte b;
  • s as { Length: > 10 } ‒ results in s if s is { Length: > 10 }, or null (type string?).
  • o as >= 0 ‒ results in a value of type int? as either >= 0 or null.
  • b as not 0 ‒ results in a value of type byte? with the value of b if non-0, or null.
  • o as not null ‒ just equivalent to o.

Examples of usage

The code above could be easily simplified to:

Class(string? a, int b)
{
    this.a = a as not "" ?? "<unspecified>";
    this.b = b as >= 0 ?? throw new ArgumentOutOfRangeException(nameof(b));
}

Other possibilities include:

  • (timeout as not 0) ?? DefaultTimeout ‒ treat 0 as unspecified timeout. Can be read as "timeout as something non-0, otherwise DefaultTimeout".
  • (myEnumValue as not MyEnum.Unspecified) ?? DefaultMyEnumValue ‒ same for an enum value:
  • numerator / (denominator as not 0)null if division would cause a black hole.
  • (collection as not [])?[0] ‒ access first element if there is one.
  • (entity as { Exists: true })?.Property ‒ access entity only if it still exists, otherwise null.
  • stream.ReadByte() as not -1 ‒ useful even alone so that you don't mistakenly use an invalid byte.
  • IndexOf(...) as not -1 ‒ likewise.
  • obj as not null ‒ not useful much, just results in obj (but makes the type nullable). Could be a warning.

Drawbacks

The new interpretation of as might need some taking used to, to communicate the intended meaning.

Alternatives

There are possibilities for an alternative syntax, not depending on as, with a similar effect:

Class(string? a, int b)
{
    this.a = a when not "" else "<unspecified>";
    this.b = b when >= 0 else throw new ArgumentOutOfRangeException(nameof(b));
}

In this case, one would need to be explicit about the introduction of null, e.g. by writing b when not >= 0 else null.

Unresolved questions

When T is a non-nullable value type, x as T is currently prohibited. While it possible to achieve a similar effect whilst keeping this restriction, it might make sense to allow producing Nullable<T> instead in such a situation. In any case, i as >= 0 for an int i would need to produce int? to be usable, unless an alternative syntax is picked that requires one to specify the "otherwise" value upfront.

Design meetings

@333fred 333fred self-assigned this Jun 12, 2024
@julealgon
Copy link

How would you reconcile this with all the guard methods that were recently introduced, such as ArgumentOutOfRangeException.ThrowIfNegative and such?

The reason they exist is to avoid the whole nameof(param) thing (which by itself is error prone), and to centralize the various message variations (there are many for ArgumentOutOfRangeException).

By using the old ?? throw new... throw expression syntax, you just lose all the benefits of the native guard methods.

I don't disagree with what you are proposing in essence (I've built something similar using .As extension methods actually before), but at the same time I don't want to lose the great benefits that the guard methods provide.

I've actually completely replaced all of these in our codebase with the guard methods recently. For example:

this.user = user ?? throw new ArgumentNullException(nameof(user));

With:

ArgumentNullException.ThrowIfNull(user);
this.user = user;

I don't think approving this with the intent of using it for argument validation is a great idea without also reconsidering all the guard methods: an ideal solution would handle both things.

The ultimate approach to me would be to properly add contracts to the language and allow this (which has been discussed a lot in multiple issues by now):

public MyClass(string user, int value)
    where user is not null
    where value is >= 0 and < 100
{
    this.user = user;
    this.value = value;
}

@CyrusNajmabadi
Copy link
Member

By using the old ?? throw new... throw expression syntax, you just lose all the benefits of the native guard methods.

That seems like a runtime issue. This pattern woudl be fairly trivial/idiomatic. So it could be detected and converted to the same IL that would be emitted for calling into the helpers.

@IS4Code
Copy link
Author

IS4Code commented Jun 12, 2024

It's a good point that there are these methods to use, but there also situations where you might not want to produce an exception, or the existing guard methods don't cover all patterns you might want.

There is also a potential of another neat way of parameters checking, using the proposed syntax:

this.x = ArgumentOutOfRangeException.ThrowIfUnmatched(x as >= 0 and <= 255);

@julealgon
Copy link

By using the old ?? throw new... throw expression syntax, you just lose all the benefits of the native guard methods.

That seems like a runtime issue. This pattern woudl be fairly trivial/idiomatic. So it could be detected and converted to the same IL that would be emitted for calling into the helpers.

I don't follow you here @CyrusNajmabadi . How would code like:

this.b = b as >= 0 ?? throw new ArgumentOutOfRangeException(nameof(b));

Avoid having to pass in the explicit exception type, the nameof(b) and the specific error message for the >= validation (all of which are benefits of using the guard methods)?

Or perhaps you are suggesting something like this (which would be very interesting if possible)?

this.b = b as >= 0 ?? throw;

In this case, the compiler could see the various trivial comparisons and pick the correct exception type and message. Is this what you are alluding to?

@CyrusNajmabadi
Copy link
Member

By using the old ?? throw new... throw expression syntax, you just lose all the benefits of the native guard methods.

That presumes that the specific corresponding native guard method exists. And, if they do, that means they have a specific code pattern in them. For example if arg is null throw new ArgNullException(...).

I'm saying, if your callsite has effective the identical pattern that calling the guard method would emit, then the runtime can/should treat them uniformly. There should be no penalization that you happened to call one vs the other.

@julealgon
Copy link

By using the old ?? throw new... throw expression syntax, you just lose all the benefits of the native guard methods.

That presumes that the specific corresponding native guard method exists. And, if they do, that means they have a specific code pattern in them. For example if arg is null throw new ArgNullException(...).

I'm saying, if your callsite has effective the identical pattern that calling the guard method would emit, then the runtime can/should treat them uniformly. There should be no penalization that you happened to call one vs the other.

Oh I see what you mean now @CyrusNajmabadi , but that was not my point. I wasn't saying one would miss "functional" benefits by using a different syntax, but maintenance-related ones.

For example, having to provide the nameof(argument) to the exception is something that is not needed when using the guard methods as they rely on [CallerMemberExpression] to automatically grab the argument name. Not having to provide that is good, as it reduces the chance of mistakes happening (using incorrect values, or introducing a typo when using raw strings etc).

Similarly, each guard method has a built-in message for the ArgumentException and ArgumentOutOfRangeException types, based on the method that is being called. By using those exceptions manually, one needs to then provide a message on every call site to get "equivalent" behavior as the guard methods. Maintaining those messages is also an additional burden, and another source for mistakes to happen.

And lastly, the guard methods also take care of choosing the more appropriate exception type and message based on the underlying checks being done. For example, using ArgumentException.ThrowIfNullOrWhiteSpace will throw 2 distinct exceptions each with their own bespoke message, depending on whether the value passed in is null (ArgumentNullException), or empty, whitespace, etc (ArgumentException + custom message).

This kind of validation is harder to capture using a single expression, especially if using the syntax proposed here, as there is no way to throw different exceptions using a single throw expression after a coalescing operator. So one would need to go down into a switch statement instead, which brings its own issues such as exhaustiveness checks.

My overall point is that any new feature that influences how argument validation is done should seriously consider how the existing guard methods will play with them, even if the answer is to deprecate the guard methods altogether.

I would love to have something like I mentioned above though, which would be a combination of both worlds:

this.b = b as >= 0 ?? throw;

If the compiler can understand the validation pattern being applied, it would pick the correct exception and message to throw automatically. If it cannot understand it, it would signal this with a compilation error.

The only caveat with an approach like this of course is how to make it extensible (and if that's even desirable in the first place). This would be a bit less complex (and less flexible) than full-blown contracts, but still much better than what we currently have.

@HaloFour
Copy link
Contributor

Given this feature is not specific to argument validation I don't think it makes sense to try to bake that much "magic" into the language specification for this feature.

@julealgon
Copy link

Given this feature is not specific to argument validation I don't think it makes sense to try to bake that much "magic" into the language specification for this feature.

I don't disagree (what I said above would likely become a separate proposal), but at the same time, the main arguments presented in favor of this proposal here are tied directly to argument validation.

I think they should be considered in conjunction to avoid disrupting existing argument validation patterns and creating clashing standards.

For a very long time, argument validation wasn't strongly standardized. With the introduction of the native guard methods, this has now changed, as most common validations are covered there. With the introduction of extension types in .NET9, it will also be possible to introduce extra guard methods using the exact same pattern as the native ones.

Using this as extended proposal for going back to manual throw expressions would be a regression on that effort IMHO.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Jun 12, 2024

I eprsonally disagree. Using throw is normal and idiomatic. If there is something common enough it could benefit from a lib, then put in a lib. But that seems like the very narrow case. The runtime seems to only have a tiny handful of these, just for the cases where they cared about perf enough for thsi to be relevant. I would not extend that out to any sort of pattern beyond that.

@hez2010
Copy link

hez2010 commented Jun 14, 2024

x is T and P type and P check x as T and P cast to T? (or a more specific applicable type from P), ensuring P, or null on failure

This doesn't follow today's is semantic where it allows you casting x to both T and P:

if (x is T t and P p) {
  // use t and p here
}

Today we are not allowed to do

if (x is (T and P) tp) // error, waiting for union/intersection types coming in future

Allowing as T and P today would block the future of union/intersection types in C#, as ideally x as T and P should result in type T and P (T & P).

@IS4Code
Copy link
Author

IS4Code commented Jun 15, 2024

@hez2010 Good point! It would be better to disable P from changing the type of the result in that case.

@IS4Code
Copy link
Author

IS4Code commented Jun 16, 2024

Actually, to provide more thoughts, the reasoning behind as T and P was intended primarily to make something like as IComparable and > 10 turn into int? (since > 10 in this situation is an implicit int pattern), i.e. the first type should not determine the result alone.

To refine on this, I think it is sufficient if the pattern is restricted (at the moment) to always result in an identifiable type. Practically, this means that as T1 and T2 is fine only if T1T2 or T2T1, and T1 or T2 is fine only if T1 = T2 (so it is pretty much useless, but is allows it too, so why not). In the future, the other cases could be turned into T1 & T2 or T1 | T2.

@TahirAhmadov
Copy link

I think these type unions and intersections should be prohibited in the proposed as syntax for now. It's always easier to prohibit now, then allow later once proper design is done for it.

@julealgon
Copy link

It's always easier to prohibit now, then allow later once proper design is done for it.

Not only that but it also allows for faster development of the rest of the design and gets some useful additions out of the door without having to develop (and especially discuss/define) the whole thing completely which as we know can take years sometimes.

I particularly like when it is possible to implement something incrementally because of that benefit. It is basically what happened (and is still happening) with pattern matching.

@bbarry
Copy link
Contributor

bbarry commented Jun 27, 2024

Is there a discussion open for this item? I feel like most of the comments here would benefit from the formatting and associations that a discussion provides.

@IS4Code
Copy link
Author

IS4Code commented Jun 27, 2024

@bbarry Yes, #8207 is what this started as.

@Thaina
Copy link

Thaina commented Jun 29, 2024

Interesting idea
But I don't like the way that we use as >= 0 and expect int? because the syntax should match float and any object that implement interface IEqualityOperators<int>

@IS4Code
Copy link
Author

IS4Code commented Jun 29, 2024

@Thaina I don't like it either, but that is how patterns work in general ‒ x is >= 0 checks for int.

@hez2010
Copy link

hez2010 commented Jun 29, 2024

@Thaina I don't like it either, but that is how patterns work in general ‒ x is >= 0 checks for int.

It's using targeted type. If x is float it will match against float as well.

@glen-84
Copy link

glen-84 commented Jul 16, 2024

My down-vote is because I don't find:

this.a = a as not "" ?? "<unspecified>";

... very readable, and it also feels weird to use the null-coalescing operator in this way.

I prefer the alternative syntax. I even wonder if it's technically possible to shorten the 2nd example to:

this.b = b when >= 0;

... and automatically throw an ArgumentException when the pattern doesn't match.

Although in this case, when might not be the right keyword.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants