[WIP] Entity Framework Core 7 Providers #8654

galvesribeiro · 2023-10-03T03:22:25Z

This PR add Entity Framework 7 providers.

The goal is to provide the basic providers which are storage agnostic and rely only on Microsoft.EntityFrameworkCore like Microsoft.Orleans.XXXXX.EntityFrameworkCore.

We can provide database-specific packages like Microsoft.Orleans.XXXXX.EntityFrameworkCore.SqlServer, Microsoft.Orleans.XXXXX.EntityFrameworkCore.Postgresql, etc., which would depends on Microsoft.EntityFrameworkCore.SqlServer for example.

Also the developers can extend the underlying DbContext (i.e. add extra indexes, constraints, etc.) and/or provide custom migrations.

Microsoft Reviewers: Open in CodeFlow

m3nax · 2023-10-24T08:52:46Z

Any ETA for this? I'm interested

galvesribeiro · 2023-10-26T22:54:57Z

@m3nax trying to figure out the MySQL and PostgreSQL parts as they don't have a timestamp/etag primitive it will be kinda hard without having to lock the tables. The current ADO providers do a lot of aggressive locks which caused issues under high load which the providers here doesn't do.

The problem is to have a stable timestamp/etag mechanism outside SQL Server.

Also, the team is busy with the .Net 8 release so I would say this should be reviewed only after that.

koenbeuk · 2023-12-05T02:06:25Z

I've been implementing grains with EF backed storage for a while now and my strategy is fundamentally different. Given that a DbContext should be short-lived, I essentially use this pattern:

Register a DbContext which has all of the state types configured as entities. Each root entity needs to have a primary key that maps to the GrainId (I use Grain so I only deal with 1 state).
Upon reading state, I fetch the entity from a new DbContext (including any navigated entities). I then make a deep clone of this entity such that I have an original and a copy.
Upon writing state, I attach the original to a new DbContext and then apply the copy, essentially letting EF do the diff for me. This causes writes where nothing has changed to be no-ops and otherwise only updates what needs updating.

The benefits here are:

Writes become much cheaper, in particular with grains who's state is large (as before I would deliberately defer or omit commits as serializing the entire object graph to json and storing that would be too expensive).
I can migrate grain states through EF migrations.

I would suggest to consider an implementation similar to the one described above.

AndrewBabbitt97 · 2023-12-05T18:03:33Z

I've been implementing grains with EF backed storage for a while now and my strategy is fundamentally different. Given that a DbContext should be short-lived, I essentially use this pattern:

Register a DbContext which has all of the state types configured as entities. Each root entity needs to have a primary key that maps to the GrainId (I use Grain so I only deal with 1 state).

Upon reading state, I fetch the entity from a new DbContext (including any navigated entities). I then make a deep clone of this entity such that I have an original and a copy.

Upon writing state, I attach the original to a new DbContext and then apply the copy, essentially letting EF do the diff for me. This causes writes where nothing has changed to be no-ops and otherwise only updates what needs updating.

The benefits here are:

Writes become much cheaper, in particular with grains who's state is large (as before I would deliberately defer or omit commits as serializing the entire object graph to json and storing that would be too expensive).

I can migrate grain states through EF migrations.

I would suggest to consider an implementation similar to the one described above.

When I saw this PR I figured this is what was going to be implemented. I'm currently evaluating Orleans for an upcoming project and this is the type of route I'm wanting to go. This type of implementation makes entities much more flexible.

A hard set schema for storage like this isn't great: 20231005033501_InitialPersistenceSchema.cs

I'd much rather see something similar to what ASP.NET Core Identity uses with its IdentityUser class: IdentityUser.cs. You can inherit from it on your own data model to add additional fields as needed.

Make it so users can implement their own data model entities that way they can benefit from the full power of Entity Framework, add indexes to their own columns, use json columns, and use linq to query data from the context when needed in places like stateless workers.

galvesribeiro · 2023-12-06T01:27:40Z

@koenbeuk / @AndrewBabbitt97 thank you both for the feedback.

The approach @koenbeuk suggested is more or less what I had before in production in a private project. Yes, the benefits you mention are clear. However, the developer is then forced to have to deal with EF directly by having to define the DbContext, register all state types, deal with migrations and what not. This introduces some level of friction which we tried to avoid in this implementation.

This implementation (changes coming soon) aim to behave the same way as the Azure Tables, Cosmos and the other providers that read/write the whole state at once. The grain state is supposed to be owned by the grain itself. Not to be read/write elsewhere. That can cause serious problems with consistency which will potentially be hard to debug/fix.

That being said, I do agree that having the power to allow the developer to customize the DbContext is a good thing if they are willing to take the responsibility of having to deal with it.

What about this:

Have an "easy mode" which is what is currently on the PR. It essentially serialize any state class on the respectively JSON representation and store in a pre-defined table with that DbContext. This is precisely the behavior of the current ADO.Net provider out-of-the-box.
Have the "advanced mode" which allow the developer to register custom DbContext type which will be used by the provider. In this model the only requirement is that the state type have to have some pre-defined properties like the grain type, grain id, eTag, etc.

Both "modes" can be registered on a per-provider basis since the storage provider is a named/keyed registration on the system so you can mix and match.

Also, this would only be something implemented on the storage provider. There is no point of having that complexity for the other providers.

What do you guys think?

AndrewBabbitt97 · 2023-12-06T02:38:49Z

@koenbeuk / @AndrewBabbitt97 thank you both for the feedback.

The approach @koenbeuk suggested is more or less what I had before in production in a private project. Yes, the benefits you mention are clear. However, the developer is then forced to have to deal with EF directly by having to define the DbContext, register all state types, deal with migrations and what not. This introduces some level of friction which we tried to avoid in this implementation.

This implementation (changes coming soon) aim to behave the same way as the Azure Tables, Cosmos and the other providers that read/write the whole state at once. The grain state is supposed to be owned by the grain itself. Not to be read/write elsewhere. That can cause serious problems with consistency which will potentially be hard to debug/fix.

That being said, I do agree that having the power to allow the developer to customize the DbContext is a good thing if they are willing to take the responsibility of having to deal with it.

I'm aware of the caveats with consistency if say someone goes and updates records of grains that have activations and how this could cause data inconsistencies. Keep in mind the big reason and value I see out of a more advanced situations is being able to use EF for read type operations and grain relationships / indexing for search reasons. While there has been attempts at looking at stuff like this (Like Orleans.Indexing), it seems like projects and things like this haven't really been looked at much over the past few years and have mostly been research / pet projects.

The recommended approach to doing things like searching all grains has been to search the underlying data store from what I can tell, and this makes sense. An EF provider opens up opportunities to do a lot more for state management and indexing of fields in grain state along with things like migrations support.

There are footguns that come with this approach plus additional work of managing things like migrations, documenting risks and situations that users could end up in should be considered.

What about this:

Have an "easy mode" which is what is currently on the PR. It essentially serialize any state class on the respectively JSON representation and store in a pre-defined table with that DbContext. This is precisely the behavior of the current ADO.Net provider out-of-the-box.

Have the "advanced mode" which allow the developer to register custom DbContext type which will be used by the provider. In this model the only requirement is that the state type have to have some pre-defined properties like the grain type, grain id, eTag, etc.

Both "modes" can be registered on a per-provider basis since the storage provider is a named/keyed registration on the system so you can mix and match.

Also, this would only be something implemented on the storage provider. There is no point of having that complexity for the other providers.

What do you guys think?

Huge fan of having two modes for this, I think its very appropriate.

I would just like your thoughts on for the "advanced mode" if rather than using a "grain type" field on an entity, you instead associate types to a specific DBSet<TEntity>? I'm not entirely sure how this would look like from a configuration perspective at the moment, but that's at least how I initially envisioned it working.

No other real concerns or thoughts on it beyond that, and I could see myself using both the easy and expert modes in different situations 😊

galvesribeiro · 2023-12-06T02:54:26Z

I'm aware of the caveats with consistency if say someone goes and updates records of grains that have activations and how this could cause data inconsistencies. Keep in mind the big reason and value I see out of a more advanced situations is being able to use EF for read type operations and grain relationships / indexing for search reasons. While there has been attempts at looking at stuff like this (Like Orleans.Indexing), it seems like projects and things like this haven't really been looked at much over the past few years and have mostly been research / pet projects.

Yeah, indexing was a good experiment within MSR that proved it is possible to do it with Orleans. I would hope at some point we get back to it and implement it as a first-class citizen.

If properly implemented it would avoid the risks of having direct access to the underlying storage but again, this is a discussion for another forum and yes I understand it would help a lot different scenarios in regards to querying for grains leveraging the storage indexing capabilities.

I would just like your thoughts on for the "advanced mode" if rather than using a "grain type" field on an entity, you instead associate types to a specific DBSet? I'm not entirely sure how this would look like from a configuration perspective at the moment, but that's at least how I initially envisioned it working.

This approach is easy to implement if we have only one underlying database and only one context type. Like for example in a private project which is what I had back in the day. So, the tricky part is how to allow the user-provided DbContext type to be used with the correct storage provider registration for the correct grain type in a generic fashion as this should work for any arbitrary grain/state. It would be insane to have 1 DbContext per State type and use of reflection to figure out the correct DbSet<T> property is also not an option as It would hurt performance.

I will have to experiment a bit to find out a solution.

ReubenBond · 2023-12-06T23:42:02Z

Have the "advanced mode" which allow the developer to register custom DbContext type which will be used by the provider. In this model the only requirement is that the state type have to have some pre-defined properties like the grain type, grain id, eTag, etc.

I especially like EF Core for this. Even if the EF Core integration was not a typical IGrainStorage implementation, and was instead something else, this could be the best approach. What are the biggest challenges here?

galvesribeiro · 2023-12-06T23:47:54Z

No challenge actually.

Just trying to find the better way to map and use the custom DbContext properly on the provider so we can support whatever structs the developer define without having to rely on reflection to figure out the correct DbSet<TGrainState> property.

Also trying to define a simple abstraction which enforce the GrainType/Id/eTag so the provider can properly work.

veikkoeeva · 2023-12-09T21:53:23Z

The current ADO providers do a lot of aggressive locks which caused issues under high load which the providers here doesn't do.

They can do whatever. Aggressive locking is needed for the case grain does not exist and if there are two grains creating a state (if I recall correctly), but lighter weight could be used and is used when the lighter route is taken. Just to point this out.

AndrewBabbitt97 · 2024-01-02T12:23:03Z

Just trying to find the better way to map and use the custom DbContext properly on the provider so we can support whatever structs the developer define without having to rely on reflection to figure out the correct DbSet<TGrainState> property.

Not sure how much you have been thinking on this given the holidays (Happy New Year!).

Thinking on this however, if the goal is to avoid reflection perhaps a source code generator would be a better approach?

EFCore Cluster and Storage providers

9a1c569

galvesribeiro changed the title ~~[WIP] EFCore Provider~~ [WIP] Entity Framework Core 7 Provider Oct 3, 2023

galvesribeiro added 4 commits October 3, 2023 18:50

EFCore Reminders

ffe2255

EFCore Grain Directory

95b6cca

EFCore Hosting extensions

ff5d7c7

Sql Server support + Migrations

32257bb

galvesribeiro changed the title ~~[WIP] Entity Framework Core 7 Provider~~ [WIP] Entity Framework Core 7 Providers Oct 5, 2023

galvesribeiro force-pushed the efcore-providers branch 5 times, most recently from 74d392a to 719fe27 Compare October 6, 2023 16:53

EFCore Sql Server tests

659aa2b

galvesribeiro force-pushed the efcore-providers branch from 719fe27 to 659aa2b Compare October 7, 2023 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Entity Framework Core 7 Providers #8654

[WIP] Entity Framework Core 7 Providers #8654

galvesribeiro commented Oct 3, 2023 •

edited

Loading

m3nax commented Oct 24, 2023

galvesribeiro commented Oct 26, 2023

koenbeuk commented Dec 5, 2023

AndrewBabbitt97 commented Dec 5, 2023

galvesribeiro commented Dec 6, 2023

AndrewBabbitt97 commented Dec 6, 2023 •

edited

Loading

galvesribeiro commented Dec 6, 2023 •

edited

Loading

ReubenBond commented Dec 6, 2023

galvesribeiro commented Dec 6, 2023

veikkoeeva commented Dec 9, 2023 •

edited

Loading

AndrewBabbitt97 commented Jan 2, 2024

[WIP] Entity Framework Core 7 Providers #8654

Are you sure you want to change the base?

[WIP] Entity Framework Core 7 Providers #8654

Conversation

galvesribeiro commented Oct 3, 2023 • edited Loading

Microsoft Reviewers: Open in CodeFlow

m3nax commented Oct 24, 2023

galvesribeiro commented Oct 26, 2023

koenbeuk commented Dec 5, 2023

AndrewBabbitt97 commented Dec 5, 2023

galvesribeiro commented Dec 6, 2023

AndrewBabbitt97 commented Dec 6, 2023 • edited Loading

galvesribeiro commented Dec 6, 2023 • edited Loading

ReubenBond commented Dec 6, 2023

galvesribeiro commented Dec 6, 2023

veikkoeeva commented Dec 9, 2023 • edited Loading

AndrewBabbitt97 commented Jan 2, 2024

galvesribeiro commented Oct 3, 2023 •

edited

Loading

AndrewBabbitt97 commented Dec 6, 2023 •

edited

Loading

galvesribeiro commented Dec 6, 2023 •

edited

Loading

veikkoeeva commented Dec 9, 2023 •

edited

Loading