Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .select() method #5

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Add .select() method #5

wants to merge 12 commits into from

Conversation

DaddyWarbucks
Copy link
Collaborator

@DaddyWarbucks DaddyWarbucks commented Jan 19, 2023

This PR makes the chainable methods more extendable and includes a new method select().

The main goal of this PR is to offer better performance when the user is "shaping" results, which is done primarily through the $select keyword. When using the query.$select the different shaped results are stored in the cache according to their params.

// This creates 3 cache entries because the params change for each.
const user1 = loader.service('users').load(1);
const user2 = loader.service('users').load(1, { query: { $select: ['name'] } });
const user3 = loader.service('users').load(1, { query: { $select: ['role'] } });

But, if the loader took over the select/shape of results, results could be cached more efficiently.

// This creates 1 cache entry, the selecting is done after caching
const user1 = loader.service('users').load(1);
const user2 = loader.service('users').select(['name']).load(1);
const user3 = loader.service('users').select(['role']).load(1);

The loader uses the same select algorithm as @feathersjs/adapter-commons for its default selectFn. And the user can now pass a new configuration option selectFn that will be used if the call has a .select(). And the select() can be used on ALL methods.

Note that this does not attempt to remove query.$select or alter the cache key in any way. You should not use select() and query.$select in conjunction.

// Works as usual, but poor for caching
const user = loader.service('users').load(1, { query: { $select: ['name'] } });

// Works partially but could break caching
const user = loader.service('users').select(['name']).load(1, { query: { $select: ['name'] } });

// Creates 2 cache entries because params changed even though select didn't
// This is expected and correct.
const user1 = loader.service('users').select(['name']).load(1, { myParam: true });
const user2 = loader.service('users').select(['name']).load(1, { myParam: false });

You can also pass a selectFn argument to the configs. This is a sync function that is called for each result like select(selection). For example:

const selectFn = (selection, result) => {
  // selection can be anything, it doesn't have to be an array
  const { $select, $resolve } = selection;
  // Do some magic
  return result;
}

const result = await loader
  .service('users')
  .select({ $select: ['name'], $resolve: ['posts'] })
  .load(1);

We should probably update the selectFn to handle a promise. This would allow the method to handle basically any kind of "shaping" of results after the cache. It can also help decoupled nested loaders from this loader's cache. If the developer handled nested loaders in the generic select() method instead of the nested service hooks, when the nested loader is cleared the developer doesn't have to know to clear this loader too.
Perhaps select() is not a great name for the method at that point either.

@DaddyWarbucks
Copy link
Collaborator Author

DaddyWarbucks commented Jan 19, 2023

Furthermore, this does make the chainable methods more flexible. The chaining mechanism is basically overwriting some internal state on the class. So the user can chain things in any order and multiple times. The latest state wins.

const loader = loader.service('users');

If (something) {
  loader.multi('myKey')
} else {
  loader.key('myKey')
}

If (somethingElse) {
  loader.select(['name'])
}

await loader.load(1)

But, we may want to lock this down...because the user can chain methods that don't make sense on some other methods.

// .key('myKey') has no effect here
const users = loader.service('users').key('myKey').find()

@marshallswain
Copy link
Member

This looks good.

I don't think the risks of chaining flexibility are material enough to merit writing code to keep users from accidentally doing something dumb.

@DaddyWarbucks
Copy link
Collaborator Author

I wonder if we should move the cacheParamsFn and cacheKeyFn to the chainable methods too? The new selectFn directly maps to the chainable select() method, and the cacheKeyFn maps to the key() and multi() methods.
Right now, all core methods take a last argument that is the an override to the default cacheParamsFn.
For example

// This can be set at the global level and at the service level.
// But you can currently also set it per method call too.
const myCacheParams = (params) => {
  return {
    ...some custom params
  }
}

const result = await loader.service('users').load(1, params, myCacheParams);

There currently is not a way to override the cacheKeyFn or selectFn per call. So I wonder if we could ditch the cacheParamsFn as the last argument and move both of these to chainable methods. And add an override to the selectFn

const result = await loader.service('users')
   // function as second arg?
  .key('myKey', myCacheKeyFn)
  // Same for multi
  .multi('myKey', myCacheKeyFn)
   // better name? Note it does not take a first arg. it gets params from the load()
  .params(myCacheParamsFn)
   // function as second arg?
  .select(['name', 'email'], mySelectFn)
  .load(1, params)

@DaddyWarbucks
Copy link
Collaborator Author

DaddyWarbucks commented Jan 27, 2023

I take most of that back actually. The cacheKeyFn is an underlying DataLoader config option and should not change between calls. And that kinda matches normal service config, you don't change the service's id option from call to call.

But we could still do

const result = await loader.service('users')
  .key('myKey')
   // better name? Note it does not take a first arg. it gets params from the load()
  .params(myCacheParamsFn)
   // function as second arg?
  .select(['name', 'email'], mySelectFn)
  .load(1, params)

@daffl
Copy link
Member

daffl commented Jan 27, 2023

There may be some challenges introduced with this that relate to some of the more complicated set logic that @marshallswain may know about, too (can-connect). For example when loaded in a different order like

const user2 = loader.service('users').select(['name']).load(1);
const user3 = loader.service('users').select(['role']).load(1);
const user1 = loader.service('users').load(1);

I probably won't get any caching right?

One thing that I see could improve performance here would be to always strip out the $select and load (and cache) the entire object assuming that it contains all the fields that I can $select.

Another question is how frequently $select on the client is actually used. I rarely do but I can see that it depends a lot on the use case but it's always really difficult to tell.

@DaddyWarbucks
Copy link
Collaborator Author

@daffl You would still get caching with

const user2 = loader.service('users').select(['name']).load(1);
const user3 = loader.service('users').select(['role']).load(1);
const user1 = loader.service('users').load(1);

Because you did not use $select in the params, then the whole record was loaded. But the select() method shaped the result after it was returned from the cache.

And I use the $select mainly when loading associated records on the server. Many times I just want a simple representation of that object and don't want all of its createdAt, updatedAt, etc.

@DaddyWarbucks
Copy link
Collaborator Author

The select() offers another value as well. The loader returns null if an id is not found for load(1). The user can use a .get(1) if they want it to throw an error, but will sacrifice some potential performance.Its more complicated with load([1, 2, 3])
The result of this may be [object1, null, object3]. So the select() method offers the user the ability to handle those however they would like. Perhaps they would just filter out the nulls, perhaps they would throw an error. Because this library took over so much of the caching, options, config from the underlying DataLoader, we abstracted away the ability for the user to handle when an item is not found.

@DaddyWarbucks
Copy link
Collaborator Author

@daffl I'm thinking more on removing params.query.$select totally. I initially wanted to keep things as similar to a feathers service as possible and affect the query as little as possible. But, I've actually already removed the $limit keyword and set paginate: false in a manner that cannot be overwritten. This is because the number of keys given has to be the same number of results. Similarly today I have realized that I should removed the $sort too. The order of the results has to be in the same order as the keys given (we don't rely on the DB for this and handle it manually).
So, now I am thinking its feels right to also remove $select. We didn't want the user messing with $limit and $sort, but they can handle that by providing the keys in the length/order they want. And now with the select() method it feels safe to remove the $select. I also just realized I need to remove $skip too.
Yea...lets remove $select.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants