Skip to content
This repository has been archived by the owner on Jul 22, 2021. It is now read-only.

[Question] Can the schema for the migrations (tables) be configurable instead of using public? #10

Open
RMISCG opened this issue Feb 26, 2020 · 9 comments
Labels
enhancement New feature or request

Comments

@RMISCG
Copy link

RMISCG commented Feb 26, 2020

Description

If I'm not mistaken, it seems like the schema for the migrations (tables) is hardcoded to public. Is it possible to make this configurable so that it can be overridden? Our team doesn't want to create tables under the public schema as our app runs under it's own schema and we want to use that instead.

Thanks

@Mythra Mythra added the enhancement New feature or request label Feb 26, 2020
@Mythra
Copy link
Owner

Mythra commented Feb 26, 2020

Hey,

Yes right now the jobs tables are put into the public schema always. I'm totally fine with making that not the case though. I can take a look at changing this, this weekend. However if you wanted to tackle it yourself the general idea is:

  • We'd need to remove public. from the schema files.
  • We'd need to mention in the documentation markdown files that you can apply the files in whichever schema you'd like (which by default is public).
  • We'd need to add a configuration option for: schema_name and use that instead of public. (for the few places it's in code).

@RMISCG
Copy link
Author

RMISCG commented Feb 27, 2020

Hi,

Thank you so much for the prompt response. Yes, I completely agree with the steps you mentioned. However, unfortunately I'm not a Kotlin developer and I don't feel comfortable applying those changes myself. If this is something that you and/or your team can tackle for us sometime soon that would be much appreciated. Otherwise, I might have to look around to see if there's a Kotlin engineer that can help my team with this.

Thanks

@Mythra
Copy link
Owner

Mythra commented Feb 29, 2020

Hey,

tl;dr: this change is going to require us to go to 2.0.0, and have some significant work

This is going to be a bit bigger of a breaking change than I was expecting. We're going to need to add an extra argument to every job constructor (which means we'll need to go to 2.0.0). This is because LISTEN/NOTIFY is not actually scoped to the schema level:

postgres=# CREATE SCHEMA workers_2;
CREATE SCHEMA
postgres=# SET search_path = workers_2;
SET
# in another tab run:
# postgres=# CREATE SCHEMA workers_1;
# CREATE SCHEMA
# postgres=# SET search_path = workers_1;
# SET
postgres=# NOTIFY workers, 'This is the payload';
NOTIFY
# in another tab run listen, you'll see the message.

This would mean if two people ran coworker on the same database (in two seperate schemas) many errors would crop up. As such we'll need to set the channel names to include the schema name like: NOTIFY workers_cw_${schema_name}, 'data';. Listening isn't that hard, however jobs need to notify to yield to new work that's coming up. Which means they'll need to get it passed in through the constructor which is a pretty massive breaking change. We won't need to requeue work or anything, but someone will have to go add the schema argument to all of their jobs. So this is going to be quite big.

Still for it, but it will be big.

Thanks,
Cynthia

@Mythra
Copy link
Owner

Mythra commented Mar 1, 2020

Okay I've got a local branch going, that just needs a little bit more work before it's ready for testing. It changes the jobs quite a bit so we can have much less breaking changes in the future (the jobs not get passed a "WorkCreationInformation" class that can have fields added without it being a breaking change). Not to mention it can successfully run in a multiple schemas, at the same time (I'll add tests for this scenario as well).

If I were to push the code to sonatype's staging repository would you be able to check it out, and ensure it works for you all (The code would also be on github, and documentation would be updated)?

@RMISCG
Copy link
Author

RMISCG commented Mar 2, 2020

Hi,

Thanks for all your work on this. Sure. Just let me know where I can get the new code from including the documentation and i will try to test it out as soo a I can.

No rush, but when do you anticipate this being ready for our testing? Just want to know how/when to account for this work into our sprint planning.

Thanks

P.S. Just curious as to what went into the decision of making the new schema variable a part of each job constructor. I would've thought that you would want to keep the schema as an environment config variable and the job(s) wouldn't know anything about it. Yes, that would mean that we would only have 1 schema to work with. I might be missing something, but I'm not sure if having the ability to associate jobs with different schemas adds much. Maybe you're aware of scenarios where that might be beneficial. But I think it might be better to keep the schema as an environment config variable and eliminate the coupling between that and the jobs(s). Please let me know what you think.

@RMISCG
Copy link
Author

RMISCG commented Mar 10, 2020

Hi,

Any updates on this?

Thanks

@Mythra
Copy link
Owner

Mythra commented Mar 10, 2020

Hey,

Sorry I missed your previous comment here. I'm currently going through on the weekend, and doing what I can to make sure all works smoothly.

The purpose of passing the schema in the constructors isn't so jobs can use a different schema (in fact it's specifically marked: val, and actually isn't even the schema name (it's actually called "channel name" since it's the full string of what to notify on)). The problem comes as I mentioned above with "notifying" for work.

Coworker uses LISTEN/NOTIFY to keep track of when jobs are queued primarily (this means it doesn't have to query the database as much, and take up as much resources, it does still need to query sometimes since LISTEN/NOTIFY aren't guaranteed but it can be much much more infrequent).

Unfortunately LISTEN/NOTIFY are not scoped to a particular schema, they're global. So they need the schema name encoded inside of the channel name. Now I could just hardcode this inside of the connection manager unfortunately there'd be two problems with this:

  1. It would blur the lines between connection manager, and coworker manager (ideally connection manager is generic enough it could be used outside of coworker easily). By adding in something so coworker specific it becomes just a little bit less general purpose.
  2. Anytime I want to change the way the data that gets sent inside of LISTEN/NOTIFY (which has never happened) It'd be a massive pain to upgrade (as it is now, which is why it hasn't changed). Because any workers not using the new notification string would throw exceptions and die trying to parse it. By encoding the channel name to use as a variable I can change it, and the new hosts will start using a new channel name, while the old hosts continue using the old one. Meaning old workers can still work on stuff they know about without error'ing out trying to parse a notification.

@OleksiiM
Copy link

Hi,
Is there a chance you will kindly find time to fix the issue in the nearest future?
Thanks

@Mythra
Copy link
Owner

Mythra commented Mar 31, 2020

Hey @OleksiiM ,

I am still working on it, I will update it if something changes. Unfortunately this is quite a large change, and I have many priorities right now. (Even if you exclude the craziness going on in the world).

If you’d like to see this sooner, let me know, and I can give you what I’ve done so far. That way you could take it to completion. Otherwise please realize this is a free project, and not my only one. I can only do so much, so fast.

Thanks,
Cynthia

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants