Coroutines #11582

nnsgmsone · 2019-06-25T06:22:02Z

nnsgmsone
Jun 25, 2019

I found that the coroutine of v is directly called pthread_create. I think it is possible to add user-level coroutines, so that more energy can be injected. Perhaps can learn from golang's approach and add a runtime. . .

spy16 · 2019-06-25T07:31:56Z

spy16
Jun 25, 2019

Implementing coroutines (similar to Go approach) has always been the plan and is in the roadmap. See https://vlang.io/docs#concurrency

0 replies

nnsgmsone · 2019-06-27T09:17:32Z

nnsgmsone
Jun 27, 2019
Author

ok

0 replies

spytheman · 2019-09-05T08:16:07Z

spytheman
Sep 5, 2019
Collaborator

@nnsgmsone what does 'more energy can be injected' mean in this context?

0 replies

nnsgmsone · 2019-09-09T06:33:27Z

nnsgmsone
Sep 9, 2019
Author

@spytheman make the routine more useful

0 replies

joe-conigliaro · 2019-09-09T06:54:22Z

joe-conigliaro
Sep 9, 2019
Collaborator

As far as I know this was the intention all along, they were Implemented that way to begin with to have something working.

0 replies

gslicer · 2019-09-09T11:39:34Z

gslicer
Sep 9, 2019

I think the use of subroutines is very limited in contrast to what threads can offer (as long as implented e.g. with the "actor" paradigm, without any semaphores/locks)... so threads shall be still supported.

See this statement:

Why create threads when there are coroutines?

Coroutine methods can be executed piece by piece over time, but all processes are still done by a single main Thread. If a Coroutine attempts to execute a time-consuming operation, the whole application freezes for the time being.

Threads are different. The execution of separate Threads is managed by the operating system. If you have more than one logical CPU, many threads are executed on different CPUs. Thanks to that, any expensive operation will not freeze your application.

0 replies

dumblob · 2019-09-09T12:59:58Z

dumblob
Sep 9, 2019

@gslicer don't worry, there will always be the unsafe package with all primitives to create your own threads with your own condition variables, mutexes, locks etc. I think the topic here is not about this though. I think it's about having a builtin primitive for concurrency and that's fully covered in #1868 .

Thus I think this topic can be closed as it got fully superseded by #1868 .

0 replies

gslicer · 2019-09-09T13:02:57Z

gslicer
Sep 9, 2019

@gslicer don't worry, there will always be the unsafe package with all primitives to create your own threads with your own condition variables, mutexes, locks etc.

As long it's "unsafe" I'm clearly worrying :)

0 replies

nnsgmsone · 2019-09-16T01:34:49Z

nnsgmsone
Sep 16, 2019
Author

@gslicer I think it's easy to achieve the effect of an actor with the routine and channel.For example, a library I wrote myself is such an effect.

0 replies

crthpl · 2020-12-08T17:43:20Z

crthpl
Dec 8, 2020
Collaborator

Will there be a compiler flag to make go start a new thread like it does now?

0 replies

atomkirk · 2020-12-30T02:08:45Z

atomkirk
Dec 30, 2020

When this is implemented in V, would it be possible to implement as pre-emptive (like erlang) instaed of cooperative (like go)?

0 replies

dumblob · 2020-12-30T09:53:33Z

dumblob
Dec 30, 2020

@atomkirk so far V has built-in "go routines" which are fully preemptive (and I think the consensus is, that it should stay so). This GitHub issue seems to be about a different thing - namely about standard library offering simple pure coroutines (which are by definition non-preemptive).

0 replies

atomkirk · 2020-12-30T14:05:30Z

atomkirk
Dec 30, 2020

@dumblob they can be. Erlang processes are user-level AND preemptive. They are very robust.

its preemptive now because it uses kernel threads which are bulky and expensive.

0 replies

dumblob · 2020-12-30T18:11:41Z

dumblob
Dec 30, 2020

Erlang processes are user-level AND preemptive.

That depends on how the Erlang VM is being executed. If it runs on bare hardware and no non-Erlang SW is being called, then you're right. In any other case Erlang processes are only partially preemptive (i.e. one Erlang process can starve indefinitely leading to stopping the whole Erlang VM). But I digress.

My point was different. V community seems to incline to have built-in support (in the form of V's go routines) for fully preemptive execution while offering non-preemptive alternative (refered to as coroutines) in the standard library (i.e. not built into the language).

0 replies

ntrel · 2021-01-22T10:44:52Z

ntrel
Jan 22, 2021
Collaborator

If a Coroutine attempts to execute a time-consuming operation, the whole application freezes for the time being.

This is not true since Go 1.14:
https://medium.com/a-journey-with-go/go-asynchronous-preemption-b5194227371c

0 replies

dumblob · 2021-08-25T13:41:43Z

dumblob
Aug 25, 2021

how would this preemption work? at which points will you yield a coroutine? usually this is done actively by the coroutine entering an async call and "waiting" (i.e. yielding) for the result. thats also the most efficient way. also how will you do this in a library? to do real context switching you need assembly. or for fully integrated coroutines like you have in c++ now you need compiler support.
you could use the c coroutine support actually. afaik the c++ corutine plumbing can also be used in c

Let me reiterate - please read the whole thread #1868 incl. all links recursively (depth 3 should be enough).

go lang most definitely will do coopoerative multitasking by having coroutines yield on channel action and async io (usually just networking, unless they also implemented async file io). so if you say "like go" you will have to do that, i.e. actively yield coroutines on these statements (channel actions, network io, mutexes, I think those are the important ones).

Partially yes - IMHO yielding under the hood will be done less aggressively than Go lang does, because there'll be the full preemptiveness, so presumably the inserted yields will be put only on critical places chosen based on true performance profiling analysis of representative apps (unlike in Go lang where they have no choice and have to put them really everywhere to make the language kind of work).

0 replies

maddanio · 2021-08-25T13:44:14Z

maddanio
Aug 25, 2021

what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway

0 replies

maddanio · 2021-08-25T13:57:27Z

maddanio
Aug 25, 2021

also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea?

0 replies

dumblob · 2021-08-25T14:01:11Z

dumblob
Aug 25, 2021

the way to have efficient networking resolves around using epoll/select inside the coroutine scheduler and have it know what coroutine to wake up when a certain socket becomes readable, similarly with channels and mutexes

Well, this supposes that programmers are dumb and will use 1 go routine per 1 request (be it a network request or any other sample from a high-rate stream). Which is one of the dumbest things one can do. Nobody from the Go lang nor Erlang world does this because Go routines in Go (and Erlang processes as well) are still extremely expensive (you can have only smaller millions of them which is by far not enough for a scalable app).

Therefore I'd say V shall actually not make the scheduler this smart. But let's see, maybe someone will provide some measurements and data and in V 1.1 (which is years ahead IMHO) there'll be such a smart scheduler. But definitely not now for V 1.0 because it's a nonsense from my point of view.

what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway

At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.).

Please just finally find few hours to read the thread #1868 recursively (and maybe wait one more day to let the brain calmly absorb it all before proposing other concepts which actually are quite aging already). We'll be here, we won't run away 😉. This topic is not urgent and very old and many people smarter than me have put their thoughts in it - most of it documented in the #1868 thread and recursive links.

also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea?

Again - there is a plan for coroutines (as part of the standard library, maybe even with some intrinsics). But IMHO it's lower priority. Feel free to make a PR with a potential API (some people already worked on that, but I can't find the links now quickly - just search for them yourself and ask e.g. on Discord).

0 replies

maddanio · 2021-08-26T07:40:44Z

maddanio
Aug 26, 2021

At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.).

I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines. Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down". Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points.

0 replies

dumblob · 2021-08-26T08:50:52Z

dumblob
Aug 26, 2021

I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines.

Please really devote several hours to reading the whole thread #1868 incl. links. You'll learn among other things about Weave which nicely shows what the performance differences are - but don't forget we're talking about maxing out performance of multiple processing units, not just a single core.

And btw. as I said, V will insert yields internally on important places (preferably according to measurements and not human guesses) like selects, mutexes, etc. But I suppose it'll be on a (much) lower number of places than the recent Go versions started to do (you can read about this also in one of the linked resources from the #1868 thread).

Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down".

Thanks for the link. Gor explains a cool idea how to implement very lightweight coroutines which highly efficiently leverage modern CPU cache hierarchy. These nano-coroutines are unfortunately something V can't care about. Simply because they don't support scheduling across multiple processing units. In other words they do support only a single processing unit (i.e. only one thread) and based on Gor's explanation this can't be changed without losing some of their benefits.

I'd even guess that e.g. Weave (which offers tasks which is just a slightly different abstraction for the very same coroutine concept) is about as fast as Gor's nano coroutines even if run only on a single core (despite Weave being designed to max out performance of many processing units with different processing powers). Feel free to test it and post your results here to let us reproduce them on more machines.

Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points.

Inlining micro-optimization sound like a patch to a wrong abstraction. But yes, thanks to this I'd guess it'll catch up with "normal function calls" when it comes overhead on one core.

0 replies

maddanio · 2021-08-26T12:32:33Z

maddanio
Aug 26, 2021

I had a quick look at weave, i dont see how it applies to async operations like networking where you have to keep on juggling tasks because most of them simply cannro make progress at any given time due to waiting for io. I think you are fundamentally mixing up concurreny with parallelism.

1 reply

dumblob Oct 14, 2021

@maddanio async is just a pure event loop, nothing else (i.e. also no multicore support whatsoever). Weave is also built around an event loop but with the twist of a built-in support for multicore processing without sacrificing performance.

In other words, I'm not mixing concurrency with parallelism. I'm just emphasizing that V being fundamentally parallel under the hood can not provide async-like semantics without sacrificing it's parallel nature. If async, then without parallelism - but that's silly IMHO (though possible and many use V like that).

So why should we even consider going down that path? The only reason I know of is compatibility (for bindings, for porting of existing solutions, etc.). With the reference to Weave I've refuted your performance concerns. So there seems to be really no other (than compatibility with the outside world) existing reason to have coroutines/async.

Feel free to come up with other reasons - I'll be all ears to learn something new.

maddanio · 2021-10-14T19:13:46Z

maddanio
Oct 14, 2021

Hi. I am not sure what exactly you mean with async as a word. It is absolutely possibly to do async dispatch in the face of paralellism. Go perfectly integrates parallelism with async dispatch. I c++ with coroutines you can also combine it easily, and rusts modern networking libraries do the same. As far as I understand it basically the event loop becomes a multithreaded reactor. So I dont really see the contradiction here. Async as a keyword as used in javascript is inherently single threaded, but that is because js is inherently single threaded. nothing stops a language/runtime/library from simply executing the continuation of an async call (which could well be a part of a coroutine) in another thread, parallel to other continuations.

17 replies

maddanio Oct 16, 2021

the userspace thread thing i dont understand really.

maddanio Oct 16, 2021

ok, deeply digging into lthread. its a good exercise, as featurewise it has exactly what is needed for go-routines. well, imho its missing an explicit yield. and yes, it has a per-thread scheduler without work-stealing, that would obviously need to be improved. other than that I think it is quite nice in being rather compact. yes, probably can be tuned, but I think the principles are sound. I mean we will basically be needing a context switch, which libcontext likely does in the most efficient way (its the basis of boost context), but can also be tuned. and beyond that its basically intrusive lists of waiting coroutines and event polling, for which it also has quite a few backends already (kevent, epol, etc).

maddanio Oct 16, 2021

regarding your references again, I think the only thing that can be taken from userspace-events is that it is signal based, something that go only uses for pre-emption, and thats probably well thought out. so if we want to add pre-emption (and I suggest doing that later as the low hanging fruit is having the context swiching and co-routine aware io and locking) we probably have to look into that.
regarding weave I am simply overwhelmed and have so far failed to see what exactly it does. what exactly is it that you think it would gain?

maddanio Oct 19, 2021

Ok, dug in I have. I am largely overhauling lthread and preparing it to be a go-like concurrent runtime:
https://github.com/maddanio/lthread_concurrent
It uses libcontext for switching (which supports osx fully and should also support windows) and I also optimized stack memory allocation. In my simple benchmark my lthreads are about 100 times fater than pthreads, but I guess it largely depends on the usage scenario. Next I will try to make it multithreading aware including work stealing and make it support window. the platform specific stuff is quite localized, so it should not be too hard to do.

maddanio Oct 19, 2021

since I now do all work dispatching from a single queue (per thread) work stealing should be quite easy, I would follow the wonderfully simple explanation of sean parent here: https://sean-parent.stlab.cc/presentations/2016-08-08-concurrency/2016-08-08-concurrency.pdf (page 31ff).

maddanio · 2021-10-14T21:16:58Z

maddanio
Oct 14, 2021

I am not sure, arguing with you feels like trying to chase a rabbit to me, and I think I am about to give up.

I don't really unerstand what you intend to build. RIght now in v I have a synchronous networking library and a server would basically built using the good old fork approach, only that I would be using threads instead of exec. And you seem to want to automatically reduce the huge amount of threads this can amount to in something really large scale by some magic automatic pre-emptive runtime which I dont iunderstand and honestly I am to lazy to read all those references. but good luck with that.

I like c++ coroutines because they make things more readable and are effectively a negative overhead abstraction as they give the compiler the chance to inline through what would traditionally have been a context switch. and I like the reactor pattern as it avoids huge amounts of threads which feel ineficcient (i have a hard time beleiving there is no downside to having 1000 threads running), but maybe I just grew used to that.

Once your scheme works I might have a look at it. Right now v doesnt seem usable for my purposes though. I would have offered to help out, but since I dont even understand the direction you are going in I dont feel in a position to do though. May be I am just too old.

Hmm, I just went through it again, and end up confused. I mean, if I can use your clone of the go-routine syntax (that is the idea, right?) and it will pre-empt "cleverly", the cleverest to me being to pre-empt at obvious bloking points such as select and lock, then the result somehow seems the same as go-routines (or equivalently c++ coroutines with some reactor implementaion such as asio). what else will your pre-emption scheme acheive? some kind of better loadbalancing?

28 replies

maddanio Jan 10, 2022

it will not make any sense for the js backend anyhow. for js I would suggest exploiting their existing async stuff.

for native I would expect it to be easy to call into c libs, no? or you really want to throw c-interop out of the window? that seems like it would lose a lot of libraries, not least things like openssl, libpng, etc

maddanio Jan 10, 2022

regarding js, is there any way to emit co-routines in the js backend? In that case it could be nice to somehow build something that hast the same v-code, but for non-js uses something like my parallel lthreads, and for js emits things like await

maddanio Jan 10, 2022

a v-rewrite could be interesting though. does v (for non-js backends) allow inline assembly? Some of that is needed for stack switching (for non-js backends, on js, as said above, all the yielding etc likely should be replaced by corresponding co-routine code...).

dumblob Jan 10, 2022

Also need a windows port of os mutexes/conds and threads and overlapped io.

It's already built in V in a fully cross-platform and moderately well tested form. It's documented in V doc 😉.

Btw, my work is on m:n scheduling, i.e. m lightweight threads to n kernel threads.

I know. My comments still apply (there is no disagreement).

For userspace 1:1 scheduling I think one should wait for googles work (or similar) to come through to win/osx: https://youtu.be/KXuZi9aeGTw

No, this would actually be counterproductive. Even if we had such far reaching kernel support for user space threads, they'd still be significantly more bulky (maybe 2x or more big only in size) than plain go routines. We don't need them. We just need a highly efficient interrupt timer. That's the most basic functionality every microcontroller out there has. Unfortunately POSIX nor Linux nor Windows do not offer such interface (there is of course similar interfaces but they're still quite inefficient for our simple use case) if not assuming eBPF.

it will not make any sense for the js backend anyhow. for js I would suggest exploiting their existing async stuff.

That's what the JS generator backend does (but it's still immature - so judge yourself by the comments & TODOs in the sources).

for native I would expect it to be easy to call into c libs, no? or you really want to throw c-interop out of the window? that seems like it would lose a lot of libraries, not least things like openssl, libpng, etc

Sure, V is "just" a safe & easier to read C with automated memory management (I've been criticized by @JalonSolov to say this in other places but I'm convinced this is the right way to explain V to C programmers 😉). But for a cooperative scheduler this doesn't sound like a good idea as developing V is one problem and developing an external library would be yet another (second) problem (and actually with a lot of duplicated effort - think of multiplatform locking, threading, ... shim layers, etc.). So why not to develop it in V - it'll be significantly smaller and will pose no issues for further development 😉.

regarding js, is there any way to emit co-routines in the js backend? In that case it could be nice to somehow build something that hast the same v-code, but for non-js uses something like my parallel lthreads, and for js emits things like await

This is not yet being done from what I know. But if you wrote the cooperative scheduler in V, then this would all get translated to JS and only with minor modifications would "magically" do the right thing.

a v-rewrite could be interesting though. does v (for non-js backends) allow inline assembly? Some of that is needed for stack switching (for non-js backends, on js, as said above, all the yielding etc likely should be replaced by corresponding co-routine code...).

Sure, V really is not a toy. Assembler is even built-in (i.e. not just a second class citizen like assembler in C) - just read the V documentation 😉.

maddanio Jan 10, 2022

No, this would actually be counterproductive. Even if we had such far reaching kernel support for user space threads, they'd still be significantly more bulky (maybe 2x or more big only in size) than plain go routines. We don't need them. We just need a highly efficient interrupt timer. That's the most basic functionality every microcontroller out there has. Unfortunately POSIX nor Linux nor Windows do not offer such interface (there is of course similar interfaces but they're still quite inefficient for our simple use case) if not assuming eBPF.

Hmm, not sure there. Google actually implemented that interrupt thing, and its a huge dance since you have no idea where you interrupted. To me it seems they want this to replace exactly that interrupt dance. Think interrupting in the middle of a mutex or anything that assumes tls like errno, malloc, etc.

it will not make any sense for the js backend anyhow. for js I would suggest exploiting their existing async stuff.

That's what the JS generator backend does (but it's still immature - so judge yourself by the comments & TODOs in the sources).

It gerenates async code? Do you have a pointer for that? Kinda surprising given that v has no real concept of async right now.

for native I would expect it to be easy to call into c libs, no? or you really want to throw c-interop out of the window? that seems like it would lose a lot of libraries, not least things like openssl, libpng, etc

Sure, V is "just" a safe & easier to read C with automated memory management (I've been criticized by @JalonSolov to say this in other places but I'm convinced this is the right way to explain V to C programmers 😉). But for a cooperative scheduler this doesn't sound like a good idea as developing V is one problem and developing an external library would be yet another (second) problem (and actually with a lot of duplicated effort - think of multiplatform locking, threading, ... shim layers, etc.). So why not to develop it in V - it'll be significantly smaller and will pose no issues for further development 😉.

Hmm, i dont really follow you. The scheduler will be complex in any language. It's one of those things where you have to mull over every line. Not sure it would be much easier to do in c++ or v. also I like doing it in c as that means I can use it also in c++ or python or....

regarding js, is there any way to emit co-routines in the js backend? In that case it could be nice to somehow build something that hast the same v-code, but for non-js uses something like my parallel lthreads, and for js emits things like await

This is not yet being done from what I know. But if you wrote the cooperative scheduler in V, then this would all get translated to JS and only with minor modifications would "magically" do the right thing.

I doubt that magic part. lthread, and my parallel fork of it, basically depends on stack switching, something I dont see possible in js. or is it? it is implemented in os and processor specific asm right now.

a v-rewrite could be interesting though. does v (for non-js backends) allow inline assembly? Some of that is needed for stack switching (for non-js backends, on js, as said above, all the yielding etc likely should be replaced by corresponding co-routine code...).

Sure, V really is not a toy. Assembler is even built-in (i.e. not just a second class citizen like assembler in C) - just read the V documentation 😉.

right, yah, another day. right now I still need to pluck out race bugs and confusions around kqueue/epoll....

cih-y2k · 2021-12-15T07:53:01Z

cih-y2k
Dec 15, 2021

https://github.com/sustrik/libmill
This lib seems appropriate for V's concurrency with Golang's style.

2 replies

maddanio Dec 15, 2021

it would seem so. but it does not support using multiple threads, thereby leaving a huge amount of performance on the floor on current many-core systems. indeed doing this in a single thread is relatively easy and can be considered an good programming exercise, but the multithreaded version becomes quite hard. i am close with my lig, but recently haven't found time to squeeze out the last races.

dumblob Dec 15, 2021

Exactly as @maddanio says. We utterly need perfect seamless and ~linearly scalable multicore support.

i am close with my lig, but recently haven't found time to squeeze out the last races.

I'm sorry for keeping you alone at that - unfortunately my availability didn't turn out better yet contrary to what I thought. Either way Run ~~Forrest~~Daniel, run!, I'll definitely keep it in mind and try to take a look at it at some point (at the latest please ping me from a PR discussion 😉).

maddanio · 2022-01-02T21:49:32Z

maddanio
Jan 2, 2022

quick heads up: I seem to be able to implement synchronization primitives now. I basically use a combined lock/condition variable as my lowest primitive. With that I implemented a concurrent generator/consumer pattern (i.e. push/pull from a fifo queue) successfully in the presence of a multithreaded scheduler.

as a side note I seem to be able to do about 500k context switches/s/thread whereas even creating and detaching 1M threads takes >10s.

next up will be playing with async i/o.

4 replies

ylluminate Mar 15, 2022
Collaborator

@maddanio, @larpon mentioned an interesting library and was curious if you've seen it and compared it with your present effort: https://github.com/edubart/minicoro

dumblob Mar 15, 2022

@ylluminate thanks for the link - yes, the library is very nice (and I mean it!). Though I'm afraid there is one rather major downside - it seems prone to indefinite starvation if used as generic go routine backend because it seemingly can not move coroutines across os-level threads (in my vision of a V go routine scheduler & runtime moving coroutines across os-level threads to maintain real-time guarantees is the major selling point!). See readme:

The mco_coro object is not thread safe, you should lock each coroutine into a thread.

and related discussion edubart/minicoro#9 .

maddanio Mar 18, 2022

If these coroutines are not thread movable that is indeed a nono. Also it does not contain a scheduler, which is really the core feature that go routines provide and that we would like to replicate

maddanio Mar 18, 2022

Unfortunately i dont really find the time to work on this, and the error cases that remain are really tricky to debug.

dumblob · 2022-02-12T16:37:13Z

dumblob
Feb 12, 2022

I'll just throw this here as I can't devote much time to this now.

Coroutines upside down as "native async" in V

It seems (some) people expect V to be usable also for "special" things like hard real-time guarantees or low-power embedded systems or systems with poor threading support running on single-core CPUs or systems with extreme I/O pressure (i.e. processing of shortest imaginable data lengths in the highest frequency the given HW permits).

For such cases it'd be inefficient to use any synchronization and thus to use multiple cores. This is important to realize and stress. Therefore from now on I'll assume in this write up only single-core processing (of course, this can be seamlessly combined with go routines but that's not the point of this write up).

One idea how this could be achieved is to create and/or use coroutines as sort of a "type". Thus basically offering seamless async (yes, you heard it) support mostly without the disadavantages discussed in other threads in this issue tracker.

See https://qed-lang.org/article/2019/06/27/coroutines.html which explains the ideas with detailed code samples how to implement (or emulate) the desired behavior using an imperative language with interfaces.

I hope this can be modeled using existing V mechanisms (as a module in standard library). But if that wouldn't be convenient enough, then contemplate some minimal extension of current semantics or in the worst case even some tiny syntax addition.

A crazy related question is whether this "smart coroutines" functionality could "blend" with the functionality of go routines somehow? Basically "telling the compiler more information" than current go routines alone do. Just to allow much better optimizations (ideally totally getting rid of any synchronization between threads in some scenarios - like ones which might suffer from greater contention could be automatically internally converted to these "smart coroutines").

20 replies

maddanio Mar 23, 2022

I am sorry, but I still fail to see how one can be turned into the other. Stackless really puts it all in the ast, while stackful basically ignores the ast. Also the link you gave and the google query did not bring me much further. also I fail to see the point of "stackless under the hood": the whole idea around stackless is that you don't put it "under the hood" but make it explicit. in stackless as far as I see you always get the state machine (or an interface to it) as a return type and work with it explicitly, while a stackful co-routine simply doesn't return, the whole stack is frozen and you have to continue with another one.

dumblob Mar 23, 2022

Try to look at it like this: stackless vs. stackful is almost an implementation detail. Stackful is easier to implement - assign frame (making the code 100% synchronous), trap (and act) if more is needed. Stackless is harder to implement but also less powerful (due to depth of the stack - or storage for the state graph if you want - needed to be known in compile time; yes, stackless also needs stack-like structure the very same way as stackful but with an optimal bound). Behavior of stackful vs stackless follows from the data structures they use and despite the behavior seems on the first sight different (because it boils down to sync vs async), it actually does the same thing and is therefore quite irrelevant (feel free to read on an example how to trivially interleave sync vs async).

From this follows that you can convert all stackless graphs into a stackful stack representations. You can do it also the other way round but you need to prove that the bound is known in compile time. And that's possible in many (most?) cases though I didn't check the literature if it's possible to prove this in compile time in the context of V for all stackless-able cases (if it's not possible as of now, V would need to be extended to provide the compiler with more knowledge - that's what I was writing about in the previous comments). From this it follows that the pure stackless/stackful representation is kind of an "implementation detail mainly depending on the level of optimization".

Therefore I'm focusing rather on the "how to describe this syntactically" - at least how to not close the door to a pleasant syntax & semantics of such "smart coroutines" when releasing V 1.0.

Even if "smart coroutines" won't exist in V 1.0 as API, the language shall allow to always either prove or disprove that the given go routine is compilable to stackless. This is a very strong requirement, but it'll allow potential introduction of a stackless API in V 1.1 as a low-level more constrained counterpart to go routines.

maddanio Mar 23, 2022

I am still not convinced :). Stackless makes it explicit and allows for much better control. It introduces a well defined semantic, which makes a lot of difference in terms of software architecture.

maddanio Mar 23, 2022

Also i am not sure how well go routines will mix with stackless. Even though, a v with go routines, ie a better go, would be a great tool in my book.

metakeule Oct 25, 2024

@dumblob @maddanio

See https://qed-lang.org/article/2019/06/27/coroutines.html which explains the ideas with detailed code samples how to implement (or emulate) the desired behavior using an imperative language with interfaces.

I hope this can be modeled using existing V mechanisms (as a module in standard library). But if that wouldn't be convenient enough, then contemplate some minimal extension of current semantics or in the worst case even some tiny syntax addition.

This is basically like a WaitGroup in Go:
https://pkg.go.dev/sync#example-WaitGroup

maddanio · 2022-03-21T07:14:55Z

maddanio
Mar 21, 2022

small heads up: I managed to fix the networking code and now have a fully concurrent webserver example running. now I still have to fixx some issues around synchronization primitives.

4 replies

ylluminate Mar 21, 2022
Collaborator

That's great news - so basically the networking code aspect was what you were referring to a few days ago? (eg: #11582 (reply in thread))

maddanio Mar 22, 2022

part of it. but I am now more confident I can find the problem, since I have managed to isolate components, like the scheduler, the network polling mechanism, and the synchronization mechanisms, much better.

ylluminate Jun 30, 2022
Collaborator

@maddanio anything we can help with to help move this forward? Any blockers for you that need particular attention?

maddanio Jun 30, 2022

I kinda stopped working on it, but it passes all my tests. Its a pure c library. It needs porting to windows (linux should work, need to check), but the relevant code is quite isolated. So if anyone is up to it i gladly take prs

medvednikov · 2023-05-30T16:44:05Z

medvednikov
May 30, 2023
Maintainer

The first version of coroutines is live! Check out the examples/simple_coroutines.v example.

15 replies

Avey777 Mar 7, 2024

Is the coroutine of v language currently stackable or non stackable?
If there is a stack, what should be done for subsequent autofree?

The purpose of using autofree is to have no GC, which is very important

Wajinn Mar 8, 2024

The purpose of using autofree is to have no GC, which is very important

I'm not sure or clearly understanding where this is coming from?

The website and documentation states flexible memory management and different ways to manage memory. So, GC should be staying with V for a long time, as one of the options. That's great and convenient for many users.

I have not read any intentions to completely divorce autofree from using GC (or possibly RC for that matter). It appears that GC (or RC) would be used for backup and to do otherwise (at this time), would be to clearly create an issue for oneself.

"The second way is autofree... the compiler inserts necessary free calls automatically during compilation. Remaining small percentage of objects is freed via GC (from documentation)."

Is the coroutine of v language currently stackable or non stackable?

"Photon is a stackful coroutine implementation".

"stackful coroutine implementation does not depend on compiler features (such as async and await in C++20), the switching point is encapsulated in the IO operation or event engine, so it's less intrusive..."

As for PhotonLibOS (Photon), it appears that it's being steadily worked on, from different sides:

PhotonLibOS/issues/40
PhotonLibOS/issues/150
PhotonLibOS/pull/185
vlang/photonbin

Avey777 Mar 8, 2024

-gc none autofre will be a compilation choice for many people

sredrv Jun 8, 2024

https://github.com/zelang-dev/c-coroutine
This library currently represent a fully C implementation of GoLang Go routine.

This has panic() and recover() implementations too. Windows supported, in addition to Linux and MacOS.
How does this compare with PhotonLibOS ? to help advancing the state of coroutine integration/implementation in V.

maddanio Jun 8, 2024

What this seems to miss, and could be important to avoid certain scenarios, is preemption. Its quite hard to do unfortunately, but go does have this, and u think for good reasons

metakeule · 2024-10-25T17:28:23Z

metakeule
Oct 25, 2024

@maddanio , @dumblob

Hi,

coming from Go and having read (and probably not fully understood) the whole thread:

I always missed in Go some more control over the goroutines.
Since there is this obvious difference between threads (heavyweighted, cpu-affinity, preempted)
and coroutines (lightweight, coordinated), my naive approach and wish would be something like this:

// pseudocode

// have a constant telling us the available cpu cores

th1 := thread.New() or {panic('no threads left') }
th2 := thread.New() or {panic('no threads left') }

cr1 := th1.NewCoroutine()
cr2 := th1.NewCoroutine()
cr3 := th2.NewCoroutine()

ch := chan(int, 3)
fn doStuff(c chan bool, cr *CoRoutine) {
 defer   cr.Close()
 
 // have other defers to release resources
  c <- cr.Num
}

// we can only run coroutines, no threads
go cr1.run(doStuff)
go cr2.run(doStuff)
go cr3.run(doStuff)

for {
  i := <- ch  or { break}
  println(i)
}

th1.WaitToFinish()
th2.WaitToFinish()

The implications being:

a coroutine could only run one a thread, therefor, a thread must be created first
a thread can't run code, just a coroutine based on it can
a coroutine would be an object (with a stack) that can be cancelled (its defers should be called/resources released), waited on, asked for its number (for debugging)
a thread would be an object (without a stack, because the stack is part of the coroutines, but behind the scenes all the coroutines have their stack "on the thread"), that can be cancelled/closed (then all the coroutines would just be cancelled), waited on (until all its coroutines are finished) , asked for its number (for debugging)

Then on top of this abstractions could be schedulers, queues etc. build for several scenarios and use for io etc. within the standard library.

Why is such a way not discussed? It should be simple to implement (you don't need to have heuristics at that point) and all the optimizations can be done later on top of the fundamentals...
am I missing something?

0 replies

Coroutines #11582

Replies: 38 comments · 91 replies

nnsgmsone Jun 27, 2019 Author

spytheman Sep 5, 2019 Collaborator

nnsgmsone Sep 9, 2019 Author

joe-conigliaro Sep 9, 2019 Collaborator

nnsgmsone Sep 16, 2019 Author

crthpl Dec 8, 2020 Collaborator

ntrel Jan 22, 2021 Collaborator

ylluminate Mar 15, 2022 Collaborator

Coroutines upside down as "native async" in V

ylluminate Mar 21, 2022 Collaborator

ylluminate Jun 30, 2022 Collaborator

medvednikov May 30, 2023 Maintainer

Replies: 38 comments 91 replies

nnsgmsone
Jun 27, 2019
Author

spytheman
Sep 5, 2019
Collaborator

nnsgmsone
Sep 9, 2019
Author

joe-conigliaro
Sep 9, 2019
Collaborator

nnsgmsone
Sep 16, 2019
Author

crthpl
Dec 8, 2020
Collaborator

ntrel
Jan 22, 2021
Collaborator

ylluminate Mar 15, 2022
Collaborator

ylluminate Mar 21, 2022
Collaborator

ylluminate Jun 30, 2022
Collaborator

medvednikov
May 30, 2023
Maintainer