Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TempResultsManager deletes results prematurely if multiple top-level variables point to the same DataBag #226

Open
ggevay opened this issue Sep 6, 2016 · 2 comments

Comments

@ggevay
Copy link
Contributor

ggevay commented Sep 6, 2016

For example, the following code fails with Flink:

var v = DataBag()
val r = v
v = DataBag()
r

The problem is that the TempResultsManager garbage collects the temp result of the 1. line after it executes the 3. line, but the 4. line then looks for the deleted file.

(A real-life example of a similar code is the inner loop of KMeans, where the last line is similar to the 2. line here. If the solution = ... line would use centroids not from the closure, but as a TempSource, then the problem would occur there.)

A solution would be to translate the val r = v line into a TempSource and an immediate TempSink.

I guess we don't want to fix this for the old backend, but we will close this issue when the backend for the new ir is done, and the problem doesn't occur there.

ggevay added a commit to ggevay/emma-1 that referenced this issue Sep 6, 2016
…ed to the driver."

This reverts commit cedbe42.

There were two problems:
 - The eval field should also be transient
 - This exposes emmalanguage#226 in
   KMeans
@joroKr21
Copy link
Member

Is this still relevant? What happens with temp results in FlinkDataSet currently?

@aalexandrov
Copy link
Contributor

The current upstream does not garbage collect temp results, but it should, and it makes sense to keep track of this issue in order to avoid it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants