-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
libstore: fix data race in getFileTransfer() singleton replacement #14094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
de2a941
to
abad972
Compare
I think it would be better to use the |
https://en.cppreference.com/w/cpp/thread/mutex/~mutex.html and https://en.cppreference.com/w/c/thread/mtx_destroy.html imply to me that my suggestion above and this are both wrong, because we shouldn't destroy the mutex while other threads own it (my suggestion) or are waiting to lock it (current PR and status quo) I think the right thing to do is then:
|
What do you mean by re-creating a file transfer in-place? Having some sort of restart method that duplicates destructor/constructor? |
2dfb906
to
33b27cc
Compare
33b27cc
to
4d0bbff
Compare
Yes I do mean that, but it doesn't need to be duplicative. Since the method is non-virtual, you can safely call it from the constructor. |
It does now actually a bit less than the constructor. I abstracted reasonably away. Have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do it like this, with no second mutex?
This looks like a deadlock to me though because we wait for the worker thread to finish and the worker thread needs this lock. Quit can be also set from outside the worker. |
I don't think so? Do you mean waiting for the old worker thread or something else? The comment says the old worker thread should be existing, and thus releasing the lock, right? |
It has to access state to check the quit flag, unless it has set the quit flag itself. I updated the comment. |
4d0bbff
to
77ba049
Compare
But restart will only be called if the quit flag was set, right? Once quit is set, I thought no one else needs to acquire the lock, except for the thread which calls restart. |
The Deadlock Sequence:Thread 1 (Main Thread):
Thread 2 (Worker Thread - in
|
@Mic92 How about lets just have |
There is also https://en.cppreference.com/w/cpp/thread/mutex/try_lock.html we can use to make |
The actual code looks more like this:
So we really need to do this in the worker thread. |
maybe it would be good for |
I think we can end up with void workerThreadEntry()
{
try {
workerThreadMain(); // Main loop has exited because quit == true
} catch (nix::Interrupted & e) {
quit = false; // atomic
} catch (std::exception & e) {
quit = false; // atomic
printError("unexpected error in download thread: %s", e.what());
}
// there only way to leave `workerThreadMain` besides exception unwinding is for `quit` to already be set.
assert(quit);
} and then pop loop is in |
I had the following idea for void restart()
{
// The worker thread will exit if quit has been set
workerThread.join();
// Check if we need to restart
{
auto state(state_.lock());
if (!state->quit) {
return;
}
resetCurl();
state->quit = false;
}
startWorkerThread()
} @Ericson2314 what are we trying to avoid / optimize by getting rid of |
Feels like a bigger refactor that is not so easy to backport. |
77ba049
to
90f7bed
Compare
90f7bed
to
b0b9391
Compare
It is allowed to read it, and to set it to `false`, but not to set it to `true`.
Whoever first calls `quit` now empties the queue, instead of waiting for the worker thread to do it. (Note that in the unwinding case, the worker thread is still the first to call `quit`, though.)
Will be useful in a moment
Multiple threads could simultaneously observe that the singleton FileTransfer instance has quit and attempt to replace it without synchronization. This caused undefined behavior as destroying the old object while other threads might still be accessing its mutex is not allowed. Fixed by implementing in-place restart of the FileTransfer object instead of replacing it. This avoids destroying mutexes that other threads might be waiting on or holding.
b0b9391
to
41f3226
Compare
Multiple threads could simultaneously observe that the singleton
FileTransfer instance has quit and attempt to replace it without
synchronization. This caused undefined behavior as destroying the
old object while other threads might still be accessing its mutex
is not allowed.
Fixed by implementing in-place restart of the FileTransfer object
instead of replacing it. This avoids destroying mutexes that other
threads might be waiting on or holding.
Motivation
Context
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.