-
Notifications
You must be signed in to change notification settings - Fork 386
Avoid using a raw pointer in the proactor queue to avoid use after free. #2468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Avoid using a raw pointer in the proactor queue to avoid use after free. #2468
Conversation
void reset () | ||
{ | ||
acquire (); | ||
this->handler_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use a guard for the acquire/release, no need for public acquire/release, use nullptr,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::atomic instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered using a guard here, but I ran into problems, when doing the actual callback to the ACE_Handler for the timeout. A possible solution was to allow the code that actually calls the ACE_Handler to lock the mutex.
This results in only two possibilities. Either I get the timeout callback, or I got to set back the handler pointer to the nullptr.
ACE_Handler::Proxy *p = this->proxy_.get (); | ||
if (p) | ||
p->reset (); | ||
deregister_callback (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why move this to a new public method which is only used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be able to use it from derived classes. It could therefore be made protected.
The order of operations in destructors would only execute this after the destructor of the derived class had already been executed, enabling callbacks to a semy-destructed object.
What about a test extension as reproducer? |
this->handler_ = 0; | ||
release (); | ||
} | ||
ACE_Handler *handler () { return this->handler_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is reading itself thread safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I added the acquire/release methods. Reading the pointer does not completely solve the problem, since I just end up with another raw pointer and the other core could again be in the process of destructing the object.
The timeout code looks something like this:
proxy->acquire();
if (proxy->handler() != nullptr)
proxy->handler()->callback();
proxy->release();
I'm very open to a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is risky, keeping a lock during a callback. In taox11 we use shared/weak pointers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess replacing the ACE_Handler* completely with some kind of shared_pointer could also solve the problem. I'd have to test it out with our code base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be the bigger change for users, since it requires them to use shared_pointers to manage the life time of their ACE_Handlers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the public API is pretty impossible given the amount of users there. Your current solution doesn't seem to be complete, so not something to merge.
A long time ago to ACE_Event_Handler
reference counting was added in such a way that the user had to enable it, maybe something for ACE_Handler? It has been some time ago that I worked on the proactor and I haven't dived into your issue in detail, it was just some high level comments, so not sure whether that is the correct path forward. Any possibility you could solve this race condition in application code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding reference counting similar to ACE_Event_Handler
sounds like a much better idea than mine.
I'll have a look at the existing tests and try to add one to reproduce my scenario. |
The timer queues in the ACE_Proactor work on raw pointer of ACE_Handler. In our multicore environment this can lead to situations, where the timer element is already being worked on, while the ACE_Handler is destructed on a different core.
These situations need to be handled correctly by every single user of the timer system, which has proven itself to be error prone.
For example one core could execute ACE_Proactor_Handle_Timeout_Upcall::timeout and access the ACE_Handler::Proxy_Ptr via the ACE_Handler*, while the ACE_Handler was destructed concurrently on another core. This resulted in a double free of the memory managed via the ACE_Handler::Proxy_Ptr.
ACE_Handler::Proxy_Ptr is already a smart pointer and the situation could be improved by storing and ACE_Handler::Proxy_Ptr instead of an ACE_Handler* in the queues of the ACE_Proactor.
Applications could then savely delete the ACE_Handler, while the memory of the ACE_Handler::Proxy would still be valid. Centra application code could could then try to retrieve the ACE_Handler* in a thread-safe way and the code using the timer system could free ACE_Handler at any time, without having to care for potentially outstanding callbacks.