Skip to content

Conversation

dm-vodopyanov
Copy link
Contributor

@dm-vodopyanov dm-vodopyanov commented Nov 19, 2020

This patch fixes hang caused by deadlock in
Scheduler::deallocateStreamBuffers function. Deadlock was
happened when StreamBuffersPool[Impl] had been being
deleted.
StreamBuffersPool[Impl] is a pointer to StreamBuffers struct
object which has buffer object as a field, so the dtor of buffer
class was called. After that the call stack is the following:
~buffer_impl() -> SYCLMemObjT::updateHostMemory() ->
Scheduler::removeMemoryObject ->deallocateStreams ->
Scheduler::deallocateStreamBuffers -> deadlock in
lock_guard, mutex was already locked by the same thread.

Changed std::mutex to std::recursive_mutex, so now
the same thread can enter the critical section, locked by
itself before.

This patch fixes hang caused by data race in `scheduler.cpp`'s
`deallocateStreams` function: data of `StreamsToDeallocate` vector is
corrupted by another thread, and due to this, it was a hang in
`Scheduler::deallocateStreamBuffers` function when
`StreamBuffersPool[Impl]` had been being deleted.
@dm-vodopyanov dm-vodopyanov requested a review from a team as a code owner November 19, 2020 13:35
vladimirlaz
vladimirlaz previously approved these changes Nov 19, 2020
@dm-vodopyanov
Copy link
Contributor Author

There are some surprisingly failed LIT tests caused by this change. I'm working on resolving them.

@dm-vodopyanov
Copy link
Contributor Author

@vladimirlaz, fixed patch, all LIT tests pass now, updated description.

@bader bader merged commit bd5893a into intel:sycl Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants