-
Notifications
You must be signed in to change notification settings - Fork 70
Description
Search before asking
- I searched in the issues and found nothing similar.
Version
Pulsar version 3.3.
OS - Red Hat Enterprise Linux 8.9 and RHEL 9.3 and Linux 5.10.0-26-amd64
Minimal reproduce step
Our test that is observing the issue is when we are unable to connect to the server immediately on startup, but the server is reachable after a while. We attempt to subscribe, if the subscribe fails we try again.
Sometimes when the subscribe eventually succeeds all is well.
Sometimes when the subscribe is successful it still throws the exception.
Sometimes even before the subscribe returns there is the exception.
//Rough setup.
clientConfig.setReceiverQueueSize(1000);
clientConfig.setUnAckedMessagesTimeoutMs(10000);
clientConfig.setConsumerType(pulsar::ConsumerType::ConsumerExclusive);
client = std::shared_ptr<pulsar::Client> (new pulsar::Client(serviceURL, clientConfig));
pulsar::Result result = pulsar::ResultRetryable;
pulsar::Consumer consumer;
while (true && result != pulsar::ResultOk) {
result = client->subscribe(["A topic", "Another topic"], subscriberName, clientConfig, consumer);
if (result != pulsar::ResultOk) {
//sleep 5s;
}
}
What did you expect to see?
No exception.
What did you see instead?
std::system_error thrown and not handled.
Anything else?
std::system_error thrown during/after client->subscribe(...) when using configuration setUnAckedMessagesTimeoutMs.
On attempt to subscribe and we have already set a value in the Client Configuration for setUnAckedMessagesTimeoutMs we are observing std::system_error being thrown, not handled, and not caught in ExecutorService and leads to terminating the application. setUnAckedMessagesTimeoutMs is set to 10000, as this is the minimum we have not experimented with other values.
Without setUnAckedMessagesTimeoutMs set in the configuration no exception is seen on or after subscribe(..).
We suspect the exception is being thrown by std::recursive_mutex when trying to aquire the lock in: UnAckedMessageTrackerEnabled::timeoutHandlerHelper() of pulsar-client-cpp/lib/UnAckedMessageTrackerEnabled.cc
- Please can this be investigated for the cause of the exception.
- Please can exceptions thrown in your library background threads be caught and not terminate the application.
stderr:
terminate called after throwing an instance of 'std::system_error'
what(): Invalid argument
backtrace:
Program terminated with signal SIGABRT, Aborted.
0 0x00007fdef2329acf in raise () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7fddd1253700 (LWP 1875868))]
0 0x00007fdef2329acf in raise () from /lib64/libc.so.6
1 0x00007fdef22fcea5 in abort () from /lib64/libc.so.6
2 0x00007fdef68e69e3 in ::coreHandler(int, siginfo_t*, void*) () from /libapclient.so.10.15
3
4 0x00007fdef2329acf in raise () from /lib64/libc.so.6
5 0x00007fdef22fcea5 in abort () from /lib64/libc.so.6
6 0x00007fdef2eea09b in __gnu_cxx::__verbose_terminate_handler() [clone .cold.1] () from /lib64/libstdc++.so.6
7 0x00007fdef2ef054c in __cxxabiv1::__terminate(void ()()) () from /lib64/libstdc++.so.6
8 0x00007fdef2ef05a7 in std::terminate() () from /lib64/libstdc++.so.6
9 0x00007fdef2ef0808 in __cxa_throw () from /lib64/libstdc++.so.6
10 0x00007fdef2eec235 in std::__throw_system_error(int) [clone .cold.28] () from /lib64/libstdc++.so.6
11 0x00007fddd380f508 in pulsar::UnAckedMessageTrackerEnabled::timeoutHandlerHelper() () from /libconnectivity-pulsar-client.so
12 0x00007fddd380f5a9 in pulsar::UnAckedMessageTrackerEnabled::timeoutHandler() () from /libconnectivity-pulsar-client.so
13 0x00007fddd38111e2 in boost::asio::detail::wait_handler<pulsar::UnAckedMessageTrackerEnabled::timeoutHandler()::{lambda(boost::system::error_code const&) # 1}, boost::asio::any_io_executor>::do_complete(void, boost::asio::detail::scheduler_operation, boost::system::error_code const&, unsigned long) () from */libconnectivity-pulsar-client.so
14 0x00007fddd374ac38 in boost::asio::detail::scheduler::run(boost::system::error_code&) () from */libconnectivity-pulsar-client.so
15 0x00007fddd3743f92 in pulsar::ExecutorService::start()::{lambda() # 1}::operator()() const [clone .isra.334] () from */libconnectivity-pulsar-client.so
16 0x00007fdef2f1cb23 in execute_native_thread_routine () from /lib64/libstdc++.so.6
17 0x00007fdef26a81ca in start_thread () from /lib64/libpthread.so.0
18 0x00007fdef2314e73 in clone () from /lib64/libc.so.6
Are you willing to submit a PR?
- I'm willing to submit a PR!