Version 2 (modified by ming, 10 years ago) (diff) |
---|
This wiki describes the issue from ticket #1820 (Helgrind thread error detector test and analysis)
- Data race over possible concurrent read/write access for thread's quit flag.
Description: One thread tries to write the quit flag while another tries to read it.
Status: false positive, this is intentional behaviour.
==26641== ---Thread-Announcement------------------------------------------ ==26641== ==26641== Thread #6 was created ==26641== at 0x5953FCE: clone (clone.S:74) ==26641== by 0x5352199: do_clone.constprop.3 (createthread.c:75) ==26641== by 0x53538BA: pthread_create@@GLIBC_2.2.5 (createthread.c:245) ==26641== by 0x4C30C90: pthread_create_WRK (hg_intercepts.c:269) ==26641== by 0x56704B: pj_thread_create (os_core_unix.c:616) ==26641== by 0x49CC57: pjmedia_endpt_create (endpoint.c:169) ==26641== by 0x43102A: pjsua_media_subsys_init (pjsua_media.c:80) ==26641== by 0x42B576: pjsua_init (pjsua_core.c:1058) ==26641== by 0x40800A: app_init (pjsua_app.c:1346) ==26641== by 0x408F1A: pjsua_app_init (pjsua_app.c:1881) ==26641== by 0x405AA9: main_func (main.c:108) ==26641== by 0x568914: pj_run_app (os_core_unix.c:1930) ==26641== ==26641== Possible data race during write of size 4 at 0x8182E8 by thread #1 ==26641== Locks held: none ==26641== at 0x42AA5E: pjsua_stop_worker_threads (pjsua_core.c:730) ==26641== by 0x42C753: pjsua_destroy2 (pjsua_core.c:1548) ==26641== by 0x42CDC9: pjsua_destroy (pjsua_core.c:1775) ==26641== by 0x409291: app_destroy (pjsua_app.c:2011) ==26641== by 0x409304: pjsua_app_destroy (pjsua_app.c:2035) ==26641== by 0x405ADF: main_func (main.c:116) ==26641== by 0x568914: pj_run_app (os_core_unix.c:1930) ==26641== by 0x405B32: main (main.c:129) ==26641== ==26641== This conflicts with a previous read of size 4 by thread #7 ==26641== Locks held: none ==26641== at 0x42A978: worker_thread (pjsua_core.c:691) ==26641== by 0x566E30: thread_main (os_core_unix.c:523) ==26641== by 0x4C30E26: mythread_wrapper (hg_intercepts.c:233) ==26641== by 0x5353181: start_thread (pthread_create.c:312) ==26641== by 0x595400C: clone (clone.S:111)
- Lock order inversion involving rwmutex_lock_read()
Both threads try to acquire read locks, which should be ok. However, on some platforms, read write mutex may be implemented as some regular mutex, and deadlock can occur.
==26641== Thread #7: lock order "0x5EDC798 before 0x5E25638" violated (lock read should be ok) ==26641== ==26641== Observed (incorrect) order is: acquisition of lock at 0x5E25638 ==26641== at 0x4C2FE45: pthread_rwlock_rdlock_WRK (hg_intercepts.c:1549) ==26641== by 0x567DA5: pj_rwmutex_lock_read (os_core_unix.c:1448) ==26641== by 0x46D6A5: pjsip_endpt_process_rx_data (sip_endpoint.c:853) ==26641== by 0x46DAD0: endpt_on_rx_msg (sip_endpoint.c:1036) ==26641== by 0x476652: pjsip_tpmgr_receive_packet (sip_transport.c:1789) ==26641== by 0x4774DC: udp_on_read_complete (sip_transport_udp.c:173) ==26641== by 0x563CFC: ioqueue_dispatch_read_event (ioqueue_common_abs.c:591) ==26641== by 0x565E20: pj_ioqueue_poll (ioqueue_select.c:963) ==26641== by 0x46D47C: pjsip_endpt_handle_events2 (sip_endpoint.c:741) ==26641== by 0x42CE6D: pjsua_handle_events (pjsua_core.c:1833) ==26641== by 0x42A964: worker_thread (pjsua_core.c:694) ==26641== by 0x566E30: thread_main (os_core_unix.c:523) ==26641== ==26641== followed by a later acquisition of lock at 0x5EDC798 ==26641== at 0x4C32145: pthread_mutex_lock (hg_intercepts.c:518) ==26641== by 0x5679DB: pj_mutex_lock (os_core_unix.c:1243) ==26641== by 0x56ED6F: pj_lock_acquire (lock.c:180) ==26641== by 0x56EFED: grp_lock_acquire (lock.c:290) ==26641== by 0x56F471: pj_grp_lock_acquire (lock.c:437) ==26641== by 0x483370: pjsip_tsx_recv_msg (sip_transaction.c:1764) ==26641== by 0x481802: mod_tsx_layer_on_rx_response (sip_transaction.c:872) ==26641== by 0x46D7E6: pjsip_endpt_process_rx_data (sip_endpoint.c:894) ==26641== by 0x46DAD0: endpt_on_rx_msg (sip_endpoint.c:1036) ==26641== by 0x476652: pjsip_tpmgr_receive_packet (sip_transport.c:1789) ==26641== by 0x4774DC: udp_on_read_complete (sip_transport_udp.c:173) ==26641== by 0x563CFC: ioqueue_dispatch_read_event (ioqueue_common_abs.c:591) ==26641== ==26641== Required order was established by acquisition of lock at 0x5EDC798 ==26641== at 0x4C32145: pthread_mutex_lock (hg_intercepts.c:518) ==26641== by 0x5679DB: pj_mutex_lock (os_core_unix.c:1243) ==26641== by 0x56ED6F: pj_lock_acquire (lock.c:180) ==26641== by 0x56EFED: grp_lock_acquire (lock.c:290) ==26641== by 0x56F471: pj_grp_lock_acquire (lock.c:437) ==26641== by 0x48326E: pjsip_tsx_send_msg (sip_transaction.c:1721) ==26641== by 0x486252: pjsip_endpt_send_request (sip_util_statefull.c:117) ==26641== by 0x44E1FB: pjsip_regc_send (sip_reg.c:1410) ==26641== by 0x41DA10: pjsua_acc_set_registration (pjsua_acc.c:2527) ==26641== by 0x418489: pjsua_acc_add (pjsua_acc.c:485) ==26641== by 0x408CC7: app_init (pjsua_app.c:1809) ==26641== by 0x408F1A: pjsua_app_init (pjsua_app.c:1881) ==26641== ==26641== followed by a later acquisition of lock at 0x5E25638 ==26641== at 0x4C2FE45: pthread_rwlock_rdlock_WRK (hg_intercepts.c:1549) ==26641== by 0x567DA5: pj_rwmutex_lock_read (os_core_unix.c:1448) ==26641== by 0x46DB92: endpt_on_tx_msg (sip_endpoint.c:1066) ==26641== by 0x474740: pjsip_transport_send (sip_transport.c:802) ==26641== by 0x470CCF: stateless_send_transport_cb (sip_util.c:1243) ==26641== by 0x471017: stateless_send_resolver_callback (sip_util.c:1344) ==26641== by 0x4727F6: pjsip_resolve (sip_resolve.c:306) ==26641== by 0x46DDB8: pjsip_endpt_resolve (sip_endpoint.c:1148) ==26641== by 0x471132: pjsip_endpt_send_request_stateless (sip_util.c:1388) ==26641== by 0x4840E7: tsx_send_msg (sip_transaction.c:2119) ==26641== by 0x484875: tsx_on_state_null (sip_transaction.c:2349) ==26641== by 0x4832A3: pjsip_tsx_send_msg (sip_transaction.c:1727)
- Pulse Audio related
Status: outside PJSIP's scope
TODO: create a suppression file for this
==26641== Possible data race during read of size 4 at 0x6CEBC28 by thread #1 ==26641== Locks held: none ==26641== at 0x6AB5FE1: pa_once_begin (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so) ==26641== by 0x6AB616A: pa_run_once (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so) ==26641== by 0x6ACAE43: pa_thread_self (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so) ==26641== by 0x666146B: pa_threaded_mainloop_lock (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.16.2) ==26641== by 0x8F573AF: pulse_connect (in /usr/lib/x86_64-linux-gnu/alsa-lib/libasound_module_pcm_pulse.so) ==26641== by 0x8F56C9B: _snd_pcm_pulse_open (in /usr/lib/x86_64-linux-gnu/alsa-lib/libasound_module_pcm_pulse.so) ==26641== by 0x55B36FC: ??? (in /usr/lib/x86_64-linux-gnu/libasound.so.2.0.0) ==26641== by 0x55B3CF5: ??? (in /usr/lib/x86_64-linux-gnu/libasound.so.2.0.0) ==26641== by 0x559982: OpenPcm (pa_linux_alsa.c:544) ==26641== by 0x559A34: FillInDevInfo (pa_linux_alsa.c:580) ==26641== by 0x55A9B8: BuildDeviceList (pa_linux_alsa.c:853) ==26641== by 0x558E8B: PaAlsa_Initialize (pa_linux_alsa.c:266) ==26641== ==26641== This conflicts with a previous write of size 4 by thread #2 ==26641== Locks held: 1, at address 0x5E41130 ==26641== at 0x6AB6093: pa_once_end (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so) ==26641== by 0x6ACAEEC: ??? (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-4.0.so) ==26641== by 0x4C30E26: mythread_wrapper (hg_intercepts.c:233) ==26641== by 0x5353181: start_thread (pthread_create.c:312) ==26641== by 0x595400C: clone (clone.S:111)
- Data race over access to g_last_thread variable.
Status: won't fix. The variable is only used for information purpose only, while mutex locking for logging will be quite expensive.
==26641== Lock at 0x5E24F98 was first observed ==26641== at 0x4C31DDA: pthread_mutex_init (hg_intercepts.c:443) ==26641== by 0x567730: init_mutex (os_core_unix.c:1139) ==26641== by 0x567909: pj_mutex_create (os_core_unix.c:1192) ==26641== by 0x56798B: pj_mutex_create_recursive (os_core_unix.c:1220) ==26641== by 0x42AD68: pjsua_create (pjsua_core.c:823) ==26641== by 0x407E0C: app_init (pjsua_app.c:1288) ==26641== by 0x408F1A: pjsua_app_init (pjsua_app.c:1881) ==26641== by 0x405AA9: main_func (main.c:108) ==26641== by 0x568914: pj_run_app (os_core_unix.c:1930) ==26641== by 0x405B32: main (main.c:129) ==26641== ==26641== Lock at 0x5EDC798 was first observed ==26641== at 0x4C31DDA: pthread_mutex_init (hg_intercepts.c:443) ==26641== by 0x567730: init_mutex (os_core_unix.c:1139) ==26641== by 0x567909: pj_mutex_create (os_core_unix.c:1192) ==26641== by 0x56EB58: create_mutex_lock (lock.c:75) ==26641== by 0x56EBE5: pj_lock_create_recursive_mutex (lock.c:96) ==26641== by 0x56F3D5: pj_grp_lock_create (lock.c:413) ==26641== by 0x481C13: tsx_create (sip_transaction.c:1011) ==26641== by 0x48258F: pjsip_tsx_create_uac2 (sip_transaction.c:1306) ==26641== by 0x482427: pjsip_tsx_create_uac (sip_transaction.c:1270) ==26641== by 0x4861C2: pjsip_endpt_send_request (sip_util_statefull.c:103) ==26641== by 0x44E1FB: pjsip_regc_send (sip_reg.c:1410) ==26641== by 0x41DA10: pjsua_acc_set_registration (pjsua_acc.c:2527) ==26641== ==26641== Possible data race during write of size 8 at 0x8100E0 by thread #7 ==26641== Locks held: 2, at addresses 0x5E24F98 0x5EDC798 ==26641== at 0x570214: pj_log (log.c:421) ==26641== by 0x570675: pj_log_3 (log.c:515) ==26641== by 0x41BB3B: acc_check_nat_addr (pjsua_acc.c:1674) ==26641== by 0x41C996: regc_tsx_cb (pjsua_acc.c:2093) ==26641== by 0x44D7BF: regc_tsx_callback (sip_reg.c:1092) ==26641== by 0x486100: mod_util_on_tsx_state (sip_util_statefull.c:81) ==26641== by 0x4822A9: tsx_set_state (sip_transaction.c:1210) ==26641== by 0x4858B5: tsx_on_state_proceeding_uac (sip_transaction.c:3008) ==26641== by 0x484C09: tsx_on_state_calling (sip_transaction.c:2492) ==26641== by 0x48338B: pjsip_tsx_recv_msg (sip_transaction.c:1765) ==26641== by 0x481802: mod_tsx_layer_on_rx_response (sip_transaction.c:872) ==26641== by 0x46D7E6: pjsip_endpt_process_rx_data (sip_endpoint.c:894) ==26641== ==26641== This conflicts with a previous read of size 8 by thread #1 ==26641== Locks held: none ==26641== at 0x5701E8: pj_log (log.c:419) ==26641== by 0x56CF92: invoke_log (errno.c:223) ==26641== by 0x56D062: pj_perror_imp (errno.c:244) ==26641== by 0x56D127: pj_perror (errno.c:252) ==26641== by 0x4059EE: on_app_started (main.c:32) ==26641== by 0x409040: pjsua_app_run (pjsua_app.c:1915) ==26641== by 0x405ABC: main_func (main.c:110) ==26641== by 0x568914: pj_run_app (os_core_unix.c:1930)
- Data race over cp->used_size variable in pool_caching.
Status: won't fix, the variable is for statistics purpose only.
==26641== Lock at 0x817F90 was first observed ==26641== at 0x4C31DDA: pthread_mutex_init (hg_intercepts.c:443) ==26641== by 0x567730: init_mutex (os_core_unix.c:1139) ==26641== by 0x567909: pj_mutex_create (os_core_unix.c:1192) ==26641== by 0x56EB58: create_mutex_lock (lock.c:75) ==26641== by 0x56EBB3: pj_lock_create_simple_mutex (lock.c:89) ==26641== by 0x571AAE: pj_caching_pool_init (pool_caching.c:81) ==26641== by 0x42ACF9: pjsua_create (pjsua_core.c:815) ==26641== by 0x407E0C: app_init (pjsua_app.c:1288) ==26641== by 0x408F1A: pjsua_app_init (pjsua_app.c:1881) ==26641== by 0x405AA9: main_func (main.c:108) ==26641== by 0x568914: pj_run_app (os_core_unix.c:1930) ==26641== by 0x405B32: main (main.c:129) ==26641== ==26641== Possible data race during write of size 8 at 0x817DA0 by thread #1 ==26641== Locks held: 1, at address 0x817F90 ==26641== at 0x57224E: cpool_on_block_free (pool_caching.c:332) ==26641== by 0x577841: default_block_free (pool_policy_malloc.c:67) ==26641== by 0x571625: reset_pool (pool.c:254) ==26641== by 0x5716C6: pj_pool_destroy_int (pool.c:292) ==26641== by 0x571EDA: cpool_release_pool (pool_caching.c:238) ==26641== by 0x571137: pj_pool_release (pool_i.h:92) ==26641== by 0x409281: app_destroy (pjsua_app.c:2007) ==26641== by 0x409304: pjsua_app_destroy (pjsua_app.c:2035) ==26641== by 0x405ADF: main_func (main.c:116) ==26641== by 0x568914: pj_run_app (os_core_unix.c:1930) ==26641== by 0x405B32: main (main.c:129) ==26641== ==26641== This conflicts with a previous write of size 8 by thread #7 ==26641== Locks held: none ==26641== at 0x57224E: cpool_on_block_free (pool_caching.c:332) ==26641== by 0x577841: default_block_free (pool_policy_malloc.c:67) ==26641== by 0x571625: reset_pool (pool.c:254) ==26641== by 0x5716A7: pj_pool_reset (pool.c:276) ==26641== by 0x477597: udp_on_read_complete (sip_transport_udp.c:221)
- Data race over tcp->pending_connect variable
Status: should be safe. A modification to suppress this would be desirable.
==5950== Lock at 0x6CAE988 was first observed ==5950== at 0x4C2DDF1: pthread_mutex_init (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==5950== by 0x50AEE1: init_mutex (os_core_unix.c:1157) ==5950== by 0x50B789: pj_mutex_create (os_core_unix.c:1211) ==5950== by 0x427C06: pjsua_create (pjsua_core.c:817) ==5950== by 0x408553: app_init (pjsua_app.c:1292) ==5950== by 0x40A780: pjsua_app_init (pjsua_app.c:1885) ==5950== by 0x407A0B: main_func (main.c:108) ==5950== by 0x64F476C: (below main) (libc-start.c:226) ==5950== ==5950== Lock at 0x6D42E58 was first observed ==5950== at 0x4C2DDF1: pthread_mutex_init (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==5950== by 0x50AEE1: init_mutex (os_core_unix.c:1157) ==5950== by 0x50B789: pj_mutex_create (os_core_unix.c:1211) ==5950== by 0x5112BE: create_mutex_lock (lock.c:75) ==5950== by 0x5117DB: pj_grp_lock_create (lock.c:438) ==5950== by 0x511B40: pj_grp_lock_create_w_handler (lock.c:463) ==5950== by 0x46D575: tsx_create (sip_transaction.c:1014) ==5950== by 0x46E2B5: pjsip_tsx_create_uac2 (sip_transaction.c:1309) ==5950== by 0x46F144: pjsip_endpt_send_request (sip_util_statefull.c:103) ==5950== by 0x440F73: pjsip_regc_send (sip_reg.c:1424) ==5950== by 0x418408: pjsua_acc_set_registration (pjsua_acc.c:2579) ==5950== by 0x41C164: pjsua_acc_add (pjsua_acc.c:487) ==5950== ==5950== Lock at 0x6D45728 was first observed ==5950== at 0x4C2DDF1: pthread_mutex_init (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==5950== by 0x50AEE1: init_mutex (os_core_unix.c:1157) ==5950== by 0x50B789: pj_mutex_create (os_core_unix.c:1211) ==5950== by 0x5112BE: create_mutex_lock (lock.c:75) ==5950== by 0x5117DB: pj_grp_lock_create (lock.c:438) ==5950== by 0x463DE7: tcp_create.constprop.3 (sip_transport_tcp.c:683) ==5950== by 0x46441E: lis_create_transport (sip_transport_tcp.c:1021) ==5950== by 0x4606A4: pjsip_tpmgr_acquire_transport2 (sip_transport.c:2040) ==5950== by 0x4165C1: pjsua_acc_get_uac_addr.part.6 (pjsua_acc.c:3151) ==5950== by 0x417FB7: pjsua_acc_create_uac_contact (pjsua_acc.c:3233) ==5950== by 0x418670: pjsua_acc_set_registration (pjsua_acc.c:2334) ==5950== by 0x41C164: pjsua_acc_add (pjsua_acc.c:487) ==5950== ==5950== Possible data race during read of size 4 at 0x6D43E20 by thread #1 ==5950== Locks held: 2, at addresses 0x6CAE988 0x6D42E58 ==5950== at 0x463652: tcp_send_msg (sip_transport_tcp.c:1246) ==5950== by 0x45F93A: pjsip_transport_send (sip_transport.c:833) ==5950== by 0x45B0FD: stateless_send_transport_cb (sip_util.c:1243) ==5950== by 0x45B439: stateless_send_resolver_callback (sip_util.c:1344) ==5950== by 0x45E09E: pjsip_resolve (sip_resolve.c:306) ==5950== by 0x45D16F: pjsip_endpt_send_request_stateless (sip_util.c:1388) ==5950== by 0x46C087: tsx_send_msg (sip_transaction.c:2129) ==5950== by 0x46C223: tsx_on_state_null (sip_transaction.c:2360) ==5950== by 0x46ED35: pjsip_tsx_send_msg (sip_transaction.c:1737) ==5950== by 0x46F1AE: pjsip_endpt_send_request (sip_util_statefull.c:117) ==5950== by 0x440F73: pjsip_regc_send (sip_reg.c:1424) ==5950== by 0x418408: pjsua_acc_set_registration (pjsua_acc.c:2579) ==5950== ==5950== This conflicts with a previous write of size 4 by thread #7 ==5950== Locks held: 1, at address 0x6D45728 ==5950== at 0x46380A: on_connect_complete (sip_transport_tcp.c:1423) ==5950== by 0x509C05: ioqueue_dispatch_write_event (ioqueue_common_abs.c:280) ==5950== by 0x50A82E: pj_ioqueue_poll (ioqueue_select.c:969) ==5950== by 0x459D8A: pjsip_endpt_handle_events2 (sip_endpoint.c:741) ==5950== by 0x426946: pjsua_handle_events (pjsua_core.c:1833) ==5950== by 0x426EB9: worker_thread (pjsua_core.c:696) ==5950== by 0x50BD89: thread_main (os_core_unix.c:541) ==5950== by 0x4C2DC3D: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==5950== ==5950== Address 0x6D43E20 is 416 bytes inside a block of size 6144 alloc'd ==5950== at 0x4C2B87D: malloc (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so) ==5950== by 0x51BEAE: default_block_alloc (pool_policy_malloc.c:46) ==5950== by 0x5130B5: pj_pool_allocate_find (pool.c:60) ==5950== by 0x5131EC: pj_pool_calloc (pool_i.h:69) ==5950== by 0x463BC8: tcp_create.constprop.3 (pool.h:476) ==5950== by 0x46441E: lis_create_transport (sip_transport_tcp.c:1021) ==5950== by 0x4606A4: pjsip_tpmgr_acquire_transport2 (sip_transport.c:2040) ==5950== by 0x4165C1: pjsua_acc_get_uac_addr.part.6 (pjsua_acc.c:3151) ==5950== by 0x417FB7: pjsua_acc_create_uac_contact (pjsua_acc.c:3233) ==5950== by 0x418670: pjsua_acc_set_registration (pjsua_acc.c:2334) ==5950== by 0x41C164: pjsua_acc_add (pjsua_acc.c:487) ==5950== by 0x408DCF: app_init (pjsua_app.c:1813)
- Destroying a locked mutex in sip dialog
Status: won't fix. When calling pj_mutex_destroy() there is actually no requirement that the mutex should be unlocked. If the mutex destroy fails, pj_mutex_destroy() will automatically retry several times by unlocking the mutex first.
==27298== Thread #9 unlocked an invalid lock at 0x6B1B408 ==27298== at 0x4C325C0: pthread_mutex_unlock (hg_intercepts.c:632) ==27298== by 0x578334: pj_mutex_destroy (os_core_unix.c:1413) ==27298== by 0x4F8D7A: destroy_dialog (sip_dialog.c:114) ==27298== by 0x4FA4C9: unregister_and_destroy_dialog (sip_dialog.c:760) ==27298== by 0x4FA94B: pjsip_dlg_dec_lock (sip_dialog.c:896) ==27298== by 0x4FCD4A: pjsip_dlg_on_tsx_state (sip_dialog.c:2030) ==27298== by 0x4FD674: mod_ua_on_tsx_state (sip_ua_layer.c:178) ==27298== by 0x4F4771: tsx_set_state (sip_transaction.c:1210) ==27298== by 0x4F8103: tsx_on_state_completed_uas (sip_transaction.c:3150) ==27298== by 0x4F4591: tsx_timer_callback (sip_transaction.c:1153) ==27298== by 0x58808F: pj_timer_heap_poll (timer.c:643) ==27298== by 0x4DF50A: pjsip_endpt_handle_events2 (sip_endpoint.c:712)