Opened 8 months ago

Closed 8 months ago

Last modified 8 months ago

#2264 closed defect (fixed)

Potential deadlock between pjsua lock and sip transport's lock

Reported by: ming Owned by: ming
Priority: normal Milestone: release-2.10
Component: pjsip Version: trunk
Keywords: helgrind Cc:
Backport to 1.x milestone: Backported: no

Description

The issue was found using Helgrind.

Although different, there's a degree of similarity with ticket #2260 and #1247, which is that holding a lock before calling pjsip_regc_send() may potentially lead to deadlock, since any send operation will later acquire transport's lock. While during incoming message, the transport's lock will be held when calling the callbacks of the upper layer to process the message, which may then also need to hold the (upper layer's) lock, thus violating the lock ordering and potentially cause a deadlock between rx and tx.

In this case, the two locks involved are pjsua lock and sip transport's lock.
PJSUA_LOCK()->pjsip_regc_send()->acquire transport's lock
ioqueue_dispatch_read()->acquire transport's lock->regc_tsx_callback()->PJSUA_LOCK()

Below is the stack trace reported by Helgrind:

Thread #4: lock order "0x645E888 before 0x64E3B88" violated

Observed (incorrect) order is: acquisition of lock at 0x64E3B88
   by 0x5A154D: pj_grp_lock_tryacquire (lock.c:483)
   by 0x5960CF: pj_ioqueue_trylock_key (ioqueue_common_abs.c:1366)
   by 0x594CC9: ioqueue_dispatch_read_event (ioqueue_common_abs.c:439)
   by 0x597314: pj_ioqueue_poll (ioqueue_select.c:1069)
   by 0x479D40: pjsip_endpt_handle_events2 (sip_endpoint.c:745)
   by 0x433808: pjsua_handle_events (pjsua_core.c:2156)

 followed by a later acquisition of lock at 0x645E888
   by 0x418DD9: PJSUA_LOCK (pjsua_internal.h:593)
   by 0x41F02F: regc_tsx_cb (pjsua_acc.c:2221)
   by 0x4592EA: regc_tsx_callback (sip_reg.c:1104)
   by 0x49A71C: mod_util_on_tsx_state (sip_util_statefull.c:81)
   by 0x496816: tsx_set_state (sip_transaction.c:1272)
   by 0x499EBC: tsx_on_state_proceeding_uac (sip_transaction.c:3115)
   by 0x4991D4: tsx_on_state_calling (sip_transaction.c:2599)
   by 0x4978F7: pjsip_tsx_recv_msg (sip_transaction.c:1832)
   by 0x495CA5: mod_tsx_layer_on_rx_response (sip_transaction.c:893)
   by 0x47A11C: pjsip_endpt_process_rx_data (sip_endpoint.c:938)

Required order was established by acquisition of lock at 0x645E888
   by 0x418DD9: PJSUA_LOCK (pjsua_internal.h:593)
   by 0x420188: pjsua_acc_set_registration (pjsua_acc.c:2689)

 followed by a later acquisition of lock at 0x64E3B88
   by 0x5A1533: pj_grp_lock_acquire (lock.c:478)
   by 0x596092: pj_ioqueue_lock_key (ioqueue_common_abs.c:1358)
   by 0x595D35: pj_ioqueue_connect (ioqueue_common_abs.c:1220)
   by 0x59D915: pj_activesock_start_connect (activesock.c:936)
   by 0x489832: lis_create_transport (sip_transport_tcp.c:1053)
   by 0x484CEB: pjsip_tpmgr_acquire_transport2 (sip_transport.c:2457)
   by 0x47A7CD: pjsip_endpt_acquire_transport2 (sip_endpoint.c:1246)
   by 0x47D3D0: stateless_send_transport_cb (sip_util.c:1181)

 Lock at 0x645E888 was first observed
   by 0x598F8C: pj_mutex_create_recursive (os_core_unix.c:1258)
   by 0x4310E7: pjsua_create (pjsua_core.c:950)

 Lock at 0x64E3B88 was first observed
   by 0x5A1435: pj_grp_lock_create (lock.c:438)
   by 0x5A14D2: pj_grp_lock_create_w_handler (lock.c:463)
   by 0x4889EC: tcp_create (sip_transport_tcp.c:684)
   by 0x4897D9: lis_create_transport (sip_transport_tcp.c:1045)
   by 0x484CEB: pjsip_tpmgr_acquire_transport2 (sip_transport.c:2457)

Change History (2)

comment:1 Changed 8 months ago by ming

  • Owner set to ming
  • Resolution set to fixed
  • Status changed from new to closed

In 6142:

Fixed #2264: Potential deadlock between pjsua lock and sip transport's lock

comment:2 Changed 8 months ago by ming

In 6160:

Re #2264: Fixed crash if pjsua_var.acc[acc_id].regc is NULL
Note that the regc instance itself hasn't been destroyed since the refcount has been incremented, but acc->regc can already be NULL-ed.

Note: See TracTickets for help on using tickets.