Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#1617 closed defect (fixed)

Major synchronization fixes in PJNATH

Reported by: bennylp Owned by: bennylp
Priority: normal Milestone: release-2.1
Component: pjnath Version: trunk
Keywords: Cc:
Backport to 1.x milestone: Backported: no

Description (last modified by bennylp)

Overview

Many problems and workarounds have been applied in PJNATH in attempt to fix synchronization issues, such as:

  • #1610 (v2.0.5): Workaround for reported crash on stun_sock's on_data_recvfrom() callback
  • #1604 (v2.0.5): Crash caused by double destructions of ICE stream transport
  • #1594 (v2.0.5): Deadlock between TURN and ioqueue locks
  • #1557 (v2.0.5): Assertion when TURN session is already destroyed
  • #1551 (v2.0.5): Assertion in TURN code when shutdown or destroy is called more than once
  • #1548 (v2.0.5): Crash due to racing condition in timer when call is disconnected quickly

This ticket contains works to fix various synchronization issues in PJNATH. This ticket depends on #1616.

The overview of works to be/being done by this ticket are as follow:

  • Modify all STUN objects to use group lock (see Group Lock).
  • Use single group lock for all objects in the same group (e.g. within the ICE strans object). These include the ICE strans, ICE session, STUN socket and session, TURN socket and session, and down to ioqueue key and timer heap entry.
  • Carefully construct the proper shutdown routine for each object. For some objects, this means to split the destroy process into two stages: the shutdown routine to stop the operation of the object (this is done by the object's destroy() API), and the actual release of the pool (this is done in the group lock's destructor handler).
  • Create new concurrency stress test routine in PJNATH-TEST program.

API Changes

Some APIs have been "enhanced" and some have been changed. These API changes were considered necessary to make the group lock usage more explicit. However since these only occur in PJNATH, it should only affect applications that directly use them (typical PJSUA-LIB apps will not be affected). The details are as follow.

STUN Session

  • pj_stun_session_create() has been changed; new group lock parameter added.


STUN Socket

  • new group lock field in pj_stun_sock_cfg. If the STUN socket is created by ICE, ICE will fill this value itself.

STUN Transaction

  • pj_stun_client_tsx_create() has been changed; new group lock parameter added.
  • pj_stun_client_tsx_destroy() was renamed to pj_stun_client_tsx_stop() to more reflect the fact that the operation doesn't destroy anything but rather just stop it from operating.

TURN Socket

  • new group lock field in pj_turn_sock_cfg. If the TURN socket is created by ICE, ICE will fill this value itself.

Change History (6)

comment:1 Changed 7 years ago by bennylp

  • Resolution set to fixed
  • Status changed from new to closed

In 4360:

Fixed #1617: major synchronization fixes in PJNATH with incorporation of group lock to avoid deadlock and crashes due to race conditions

comment:2 Changed 7 years ago by bennylp

  • Description modified (diff)

comment:3 Changed 7 years ago by bennylp

In 4368:

Re #1617: prevent TURN session from sending anything once it is in deallocating state

comment:4 Changed 7 years ago by bennylp

  • Description modified (diff)

comment:5 Changed 7 years ago by riza

In 4372:

Re #1617: added concur_test.c to visual studio pjnath_test project

comment:6 Changed 7 years ago by ming

In r4413:

Re #1617: Fixed assertion trying to release group lock when STUN transaction is already destroyed in the callback

Note: See TracTickets for help on using tickets.