Opened 2 years ago

Closed 2 years ago

#1969 closed defect (fixed)

Crash on using an already destroyed SSL socket

Reported by: riza Owned by: bennylp
Priority: normal Milestone: release-2.6
Component: pjlib Version: trunk
Keywords: Cc:
Backport to 1.x milestone: Backported: no

Description (last modified by riza)

On heavy loaded system with TLS,
one thread could destroy the ssl socket on SSL_ERROR_SYSCALL
while another thread still uses this socket which
was already freed, so we get segfault.

Stack trace:

Stack trace of thread 6110:
#0  0x00007f2f6497a914 __memcpy_sse2_unaligned (
#1  0x00007f2f6601adc6 mem_write (
#2  0x00007f2f66019d6c BIO_write (
#3  0x00007f2f6638b652 ssl3_write_pending (
#4  0x00007f2f6638d833 ssl3_dispatch_alert (
#5  0x00007f2f66389432 ssl3_shutdown (
#6  0x00007f2ed6b154cf destroy_ssl (
#7  0x00007f2ed6b169f7 asock_on_data_read (
#8  0x00007f2ed6b0c018 ioqueue_on_read_complete (
#9  0x00007f2ed6b07af2 ioqueue_dispatch_read_event (
#10 0x00007f2ed6b08ee0 pj_ioqueue_poll (
#11 0x00007f2ed86551d5 pjsip_endpt_handle_events2 (
#12 0x00007f2ed46206c8 monitor_thread_exec (
#13 0x00007f2ed6b09e06 thread_main (
#14 0x00007f2f656a261a start_thread (
#15 0x00007f2f649de59d __clone (

#0  0x00007f66058ca1c0 in ?? ()
#1  0x00007f66b10307bb in BIO_write () from /lib64/
#2  0x00007f66b1363142 in ssl3_write_pending () from /lib64/
#3  0x00007f66b1363a20 in ssl3_write_bytes () from /lib64/
#4  0x00007f664df7d806 in ssl_write (ssock=ssock@entry=0x7f663187a858, send_key=send_key@entry=0x7f6630543040, 
    data=data@entry=0x7f662c4e0768, size=421, flags=flags@entry=0) at ../src/pj/ssl_sock_ossl.c:2499
#5  0x00007f664df7f62d in pj_ssl_sock_send (ssock=0x7f663187a858, send_key=send_key@entry=0x7f6630543040, data=0x7f662c4e0768, 
    size=size@entry=0x7f6646f0b5d8, flags=flags@entry=0) at ../src/pj/ssl_sock_ossl.c:2643
#6  0x00007f664f04d410 in tls_send_msg (transport=0x7f66304b9348, tdata=0x7f6630542fe8, rem_addr=<optimized out>, 
    addr_len=<optimized out>, token=<optimized out>, callback=<optimized out>) at ../src/pjsip/sip_transport_tls.c:1460
#7  0x00007f664f047b8a in pjsip_transport_send (tr=0x7f66304b9348, tdata=tdata@entry=0x7f6630542fe8, 
    addr=addr@entry=0x7f66305431d8, addr_len=addr_len@entry=16, token=token@entry=0x7f6630543c10, 
    cb=cb@entry=0x7f664f043614 <stateless_send_transport_cb>) at ../src/pjsip/sip_transport.c:839
#8  0x00007f664f04395d in stateless_send_transport_cb (token=token@entry=0x7f6630543c10, tdata=tdata@entry=0x7f6630542fe8, 
    sent=<optimized out>, sent@entry=-70002) at ../src/pjsip/sip_util.c:1251
#9  0x00007f664f043b91 in stateless_send_resolver_callback (status=<optimized out>, token=0x7f6630543c10, addr=<optimized out>)
    at ../src/pjsip/sip_util.c:1352
#10 0x00007f664f046883 in pjsip_resolve (resolver=<optimized out>, pool=<optimized out>, target=target@entry=0x7f6646f0b9f0, 
    token=token@entry=0x7f6630543c10, cb=cb@entry=0x7f664f0439a0 <stateless_send_resolver_callback>)
    at ../src/pjsip/sip_resolve.c:348
#11 0x00007f664f0430b7 in pjsip_endpt_resolve (endpt=endpt@entry=0x1c0b5c8, pool=<optimized out>, 
    target=target@entry=0x7f6646f0b9f0, token=token@entry=0x7f6630543c10, 
    cb=cb@entry=0x7f664f0439a0 <stateless_send_resolver_callback>) at ../src/pjsip/sip_endpoint.c:1158
#12 0x00007f664f04537f in pjsip_endpt_send_request_stateless (endpt=0x1c0b5c8, tdata=tdata@entry=0x7f6630542fe8, 
    token=token@entry=0x0, cb=cb@entry=0x0) at ../src/pjsip/sip_util.c:1396
#13 0x00007f664f056dc3 in pjsip_dlg_send_request (dlg=0x7f66a02f3b18, tdata=0x7f6630542fe8, mod_data_id=mod_data_id@entry=-1, 
    mod_data=mod_data@entry=0x0) at ../src/pjsip/sip_dialog.c:1290
#14 0x00007f664f48fbb1 in inv_send_ack (inv=inv@entry=0x7f66a02f4b68, e=e@entry=0x7f6646f0bb60) at ../src/pjsip-ua/sip_inv.c:442
#15 0x00007f664f491eae in inv_on_state_early (inv=0x7f66a02f4b68, e=0x7f6646f0bb60) at ../src/pjsip-ua/sip_inv.c:4392
#16 0x00007f664f48cf79 in mod_inv_on_tsx_state (tsx=0x7f66a030f4f8, e=0x7f6646f0bb60) at ../src/pjsip-ua/sip_inv.c:677
#17 0x00007f664f0574bd in pjsip_dlg_on_tsx_state (dlg=0x7f66a02f3b18, tsx=0x7f66a030f4f8, e=0x7f6646f0bb60)
    at ../src/pjsip/sip_dialog.c:2056
#18 0x00007f664f05833a in mod_ua_on_tsx_state (tsx=<optimized out>, e=<optimized out>) at ../src/pjsip/sip_ua_layer.c:178
#19 0x00007f664f052a0c in tsx_set_state (tsx=tsx@entry=0x7f66a030f4f8, state=state@entry=PJSIP_TSX_STATE_TERMINATED, 
    event_src_type=event_src_type@entry=PJSIP_EVENT_RX_MSG, event_src=0x7f6605933e28, flag=flag@entry=0)
    at ../src/pjsip/sip_transaction.c:1233
#20 0x00007f664f053f30 in tsx_on_state_proceeding_uac (tsx=0x7f66a030f4f8, event=0x7f6646f0bc20)
    at ../src/pjsip/sip_transaction.c:2930
#21 0x00007f664f0552ac in pjsip_tsx_recv_msg (tsx=tsx@entry=0x7f66a030f4f8, rdata=rdata@entry=0x7f6605933e28)
    at ../src/pjsip/sip_transaction.c:1787

There are some issues we identified:

  • Race condition : write method was called on an already destroyed SSL.
  • Race condition : write_mutex destroyed before the call to send.

These issues can be resolved by using the group lock and moving the cleanup code (destroy_ssl() and pj_lock_destroy()) to the group lock destroy method handler.

Note that we can't move the close_sockets() operation to the destroy handler, since we need the pj_grp_lock_dec_ref() be called from pj_activesock_close(). However, we cannot set the ssock->asock to NULL since it will raise asock == NULL assertion.

Thanks to Alexei Gradinari for the report and original patch.

Change History (3)

comment:1 Changed 2 years ago by riza

In 5459:

Re #1969: Fix crash on using an already destroyed SSL socket.

comment:2 Changed 2 years ago by riza

  • Description modified (diff)

comment:3 Changed 2 years ago by riza

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.