Opened 12 years ago

Last modified 11 years ago

#1573 closed defect

Deadlock in SSL socket (thanks to Jeff Anderson for detailed report) — at Version 1

Reported by: ming Owned by: bennylp
Priority: normal Milestone: release-2.1
Component: pjlib Version: trunk
Keywords: Cc:
Backport to 1.x milestone: release-1.16 Backported: yes

Description (last modified by ming)

Thread 1 is reading data from the SSL socket with this sequence of events:

  1. ioqueue_dispatch_read_event() in ioqueue_common_abs.c locks the ioqueue key’s mutex and calls on_read_complete()
    /* Lock the key. */
    pj_mutex_lock(h->mutex);
    …
    …
    if (h->cb.on_read_complete && !IS_CLOSING(h)) {
        (*h->cb.on_read_complete)…
    }
    
  2. on_data_read calls asock_on_data_read() in ssl_sock_ossl.c. Here it locks the socket’s write_mutex:
    pj_lock_acquire(ssock->write_mutex);
    size_ = SSL_read(ssock->ossl_ssl, data_, size_);
    pj_lock_release(ssock->write_mutex);
    

Thread 2 is writing data to the SSL socket with this sequence of events:

  1. pj_ssl_sock_send() is called which acquires the lock to the socket’s write mutex:
    pj_lock_acquire(ssock->write_mutex);
    
  2. Eventually pj_ioqueue_send() is called from ioqueue_common_abs.c which locks the ioqueue key’s mutex:
    pj_mutex_lock(key->mutex);
    

So in thread 1 we lock the ioqueue key mutex followed by the socket write_mutex. In thread 2 we lock the write_mutex followed by the ioqueue key mutex.

Another deadlock scenario:
It’s the same problem with one thread locking the ioqueue key’s mutex and the ssock.write_mutex in opposite orders:

Thread 1:

  1. In pj_ssl_sock_send() ssl_write is called and the ssl sock’s write_mutex is locked before the call:
    pj_lock_acquire(ssock->write_mutex);
    …
    …
    status = ssl_write(ssock, send_key, data, *size, flags);
    
  2. Call to ssl_write eventually leads to pj_ioqueue_send() being called and the ioqueue key’s mutex is locked (and deadlocked here):
    pj_mutex_lock(key->mutex);
    

Thread 2:

  1. A write event leads to ioqueue_dispatch_write_event being called which locks the ioqueue key’s mutex and calls the write callback:
    pj_mutex_lock(h->mutex);
    …
    if (h->cb.on_write_complete && !IS_CLOSING(h)) {
        (*h->cb.on_write_complete)(h, (pj_ioqueue_op_key_t*)write_op,
        write_op->written);
    }
    
  2. asock_on_data_sent is called which locks the ssl sock’s write_mutex, and here we deadlock on the acquire:
    /* Update write buffer state */
    pj_lock_acquire(ssock->write_mutex);
    

Note that SSL sock creates activesock with whole_data setting set as PJ_TRUE, so this causes the concurrency to be disabled in activesock.c line 219:

    if (asock->whole_data) {
	/* Must disable concurrency otherwise there is a race condition*/
	pj_ioqueue_set_concurrency(asock->key, 0);
    }

Change History (1)

comment:1 Changed 12 years ago by ming

  • Description modified (diff)
Note: See TracTickets for help on using tickets.