Opened 12 years ago

Last modified 11 years ago

#1573 closed defect

Deadlock in SSL socket (thanks to Jeff Anderson for detailed report) — at Initial Version

Reported by: ming Owned by: bennylp
Priority: normal Milestone: release-2.1
Component: pjlib Version: trunk
Keywords: Cc:
Backport to 1.x milestone: release-1.16 Backported: yes

Description

Thread 1 is reading data from the SSL socket with this sequence of events:

  1. ioqueue_dispatch_read_event() in ioqueue_common_abs.c locks the ioqueue key’s mutex and calls on_read_complete()

/* Lock the key. */
pj_mutex_lock(h->mutex);


if (h->cb.on_read_complete && !IS_CLOSING(h)) {

(*h->cb.on_read_complete)…

}

  1. on_data_read calls asock_on_data_read() in ssl_sock_ossl.c. Here it locks the socket’s write_mutex:

pj_lock_acquire(ssock->write_mutex);
size_ = SSL_read(ssock->ossl_ssl, data_, size_);
pj_lock_release(ssock->write_mutex);

Thread 2 is writing data to the SSL socket with this sequence of events:

  1. pj_ssl_sock_send() is called which acquires the lock to the socket’s write mutex:

pj_lock_acquire(ssock->write_mutex);

  1. Eventually pj_ioqueue_send() is called from ioqueue_common_abs.c which locks the ioqueue key’s mutex:

pj_mutex_lock(key->mutex);

So in thread 1 we lock the ioqueue key mutex followed by the socket write_mutex. In thread 2 we lock the write_mutex followed by the ioqueue key mutex.

Another deadlock scenario:
It’s the same problem with one thread locking the ioqueue key’s mutex and the ssock.write_mutex in opposite orders:

Thread 1:

  1. In pj_ssl_sock_send() ssl_write is called and the ssl sock’s write_mutex is locked before the call:

pj_lock_acquire(ssock->write_mutex);


status = ssl_write(ssock, send_key, data, *size, flags);

  1. Call to ssl_write eventually leads to pj_ioqueue_send() being called and the ioqueue key’s mutex is locked (and deadlocked here):

pj_mutex_lock(key->mutex);

Thread 2:

  1. A write event leads to ioqueue_dispatch_write_event being called which locks the ioqueue key’s mutex and calls the write callback:

pj_mutex_lock(h->mutex);

if (h->cb.on_write_complete && !IS_CLOSING(h)) {

(*h->cb.on_write_complete)(h, (pj_ioqueue_op_key_t*)write_op,
write_op->written);

}

  1. asock_on_data_sent is called which locks the ssl sock’s write_mutex, and here we deadlock on the acquire:

/* Update write buffer state */
pj_lock_acquire(ssock->write_mutex);


Note that SSL sock creates activesock with whole_data setting set as PJ_TRUE, so this causes the concurrency to be disabled in activesock.c #219:

if (asock->whole_data) {

/* Must disable concurrency otherwise there is a race condition*/
pj_ioqueue_set_concurrency(asock->key, 0);

Change History (0)

Note: See TracTickets for help on using tickets.