Ticket #1974 (closed defect: fixed)
Various fixes for DNS, primarily for IPv6
|Reported by:||ming||Owned by:||bennylp|
|Backport to 1.x milestone:||Backported:||no|
Description (last modified by riza) (diff)
- There is a race condition introduced with the IPv6 DNS patch.
*) Thread A sending a SIP request that needs DNS resolution.
*) Thread B handling DNS query responses in pjsip_endpt_handle_events().
The call to pjsip_resolve() happens under Thread A which sets a dummy query:
query->object6 = (pj_dns_async_query*)0x1;and then calls pj_dns_resolver_start_query() with the dns_a_callback().
However, the dummy query protection is undone because the first thing that pj_dns_resolver_start_query() with the dns_a_callback() does is to set query->object6 to NULL when it is called.
Then in thread B, the dns_a_callback() sees that there isn't an IPv6 query and attempts to send the request immediately even if there may not be any A records in the response. The pending dns_aaaa_callback() can now attempt to use resources that may be destroyed.
- In pjsip_resolve() the pj_dns_resolver_start_query() for the AAAA records should be guarded by a status == PJ_SUCCESS check in case the A records query start fails.
- In both dns_a_callback() & dns_aaaa_callback():
if (srv->count > 0) (*query->cb)(PJ_SUCCESS, query->token, &query->server); else (*query->cb)(query->last_error, query->token, NULL);However, query->last_error is never set, thus in sip_util.c stateless_send_resolver_callback() it will hit the assertion:
pj_assert(tdata->dest_info.addr.count != 0);If assertion in disabled, later in stateless_send_transport_cb() the code line:
cont = (sent > 0) ? PJ_FALSE : (tdata->dest_info.cur_addr<tdata->dest_info.addr.count-1);can result in an underflow.
- Fixed bug of premature app callback invocation for SRV resolver by applying the same method in r5369.
- Remove DNS cache entry from resolver's hash table when app callback has a reference.
In resolver.c update_res_cache(), if app callback has a reference, we only decrement the reference count of the cache entry. However, since it's not removed from the hash table, later the call to pj_hash_set_np() will overwrite this old entry (instead of using the newly allocated cache->hbuf), which will later be freed.
*) A previous cached A record DNS resolution that contains either no answer records or had an error returned in the query response (PJ_STATUS_FROM_DNS_RCODE() is not PJ_SUCCESS).
*) The next attempt to send a SIP request under Thread A will cause pj_dns_resolver_start_query() to find the cached entry and immediately call the cb callback. When the callback returns pj_dns_resolver_start_query() cannot update *p_query because where it points to may no longer be valid because the memory could be freed.
Fixed in r5477
Thank you to Richard Mudgett (Digium) for the report.
- Description modified (diff)
- Summary changed from Various fixes for DNS IPv6 to Various fixes for DNS, primarily for IPv6
Note: See TracTickets for help on using tickets.