= Using APS-Direct and VAS-Direct in PJMEDIA = '''Table of Contents''' [[PageOutline(2-3,,inline)]] The '''APS-Direct''' and '''VAS-Direct''' is our codenames for functionalities to use the hardware codecs that are supported by sound devices e.g. Nokia APS and/or VAS directly, bypassing media processing in PJMEDIA. These features will be introduced gradually beginning in PJSIP version 1.1. APS stands for '''Audio Proxy Server''', and it is available as plug-in for Nokia S60 3rd Edition up to Feature Pack 2 version. This has been deprecated for FP2 devices and above, and it is being replaced by '''VoIP Audio Services (VAS)''', which is available as plug-in for S60 FP1/FP2 devices and will be available built-in in later S60 versions. [[BR]] == Introduction == The Nokia APS and VAS support codecs such as G.711 (PCMA and PCMU), G.729, iLBC, and AMR-NB, though the availability of these codecs may vary according to the handset types. There are significant benefits of using these codecs instead of software codecs (in PJMEDIA-CODEC), with the main benefits are performance (hardware vs software codecs, latency) and the given codec licenses/royalties. Due to these benefits, the ability to use these codecs in PJSIP applications is very desirable. Note that non-APS codecs can still be used as usual, e.g: GSM, Speex/8000. [[BR]] == Concepts == Before starting working with APS-Direct, please make sure you understand the concepts behind APS-Direct so that you can design the application appropriately. The whole point of APS-Direct is to enable end-to-end '''encoded audio format media flow''', that is from microphone device down to network/socket and from network/socket to the speaker device. This may sound obvious, but it has the following serious implications which will impact your application design. === What APS-Direct is really === To use APS-Direct means that you're opening the sound device in codec (i.e. non-PCM) mode. You '''still have the choice''', at run-time, to open the sound device in PCM mode, if you wish, for example to make use of the PCM features in PJMEDIA such as the tone generator, or if you want to use the software codecs such as Speex or GSM in PJMEDIA. Note that if you use PJSUA-LIB then the management of closing/re-opening the sound device using the correct codec will be done by PJSUA-LIB automatically. To use APS-Direct also means that you are restricted to use the [#switchboard audio switchboard] at compile time (audio switchboard will be explained later). This means that you loose the capability of mixing audio with PJSUA-LIB, as well as several other restrictions on your audio routing arrangements. Among other things, you can't have two calls active and connected to the audio device at the same time. You can have more than one calls, but only one can be active (we recommend to put the other call on hold in this case). The sound device can only handle one format at a time. For example, if it is currently opened with G.729 format, you can't reconnect it to any media ports working with different format (e.g: stream or other pjmedia_port set to work with PCM). You must first close it, then re-open it using the same format as target port's format. Note that if you are using PJSUA-LIB, this will be handled automatically (i.e. PJSUA-LIB will close/reopen the sound device with the correct format depending on what is connected to port zero of the [#switchboard audio switchboard]). But still you are limited to only able to connect one media port to the sound device at the same time (e.g. you cannot hear the call and play ringback tone to the sound device simultaneously). === When APS-Direct is activated === APS-Direct is used when passthrough codec is used (and vice versa): - when APS-Direct is used, the sound device emits and takes encoded audio frames (as opposed to PCM audio frames), and the stream needs special codec which handles this. This special codec is called passthrough codec, since it just do (de)packetization and not the actual encoding/decoding. - similarly when passthrough codec is selected in the stream, the stream will emit and take encoded audio frames (rather than PCM frames), hence it needs APS-Direct on the other side. One important thing to note: '''you may still use software codecs such as Speex and GSM even when your application is compiled with APS-Direct support'''. When one of these software codecs is selected to be used by the stream, the stream will work as usual (i.e. emitting and taking PCM audio frames), so the audio device '''must''' be opened in normal/PCM mode (i.e. non-APS-Direct mode). If you are using PJSUA-LIB, then again this will be handled automatically. [[BR]] == Using APS-Direct or VAS-Direct == Currently only APS-Direct is implemented, and here are the steps to build the application with APS- Direct feature. '''Update:''' VAS-Direct is now implemented (currenty it's only available in SVN trunk, it will be in 1.4 and newer). 1. Enable the appropriate sound device implementation. Currently [wiki:APS APS], [wiki:VAS VAS], and native WMME backends support APS-Direct feature. 1. Enable audio switch board, i.e. in config_site.h: {{{ #define PJMEDIA_CONF_USE_SWITCH_BOARD 1 }}} 1. Selectively enable/disable which software codecs to be supported, for example to disable all software codecs: {{{ #define PJMEDIA_HAS_G711_CODEC 0 #define PJMEDIA_HAS_L16_CODEC 0 #define PJMEDIA_HAS_GSM_CODEC 0 #define PJMEDIA_HAS_SPEEX_CODEC 0 #define PJMEDIA_HAS_ILBC_CODEC 0 #define PJMEDIA_HAS_G722_CODEC 0 #define PJMEDIA_HAS_INTEL_IPP 0 }}} 1. Enable passthrough codecs, and selectively enable/disable which passthrough codecs to be supported. The passthrough codecs supported would depend on which codecs are supported by the sound device backend that you choose to use: {{{ #define PJMEDIA_HAS_PASSTHROUGH_CODECS 1 // Disable all passthrough codecs except PCMA and PCMU #define PJMEDIA_HAS_PASSTHROUGH_CODEC_PCMU 1 #define PJMEDIA_HAS_PASSTHROUGH_CODEC_PCMA 1 #define PJMEDIA_HAS_PASSTHROUGH_CODEC_AMR 0 #define PJMEDIA_HAS_PASSTHROUGH_CODEC_G729 0 #define PJMEDIA_HAS_PASSTHROUGH_CODEC_ILBC 0 }}} The following table shows what formats/codecs are supported by the various sound device backends: {{{ Format APS VAS WMME Symbian-MMF PortAudio ---------------------------------------------- linear 1 1 1 1 1 PCMA 1 1 1 PCMU 1 1 1 AMR-NB 1 1 G.729 1 1 iLBC 1 1 }}} Notes: - VAS is to be supported in future release ('''Update:''' it is supported now in SVN trunk.) - the WMME backend points to the native wmme_dev.c implementation and not WMME implementation from !PortAudio. We only use this WMME implementation to test APS-Direct framework on PC. 1. If you are using PJSUA-LIB, then relatively that's all needed to make use of APS-Direct. Please note that the application logic must take care that there can only be one source transmitting to any destination in the switchboard. [[BR]] == Changes == The use of APS-Direct and VAS-Direct is very different than traditional PJMEDIA media processing, with the main difference being the audio frames returned by/given to the sound device are now in encoded format rather than in raw PCM format. The following changes have been done in order to support this. === Support for non-PCM format === Media ports may now support non-PCM media, and this is signaled by adding a new "format" field in the {{{pjmedia_port_info}}}. {{{ typedef enum pjmedia_format_id { PJMEDIA_FORMAT_PCM, PJMEDIA_FORMAT_ALAW, PJMEDIA_FORMAT_ULAW, PJMEDIA_FORMAT_G729, PJMEDIA_FORMAT_AMR_NB, .. } pjmedia_format_id; /** Media format information. */ typedef struct pjmedia_format { pjmedia_format_id id; pj_uint32_t bitrate; pj_bool_t vad; } pjmedia_format; }}} We also need to support passing around non-PCM frames in PJMEDIA. We added support for new frame type (the {{{enum pjmedia_frame_type}}}): '''{{{PJMEDIA_FRAME_TYPE_EXTENDED}}}'''. When the frame's type is set to this type, the {{{pjmedia_frame}}} structure can be typecasted to '''{{{pjmedia_frame_ext}}}''' struct (new): {{{ typedef struct pjmedia_frame_ext { pjmedia_frame base; /**< Base frame info */ pj_uint16_t samples_cnt; /**< Number of samples in this frame */ pj_uint16_t subframe_cnt; /**< Number of (sub)frames in this frame */ /* Zero or more (sub)frames follows immediately after this, * each will be represented by pjmedia_frame_ext_subframe */ } pjmedia_frame_ext; typedef struct pjmedia_frame_ext_subframe { pj_uint16_t bitlen; /**< Number of bits in the data */ pj_uint8_t data[1]; /**< The encoded data */ } struct pjmedia_frame_ext_subframe; }}} The stream must also support non-PCM audio frames in its {{{get_frame()}}} and {{{put_frame()}}} port interface. The stream will set the correct format in its '''pjmedia_port_info''' structure depending on the codec being used (i.e. if passthrough codec is being used, the format will contain non-PCM format information). === Passthrough Codecs === While the actual codec encoding/decoding will be done by the sound device, "dummy" codec instances still need to be created in PJMEDIA: - PJMEDIA needs the list of codecs supported to be offered/negotiated in SDP, - some codecs have special framing requirements which are not handled by the hardware codecs, for example the framing rules of AMR codecs ([http://tools.ietf.org/html/rfc3267 RFC 3267]). Passthrough codecs will be implemented for: PCMA, PCMU, iLBC, G.729, and AMR-NB. === Sound Device API === '''New'''[wiki:Audio_Dev_API Audio Device API] has been introduced. Please see the link for more information. === New Audio Switchboard (the non-mixing conference bridge) === #switchboard Since audio frames are forwarded back and forth in encoded format, the traditional conference bridge would not be able to handle it. A new object will be added, we call this audio switchboard ({{{conf_switch.c}}}) and it's API will be compatible with the existing conference bridge API to maintain compatibility with existing applications/PJSUA-LIB. Understandably some conference bridge features will not be available: - audio mixing feature (no conferencing feature), - audio level adjustment and query (only when the port is using non-PCM format), - passive ports. Some of the features of the switchboard: - uses the conference bridge API to control it (so it is compile-time compatible), - one source may transmit to more than one destinations (though obviously one destination cannot take more than one sources since the switchboard cannot mix audio). This is useful for example to implement call recording feature in the application. - it is optimized for low latency to the sound device (no ''delaybuf'' buffering for the microphone), - much more lightweight (footprint and performance), - supports routing audio from ports with different ''ptime'' settings. [[BR]] == References == Internal documentations: 1. [wiki:APS Using APS in PJSIP] 1. [wiki:VAS Using VAS in PJSIP] 1. '''New''' [wiki:Audio_Dev_API Audio Device API] 1. [wiki:FAQ#symbian Known problems of PJSIP with APS and Symbian target in general] 1. [http://www.pjsip.org/pjmedia/docs/html/index.htm PJMEDIA and PJMEDIA-CODEC Documentation] External links: 1. [http://www.forum.nokia.com/info/sw.nokia.com/id/baae1f23-214a-42e0-96fc-aca65d86bcee/Audio_Proxy_Server_and_VoIP_Audio_Services_v1_0_en.pdf.html Audio Proxy Server and VoIP Audio Services] (PDF presentation) 1. [http://wiki.forum.nokia.com/index.php/Audio_Proxy_Server Audio Proxy Server] - Forum Nokia 1. [http://wiki.forum.nokia.com/index.php/VoIP_Audio_Service_API Nokia VoIP Audio Service API] - Forum Nokia