Chapter 2 Session Initiation Protocol (SIP)

(1)

Session Initiation Protocol

(SIP)

According to the definition in RFC 3261[72], Session Initiation Protocol (SIP) is “an application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. These sessions in-clude Internet telephone calls, multimedia distribution, and multimedia con-ferences.”

SIP is an application-layer protocol: this means that all the logic is inside the endpoints, requiring no modifications to the already consolidated lower-level protocols.

SIP is a session signaling protocol: it aims at managing sessions, intended as active connections between to or more endpoints.

What makes this protocol a convenient choice is that it is simple, ex-tensible and it is an open standard, thus allowing easy interoperability. Its simpleness derives from HTTP; in fact it shares with it the design principles. It is human readable and based on request/response message exchanges.

SIP signaling is also referred to as an easy to implement and parse pro-tocol. This was definitely true at the beginning of his history; anyway its complexity has grown over time, due to the various extensions that have been added to it.

(2)

Univer-sity) and Mark Handley (UCL) starting in 1996. It was developed inside IETF1 _{and the first proposed standard version (SIP 2.0) was defined in RFC}

2543[39] (1999). In November 2000, SIP was accepted as a 3GPP2 _signaling

protocol and permanent element of the IMS architecture. The protocol was further clarified in RFC 3261[72] (year 2002). This is the reference document for SIP, even though many convenient features are described in later RFCs and drafts.

INVITE sip:[email protected] SIP/2.0

Via: SIP/2.0/UDP pc33.atlanta.com;branch=z9hG4bK776asdhds Max-Forwards: 70

To: Bob <sip:[email protected]>

From: Alice <sip:[email protected]>;tag=1928301774 Call-ID: [email protected]

CSeq: 314159 INVITE

Contact: <sip:[email protected]> Content-Type: application/sdp

Content-Length: 142

Figure 2.1: Example of SIP message

2.1 Purpose of SIP

It has to be noted that SIP is not just a protocol for doing VoIP. As its name suggests, its aim is to manage sessions through signaling. As already stated, a session is an active application-level connection between two or more endpoints. The purpose and meaning of this connection depends on what is flowing inside it: examples can be a chat session, a gaming session, or shared blackboard.

SIP supports five aspects of establishing and terminating multimedia com-munications:

1. user location: determination of the end system to be used for commu-nication;

1_{Internet Engineering Task Force}

2_{3rd Generation Partnership Project: a co-operation that has the scope to make a}

globally applicable third generation (3G) mobile phone system specification within the scope of the ITU’s IMT-2000 project

(3)

2. user availability: determination of the willingness of the called party to engage in communications;

3. user capabilities: determination of the media and media parameters to be used;

4. session setup: establishment of session parameters at both called and calling party;

5. session management : including transfer and termination of sessions, modifying session parameters, and invoking services.

SIP dictates no protocol to be used inside a session. Anyway, the most common use (and also the use that is being considered in this document) is to describe audio and video sessions: SIP packets convey SDP (Session Description Protocol, [38]) payloads, which describe RTP (see 1.5.1) and RTCP (see 1.5.2) flows.

This design of the protocol is typical of IETF, which gives top priority to reuse of already defined protocols; this is also a major difference between SIP and H.323, which was indeed designed by ITU.

2.2 Transport Protocol

SIP is independent of the underlaying transport protocol: it can run on UDP, TCP, TLS or SCTP3, but it is not restricted to them. Because it makes no assumption on the transport protocol, SIP implements its own retransmission mechanism to recover from lost packets. IANA4_{-assigned port numbers are}

5060 for UDP, TCP and SCTP, and 5061 for TCP over TLS.

In most cases, however, UDP is desirable. In fact, being it a connection-less protocol, it has not to pay the cost of TCP connection establishment and needs no persistent status to be maintained inside network nodes; this guarantees faster response times, allows deploying stateless SIP proxies and 3_{SCTP (Stream Control Transmission Protocol)[81] is a reliable transport protocol}

operating on top of a connectionless packet network such as IP.

(4)

reduces the load on intermediate network elements (in particular NAT de-vices).

Nevertheless, for the cases where TCP is needed (e.g. TLS), an extension has been proposed[46] to allow reuse of already established connection.

Noteworthy is that the recent RFC 4347[60] defines a protocol called Datagram Transport Layer Security (DTLS ), based on TLS but applying to UDP, allowing communication privacy also on datagram streams.

2.3 SIP resources

Resources that can be addressed with SIP are of different types. Examples include the followings:

• a user of an on-line service;

• an appearance on a multi-line phone; • a mailbox on a messaging system; • a PSTN number at a gateway service;

• a group (such as ”sales” or ”helpdesk”) in an organization.

Such communication resources are identified by a URI (Uniform Resource Identifier ), defined in RFC 2396[6]. The general form of a SIP URI is:

sip:user:password@host:port;uri-parameters?headers

The user part identifies the particular resources at host, which is usually a fully qualified domain name, but can also be an IP address (a port can be specified). Additional parameters can be included, separated by a semicolon, in the form:

parameter-name "=" parameter-value

An example of a common URI parameter is transport=udp. Typical SIP URIs can be:

sip:[email protected]

(5)

2.3.1 Address Of Record

At different times, a SIP user can be available on different devices at different IP addresses. For this reason, resources are advertised with a generic SIP URI called Address Of Record (AOR). Such URI is usually easy to remember and points to a domain with a location service that can map the URI to another URI where the user might be available at that specific time. An AOR is frequently thought of as the “public address” of the user.

The association between the AOR and the real URI of the user is usually stored in the location service through a REGISTER request (see 2.7.1).

2.4 SIP architecture

SIP is designed as a layered protocol: each layer has some processing func-tions and is loosely coupled with the other layers. Those layers are, starting from the bottom one:

1. syntax and encoding: describes the structure of a SIP message, specified in augmented Backus-Naur Form;

2. transport layer : defines how a client sends requests and receives re-sponses and how a server receives requests and sends rere-sponses over the network; all SIP elements contain a transport layer;

3. transaction user (TU ): is aware of transactions (see 2.6) and cre-ates/destroys them; all SIP entities except stateless proxies are TUs.

2.4.1 SIP entities

What differentiates each SIP entity from the others is the core, which is a Transaction User (except for the stateless proxy).

User Agent (UA)

User Agents (UAs) are endpoints that use the SIP protocol to find each other and to negotiate session characteristics. UAs can be physical devices (like

(6)

desk phones, mobile phones, PDAs, etc.) or software applications (which run, for example, in a PC) that interact with a human user, but also services like a PSTN gateway, a voice message box, an IVR5 _{and so on.}

Each UA acts as two different logical entities known as:

• User Agent Client (UAC ): it creates request messages and use the transaction state machinery to send it and waits for the response; • User Agent Server (UAS ): it accepts requests from a UAC and

gener-ates a response, which can accept, reject, or redirect the request.

Two particular kinds of SIP UA are:

• Back-to-back User Agent (B2BUA): it receives requests as a UAS and forwards them as a UAC; unlike a proxy server, it maintains dialog state and must participate in all requests sent on the dialogs it has established; it is typically used to hide sensitive data from inside a LAN (such as IP addresses) or to implement ALGs6;

• Gateway: its purpose is to interface SIP with another protocol (like H.323) or another infrastructure (like a PSTN).

Proxy Server

A proxy server receives SIP requests and forwards them on behalf of the requester. It is used primarily for routing purposes: its aim is to forward the message to another entity “closer” to the targeted user. A proxy often has access to the location database of its domain; in this case, it is able to contact a UA that has previously registered7_.

Moreover, a proxy can be used to enforce some policy (user authentica-tion, security checks, call barrings, etc.).

There are 3 kinds of SIP proxies: 5_{Interactive Voice Response}

6_{Application Level Gateway}

7_{For more details on the user location process, see the INVITE method description in}

(7)

• call stateful proxy: it maintains the status of each dialog until the dialog is destroyed;

• transaction stateful proxy: it maintains the client and server transaction state machines; this is the most common type of SIP proxy server; it is able to handle retransmissions and redirections; moreover it is able to fork requests to multiple target UAs;

• stateless proxy: it is little more than a message forwarder and can be used to implement some form of load balancing or simple message rewriting; as it is not keeping any state, it is the most scalable type of proxy.

Redirect Server

It receives request messages and sends back a list of alternative URIs in a 3xx-class response (for example: 301 Moved Permanently or 302 Moved Temporarily). Those URIs can be used by the UAC to get closer to the target UAS. A redirect server can point the UAC to another proxy or directly to the target user, in case it has access to the location database.

It is used mainly to reduce the load of routing requests, pushing back routing informations to the requester. It improves scalability and signaling path robustness.

Registrar

A registrar is the front-end to the user location service. This entity receives registration messages from a UA and stores a binding between the AOR of the user and its contact URI in the location database. The contact URI represents the actual user location (e.g. IP address and port).

(8)

2.5 SIP messages

Like HTTP, all SIP messages are either requests from a client to a server or responses to a request. The messages are formatted according to RFC 2822[61]. For all messages, the general format is:

1. start line (request or response line) 2. some header fields

3. an empty line

4. an optional message body

Each line must end with a carriage return-line feed (CRLF).

2.5.1 Requests

The first line of a SIP request is in the form: METHOD Request-URI SIP/2.0

RFC 3261 defines six types (methods) of request (listed in table 2.1); anyway, it allows extensions to specify new methods.

Method Description

INVITE Indicates that a user or service is being invited to participate in a call session

ACK Confirms that the client has received a final response to an INVITE request

BYE Terminates a call and can be sent by either the caller or the callee

CANCEL Cancels any pending searches but does not terminate a call that has already been accepted

OPTIONS Queries the capabilities of servers

REGISTER Registers the address listed in the To header field with a SIP server

Table 2.1: SIP methods

Request-URI is a generic URI that indicates the user or service to which this request is being addressed.

(9)

2.5.2 Responses

The first line of a SIP response is in the form:

SIP/2.0 Status-Code Reason-Phrase

RFC 3261 defines six classes of status codes (listed in table 2.2). Status code Description

1xx Informational Responses 2xx Successful Responses 3xx Redirection Responses 4xx Client Failure Responses 5xx Server Failure Responses 6xx Global Failure Responses

Table 2.2: SIP responses

2.5.3 Header fields

Among all the SIP headers that can follow the request line, To, From, Call-ID, CSeq and Max-Forwards are mandatory. Table 2.3 is a review of the headers that are relevant for this work.

2.6 Message routing

2.6.1 Transactions

Interaction between SIP components take place in a series of independent messages exchanges, called transactions. A transaction starts with a request, may include some provisional responses (1xx) and ends with a final response. INVITE transactions also include a final ACK message from the client to the server, if the final response is not of class 2xx.

SIP requires that responses follow the reverse path of the request, i.e. they must traverse the same SIP network elements the request traversed, but in reverse order. To achieve this, at every hop of the request a Via header is added to the message before the previous ones; in this way, when the

(10)

Header Description

To specifies the logical recipient of the request, usually the AOR of the user or resource that is target of this request From indicates the logical identity of the initiator of the

re-quest, possibly the user’s AOR

Call-ID is a unique identifier to group together a series of mes-sages; it is generated by the UAC

CSeq used to order the transactions and identify retransmit-ted messages; it consists of a sequence number and a method name

Contact consists of a SIP URI that can be used to contact that specific instance of the User Agent for subsequent re-quests; it must be unique inside requests

Via indicates the transport used for transmitting the request and identifies the address to which the response is to be sent; it is used to ensure that the response follows the reverse of the path of the request; see routing of SIP messages in section 2.6

Max-Forwards maximum number of times a message can be forwarded: used to prevent infinite loops

Supported list of tags identifying the extensions the sender supports Require list of tags identifying the extensions required for

pre-cessing the request

Table 2.3: Major SIP headers

request arrives to the UAS, it contains a log of all the SIP network elements it traversed. When the UAS sends the response, it copies the Via lines from the request, preserving the order; then it sends the response to the address in the first line. At each hop, the top Via is popped from the list and the packet is sent to the next address.

2.6.2 Dialogs

Transactions may flow inside dialogs. According to RFC 3261, “a dialog rep-resents a peer-to-peer SIP relationship between two user agents that persists for some time”.

(11)

of messages with a particular sessions and proper routing of them between endpoints. Dialogs are established by dialog-creating requests, such as IN-VITE. The UAC adds a tag parameter to the From header field in the request message; on the other side, the UAS adds a tag parameter to the To header in the response message. Such newly created dialog is univocally identified by the Call-ID value and the two tags, which together form the dialog ID. Subsequent messages in this dialog will always have these three values.

Each dialog has a state, which if composed of:

• dialog ID ;

• local sequence number, used to order request from the UA to its peer;

• remote sequence number, used to order requests from its peer to the UA;

• local URI, set on the basis of the From (for the UAC) or To (for the UAS) header fields;

• remote URI, set on the basis of the To (for the UAC) or From (for the UAS) header fields;

• remote target, set to the value of the remote Contact header;

• secure flag, set to true if the dialog-creating transaction was sent inside an encrypted connection;

• route set, an ordered set of URIs taken from the Record-Route header (explained below).

2.6.3 Direct connectivity

Usually a UAC does not know how to reach a UAS associated with a specific AOR. For this reason a UAC contacts a proxy server8 _{in order to deliver the}

dialog-creating request. Such proxy has access to the database that stores 8_{Procedures for locating SIP servers are described in RFC 3263[70].}

(12)

the associations between AOR and actual URI, and forwards the request to the target UA.

After the message exchange of the dialog-creating transaction, both the UAs know a valid transport (from Contact header) to reach their peer, so they should send subsequent messages directly, without traversing SIP prox-ies. UAA PRA PRB UAB INVITE INVITE INVITE 200OK 200 OK 200 OK ACK MEDIA INVITE 100 Trying INVITE 100 Trying INVITE 180 Ringing 180 Ringing 180 Ringing .. . 200 OK 200 OK 200 OK ACK MEDIA

Figure 2.2: SIP “trapezoid”

Typically UAs are not able to perform the complex DNS queries defined in RFC 3263; for this reason, they rely on a default outbound proxy to locate the proxy of the target user. The message flow that originates is the so-called SIP “trapezoid” in figure 2.2. The messages of the first transaction

(13)

flow through the proxies, while all subsequent messages, starting from the ACK, are delivered directly.

2.6.4 Record-Routing

On the contrary, a SIP proxy may want that all signaling pass through it; this could be true for reasons of security or call logging/billing, but also in cases where direct connectivity between peers is not available (as described in section 2.9).

In this case, the SIP proxy can add itself to the top of the list in the Record-Route header in the request message of the dialog-creating transac-tion. This will force the UAS to store this list in the route set of the dialog state, and to copy the Record-Route header in the response message. Then, the UAC will store the same route set, but in reverse order.

When a UA needs to send a request inside a dialog and the route set is not empty, it must place the list of those hosts in the Route header, forcing the request message to traverse the specified hops.

2.7 SIP requests

2.7.1 REGISTER

The REGISTER method is used to store and fetch bindings in the location service for a particular domain. Those bindings associate an AOR URI with one or more contact addresses.

When a UA wants to register itself, it must contact the registrar SIP server, which has write access to the location database. The REGISTER re-quest is formed by placing the AOR to be registered in the To header, the AOR of the registering UA in the From header and the contact URI in the Contact header.

Usually To and From URIs are the same, except in third-party regis-trations. Moreover the UAs may indicate how long the binding should be considered valid, specifying a duration in seconds in the Expire header or in

(14)

the expire tag of the Contact header. The default value for binding dura-tion is 3600 seconds. To lengthen the validity of the binding, the UA should send another REGISTER request before expiration of the previous one.

In response to a valid REGISTER request the registrar should answer with 200 OK and include all the current bindings for that AOR, along with their updated expiration time. Note that this could result in multiple Contact lines in the response message.

Typical failure responses are:

• 401 Unauthorized: the registrar requires authentication9_;

• 403 Forbidden: the user is not authorized to modify that binding;

• 404 Not Found: the specified AOR is not valid for that domain.

A UA can also use a REGISTER request without the Contact header to just fetch the status of the bindings stored in the location database.

2.7.2 INVITE, ACK and BYE

The INVITE method is used to establish a session between two UAs. This re-quest initiate a dialog-creating transaction, and is routed according to section 2.6.

An INVITE request can receive the following common provisional answers:

• 100 Trying: from intermediate SIP nodes, in order to stop retrans-missions of the INVITE;

• 180 Ringing: from the remote UA, to indicate that the user is being notified of the incoming call, but has not answered yet;

Provisional messages are not sent reliably. An extensions has been defined in RFC 3262[69] in order to allow reliable delivery of provisional messages.

The transaction is ended with a final response. Among all the possible responses there are:

(15)

• 200 OK: the call has been accepted; the dialog has been created;

• 404 Not Found: the requested user does not exist;

• 486 Busy Here: the remote user is busy;

• 487 Request Terminated: the request was terminated by a BYE or CANCEL request.

The reception of a final response by the UAC is notified by sending an ACK message to the UAS. Note that this message is a new transaction (even though without response) if the final response indicates success (2xx), oth-erwise it is part of the INVITE transaction.

A session is terminated by a BYE request. If there are no more dialog usages10 _{in that dialog, its status is deleted.}

2.7.3 SUBSCRIBE and NOTIFY

In RFC 3265[63] a SIP extension is defined in order to allow request notifi-cation from remote nodes indicating that certain events have occurred. For this purpose two new methods has been defined: SUBSCRIBE to request noti-fications, and NOTIFY to report a change of state. An example of the usage of this mechanism is presence subscription, defined in RFC 3856[64].

A SUBSCRIBE message is sent by a SIP node that wants to request the current state and state updates from a remote node. The duration of the subscription must be indicated in the Expires header. The event type the sending node is interested in is specified in the Event header.

When a node receives a subscription and wants to honor it, it replies with a 200 OK response (or 202 Accepted if it can not activate the subscription immediately); then, it sends a NOTIFY message, containing the current status that was requested.

Each NOTIFY message contains a Subscription-State header that can assume one of the following values:

• active: the subscription is active; 10_{Dialog usages are discussed in section 2.9.1.}

(16)

UAA UAB SUBSCRIBE 200 OK NOTIFY 200 OK .. . NOTIFY 200 OK .. . SUBSCRIBE 200 OK NOTIFY 200 OK Subscription creation Expires: 3600 Subscription termination Expires: 0

Figure 2.3: SUBSCRIBE/NOTIFY flow

• pending: the subscription has been accepted, but is not effective yet;

• terminated: the subscription has been deactivated; for example when the subscription expires, a NOTIFY with this state is sent, with the additional parameter reason=timeout.

2.8 Authentication

When terminals interact in an open network, one of the major issues is secu-rity. In fact, if no authentication procedures are defined, a user could provide a false identity.

Authentication is important for:

(17)

calls for another user;

• proxies and redirect servers: a provider may want to do access control to its services;

• UAs: terminals want to be sure of the identity of their counterpart. SIP defines a stateless, challenge-based authentication mechanism similar to HTTP authentication described in RFC 2617[29].

In detail, when a registrar receives a request that it want to authenticate, it responds with a 401 Unauthorized message, conveying the challenge in the WWW-Authenticate header field. Then the UAC re-issues the request again adding an Authorization header which contains the response to the challenge, giving proof of its identity.

UAA _registrar REGISTER 401 Unauthorized WWW-Authenticate: Digest realm="biloxi.com", nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093" REGISTER Authorization: Digest username="bob", realm="biloxi.com", nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", response="6629fae49393a05397450978507c4ef1" 200 OK

Figure 2.4: Authentication message flow

The same procedure applies to proxies, except that the authentication request message is 407 Proxy Authentication Required and the involved SIP headers are Proxy-Authenticate and Proxy-Authorization.

RFC 3261 recommends usage of the Digest Authentication Scheme in-stead of the Basic Authentication Scheme specified in RFC 2543. In fact,

(18)

the latter consists in sending the credentials encoded in Base-6411 _form.

A client using the Digest Authentication Scheme, on the other side, does not send its credentials over the network, but sends their MD512 _hash,

to-gether with the challenge provided by the server. In this way, the server, which also knows the credentials of the UA, can perform the same calcula-tions and verify its integrity.

2.9 SIP issues

Despite being widely used, SIP still suffers from some issues that have not an adequate solution yet. These include some unclearness about dialog usages and proper support for multi-homing (like for dual-stack IPv4/IPv6 nodes) and NAT traversal.

Those issues will be briefly introduced here; than, in chapter 6, they will be deeply analyzed and an effective solution will be proposed.

2.9.1 Dialog usages

In the first specifications of SIP, the only method that could create a dia-log was INVITE. Later extensions added new diadia-log-creating requests, like SUBSCRIBE. However, when a dialog already exists between two endpoints, a dialog-creating transaction can be sent inside such dialog, leading to the creation of a so-called dialog usage instead of a new dialog.

Such behavior is convenient, as it can exploit the existing connection between the UAs and does not require to traverse again a SIP proxy for user location purposes. Anyway, it arises some unclear interpretations of the standard. In particular it is not well-defined how failure responses, timeouts and refresh messages on one dialog usage should affect the other usages.

IETF Internet draft [79] goes deep into analysis of such ambiguous situa-tions and proposes a reasonable interpretation of the standard. However, the authors suggest to avoid creating multiple dialog usages as much as possible.

11_{A format for encoding data in a non human-readable form, proposed in RFC 2045.} 12_{Message Digest algorithm 5, defined in RFC 1321.}

(19)

Nevertheless, in this work, in section 6, a solution will be proposed and analyzed, which, as a side effect, overcomes also this problem by creating a new kind of dialog-wrapping association between endpoints.

2.9.2 Multi-homed hosts

Multi-homed hosts are hosts that have more than one valid IP address to connect to the network. These IP addresses could be assigned to different physical network interfaces, because of redundancy or because the host is acting as a gateway between two or more LAN. Multiple IP addresses could also derive from IP aliasing or connection to a VPN13_{. One particular case}

of multi-homed host is a dual-stack IPv4/IPv6 node.

SIP have no support for multi-homed UAs. In fact, event though multiple URI can be registered in association with the same AOR, only one Contact header can be inserted in a request message.

IPv6 support

Even though SIP claims to support IPv6, actually it does not support dual-stack IPv4/IPv6 hosts in the best way.

RFC 3484[27] proposes an algorithms to select the best IPv6/IPv4 to use when contacting a proxy, but anyway SIP allows only one Contact header in request messages; in this way, a UA can advertise either its IPv4 or IPv6 address, without prior knowledge of whether the remote party supports that IP version, or communication using that address will be possible.

A later draft [17] specifies how UAs and proxies should behave in a mixed IPv4/IPv6 environment. Anyway it suggests having a proxy which makes the necessary address rewriting and Record-Routes all the requests; this solution, however, looses end-to-end connectivity between UAs, increases the load on the proxy and reduces the robustness of the communication (the failure of that proxy would compromise the dialog).

(20)

2.9.3 NAT support

Network Address Translation (NAT, also called Masquerading) is the process of rewriting the IP source and/or destination addresses of IP packets as they pass through a router or firewall.

UAs insert IP addresses in SIP messages, but those addresses may not be valid if the packet traverses a NAT device. In particular responses, which are routed according Via header, may not be sent to the correct address. This problem was solved by RFC 3581[71] with ensures “Symmetric Response Routing”. The source IP address, taken from the IP header, is stored in the Via header (received and rport tags) of the request, and used later to route the response message (see paragraph 4.3.1).

Moreover, draft [42] address the problem of ensuring the reachability of a UA behind NAT, by always keeping alive a connection between the UA and an outbound proxy. However, also in this case, the solution requires Record-Routing messages and forcing them through the outbound proxy (see paragraph 4.3.2).

A more detailed analysis of NAT issues is done in chapter 4.

2.10 SIP and VoIP

Although SIP is a general purpose session establishment protocol, its straight-forward and mostly deployed usage is for VoIP calls (or, generally, for mul-timedia sessions).

In relation to VoIP, SIP has the role of delivering the offer and the answer containing the description of the multimedia session the two UA would like to establish. To encode the such information about the session, the Session Description Protocol (SDP ) is used.

2.10.1 SDP

Session Description Protocol (SDP ) is an offer/answer format intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. It is

(21)

defined in RFC 4566[38], while examples of SDP exchanges are provided in RFC 4317[43].

SDP provides a textual description of the session to be established, inde-pendently of the actual transport protocol; however, it is usually employed to describe RTP and RTCP flows (section 1.5).

Although SDP was designed with multicast support, here we are inter-ested only in its support for unicast sessions. Description of such a session includes the following information:

• session name and purpose;

• time(s) the session is active;

• the media comprising the session;

• information needed to receive those media (addresses, ports, formats, etc.);

• information about the bandwidth to be used by the session;

• contact information for the person responsible for the session.

SDP is a textual protocol hence its sessions are described using the ISO 10646 character set in UTF-8 encoding, while field and attribute names are specified using the US-ASCII subset of UTF-8. An SDP session description consists of a number of lines of text of the form type=value, where type is always one-character long and case-significant, while value is a structured text.

SDP description divides into two sections:

1. session-level description: contains parameters which apply to the whole session; it starts with a v= line;

2. media-level description: contains parameters which apply to each single media; each media-level description start with a m= line and can override some session parameters.

(22)

• v=<protocol version>: current version is 0;

• o=<username> <session id> <version> <network type> <address type> <address>: this line specifies the name of the user who is creat-ing the session and a randomly generated id for the session; <version> is the sequence number of the session announcement; <network type> is usually IN, while <address type> can be IP4 or IP6; <address> is the address of the UA participating in the session;

• s=<session name>: an arbitrary name for the session;

• c=<network type> <address type> <address> (optional if specified in all media descriptions): specifies the address used to connect to this UA; the fields of this line follow the same convention as the o= line;

• a=<attribute>[:<value>] (optional): specifies some additional at-tributes, that depend on the session and media characteristics; one common attribute is used to specify the direction of flows: sendonly, recvonly or sendrecv;

• t=<start> <stop>: specifies start and stop times; usually it is set to 0 0 to indicate a permanent session.

The media-level description line we are interested in are the following:

• m=<media> <port> <protocol> <format>: <media> can currently be audio, video, text, application and message. <port> and <protocol> specify the reception port and the transport protocol used, which is usually RTP/AVP (RTP with Audio-Video Profile); <format> depends on the transport, and, for RTP, it is the list of supported codecs14_.

• c=<network type> <address type> <address> (optional): it the same as the session-level c= line and can be used to override it for a particular media flow;

14_{Numeric codes for codecs are specified at http://www.iana.org/assignments/}

(23)

• a=<attribute>[:<value>] (optional): it is the same as the session-level a= line; a common media-session-level attribute is rtpmap and is used to specify the mapping between the codec code and the codec name.

v=0

o=- 1177879684 1177879684 IN IP4 192.168.181.21 s=Opal SIP Session

c=IN IP4 192.168.181.21 t=0 0 m=audio 5000 RTP/AVP 0 8 115 3 114 107 110 101 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:115 iLBC/8000 a=rtpmap:3 GSM/8000 a=rtpmap:114 SPEEX/16000 a=rtpmap:107 MS-GSM/8000 a=rtpmap:110 SPEEX/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-15 m=video 5002 RTP/AVP 31 a=rtpmap:31 H261/90000

Figure 2.5: SDP offer example

2.10.2 Session Negotiation

In section 2.7.2 the mechanism for creating a SIP session is described. During the creation of such a session, SDP messages are exchanged as payloads in SIP messages between the two parties. In this way, during session creation, also session characteristics are negotiated.

The calling UA places its SDP payload in the INVITE or ACK message, while the called party adds it to the 200 OK reply, or to a reliably-sent pro-visional response (1xx).

An SDP payload specifies which flows and codecs (in order of preference) the UA is willing and able to accept. An answer to an SDP offer must contain at least one codec that is present also in the offer.

A UA can refuse a flow by setting the corresponding port to 0 in the m= line.

(24)

Session parameters can be re-negotiated during a call, by sending a new INVITE request (called re-INVITE ) inside dialog; a new SDP offer is attached to this request.

2.10.3 ANAT

A SDP session description allows specifying a set of media codecs per stream but only one network address. The ANAT (Alternative Network Address Types, [19]) semantics for the SDP grouping framework overcomes this limi-tation and allows specifying different groups of network addresses (e.g. IPv4 or IPv6) for a particular media stream. In this fashion, for example, it is possible to define a default IPv6 network address and a default IPv4 network address. If a user agent does not support a particular realm (e.g., IPv6) then it can use the default network address specified for the IPv4 realm.

ANAT introduces a new attribute within the SDP grouping framework [18] by which it is possible to provide alternative network addresses of differ-ent types for a single logical media stream (figure 2.6).

v=0

o=bob 280744730 28977631 IN IP4 154.32.12.75 s=a simple example of ANAT semantics

t=0 0 a=group:ANAT 1 2 m=audio 25000 RTP/AVP 0 c=IN IP6 2001:DB8::1 a=mid:1 m=audio 22334 RTP/AVP 0 c=IN IP4 192.0.2.1 a=mid:2

Figure 2.6: ANAT offer example

Offerer that wants to use SIP to send its offer has to place the sdp-anat option-tag in the Require header field. An answerer receiving an ANAT session description should use the address with the highest priority and reset the ports of the rest of the m-lines of the group.