The popularity of Instant Messaging (IM) and videoconferencing is growing rapidly both on a commercial level, in which organizations use these technologies to conduct business, and on a casual level, in which individuals increasingly rely on IM and personal WebCams to communicate with friends and family. As a result, Microsoft has decided to align its strategic approach to these realtime collaborative technologies with new and evolving standards. One such standard that Microsoft is helping to shape and define is the Internet Engineering Task Force's (IETF's) Session Initiation Protocol (SIP, which is pronounced as the word "sip"). Microsoft is releasing the Real Time Communications (RTC) Server, which uses SIP to provide IM and videoconferencing capabilities. (For information about how Microsoft intends to release the RTC Server, see the Web-exclusive sidebar "Packaging of the RTC Server," http://www.winnetmag.com, InstantDoc ID 27398.) Microsoft will ship the RTC Server around the same time it ships Windows .NET Server (Win.NET Server) 2003. At the time of this writing, the RTC Server is about to enter beta testing and is generally referred to by its code name, Greenwich.
To understand how RTC Server and similar products work, you need to know about the protocol that underlies the server's functionality. Here's a look at the basic concepts behind and the components of SIP.
SIP Basics
SIP is an end-to-end and client/server protocol that facilitates the creation, modification, and termination of communications sessions between one or more participants. These communications sessions can include different forms of interactionsbasically any form of peer-to-peer or multipoint communication, including multimedia conferences and telephone calls. The participants can be either humans (who use endpoints such as SIP-enabled telephones or videoconferencing clients) or an automation component (e.g., voicemail server, media-archiving server).
Although the telephony industry conceived SIP, this protocol is designed to simplify Internet-based communication. (For information about SIP's beginnings, see the Web-exclusive sidebar "A Short History of SIP," http://www.winnetmag.com, InstantDoc ID 27399.) Any SIP-based communications session typically involves at least three separate activities and protocols:
- SIP provides the basic signaling between participants to set up the session.
- SIP uses the Session Description Protocol (SDP) to define the nature of the communication used within the session, including the type of media (e.g., video, audio), transport protocol (e.g., IP, UDP, Real Time ProtocolRTP), and media format (e.g., H.261 video, Moving Pictures Experts GroupMPEGvideo).
- SIP uses the appropriate protocol to transfer information in the session. For example, SIP uses RTP to transfer realtime information and Real Time Streaming Protocol (RTSP) to deliver streaming media.
SIP is defined in IETF Request for Comments (RFC) 2543, which you can retrieve from the IETF repository (http://www.ietf.org/rfc.html). RFC 2543 wasn't published until March 1999, so the IETF working groups had the opportunity to incorporate concepts from other Internet protocol architectures that were well established and successful. The SIP architecture borrows many concepts from SMTP. For example, in SIP, users are designated with a SIP address (i.e., a SIP URL) that's similar to an SMTP address (i.e., a mailto URL). SIP also borrows concepts from HTTP. Information about the communications session that SIP is controlling is similar to the information that you would expect to see with HTTP. For example, a SIP packet might look something like the one that Figure 1, page 26, shows. In this packet, user George on the PC named gpc.yankees.com invites user Jerry to a session.
SIP provides four key functions. Two functionsname mapping and redirection, and capabilities negotiationoccur during a session's setup. The other two functionsparticipant management and capabilities managementoccur during the session.
Name mapping and redirection. SIP translates participants' descriptive naming information to SIP location information that's consistent with directory or other services. SIP facilitates personal mobility so that users can establish a SIP session when they're on the move (e.g., moving from their connected desktop PCs to their cars), thereby making their mobile telephone or wirelessly connected PDA their preferred communications device.
Capabilities negotiation. SIP determines the various media capabilities of all the participants in a session and agrees on the media facilities to be used during the session. For example, if two participants in a session have video capability but a third participant has only audio capability, SIP determines that a video stream can be used in the session but only the audio stream should be transmitted to the nonvideo participant.
Participant management. During a session, SIP lets participants bring new participants into a session or terminate or suspend connections with existing participants.
Capabilities management. During a session, SIP monitors the media capabilities and makes adjustments if necessary. For example, suppose a session consists of two participants, both of whom have only audio capability. If another participant joins the session and has video capability, SIP adds a video stream for the new participant.
The SIP Components
SIP consists of five components: user agent client (UAC), user agent server (UAS), proxy server, redirect server, and registrar server. The UAC and UAS are client-side components, whereas the proxy, redirect, and registrar servers are server-side components.
Prev. page  
[1]
2
3
next page