%ents; ]>
Multiparty Jingle (Muji) This specification defines an XMPP protocol extension for initiating and managing multiparty voice and video conferences within an XMPP MUC &LEGALNOTICE; 0272 Experimental Standards Track Standards Council XMPP Core XEP-0045 XEP-0166 muji Sjoerd Simons sjoerd.simons@collabora.co.uk sjoerd.simons@collabora.co.uk Dafydd Harries dafydd.harries@collabora.co.uk dafydd.harries@collabora.co.uk 0.1 2009-09-11 psa

Initial published version as accepted for publication by the XMPP Council.

0.0.0.2 2009-06-09 sjoerd

Second rough draft.

&xep0166; is used to negotiate peer to peer media sessions. Muji (short for Multiparty Jingle) is a way to coordinate Jingle sessions between a group of people. Muji conferences are held in &xep0045; rooms. A Muji conference has a number of contents, each of which has unique name. content type, and an encoding. Each participant may provide a stream for each content, and communicates which contents they are willing to provide streams for, along with encoding information, in their MUC presence. This serves two purposes. Firstly, so that each participant knows which contents every other participant provides. Secondly, so that there is a global payload type (PT) mapping for the various contents, so that clients only need to encode and payload each content that they provide once. Participants are not required to participate all the contents that are available. For example, a Muji client might choose to only request audio streams.

Joining a conference is done in two stages. The first step is to declare that preparations are being done to either join or start a muji session inside the MUC. This is indicated by the client sending a presence stanza to the MUC with a preparing element in muji section. ]]>

When a client adds a payload ID to a content description, it MUST have the same codec name and receiving parameters as the corresponding entries in other participants' payload maps for that content. For instance, if Alice defines a payload type with ID 98, codec Speex and a a clock rate of 8000 for a content called “voice0”, then Bob must define payload type 98 identically or not at all for that content.

Furthermore, each content description MUST include at least one payload type that every other participant supports. In other words, the intersection of payload type mappings in descriptions for a content must not be the empty set. This avoids clients having to encode the same stream multiple times, which can be very costly, and also allows sending the encoded data only once where the transport makes this possible (e.g. IP multicast).

Once a client has constructed content descriptions and advertised them in its MUC presence, it MUST initiate a Jingle session with every other participant. The requirement that it is the joining participant that initiates sessions avoids race conditions.

Jingle sessions are initiated between the MUC JIDs of participants. That is, the Jingle session-initiate stanza is sent from one MUC JID to another. This allows participants to easily identify sessions as belonging to a Muji conference. Content names inside Muji-related Jingle sessions always refer to the content with the same name inside the Muji conference.

To leave a conference the Muji information MUST first be removed from the participant's presence; subsequently it SHOULD terminate all Jingle sessions related to that conference. Updating the presence first reduces the likelihood of situations where new participants initiate sessions with participants who are leaving the conference.

Adding a stream follows a process similar to the joining a conference. As a first step an updated presence stanza MUST be send which contains a preparing element as part of the Muji section. ]]> The client MUST then wait until the MUC rebroadcasts its presence message, after which it MUST wait for all other participants that had a preparing element in their presence to finish their changes. Afterwards the client should add the new content to the muji section of its presence and add the content to all the Jingle sessions it had with participants it shared the content with. ]]>

To remove a content type the participant SHOULD first sent an updated presence without the content in its muji section. Afterwards it MUST the content from all the Jingle sessions it has open.

When scaling to conferences with a big number of participants or when clients it's no longer viable for all participants to have direct connections. On connections where upstream bandwidth is the limiting factor an RTP a RTP relay which is able to relay the stream to multiple participants on the behalf of the clients and which forwards the streams of other participants back to the client can be used. If the limiting factor is either CPU or downstream bandwidth then a mixer can be used, which receives the media streams from other participants and mixes them on behalf of the client, so that the client only has to deal with receiving and decoding a single stream for each media type. On the sending side a mixer acts like a relay and relays the clients stream to all other participants. Both these services can either be provided by dedicated services or by other clients.