11.3 Protocol Conversion Minimization Requirements

22.2283GPPRelease 16Service requirements for the Internet Protocol (IP) multimedia core network subsystem (IMS)Stage 1TS

The IMS shall provide a mechanism to support data communication between WebRTC IMS clients without requiring bearer level protocol conversions between WebRTC and IMS protocols.

The IMS shall be able to support data communication between WebRTC IMS clients using WebRTC end to end security mechanisms.

Note: Implementation of this capability is subject to operator policy and regulatory constraints.

Annex A (informative):
Example IP multimedia application scenarios

The following example scenarios describe the personalised handling of individual media in multimedia applications (note that this list is neither complete nor exhaustive):-

1) The user is in a voice communication, and receives an incoming IP video communication. The user decides not to accept the communication, but diverts the incoming video to a messaging system. Further, the user is given an indication that there is a video message in his mail box

2) The user is in a voice communication, and receives an incoming video communication. The user decides to accept the communication but wishes to switch between the two communications.

3) The user is idle in a network and not involved in a communication. The user modifies his user profile to divert all voice communications other than those from high priority, pre-identified callers (e.g. his boss). In this scenario all emails and text messages continue to be received regardless of the sender.

4) On receiving a communication, the calling party’s identity is displayed (if not restricted) and user shall be able to decide whether to accept the communication, or divert to a messaging system. The user shall be able to request media handling of the communication (e.g. media splitting to different destinations, media conversion).

5) The user is busy in a communication when receiving an incoming communication, but responds to the originating party that he will respond later. The user may request that the originating party’s details (if not restricted) are stored with a reminder in user’s profile.

6) Hi-fi sound (nuances, character of voice)
Person(s): Marketing Manager, Rita
Situation: She is at a launch party for some customers in London. In the break she listens to her messages and one from another customer in Tokyo gets her attention. He just wants her to call, but doesn’t say if it is urgent or not.
Solution: Due to the excellent sound quality of the terminals involved and the messaging system, she picks up the faint irritation in his voice and decides to call him immediately. It was urgent and she could remedy the situation easily by emailing the information from her built in PDA storage. The customer was relieved as he was just going in to a very important meeting.
Benefit(s): Good sound quality gives more information to base judgements on, i.e. emulates real life meetings better.

7) Stereo sound (nuances, character of voice plus positions, sound-scapes)
Person(s): Purchase Officer, Gustavo
Situation: Participates in a conference to discuss purchase of a new kind of steel for the factory in Rio. As he is on the road he calls from his hotel room in Sydney. The conference is in the head office in Rio. The local department has invited the two final contenders to have them argue their cases. The two companies are positioned at the different ends of the table. One of the groups is presenting and mentions something about deliveries. A side remark is barely audible, "we can’t deliver that quality and that quantity this year !" Who gave this remark?
Solution: The excellent sound quality together with the stereoscopic sound gives Gustavo the information he needed. It was the other group that gave the remark. The decision was made for him at that point. He gave the order to the presenting group right after they finished a very good presentation that told him everything he wanted to hear. The setup at the head office was done with two synchronized UEs at each end of the table.
Benefit(s): Stereoscopic sound gives even more information than just hi-fi sound to base judgements on, i.e. emulates real life meetings better.

8) Conference/chat with "private rooms"
Person(s): A project team at an IT company: Rick, Diana, Ted, Sven and Liu
They are based in different cities.
Situation: The project team has one of their weekly reporting meetings using their mobile communicators. In the middle of the meeting, Rick and Diana get lost in a lengthy arguing on some detailed design matters that bores the rest of the team. Ted, the moderator, finds that it is nevertheless necessary to give Rick and Diana some minutes to finish their discussion, so he decides to not interrupt them. At the same time Sven remembers that he need to remind Liu to send a report to him on the latest findings from her research work.
Solution: The team use a conference/chat service with the new facility "private rooms". This allows Sven to direct a few words in privacy to Liu. Sven activates easily this feature by the GUI of his communicator. Liu is immediately notified by the GUI of her communicator that Sven is now talking privately with her (this is necessary to avoid embarrassing misunderstandings that could occur if Liu would answer Sven in the "common room" instead of in the new "private room" that Sven has created).
Since the voices of all conference members are synthetically mapped in a stereophonic projection, Liu is able to hear what Sven is saying, even though he speaks simultaneously with the other team members (the communicator will not automatically adjust the sound volume of the "common room", since it cannot know if Liu is more interested in Sven’s comments or in continuing to listen to the other team members).
Benefit(s): This service emulates virtual presence in a conference room in the best possible way without adding more visionary technologies like holographic projections, etc. The synthetic stereophonic sound projection provides good possibilities for a conference member to discriminate unwanted voices even if the meeting situation is informal and spontaneous and everyone are talking at the same time. The flexible possibilities to create one or more "private rooms" make it easy to make private comments to selected colleagues. The easy-to-use and fast responding GUI makes the needed end-user effort to create a new "room" so low, that it feels natural to use the function even for exchanging just a few quick words.
Alternative use: Exchange the IT project team with a gang of teenagers that are planning what to do in the weekend. The service works perfectly well also in that scenario and provides the same benefits.
Additional features: Easy GUI controlled addition of new participants (can be initiated by any of the participants), including addressing, notification/invitation, etc. (cf. "outgoing call" in PSTN). GUI notification of new incoming session invitations (cf. "incoming call" in PSTN) and possibility to choose action as desired (incorporate the "calling party" in the existing conference session, creating a new separate session, rejecting the invitation, diverting it to a messaging system, etc.) Whiteboarding and/or application sharing.

9) Multiplayer mobile gaming with voice channel
Person(s): Joe (age 15), Blenda (age 14), Fredric (age 15) and all their "cyber friends" in the Shoot-n-Shout v.14.0 community
Situation: In the legendary multiplayer game Shoot-n-Shout v.14.0 the most popular game mode is a team competition. The idea is simply to shoot down the members of the concurring teams. There are always a lot of active game sessions in CyberSpace. At a web/WAP service operated by the game application provider, interested potential players can choose a game session and also find other gamers to form a team with. There is a text chat service where potential team-mates can learn to know each other.
Joe, Blenda and Fredric meet on the web/WAP chat and decide to form a team to take up the fight in one of the Shoot-n-Shout sessions. They are preparing a game strategy in advance through the text chat service, but when they have started the battle it takes too long time to type text, so they the will need another way to communicate with each other.
Solution: The game application provider makes use of a conference/chat service with "private rooms" in order to provide a multi-player voice service to the players of Shoot-n-Shout. When a game starts there is one "common room" where all players can talk (or rather shout) to each other and one "private room" for each team. Players in a team can also dynamically create more "private rooms" if they only like to talk to one (or a few) of their friends. (See the conference/chat scenario for details.)
The volume (and stereophonic position) of the players voices when they are using the "common room" is controlled so that it matches the virtual surroundings in the game environment. As an example, players that are behind a wall will only be heard as a vague whisper in the distance.
Benefit(s): A voice channel will enhance the gaming experience for several popular network games.

10) Application sharing with voice commentary
Person(s): Marketing Manager, Rita and Media expert, Jones
Situation: The launch of a new campaign for some customers in London. Last minute feedback is that one of the customers is expecting the latest gadget to be included, even if its only a prototype. Rita knows it’s not included in the presentation and she has no information with her.
Solution: Rita calls Jones, the media guru they employed for design of their important presentations. He has the information and some pictorials. He sends them over into Rita’s PowerPoint application and they edit the new slide together as they discuss the textual information to be included.
Benefit(s): The process is extremely interactive and the session takes only 5 minutes thanks to the broadband connection and the fact that they don’t need to Ping-Pong the pictures and the text back and forth. (Emphasize mobile or fixed access as required). The customer is happy and a Letter of Intent is signed.
Comments: By adding voice and pictures in an interactive session we achieve both effectiveness and interaction, two desired components.

11) Emergency location with voice conversation, navigation and picture transfer
Person(s): Ma Beth, her children and the pet dog Bobby
Situation: The family is out driving in the country side and they take a turn on the slippery country road a bit too fast. They slide down into the ditch. Bobby the dog in the back of the van gets a heavy box of books on top of his left paw. It may be broken, and you can tell it certainly hurts from the loud yelps that come out in a rushed stream. The rest of the family is ok. They were all buckled up.
Solution: Ma Beth reaches for her communicator as soon as she has recovered from the initial shock. She calls 112 (911 or similar). The answer comes after 23 seconds and the operator immediately confirms the identity and the location of the van. Ma Beth is a bit taken aback by this quick information and has to think for awhile, then confirms the location as possibly correct. She then states the problem and she gets connected to a vet that asks a few pertinent questions. She can show a close up picture of the dog’s left paw and the vet confirms a possible (95%) broken leg just above the paw. He gives a few quick instructions and sends her a map of the closest emergency animal hospital. The map shows her current position and soon displays the quickest way to get to the hospital. Well there, Bobby is taken care of and things are looking up. Even the kids are smiling now that the dog is calm and free from pains, and he looks so funny with his little cast.
Benefit(s): The initial call transfers emergency information to the operator automatically. This ensures minimum delay to correct action. The Communicator transfers the picture that gives enough information to make a very accurate and fast assessment of the situation. Then the map transfer and display on the terminal together with the current position gives clear information and directions for Ma Beth to drive and make the right turns at every corner. In her still half-shocked state she can drive to the hospital without hesitation about where to go. Very reassuring for all parties including the dog that gets fastest possible help.
Comments: The call is initially just a voice call but evolves with the best of positioning in emergency situations and navigational aid together with picture and graphics transfer.

12) The Real Virtual Theatre and Foyer Chat room – Fixed Network example
Person(s): Theatre going "cultural" group with one member (Bob) in a hospital bed.
Situation: The group is watching the play and are utterly fascinated by the first act. When they come out into the foyer in the break they remember Bob. They really want to share this first act with him since they know Shakespeare’s Midsummer Night’s Dream is his favorite.
Solution: Bob uses the theatre’s online streaming service via the hospital network. (At only half the price of a theatre ticket!). The play displays in color and stereo surround sound on his bedside TV set. In the break his friends call him up from the theatre chat room. The chat room is equipped with 3D sound pick up and local display screens with streaming facilities. They set up the streaming from one of the screens to be synchronized with Bob’s bedside equipment. Their voices are also mixed into the sound streams as they talk. Bob now gets both the playbacks from the first act and his friends’ voices in 3D surround sound. Bob’s voice is projected close to the screen as if he was standing leaning on the bench right there. His voice is very clear and full of emotions as he speaks to the various playbacks. Both parties can control the playbacks and watch their own selection in a second window on the screen.
Benefit(s): Bob can pick up every nuance in the lively discussion, including the whispered comments from Greta in the back. The group is almost feeling Bob’s presence because of the emotional clarity and distinct position of his voice. As both parties have control and visibility of the streaming sessions, it is very effective and very interactive.
Comments: Experiential services are sought after. This one can be a bit exclusive because of the equipment requirements, but the uses are many.

13) Mobile synchronized MM container
Person(s): The married couple Bill and Christine and their daughter Linda
Situation: Bill is on a business travel to Spain. He calls his wife Christine every night using his MMM terminal. Often Christine is answering at home using her Screenphone, but this particular evening Christine has arranged a baby-sitter for their children so she could go to a restaurant with some friend. When Bill is calling, she is sitting on the commuter train on her way home. Bill often show some pictures during his calls (both live pictures showing the environment where he is at the moment and pictures that he has been taking during the day with his separate digital camera).
Today, their talk starts off as a common voice conversation. After a while Bill likes to show Christine the lovely sunset view that he can see from his hotel room, so he make some snapshots with the built-in camera of his terminal and sends them in real-time mode to Christine. Christine likes to show one of them to their little daughter Linda when she comes home.
Solution: With a quick gesture on the touchscreen of Christine’s MMM terminal, she instantly moves the selected picture from the real-time session window to the "multimedia container" icon. All the contents of the "container" is automatically mirrored between the MMM terminal and her home server. In this way, Christine can easily pick up the picture from her Screenphone at home. If Linda is at sleep when Christine comes home, she can wait until tomorrow.
Benefit(s): The "multimedia container" can be used for every type of MM content that one likes to have available both at home and at another location. This "container paradigm" is very intuitive and stimulates the use of images, video clips etc. for a multitude of purposes. The "container" can be used both for transferring content from the MMM terminal to the home server (as in this scenario) and in the opposite direction.

Annex B (Informative): Business models use cases

The IMS supports agreements between the access network operator and the network operator providing IMS services (IMS operator).

The IMS shall be able to offer services to users that are attached to access networks owned by another operator.

The service offering may be restricted by the capabilities of the access network and the agreement between the access network operator and the IMS operator.

The IMS shall support at least the following operator’s domain relationships:

a) Access network to IMS relationships

a.1) Access network and the IMS it connects to, belong to the same operator as shown in figure B.1.

Figure B.1

a.2) Access network and the IMS it connects to, belong to different operators having an interconnection as shown in figure B.2.

Figure B.2

b) IMS level relationships

b.1) The IMS (e.g. 3GPP or NGN) to which the access network connects and the Home IMS (e.g. NGN or 3GPP) which provides the IMS services belong to different operators as shown in figure B.3.

Figure B.3

b.2) The IMS (e.g. 3GPP or NGN) to which the access network connects and the Home IMS (e.g. NGN or 3GPP) which provides the IMS services are the same as shown in figure B.4.

Figure B.4

b.3) The IMS (e.g. 3GPP or NGN) to which the access network connects and the Home IMS (e.g. NGN or 3GPP) which provides the IMS services belong to the same operator as shown in figure B.5.

Figure B.5

An IMS operator shall be capable of connecting to other network operators via:

– an interconnect model where agreements are established between two operators;

– an interconnect model where intermediate network(s) can provide interconnect on behalf of multiple operators (and may be based on an agreement between the operators and their intermediate network provider).

A single IMS operator shall be able to choose to support either of the interconnect models, or both of the interconnect models simultaneously.

Annex C (Informative):
Basic communication cases for IMS networks

A basic communication case can be described on a per IMS basis by stating the IMS entry point and an exit point for the communication as shown in figure C.1.

The following general types of entry/exit point can be identified:

– Access (for communication to from terminals);

– Interconnect to non-IMS network;

– Interconnect to other IMS;

– Internal network resource (e.g. a conference bridge for conferencing services).

As a general rule a network based on IMS shall support the following basic communication cases on a per network basis, as shown in Table C.1.

Table C.1

(entry point)

To (exit point)

Access Network

Interconnect to other IMS

Interconnect to
non-IMS network

Internal Network Resource

Access Network





Interconnect to other IMS



Interconnect to
non-IMS network




Internal Network Resource


It is not precluded that other, more complex communication cases may be provided by service level concatenation of basic communication cases, e.g. by means of call diversion services.

Figure C.1: Graphical representation of supported basic communication cases

Annex D (normative):
Access to IMS via non-3GPP access or via NPNs using alternative authentication

This annex defines additional specific requirements for access to IMS via non-3GPP access or via NPNs using alternative authentication.

These requirements shall not apply to terminals having a 3GPP access and not accessing IMS via NPN using alternative authentication.

For non-3GPP-only terminals or terminals accessing IMS via an NPN using alternative authentication with neither ISIM nor USIM, the IMC may be used to access the IMS via a non-3GPP access technology or via the NPN. However, if ISIM [21] is present it shall be used to access IMS or if ISIM is not present but USIM [24] is present, USIM shall be used to access IMS.

Annex E (informative):
Example use cases for IMS Inter UE Transfer (i.e. transfer/replication/sharing)

Users of communication services will increasingly use different devices with different communication capabilities to meet their communication needs. Users may wish to initiate, transfer, manipulate, or otherwise maintain the simultaneous real time streaming of multimedia components (e.g. video, speech, audio) between multiple devices for a variety of reasons. For example, users may wish to control:

– the coordinated delivery of simultaneous multimedia streams to multiple devices owned by the same user (e.g. to take advantage of the different video and audio capabilities of the different devices: high definition television, surround sound stereo, home speaker phone, etc)

– shared multimedia content in real time across multiple devices owned by others (who may be in different locations)

This annex defines some of the use cases for IMS Inter UE Transfer (i.e. transfer/replication/sharing), examples of which are given below (note that this list is neither complete nor exhaustive):

1) Shared control of session across multiple IMS users

Jill calls Jane to share and discuss a video.

– Jane and Jill’s controls are synchronized so they share the seamless video experience. Both UEs are capable of shared control.

Either Jill or Jane can pause the shared video to provide comments. Either of them can resume the video when they are ready to do so.

– Jane and Jill need to be authorized by network after Jill allows Jane to share control.

2) Transfer of service control

Jane is having a multi-media call (audio/video stream) on her mobile with Jill while coming home. After arriving at home, Jane transfers the video stream and service control to another device that belongs to her (e.g. a PC that is IMS capable and on which she is logged in), but keeps the audio on her mobile to keep the conversation private from the other people in the room.

3) Media Replication

Jane is watching a video clip on her cell phone, which another family member (Jane’s Dad) starts to watch, so they are watching together and talking about it. Jane’s Dad wants to continue watching the video clip on another device (e.g. Jane’s PC that is IMS capable and on which she is logged in), so he requests replication of the video component to Jane’s PC.

4) Video replication

Steve, president of a wine-tasting club schedules a video documentary to start at an agreed time for all club members, so that they can watch it from their home and chat together about it while watching. In this case, all sessions are synchronized, and Steve is able to control the video playing: he can put the video on pause for all members in the same time.

Annex F (Informative): Void

Annex G (Informative):
Example use cases for IMS-Based Telepresence

This annex provides a use case for IMS-Based Telepresence.

In the scenario below, a project team has one of their weekly reporting meetings using a Telepresence conference. Steven, John and Marc are in a meeting room in San Francisco. Fred and Liu are in another room in Paris. Ted is on the road and joins using his mobile phone. Bill is at home and joins using his PC. Both meeting rooms are equipped with multiple cameras and large display monitors. Three cameras and screens are arranged to provide a panoramic view of the room. Additional cameras and screens are used to share presentations among the participants.

Multiple video streams are shared along with additional information (e.g. spatial information, video resolution, and environmental), so that the user experience is as if they are in same location. Audio information (e.g. spatial information) is exchanged to facilitate the rendering of the audio in accordance with the rendering of the video. Users in the meeting enjoy a strong sense of realism and presence between all participants.

Figure G.1 – Telepresence

In the above scenario participants may be from different operator’s networks, or from enterprise networks. In such cases, IMS-based Telepresence has interconnection with Telepresence in other networks.

Figure G.2 – Interconnection of Telepresence

Annex H (Informative): Support of WebRTC client access to IMS