A voice call on a 4G or 5G network is not a feature of the radio. It is a tiny, fragile SIP session riding on top of a packet bearer that the network has to build, on demand, in the few hundred milliseconds between you pressing dial and the far phone ringing. When a VoLTE or VoNR call drops, mutes, or takes four seconds to connect, the fault is almost never "the signal" — it is one specific message, in one specific interface, that did not arrive or did not say what the next node expected. This article traces every one of those messages, at the level of detail you would see in a Wireshark or a QXDM capture, and then turns that knowledge into a repeatable field method for finding the broken message fast.
Who this is for. Core/IMS engineers, RF and drive-test engineers, NOC L2/L3, and anyone preparing for a 4G/5G voice interview. We assume you know what a SIP transaction and an EPS bearer are; we do not hand-wave the parts that actually break.
Why voice is hard on a packet network
In 2G and 3G, voice had its own circuit. A timeslot was reserved end to end; if you had the timeslot, you had the call. LTE deleted the circuit-switched domain entirely — there is no CS core in an LTE-only deployment. Voice therefore had to become an application that runs over IP, exactly like web traffic, but with three constraints web traffic never has:
- Bounded delay. Conversational speech tolerates roughly 150 ms mouth-to-ear one way before it feels like a satellite call. The radio scheduler must therefore give voice packets a guaranteed, low-latency path — not best-effort.
- Guaranteed bitrate. An AMR-WB codec needs ~24 kbit/s reserved. If the cell is congested, the voice packets cannot simply queue behind a video download.
- Session control. Something has to ring the other phone, negotiate the codec, handle hold/resume/transfer, generate the bill, and do lawful intercept. That "something" is the IMS — the IP Multimedia Subsystem.
VoLTE is the GSMA's profile (GSMA IR.92) of the 3GPP IMS for voice over LTE. VoNR is the same IMS voice profile (IR.92 / IR.94 media) carried over a 5G Standalone radio and 5G Core. The application layer — SIP, SDP, RTP, the MMTel telephony service — barely changes between them. What changes is the machinery underneath that builds the bearer and guarantees the QoS. That is why this article spends most of its time on the seams: the points where the IMS asks the access network for a guaranteed pipe, because that is where calls break.
VoLTE is not "voice over the LTE radio." It is a SIP session that politely asks the EPS to build it a guaranteed-bitrate bearer, and then streams RTP through it.
— the one sentence to rememberThe IMS architecture & its interfaces
The IMS is a small SIP network that sits behind the packet core. You must know the named nodes, because every troubleshooting log refers to them by name and every interface has a 3GPP reference point you can grep for.
The interfaces you will actually see in logs
| Reference point | Between | Protocol | Carries |
|---|---|---|---|
| Gm | UE ↔ P-CSCF | SIP | All UE SIP: REGISTER, INVITE, etc. Secured by IPSec. |
| Mw | CSCF ↔ CSCF | SIP | P↔I↔S routing inside the IMS. |
| ISC | S-CSCF ↔ AS | SIP | Service invocation to the TAS via iFC. |
| Cx | I/S-CSCF ↔ HSS | Diameter | UAR/UAA, MAR/MAA (auth vectors), SAR/SAA (registration state). |
| Sh | AS ↔ HSS | Diameter | Service data (e.g. divert numbers) for the TAS. |
| Rx | P-CSCF ↔ PCRF | Diameter | AAR/AAA — media authorisation that triggers the QCI 1 bearer. 4G. |
| Gx | PCRF ↔ PGW | Diameter | RAR/RAA — installs the PCC rule that builds the dedicated bearer. 4G. |
| N5 | P-CSCF ↔ PCF | HTTP/2 | Npcf_PolicyAuthorization — the 5G equivalent of Rx. 5G. |
| N7 | PCF ↔ SMF | HTTP/2 | Npcf_SMPolicyControl — installs the 5QI 1 PCC rule. 5G. |
Specs to keep open. IMS stage-2 TS 23.228; SIP/IMS stage-3 TS 24.229; MMTel media TS 26.114 (and GSMA IR.92); policy TS 29.214 (Rx) / TS 29.512 (N7); EPS bearers TS 23.401; 5GS TS 23.501/502 and NAS TS 24.501; EPS Fallback TS 23.502 §4.13.6.1.
Bearers & QoS — the part that actually breaks
A VoLTE call uses two EPS bearers on the IMS APN, never one:
- A default bearer, QCI 5 (non-GBR, priority 1) — carries the SIP signalling. It comes up at attach and stays up as long as you are registered.
- A dedicated bearer, QCI 1 (GBR, priority 2) — carries the RTP voice media. It is built per call and torn down when the call ends. Its existence is the single best health indicator of a VoLTE call.
VoNR mirrors this exactly with QoS flows on the IMS PDU session: a non-GBR 5QI 5 flow for signalling and a GBR 5QI 1 flow for media. The standardised characteristics are identical by design so the two can interwork.
| QCI / 5QI | Resource | Priority | PDB | PELR | Used for |
|---|---|---|---|---|---|
| 1 | GBR | 2 | 100 ms | 10⁻² | Conversational voice (RTP). VoLTE/VoNR media. |
| 2 | GBR | 4 | 150 ms | 10⁻³ | Conversational video (ViLTE/ViNR). |
| 5 | non-GBR | 1 | 100 ms | 10⁻⁶ | IMS SIP signalling (highest priority of all). |
| 9 | non-GBR | 9 | 300 ms | 10⁻⁶ | Default internet APN (for contrast). |
Two more attributes decide whether a call survives congestion:
- ARP (Allocation & Retention Priority) — a level 1–15 plus pre-emption capability/vulnerability. The QCI 1 voice bearer is given a strong ARP so it can pre-empt a data bearer when the cell is full. A mis-provisioned ARP is why "calls fail only at busy hour."
- TFT (Traffic Flow Template) — the packet filters (source/dest IP + port range + protocol) that bind the RTP 5-tuple onto the QCI 1 bearer. If the uplink TFT is wrong, your uplink RTP falls back onto the default bearer and you get one-way audio. Remember this for Chapter 13.
Signalling rides QCI 5 / 5QI 5. Media rides QCI 1 / 5QI 1. The TFT/packet filter is the glue that puts the right RTP packets on the right bearer — in both directions. Most "audio" faults are really "the media bearer or its filter is wrong in one direction."
VoLTE step 1 — attach & P-CSCF discovery
Before any SIP exists, the UE must (a) attach to LTE, (b) get a default bearer on the IMS APN, and (c) learn the IP address of its P-CSCF. All three happen inside the LTE Attach. The clever bit is P-CSCF discovery: the UE asks for it inside the Protocol Configuration Options (PCO) of the PDN Connectivity Request, and the PGW returns the P-CSCF address(es) in the PCO of the response.
Request Type = initial request PDN Type = IPv4v6 Access Point Name = ims # the dedicated IMS APN, not "internet" ESM Information transfer flag = 1 Protocol Config Options (PCO): ┌─ P-CSCF IPv6 Address Request (container 0001) ├─ P-CSCF IPv4 Address Request (container 000C) ├─ DNS Server IPv6 Address Request (container 0003) └─ IP Address Allocation via NAS (container 000A)
EPS Bearer Identity = 5 EPS QoS = QCI 5 # non-GBR, IMS signalling bearer APN = ims PDN Address = 2001:db8:ca1:5::23 / 10.45.0.23 Protocol Config Options (PCO): ┌─ P-CSCF IPv6 Address = 2001:db8:ca1:f::1 # <-- the P-CSCF the UE will REGISTER to ├─ P-CSCF IPv6 Address = 2001:db8:ca1:f::2 # secondary, for failover └─ DNS Server IPv6 Address = 2001:db8:ca1:d::53
First failure class. If the operator forgot to provision the IMS APN in the HSS subscription, or the PGW returns no P-CSCF in the PCO, the UE never registers and there is no voice at all — yet data works perfectly. Always confirm the QCI 5 default bearer on APN ims exists before blaming SIP.
VoLTE step 2 — IMS registration with AKA & IPSec
Now the UE has an IP on the IMS APN and a P-CSCF address. It registers. IMS registration is a two-pass challenge–response: the first REGISTER is unauthenticated and draws a 401 carrying an AKA challenge; the UE/ISIM computes the response and the keys, then sends a second REGISTER inside a freshly established IPSec tunnel.
REGISTER sip:ims.mnc010.mcc234.3gppnetwork.org SIP/2.0 Via: SIP/2.0/UDP [2001:db8:ca1:5::23]:5060;branch=z9hG4bK-524287-1 Max-Forwards: 70 P-Access-Network-Info: 3GPP-E-UTRAN-FDD; utran-cell-id-3gpp=2340100001A2D3E4 From: <sip:[email protected]>;tag=8a7d2 To: <sip:[email protected]> Call-ID: 1-524287@2001:db8:ca1:5::23 CSeq: 1 REGISTER Contact: <sip:[2001:db8:ca1:5::23]:5060>;+sip.instance="<urn:gsma:imei:35395806-012345-0>"; +g.3gpp.icsi-ref="urn%3Aurn-7%3A3gpp-service.ims.icsi.mmtel" Authorization: Digest username="[email protected]", realm="ims.mnc010.mcc234.3gppnetwork.org",nonce="",uri="sip:ims.mnc010...", response="",algorithm=AKAv1-MD5 Security-Client: ipsec-3gpp; alg=hmac-sha-1-96; ealg=null; spi-c=12345; spi-s=12346; port-c=6000; port-s=6001 Require: sec-agree Proxy-Require: sec-agree Supported: path, gruu, sec-agree Expires: 600000 Content-Length: 0
The I-CSCF queries the HSS (Cx: UAR/UAA) to pick an S-CSCF; the S-CSCF fetches an auth vector (Cx: MAR/MAA — RAND, AUTN, XRES, CK, IK) and returns the challenge. Critically, the P-CSCF strips the CK/IK from the 401 before forwarding to the UE, and uses them to set up the IPSec SAs — the UE derives the same keys from the ISIM.
SIP/2.0 401 Unauthorized WWW-Authenticate: Digest realm="ims.mnc010.mcc234.3gppnetwork.org", nonce="QnJhbmRvbVJBTkR8QVVUTiBiYXNlNjQgZW5jb2RlZA==", # base64( RAND || AUTN ) algorithm=AKAv1-MD5, qop="auth", ck="...", ik="..." # present on Cx/Mw; REMOVED by P-CSCF toward UE Security-Server: ipsec-3gpp; q=0.1; alg=hmac-sha-1-96; ealg=null; spi-c=98765; spi-s=98766; port-c=6100; port-s=6101
The UE runs AKA in the ISIM: it checks AUTN (network authentication + sequence freshness), computes RES, CK and IK, brings up the IPSec tunnel, and re-sends REGISTER #2 — now protected — with the computed response. The S-CSCF compares RES to XRES, accepts, writes registration state to the HSS (Cx: SAR/SAA) and returns 200 OK.
SIP/2.0 200 OK P-Associated-URI: <sip:+447700900123@ims...>, <tel:+447700900123>, <sip:234010123456789@ims...> Service-Route: <sip:scscf1.ims...:5060;lr;orig> # UE must route future requests via this S-CSCF Contact: <sip:[2001:db8:ca1:5::23]:5060>;expires=600000;pub-gruu="..." P-Charging-Function-Addresses: ccf=[2001:db8:ca1:c::1]
After 200 OK the UE typically SUBSCRIBEs to its own reg event package (to learn of network-initiated de-registration) and publishes presence if used. Only now is the UE "VoLTE registered" — the green phone icon. Everything in Chapter 6 assumes this state.
VoLTE step 3 — the mobile-originated call, message by message
This is the heart of it. The MO call interleaves three conversations that must stay in lock-step: the SIP offer/answer, the precondition handshake (RFC 3312) that says "don't ring until the media bearer is up," and the Rx→Gx policy exchange that actually builds the QCI 1 bearer. Get the ordering wrong and you ring before there is a pipe — the classic "ghost ring then dead air."
INVITE tel:+447700900456 SIP/2.0 Via: SIP/2.0/UDP [2001:db8:ca1:5::23]:5060;branch=z9hG4bK-inv-1 Route: <sip:scscf1.ims...:5060;lr;orig> # from the Service-Route learned at registration P-Preferred-Identity: <tel:+447700900123> P-Access-Network-Info: 3GPP-E-UTRAN-FDD; utran-cell-id-3gpp=2340100001A2D3E4 Accept-Contact: *;+g.3gpp.icsi-ref="urn%3Aurn-7%3A3gpp-service.ims.icsi.mmtel" P-Asserted-Service: urn:urn-7:3gpp-service.ims.icsi.mmtel Supported: 100rel, precondition, gruu Require: sec-agree Content-Type: application/sdp CSeq: 1 INVITE v=0 o=- 4090 4090 IN IP6 2001:db8:ca1:5::23 m=audio 50020 RTP/AVP 104 105 96 a=rtpmap:104 AMR-WB/16000/1 a=fmtp:104 mode-set=0,1,2;max-red=220 a=rtpmap:105 AMR/8000/1 a=rtpmap:96 telephone-event/16000 # DTMF (RFC 4733) a=ptime:20 a=curr:qos local none # my media bearer is NOT yet up a=des:qos mandatory local sendrecv # I MUST have it before media a=des:qos optional remote sendrecv
When this INVITE reaches the P-CSCF, the P-CSCF does the thing that makes voice "carrier-grade": it inspects the SDP and fires an Rx AAR at the PCRF to authorise exactly this media (codec, bandwidth, 5-tuple). The PCRF translates it into a PCC rule and pushes it to the PGW over Gx RAR, which makes the PGW initiate the dedicated QCI 1 bearer toward the UE.
AAR (Auth-Application-Id = 16777236 // 3GPP Rx) Media-Component-Description: Media-Component-Number = 1 Media-Type = AUDIO Flow-Status = ENABLED Max-Requested-Bandwidth-UL/DL = 42000 / 42000 # AMR-WB + headroom (bps) Codec-Data = "uplink offer ... AMR-WB/16000" Flow-Description = permit out 17 from 2001:db8:ca1:5::23 50020 to any Specific-Action = INDICATION_OF_SUCCESSFUL_RESOURCES_ALLOCATION AF-Charging-Identifier = 0x5a1f...
RAR Charging-Rule-Install: Charging-Rule-Name = "VoLTE-AMRWB-1" QoS-Information: QCI = 1; ARP = {priority 2, pre-empt-cap ENABLED, pre-empt-vuln DISABLED} GBR-UL/DL = 23850 / 23850; MBR-UL/DL = 42000 / 42000 Flow-Information (TFT): permit RTP 2001:db8:ca1:5::23:50020 ↔ remote:50040
The PGW now runs the EPS dedicated-bearer activation down to the eNB and UE. On S1-AP the eNB sees an E-RAB Setup Request; over the air the UE gets an RRC reconfiguration adding the DRB:
E-RAB ID = 6 E-RAB Level QoS Parameters: QCI = 1; ARP = {2, can-preempt, not-preemptable} GBR (UL/DL) = 23.85 / 23.85 kbps; MBR (UL/DL) = 42 / 42 kbps Linked EPS Bearer Identity = 5 # tied to the QCI 5 signalling bearer Traffic Flow Template (TFT): packet filter (uplink) = proto UDP, local :50020, remote <far>:50040 packet filter (downlink) = proto UDP, remote <far>:50040, local :50020 # <-- BOTH directions present? If uplink filter missing → one-way audio (Ch.13)
Only once the bearer is up does the UE confirm preconditions are met and the call is allowed to ring. The full SIP ladder, with the precondition state machine, looks like this:
UE P-CSCF/IMS far-end │ INVITE (offer, curr:none) │ │ │ ─────────────────────────▶ │ INVITE │ │ │ ────────────────────────────▶ │ │ │ 100 Trying │ │ 100 Trying │ ◀──────────────────────────── │ │ ◀───────────────────────── │ │ │ │ 183 Session Progress │ (answer, curr:none, 100rel) │ 183 Session Progress │ ◀──────────────────────────── │ │ ◀───────────────────────── │ │ │ PRACK │ │ │ ─────────────────────────▶ │ ───── PRACK ────────────────▶ │ │ 200 (PRACK) │ ◀──── 200 (PRACK) ─────────── │ │ ◀───────────────────────── │ │ │ «QCI 1 dedicated bearer activates here» │ │ UPDATE (curr:qos local sendrecv) │ │ ─────────────────────────▶ │ ───── UPDATE ───────────────▶ │ │ 200 (UPDATE) │ ◀──── 200 (UPDATE) ────────── │ │ ◀───────────────────────── │ │ │ 180 Ringing │ ◀──── 180 Ringing ─────────── │ ← phone rings NOW, not before │ ◀───────────────────────── │ │ │ 200 OK (INVITE) │ ◀──── 200 OK ──────────────── │ ← callee answered │ ◀───────────────────────── │ │ │ ACK │ ───── ACK ──────────────────▶ │ │ ═══════════════ RTP / AMR-WB media (QCI 1) ═══════════════ │
The precondition handshake guarantees the media bearer exists before 180 Ringing. That is why a correct VoLTE call never has "dead air after pickup" — if you see it, suspect the bearer or TFT, not the codec.
MT call is the mirror image: the terminating S-CSCF runs the callee's iFC (TAS supplementary services like call-divert evaluated here), pages the UE, the UE answers the INVITE, and the dedicated bearer is built on the terminating side by the same Rx/Gx (or N5/N7) trigger. Same precondition logic, reversed.
The media plane — AMR, RTP, ROHC
With the bearer up and the SDP agreed, voice is just RTP. But the details decide quality and are where MOS complaints originate.
- Codec. VoLTE/VoNR mandate AMR (narrowband) and AMR-WB (wideband, "HD Voice", 50–7000 Hz). EVS (Enhanced Voice Services, super-wideband/fullband) is optional but increasingly default. The mode-set in the SDP fmtp restricts which AMR rates are allowed; a mismatch causes garbled audio even though the call connects.
- Packetisation. ptime:20 — one 20 ms speech frame per RTP packet (50 packets/s). max-red allows redundancy (re-sending prior frames) to survive loss.
- DTX / CN. During silence the encoder stops sending speech and emits occasional SID (comfort-noise) frames, dropping the packet rate to ~2/s. This is normal and saves radio resources — do not mistake it for packet loss.
- ROHC (Robust Header Compression). A 40-byte IPv6+UDP+RTP header is larger than a 32-byte AMR-WB payload. PDCP runs ROHC (profile 0x0001) to crush it to 1–3 bytes. A stalled ROHC context after a handover is a real and nasty cause of one-way audio.
IPv6 src 2001:db8:ca1:5::23 dst <far> # compressed to ~1B by ROHC over the air UDP sport 50020 dport 50040 RTP V=2 PT=104(AMR-WB) seq=7421 ts=+320 ssrc=0x9af2 # ts step 320 = 20ms @16kHz AMR CMR=7 (23.85k) F=0 FT=8 Q=1 [ 61 bytes speech ]
VoNR — the 5G SA architecture delta
VoNR is VoLTE with the EPS swapped for the 5G System (5GC). The IMS is the same IMS. Here is precisely what changes:
| Concept | VoLTE (4G) | VoNR (5G SA) |
|---|---|---|
| Radio / core | E-UTRAN + EPC | NG-RAN (gNB) + 5GC |
| Signalling bearer | Default QCI 5 EPS bearer, APN ims | QoS flow 5QI 5 on IMS PDU session, DNN ims |
| Media bearer | Dedicated QCI 1 GBR bearer | GBR QoS flow 5QI 1 → mapped to a DRB |
| P-CSCF discovery | PCO in PDN Connectivity | PCO in PDU Session Establishment (same idea) |
| Policy from IMS | P-CSCF → PCRF over Rx (Diameter) | P-CSCF → PCF over N5 (HTTP/2) |
| Policy into core | PCRF → PGW over Gx | PCF → SMF over N7; SMF → gNB over N2/NGAP |
| Mobility brain | MME | AMF (+ SMF for sessions) |
| Voice fallback | SRVCC to 2G/3G | EPS Fallback to VoLTE; 5G-SRVCC (Rel-16) |
The deep change is the QoS model. In EPS, a "bearer" is the granularity of QoS. In 5GS, the QoS flow (identified by a QFI) is the finest QoS granularity, and the NG-RAN decides how to map QoS flows onto Data Radio Bearers. The 5QI 1 voice flow is what the IMS asks for; the gNB's flow-to-DRB mapping is an extra step that did not exist in 4G — and is part of why VoNR setup has more moving parts (Chapter 11).
VoNR call flow — NGAP & the QoS flow
Prerequisite: the UE has registered to 5GS (NAS Registration), established an IMS PDU session (DNN=ims) carrying a 5QI 5 QoS flow, learned its P-CSCF from the PCO, and completed IMS registration exactly as in Chapter 5 — the SIP is RAT-agnostic. The new part is how the 5QI 1 media flow is built.
PDU Session ID = 5 DNN = ims S-NSSAI = SST=1, SD=000001 # often a dedicated IMS slice PDU Session Type = IPv4v6 Authorized QoS rules: QFI = 5 → 5QI = 5 (non-GBR) # IMS SIP signalling Extended PCO: P-CSCF IPv6 = 2001:db8:5c1:f::1
Now the call is dialled. SIP INVITE flows over the 5QI 5 flow. The P-CSCF authorises the media over N5 to the PCF; the PCF installs an SM policy over N7 to the SMF; the SMF triggers a PDU Session Modification that adds the GBR 5QI 1 QoS flow, and over N2/NGAP the gNB sets up the corresponding DRB:
POST /npcf-policyauthorization/v1/app-sessions { "medComponents": { "1": { "medType": "AUDIO", "fStatus": "ENABLED", "marBwUl": "42 Kbps", "marBwDl": "42 Kbps", "medSubComps": { "1": { "fDescs": ["permit out 17 from <ue> 50020 to any"] } } }}, "afAppId": "IMS-MMTel" }
PDUSessionResourceModifyRequest PDU Session ID = 5 QosFlowAddOrModifyRequestList: QoS Flow Identifier (QFI) = 1 QoS Characteristics: 5QI = 1 (GBR, delay-critical=no) Alloc/Retention Priority = {priority 2, pre-empt-cap may, pre-empt-vuln not} GBR (UL/DL) = 23.85 / 23.85 kbps; MBR (UL/DL) = 42 / 42 kbps # gNB now maps QFI 1 → a new Data Radio Bearer via RRCReconfiguration, # and returns PDUSessionResourceModifyResponse with the DRB / NG-U TEID.
The corresponding RRCReconfiguration over the air adds the DRB and the SDAP/PDCP config; the SDAP header is what carries the QFI so the gNB and UE agree which packets are the voice flow. From the SIP perspective the precondition/PRACK/UPDATE/180/200 ladder is identical to Chapter 6 — the only difference is that "bearer activates here" becomes "QoS flow + DRB activate here." That symmetry is the whole point of the 3GPP design: one IMS, two access flavours.
VoLTE: P-CSCF → Rx → PCRF → Gx → PGW → S1-AP → eNB. VoNR: P-CSCF → N5 → PCF → N7 → SMF → N2/NGAP → gNB → (QFI→DRB). One more hop and one more mapping. Every extra hop is a place a 5QI 1 flow can fail to appear — and a call to silently fall back (Ch.10) or sound bad.
EPS Fallback & SRVCC — when 5G/IMS can't keep the call
Most live "5G" networks in 2026 do not actually carry VoNR everywhere — NR voice coverage is patchier than NR data coverage, and many gNBs are configured to fall back to LTE for voice. This is EPS Fallback, and it is the single most common 5G voice behaviour in the field.
EPS Fallback adds 1–2 s to setup because the redirect/handover happens mid-INVITE. With N26 present the core context is transferred (seamless); without N26 the UE re-attaches on LTE (slower, but works). For the user it is invisible except for the delay and the radio bar dropping from 5G to 4G during the call.
SRVCC vs EPS Fallback — don't confuse them. EPS Fallback happens at call setup (5G→4G, both IMS). SRVCC (Single Radio Voice Call Continuity) happens during an active call when the user leaves IMS coverage entirely and the call must be handed to the legacy CS domain (LTE→UTRAN/GERAN over the Sv interface to an MSC; 5G-SRVCC NR→UTRAN added in Rel-16). EPS Fallback keeps you on packet voice; SRVCC drops you to a circuit.
Why VoNR setup looks slower than VoLTE
A frequent field and interview question. With both technologies well-tuned, a VoLTE call sets up in ~1.5–2.0 s and VoNR in ~1.8–2.5 s. The extra few hundred milliseconds are structural, not a bug:
- An extra policy hop. 4G: P-CSCF → Rx → PCRF → Gx → PGW. 5G inserts the SMF: P-CSCF → N5 → PCF → N7 → SMF → N2 → gNB. The N5 (PCF↔P-CSCF) interface is a deliberate SBA design choice that simply did not exist in EPC.
- QoS-flow-to-DRB mapping. In EPS the bearer is the radio bearer. In 5GS the gNB must additionally map the 5QI 1 QoS flow onto a DRB and configure SDAP — an extra RAN computation per call.
- EPS Fallback, when configured. If the cell isn't VoNR-enabled, the redirect/handover to LTE adds 1–2 s — and to the user this is "5G voice," so it gets blamed on VoNR.
- Beam management & SA paging. On the terminating side, NR paging + beam acquisition can add latency versus LTE's simpler procedure.
The mitigations operators use: pre-establish the IMS PDU session at registration (so only a Modify, not an Establish, is needed at call time); pre-provision PCF policy so N5/N7 is a lookup not a negotiation; and enable VoNR per-cell aggressively so fallback is the exception. Done well, VoNR setup is within ~300 ms of VoLTE and the call quality (EVS) is better.
The 10-minute field troubleshooting method
When a voice complaint lands, do not start from the radio. Start from the layer that proves the call exists and walk down. Six checks, in order — each one localises the fault to a different node, so you stop at the first that fails.
Is the UE IMS-registered?
No registration ⇒ no voice at all (data fine). Check for the QCI 5 / 5QI 5 signalling bearer on APN/DNN ims and a SIP 200 OK to REGISTER.
check: default bearer on "ims" · REGISTER → 200 OK · IPSec SA up
If it fails → P-CSCF reachability, IMS APN provisioning, ISIM/AKA (check 401 → RES≠XRES = auth reject).
Did the dedicated media bearer come up?
The call connects but quality is bad / no audio ⇒ the QCI 1 / 5QI 1 GBR bearer likely never built. This is the highest-yield single check.
check: QCI 1 (VoLTE) or 5QI 1 QoS flow + DRB (VoNR) present for the call
If missing → Rx/Gx (4G) or N5/N7/NGAP (5G) policy failure; or EPS Fallback misfired. Look at the AAR/AAA and RAR/RAA, or the PDUSessionResourceModify cause.
Did codec negotiation succeed?
Connects but audio is garbled/robotic ⇒ SDP offer/answer mismatch or wrong AMR mode-set. A hard mismatch returns 488 Not Acceptable Here.
check: SDP answer has a common codec (AMR-WB/EVS) · ptime 20 · same mode-set
If mismatch → align IMS-AGW/transcoder config and the UE's allowed codecs; verify EVS↔AMR-WB transcoding where one side lacks EVS.
Is RTP flowing both ways?
One-way audio ⇒ RTP present in one direction only. Capture at UE, IMS-AGW and far end to find the dead leg.
check: uplink RTP seq advancing AND downlink RTP seq advancing
If one-way → TFT/packet-filter missing in that direction (Ch.6); SBC/NAT pinhole; SDP direction attribute (sendonly/recvonly); stalled ROHC context after handover.
What does the media quality say?
Both ways but choppy ⇒ loss/jitter on the radio or transport. Read RTCP / drive-test KPIs.
check: packet loss < 1% · jitter < 30 ms · MOS ≥ 3.5
If poor → radio (poor CQI, HARQ retx), missing GBR admission (bearer built but not actually guaranteed), or transport congestion. Confirm the GBR is enforced, not just signalled.
Did it fall back or hand over cleanly?
5G call slow or dropped ⇒ EPS Fallback / SRVCC behaviour. Confirm expected RAT.
check: EPS Fallback occurred? N26 present? SRVCC to CS mid-call?
If broken → N26/Sv configuration, target-LTE QCI 1 admission, neighbour relations, or a VoNR-disabled cell forcing constant fallback.
The discipline. Each check maps to exactly one part of the chain: 1=registration, 2=policy/bearer, 3=IMS-AGW/codec, 4=TFT/NAT, 5=radio/transport, 6=mobility. You never "look everywhere" — the first failing check is the domain that owns the fault.
Failure catalogue & the SIP error codes that name them
Symptom → most-likely root cause → the exact message or IE to inspect. This is the table to keep on the second monitor.
| Symptom | Most-likely root cause | Where to look |
|---|---|---|
| No voice at all, data OK | Not IMS-registered (no IMS APN, P-CSCF unreachable, AKA reject) | QCI5 bearer? REGISTER→200? 401 RES==XRES? |
| Dead air after answer | QCI 1 / 5QI 1 bearer never built; rang before media | Rx AAR/AAA · Gx RAR · NGAP ModifyResp cause |
| One-way audio | TFT/packet filter missing one direction; NAT pinhole; ROHC stall | TFT UL+DL filters · IMS-AGW legs · ROHC ctx |
| Robotic / garbled audio | Codec/mode-set mismatch; transcoding gap (EVS↔AMR-WB) | SDP fmtp mode-set · IMS-AGW transcoder |
| Choppy / low MOS | Radio loss/jitter; GBR signalled but not enforced | RTCP loss/jitter · CQI/HARQ · admission |
| Slow 5G setup / 4G icon mid-call | EPS Fallback (often a VoNR-disabled cell) | NGAP fallback cause · N26 · target QCI1 |
| Call drops at cell edge | SRVCC / handover failure; QCI1 not admitted in target | Sv (SRVCC) · X2/Xn HO · target ARP |
| Busy-hour-only failures | ARP too weak — voice can't pre-empt data | ARP priority & pre-emption flags on QCI1 |
The SIP response codes you will see — and what they mean for voice
| Code | Meaning | Typical voice cause |
|---|---|---|
| 403 | Forbidden | Subscriber not provisioned for MMTel / barred service |
| 404 | Not Found | Dialled number not routable (ENUM/number translation fail at TAS) |
| 408 | Request Timeout | Far end / S-CSCF not responding — transport or node down |
| 480 | Temporarily Unavailable | Callee not registered / out of coverage |
| 486 | Busy Here | Callee on another call (no call-waiting) — normal |
| 487 | Request Terminated | Caller hung up before answer (CANCEL) — normal |
| 488 | Not Acceptable Here | SDP/codec mismatch — no common codec/precondition |
| 503 | Service Unavailable | IMS node overload / congestion control |
| 580 | Precondition Failure | QoS bearer could not be established — policy/RAN refused QCI1/5QI1 |
The two codes that mean "bearer/QoS problem," not "people problem": 488 (codec/SDP, including precondition negotiation) and 580 (precondition/QoS could not be met). When you see these, go straight to checks 2 and 3 of the field method — the radio is almost never the cause.
Putting it together
VoLTE and VoNR are the same IMS voice service wearing two different access coats. The application — registration with AKA, the precondition-guarded INVITE, AMR-WB/EVS over RTP — is shared. The differences are all in the seam where the IMS reaches into the packet core to demand a guaranteed pipe: Rx/Gx + QCI 1 in 4G, N5/N7/NGAP + 5QI 1 QoS flow → DRB in 5G, with EPS Fallback bridging the two when NR voice coverage runs out. Master that seam — and the six-step method that walks it — and you can localise any voice fault to a single node in under ten minutes, with the capture to prove it.