← course  ·  COURSE HANDBOOK · CONFIGURATION + KPI + SCRIPTS + 3 OPTIMIZATION DESIGNS

Optimization Design

The engineering handbook of the whole course on one page: every voice-relevant parameter with its live value, default, tuning range and rationale · the full KPI formula library with exact counters · ready-to-run moshell/kget scripts · and three complete optimization designs — ① Configuration, ② Voice-quality improvement, ③ Mute / one-way-call improvement — each written as an executable program: detect → decide → tune → verify → roll back.

No fake things. Every value in the “Live” columns is from the NYC tri-band node kget (25.Q3, 25,224 MOs); defaults are from the LTE RAN 25.Q4.4 MOM; policy values from the operator golden file (6,027 rows). Counter names are from the live PM inventory. Tune on your network only through the Module-8 trial protocol.

D1Configuration optimization design

Configuration optimization is a control loop, not an event: photograph → policy → deviation → verdict → change-process → re-photograph. The tables below are the loop's substance — the voice-critical parameter surface with live values and tuning guidance.

1.1 · The bearer & scheduler surface (QciProfilePredefined, Module 1+3)

MO.attributeLiveMOM defaultRange / stepTuning rationale
qci1.priority12 (per 23.203 order)1–15Deadline beats persistence: voice above SIP at congestion. Network-wide policy — never a per-KPI lever.
qci1.pdb80 ms10050–300Radio-leg slice of the 280 ms corridor. Lower = tighter scheduler deadline weighting; only with measured transport headroom.
qci1.pdbOffset50 ms00–100Reserves corridor headroom: scheduler sees pdb−offset. Raise where transport legs are long (×2 operator calls).
qci1.aqmModeMODE2OFFOFF/1/2Deadline-respecting AQM for voice. Pair with pdbOffset — MODE2 drops only frames that cannot make the deadline.
qci5.aqmModeOFFOFFNever drop SIP for latency: RLC AM retransmits anyway; you'd pay twice.
qci9.aqmModeMODE1OFFEarly-drop for TCP — loss is the signal TCP understands.
qci1.dscp4046 (EF common)0–63Voice marking — must match transport queue maps end-to-end (Design 3, cause 4).
qci5.dscp / qci9.dscp26 / 100–63Assured for SIP, best-effort intent for data. Verify with marked TWAMP, never assume.

1.2 · Admission & access (Module 2)

MO.attributeLiveDefaultRangeTuning rationale
dlAdmDifferentiationThr8000–1000 ‰Soft gate: low-ARP GBR refused. Lower = earlier protection, more refusals. Move only after a capacity calendar review.
dlAdmOverloadThr8509500–1000 ‰Hard wall — this operator stops 100 ‰ earlier than default: headroom for fading + mobility of admitted calls.
preemptInactTimerMin15 s40–60Don't evict the briefly silent (a mute is not inactivity). Raise where pre-emption victims show recent voice.
paArpOverride (HPA seats)20 + override 6per cellReserved admission seats for priority ARP. Size to busy-hour priority population + margin.
acBarringFactor / Time95 / 4 s (staged, AUTO)0–95 / 4–512Owned by Load-Based barring automation. Manual edits fight the robot — change the automation's inputs, not its outputs.
acBarringSkipForMmtelVoicefalsefalseboolThe deliberate position: skipped attempts still cost PRACH/PDCCH and fail deeper. Flip only for event cells, with barring active.
pagingDiscardTimer / maxNoOfPagingRecords3 / 71–10 / 1–16Stale pages die; MME re-pages fresh. Idle-mode dials move whole tracking areas — least casual surface in the book.

1.3 · Latency & uplink (Module 3)

MO.attributeLiveDefaultRangeTuning rationale
voltePreschedulingEnabledfalse (staged)falseboolThe latency weapon. Profile already tuned (next row). Activate where SR-to-grant dominates the UL delay tail.
PreschedulingProfile (size/period/duration)86 B / 5 ms / 200 ms100/—/—B/ms/ms86 B = AMR-WB 23.85 frame + ROHC + MAC/RLC. Wrong size wastes PUSCH (too big) or splits frames (too small).
SR period (PUCCH)10 ms10/205–80Worst-case grant-request wait = half a voice frame. Longer periods eat the radio-leg budget invisibly.
noOfPucchSrUsers / CqiUsers320 / 320dim.SR budget vs busy-hour connected UEs. Saturation = setup latency no downstream feature recovers (D1 case 3, M3 theory).
pucchOverdimensioning5000–100 %Absorbs handover surges into the PUCCH region.

1.4 · Link robustness (Module 4)

MO.attributeLiveDefaultRangeTuning rationale
TTI bundling (CXC4011253)DEACTIVATED · golden: ACTIVATEDstate+4 dB UL for edge talkers, thresholds already tuned, arming-on-HO configured. The crown governance finding — resolve via trial, not assumption.
ulHarqVolteBlerTarget5 %101–30Voice's own gambling policy: half the data working point. Verify on mean-HARQ ≤ 1.15; raise only if PUSCH efficiency is genuinely scarce (it isn't, for 30 kbps).
pdcchOuterLoop (init/up/down)−70 / 6 / 200.1 dB unitsAsymmetric: back off fast on NACK, creep back slow. Touch only with DCI-miss evidence per aggregation level.
pdcchPowerBoostMax0 (armed)00–6 dBThe loaded magazine. Raise one step on edge DCI-miss evidence; cost is PDSCH power budget.
pZeroNominalPusch / Pucch · alpha−103 / −117 · 1dBm · 0–1The floor under everything. α=1 = full pathloss compensation (edge UEs shout; neighbors pay). Market decision, modeled first, never per-complaint.
enableServiceSpecificHARQtrue ×9falseboolVoice's own max-HARQ ladder. 7 rungs × 8 ms = 56 of the 80 ms PDB — exactly one ladder per frame.

1.5 · DRX & release (Module 5)

MO.attributeLiveDefaultRangeTuning rationale
DrxProfile=1 longDrxCycleSF40SF10–SF2560The 40 ms contract: two voice frames per wake, one full HARQ ladder of margin inside the PDB. The cycle is the latency contract — tune onDuration first.
onDurationTimer / drxInactivityTimer / drxRetransmissionTimerPSF10 / PSF8 / PSF2PSF1–200Listen window / post-activity wake / HARQ patience. The retrans timer must outlast the HARQ RTT or edge ladders silently halve (M5 theory case 2).
shortDrxCycleTimer00–16One rhythm, no second gear — deliberate simplicity for the voice profile.
inactivityTimerOffset35 s00–60Voice connections earn extra patience: post-call SIP/redial clusters in the next 25 s. Defended with the trace histogram, not folklore.

1.6 · Mobility (Module 6)

MO.attributeLiveDefaultRangeTuning rationale
VoLTE quality detectors UL/DLDISABLED, thresholds 5/6 · 4/10ratioMobility on delivery quality, not signal — for strong-signal asymmetric deaths. Wake on autopsy evidence.
hoOptQci15 / 50 / 200Voice's own self-optimization triplet. FREEZE on trial relations — the optimizer learns from statistics trials perturb.
sCellHandlingAtVolteCallDECONF_UL_SUPPRESS_DL_SCELLS ×9NO_ACTIONenumCA stands down during calls: deconfigure UL SCells, suppress DL activation. The ×9 consistency is itself audited.
endcSplitAllowedMoVoicefalseboolVoice never rides the split bearer: PDCP reordering jitter for capacity voice doesn't need.
srvccDelayTimer000–10 sSRVCC-at-setup caching: 0 on metro (geography is a parameter); corridor networks raise it.

1.7 · Integrity & codec (Module 7)

MO.attributeLiveDefaultRangeTuning rationale
tReorderingUl/Dl (qci1)60 ms0–200Sized to outlast the 7×8 ms HARQ ladder, barely. Shorter starves HARQ; longer just delays the inevitable discard.
RLC SN / PDCP SN (voice)10 / 12 bitHeaders matter at 30-byte payloads. Never set voice to RLC AM — late frames are dead frames (the wrong-ruler case).
bitRateRecommendationEnabledfalsefalseboolThe codec-rate loop (FAJ 121 5014, license-gated). Open on edge-concentrated loss evidence; downshifting is a quality feature.
dscpArpMap[15]all −1−1−1–63Per-ARP marking refinement, staged unused: emergency could mark differently.
The audit machine (run monthly · post-upgrade · pre-trial)

1. Photograph: kget on every node → parse to (MO class, attribute, struct member, instance, value).
2. Join the golden file on canonical keys; normalize enums and user-vs-internal forms.
3. Evaluate exceptions before flagging; audit stale exceptions in reverse.
4. Output three populations: matches · deviations · ungoverned (report it — policy gaps hide here).
5. Verdict each deviation: fix (drift → change process) / challenge (golden stale → file with evidence) / investigate (KPI-correlated → playbooks). Never silent mass reversion.

KPIThe KPI formula library

Exact counters, the formulas this course trusts, and the reading law attached (volume before ratio · definition before comparison · bins before aggregates).

KPIFormula (QCI-1 bin)Reading discipline
VoLTE Accessibility (added)pmErabEstabSuccAddedQci[1] ÷ pmErabEstabAttAddedQci[1]Always beside the attempt volume sparkline; a “better” ratio on collapsed attempts is barring or paging loss upstream.
Initial accessibilitypmErabEstabSuccInitQci[1] ÷ pmErabEstabAttInitQci[1] (× RRC + S1 legs for the full chain)Initial vs added separates attach-time voice from mid-session voice.
VoLTE Drop (percentage)pmErabRelAbnormalEnbActQci[1] ÷ (pmErabRelAbnormalEnbQci[1] + pmErabRelNormalEnbQci[1] + pmErabRelMme…)“Act” qualifier = data in flight (a real conversation died). HO-ongoing exclusions keep it surgical.
VoLTE Drop (per exposure)pmErabRelAbnormalEnbActQci[1] ÷ pmSessionTimeDrbQci[1] (drops per session-minute)The honest twin — immune to call-length mix. Dashboard carries both side by side.
UL air losspmPdcpPktLostUlQci[1] ÷ (pmPdcpPktLostUlQci[1] + pmPdcpPktReceivedUlQci[1])Surface 1 of 3. Suspects: power, adaptation, interference.
DL in-node discardpmPdcpPktDiscDlPelrQci[1] ÷ DL packetsSurface 2: queues/AQM/congestion — comes with timestamps (load calendar).
DL air losspmPdcpPktDiscDlPelrUuQci[1] ÷ DL packetsSurface 3: transmitted, never acknowledged — the radio truth-teller.
Leading indicatormean HARQ transmissions per voice TB (per-QCI HARQ stats)Moves days before loss. Amber at 1.3, red at 1.6 on this node's history.
SRVCC healthper-phase: preparation SR × execution SRPrep failures = core/neighbor config; exec failures = radio timing at the edge.
Paging healthdiscards ÷ received (cell) + records-per-occasion distributionThe stealth accessibility killer: lost pages never become attempts.

SCRThe script library

moshell/AMOS against the live node. Read-only first — every set goes through change process with a backup (kget before, always).

## 0 · Photograph the node (the audit's input — run before ANY change)
moshell NODE
lt all                                  // load full MO tree
kget                                    // full config dump → archive dated copy
inv                                     // HW/license inventory

## 1 · The voice bearer surface in one pass
get qciprofilepredefined=qci1           // priority/pdb/pdbOffset/aqm/dscp/...
get . drx                               // DrxProfile timers (SF40/PSF10/PSF8/PSF2)
get . preschedulingprofile              // 86B / 5ms / 200ms — staged
get . scellhandlingatvoltecall          // expect DECONF_UL_SUPPRESS_DL_SCELLS ×9
get . ulharqvolteblertarget             // expect 5
get . pdcchpowerboostmax                // expect 0 (armed magazine)

## 2 · Feature governance — the three truths per feature
get featurestate=.* featurestate        // ACTIVATED / DEACTIVATED
get featurestate=.* licensestate        // ENABLED?
get featurestate=.* servicestate        // OPERABLE? — all three or it's decoration
lpr                                     // license keys + expiry (the licensing pass)

## 3 · KPI pull — the four questions, hourly bins
pmx . pmErabEstabAttAddedQci.1|pmErabEstabSuccAddedQci.1 -m 24      // can calls start
pmx . pmErabRelAbnormalEnbActQci.1|pmSessionTimeDrbQci.1 -m 24      // do they survive
pmx . pmPdcpPktLostUlQci.1|pmPdcpPktDiscDlPelrQci.1|pmPdcpPktDiscDlPelrUuQci.1 -m 24   // was it good (3 surfaces)
pmx . pmRrcConnEstabFailDynUeAdmCtrlMoVoice -m 24                   // did machinery engage

## 4 · A guarded change (example: the boost magazine, one step)
kget > pre-change-$(date).kget          // backup FIRST
lset EUtranCellFDD=CELL pdcchPowerBoostMax 1     // one cell, one step
get EUtranCellFDD=CELL pdcchPowerBoostMax        // verify the write
// then: engagement counters within the hour, KPI delta vs control cells for a week,
// rollback = lset back to 0 (rehearsed before the change, not after)

D2Voice-quality improvement design

A complete program, executed as a loop. Stage A — locate (which question fails: start / survive / sound / engage). Stage B — the matrix (symptom row → parameter column). Stage C — one change through the trial protocol. Stage D — verify on the formula library, leading indicators first.

Symptom (measured)First parameter / feature leverSecond leverVerify on
UL delay tail > PDB, SR-heavy cellsvoltePreschedulingEnabled=true (profile staged)SR period ↓ / Prioritized SR (already on)UL delay %iles, SR volume, PUSCH cost
Edge UL loss, PHR≈0 UEsTTI bundling trial (thresholds staged; golden says ACTIVATED)Single-codeword for VoLTE (rank-1)edge-bin pmPdcpPktLostUlQci[1], mean-HARQ
Loss + healthy CQI + DCI missespdcchPowerBoostMax 0→1 stepouter-loop floor review (−70/6/20)DCI miss by aggregation level, drop delta
Mean-HARQ drifting up, loss still flatinterference hunt (the leading-indicator drill)ulHarqVolteBlerTarget guard (keep 5)mean-HARQ back ≤ 1.15
In-node DL discards stepping at busy houradmission corridor review (800/850 calendar)AQM MODE2+offset50 sanity (never AM, never qci5 drop)pmPdcpPktDiscDlPelrQci[1] vs load
Drops clustered in HO windowsrelation-scoped hysteresis/TTT (never cell-wide first)cascade threshold walk (no missing stair)per-relation HO fail, ping-pong rate
Strong-signal asymmetric deathswake quality detectors (5/6 UL · 4/10 DL staged)service-triggered mobility steeringquality-triggered HO counters, cluster drops
Edge “breaking up” before SRVCCbitRateRecommendationEnabled=true (license first)SRVCC B2 placement from drive dataedge loss at constant traffic, complaint text
Battery complaints, latency fineonDurationTimer PSF10→PSF8 (never the cycle first)adaptive DRX scoped to data QCIs onlyvoice delay %iles unchanged, duty cycle
Drops + attempts spiking togetherrelease policy review (tInactivity chain, offset 35)— it's not radio; stopthe twin spikes collapse together
The program rules

One lever per population per window · neighbors are guard rails · engagement counters prove the machinery fired before any KPI is credited to it · accepted changes graduate into the golden file through process — or the next audit reverts your improvement as drift.

D3Mute & one-way-call improvement design

The trickster failure: signalling healthy, call connected, someone hears silence — mid-call mute, one-way audio, or clipped onsets. The control plane cannot see it; the design below can. Direction first, surface second, escalate with evidence.

3.1 · The six-cause tree

#CauseDetection (exact check)Fix lever
1UL air death (interference / power limit) — “they can't hear me”pmPdcpPktLostUlQci[1] elevated on the complainer's cell; traces: UL SINR collapse at healthy DL (glass-tower asymmetry); PHR≈0Interference hunt → TTI bundling / boost / quality-detector wake (D2 rows 2–3, 7)
2DL air death — “I can't hear them”pmPdcpPktDiscDlPelrUuQci[1] on the listener's cell; CQI + DCI-miss pairing decides traffic vs control channelCoverage/interference path, or boost magazine if control-side
3RTP blackhole in transport/core (one direction's stream dies; radio surfaces clean)All three loss surfaces clean both ends + complaint persists → marked TWAMP per segment; core media-path trace; firewall/NAT asymmetry reviewTransport/core escalation with the radio exonerated by evidence — timestamps, not suspicion
4DSCP demotion (voice queues behind bulk under load — intermittent mute at busy hour)TWAMP 40-vs-10 differential collapses under load on one segment; in-node discards rise while air is cleanRestore transport queue maps; stand the differential canary as a permanent war-room tile
5ROHC context damage (bursty mutes on one site, lossy backhaul)Compression state counters: repeated IR fallbacks; backhaul microburst loss on TWAMPFix the transport segment — ROHC stays on; its resync is the symptom, not the disease
6Sleep/release misfires (clipped first words after silence; calls “going quiet” then dropping)Post-silence onset delay %iles ≫ one DRX cycle; adaptive DRX lengthening cycles during SID-only periods; preemptInactTimer evicting muted callsPin QCI-1 to DrxProfile 1 (scope adaptation to data) · preemptInactTimerMin=15 honored · drxRetransmissionTimer ≥ HARQ RTT

3.2 · The mute-call runbook (executable order)

## Step 1 — direction discipline (the complaint text tells you)
"they can't hear me"  → complainer's UPLINK     → causes 1, 3, 4
"I can't hear them"   → complainer's DOWNLINK   → causes 2, 3, 4
"first words clipped" → onset latency           → cause 6
"goes silent at rush hour" → load-correlated     → cause 4 first

## Step 2 — surface check (one pmx, both ends' cells)
pmx CELL pmPdcpPktLostUlQci.1|pmPdcpPktDiscDlPelrQci.1|pmPdcpPktDiscDlPelrUuQci.1 -m 24
// surfaces dirty → radio path (causes 1/2/5/6): trace CQI+SINR at failure timestamps
// surfaces CLEAN → transport path (causes 3/4): marked TWAMP per segment, under load

## Step 3 — the burst dimension (the ear hears patterns)
// isolated single frames = concealment absorbs (routine); 3+ consecutive = audible mute (urgent)

## Step 4 — one fix from the tree, through the trial protocol; verify on the SAME check that found it
## Step 5 — postmortem: would the dashboard see this faster next time? (tile or threshold added)
The two rules that save weeks

Never escalate a mute to transport without the surfaces printed — “radio exonerated by evidence” is what makes the transport team move. Never “fix” a mute by setting voice to RLC AM — the loss KPI will improve and the mute will get worse (frames arriving embalmed; the wrong-ruler case is module 7's law).

References