Abstract
Scope covers production‐sound capture in Indian field conditions, location mix practices, metadata and deliverables, risk factors (climate/noise/power), decision criteria for ADR vs. salvage, and handoff into post (dialogue edit, mixing, M&E). Output includes stepwise workflows, checklists, and tables for fast deployment.
1) Terminology & Scope
- Production Sound: Dialogue and synchronous effects recorded on set.
- Location Mix: On-set mono/stereo mix derived from ISO sources for editorial reference and, when quality allows, for final mix use.
- ISO Tracks: Discrete, unprocessed mic channels.
- Wild Lines / Room Tone: Dialogue captured off-camera; 30–120 s tone per space.
- ADR: Automated Dialogue Replacement; in-studio re-record of lines; includes loop group (walla) and voice-over (VO).
- Sync Backbone: Timecode + slate + sound reports tying picture and sound.
2) India Field Conditions: Operating Realities
- Acoustic floor: Dense traffic, horns, construction, market ambience; intermittent PA systems; urban birds at dawn; coastal wind.
- Climate & seasonality: Monsoon corridors (roughly Jun–Sep west coast; Jul–Oct east coast; variable in mountains); humidity spikes; hot-season cicada/ambient insects in forests.
- Power: 50 Hz mains; generator placement influences low-frequency contamination; earthing quality varies.
- Language matrix: Hindi + major regional languages (Tamil, Telugu, Malayalam, Kannada, Bengali, Marathi, etc.); dialect and code-switching common.
- Permits & crowd control: Urban exteriors frequently imply spill noise; railway, airport, and heritage sites impose strict windows and safety marshals. These environments are governed through filming compliance systems for international productions.
3) Pre-Production Workflows
3.1 Script & Breakdown
- Scene taxonomy: Interior vs. exterior; night vs. day; controlled vs. public; crowd density; SFX overlap (vehicles, rain rigs, water).
- Dialogue risk tags: High-risk markers where SNR and intelligibility are likely compromised (e.g., street markets, moving vehicles, rain towers, waterfalls, boat decks).
- Deliverable spec lock: Frame rate (film 24; broadcast legacy 25; streaming may be 23.976/24/25), sample rate (48 kHz baseline), bit depth (24-bit), polyWAV, iXML fields, track naming conventions.
Large-scale productions typically integrate these sound workflows within the broader line production execution framework in India where departments coordinate technical planning, location approvals, and production logistics before principal photography begins.
3.2 System Plan
- Timecode & sync: One master clock; jam-sync at call and after battery swaps; visual slate with sticks; 2nd sticks for multi-cam.
- I/O topology: Boom + bodyworn lavaliers; plant mics for vehicles/tables; camera scratch feed for dailies; IFB for script/continuity.
- RF planning: Frequency coordination chart; intermod spacing; scan at tech recce and at call; separation from walkie blocks.
- Contingency inventory: Weather covers, wind protection stages (foam → softie → blimp), shock isolation, rain rigs for transmitters/recorders, spare media.

4) On-Set Capture: Step Sequence
4.1 Daily Start
- TC sync: Master to recorders, smart slates, cameras; confirm frame rate.
- Slate & naming test: Scene/shot/take format; confirm iXML showcode.
- RF scan: Log occupied/notch; set transmit power and spacing.
- Noise map: Identify LF sources (gensets, AC compressors), HF sources (inverters, fans), intermittent sources (horn clusters, temple bells, azaan, construction).
- Tone capture (per location): 60 s min; 120 s in reverberant spaces.
4.2 Take Workflow
- Microphone coverage:
- Boom: Primary harmonic/air; super/hypercardioid interiors; shotgun exteriors.
- Lavaliers: Clothing, concealment, and rustle control; plant mics where blocking constrains boom.
- Location mix:
- MX1 (mono or L): Dialogue-dominant, limited dynamics.
- MX2 (R): Alternative emphasis (e.g., wider bed, safety comp).
- ISOs: Each principal on discrete track; plants labeled by set position (e.g., “Plant-Table-N”, “Plant-Dashboard”).
- Metadata per take: Scene–shot–take, mic map, notes (noise intrusions, line flubs), circle-takes flagged.
4.3 Inserts & Wilds
- Wild lines: Immediately post-take while emotional state matches.
- Futz/prop mics: Phones, radios, PA systems captured clean and futzed in post if needed.
- Effects snippets: Doors, footsteps on unique floors, vehicles at that location.
4.4 End-of-Day
- Media handoff: Two verified copies (primary editorial SSD; backup drive).
- Reports: CSV/PDF sound report; circle takes; noise notes; RF anomalies.
- Slate photo: Quick photo of slate + mic map for editorial reference.

5) Location Mix & ISO Deliverables
5.1 Recording Specs
- File container: Broadcast WAV (PolyWAV).
- Sample/bit: 48 kHz / 24-bit (96 kHz for SFX beds if pre-agreed).
- Timecode: Linear or embedded; matches camera master.
- Track order (illustrative): 1–2: MX1/MX2; 3: Boom; 4–N: Lav-A, Lav-B, Plants…
5.2 Naming & iXML
- Roll/Media: Prod_Sxx_RYY (shoot day/roll).
- File: Scene_Shot_Take → 012A_1_03.wav.
- iXML fields: Project, scene, take, notes, speed, track names, mic assignments, TC start.
5.3 Reports (minimum fields)
- Per take: File, TC start, duration, tracks armed, circle, notes (noise, rustle, overlap, ad-libs), SNR estimate (low/med/high).
- Per location: Room tone length, ambience IDs, special SFX, RF chart.
6) Noise & Risk Controls (Fact Set)
- LF contamination: 50 Hz mains/generator; distance and isolation directly impact floor noise.
- Wind & rain: Western Ghats/coastal belts show higher gust/spray; blimps and rain covers are standard countermeasures.
- Transit shots: Road horns and tire hiss dominate mid-high bands; plant mics with vibration isolation reduce structure-borne noise.
- Crowd & festivals: High-SPL bursts near temples, stadiums, and parade routes; schedule windows determine baseline SNR.
These acoustic risk factors are often evaluated during early location scouting and production planning. International productions frequently assess environments using structured evaluation models described in how global productions score filming locations before committing to large-scale shoots.

7) ADR Strategy & Decision Framework
7.1 Objective Triggers (indicative technical ranges)
- SNR threshold: Production dialogue with SNR ≥18–20 dB often conforms after noise reduction; <15 dB frequently marked ADR.
- Masking content: Broadband continuous noise (rain rigs, generators) vs. transient spikes (horns, bells).
- Performance & diction: Fluency, dialect accuracy, and emotional alignment.
- Textual changes: Script rewrites, TV compliance, language versions.
7.2 ADR Decision Matrix (simplified)
| Condition | Production salvage | Partial ADR | Full ADR |
|---|---|---|---|
| SNR ≥ 20 dB, minimal transients | ✔ | — | — |
| SNR 15–20 dB, intermittent spikes | ✔ (strip-silence, de-noise, fill) | ✔ (problem words) | — |
| SNR < 15 dB, continuous masking | — | — | ✔ |
| Accented/dub language version | — | — | ✔ |
| Performance strong, single rustle | ✔ | ✔ (line) | — |
7.3 ADR Workflow
- Cueing: From locked picture or change list; per-line timecodes, text, note of mouth shapes.
- Pre-roll: 3-beep or visual streamer; 0.5 s handle.
- Booth & chain: Quiet room; neutral early reflections; mic at consistent distance/axis; 48 kHz/24-bit WAV; matched mic family if production intended to intercut.
- Takes: Multiple passes; best-of or comped; note lexical stress and breath.
- Sync & conform: Slip to plosive/consonant impacts; verify lip-flap close-ups; print in-session metadata.
- Loop group/walla: Language-appropriate layers for markets (Hindi/English/regional); proximity and perspective variants.
7.4 Deliverables (ADR)
- Files: One WAV per cue or consolidated reels; embedded TC; naming Reel_Scene_Line_Actor_Take.
- Sheets: CSV/EDL of cues; take ratings; notes.
- Alt language packs: Separate stems per language variant.

8) Editorial & Post Handoff
Dialogue recording, post-production coordination, and location-based audio workflows typically operate within broader film production services that integrate technical departments across production, post, and delivery pipelines.
8.1 Picture Editorial
- Ingest: Location mix to A/V bins for offline; ISOs parked for salvage; sound report integrated as markers or notes.
- Conform: AAF/EDL relinks to production WAVs by timecode; circle takes bias.
8.2 Dialogue Edit & Mix Prep
- Tracklay: DX-Prod (boom/lavs matched), ADR, Walla, FX, Foley, MX.
- Cleanup: Broadband reduction, spectral repair of transients, rustle mitigation.
- Fill: Room tone and production ambience for continuity.
8.3 Final Mix & Masters
- Stems: DX, ADR, Walla, FX, Foley, MX; printmaster (5.1/2.0 per deliverable spec).
- M&E: Dialogue-free version for language re-versioning; production effects retained if clean.
9) Risk Register (India Context)
| Risk | Likely regions/times | Impact | Mitigation assets to stage |
|---|---|---|---|
| Monsoon rainfall noise | West coast Jun–Sep; East coast Jul–Oct | Continuous broadband masking | Weather covers, secondary shoot windows, longer tone |
| Traffic horns & PA | Urban cores, markets, political rallies | Transient spikes | Slate noise notes; wild lines; tighter coverage |
| Generator/50 Hz hum | Remote units; heritage where mains blocked | LF floor contamination | Isolation distance; ground lift options; alternate power |
| Coastal wind | Mumbai, Goa, Kerala shores | Mid-HF modulation | Full blimps; windjammers; mic orientation |
| Wildlife/insects | Forests, Ghats, NE hills | Tonal/chorus bed | Longer clean tone beds for fill; off-time lines |
| Crowd spill | Festivals, stadium perimeters | Overlaps, loss of diction | Marshals; immediate wild lines post-take |
(Impacts and assets listed as factual categories; activation depends on schedule and permissions.)
10) Checklists
10.1 Daily Sound Start
- Master TC set → cameras/slates jammed
- Frame rate/sample/bit verified
- RF scan saved; walkie channel map noted
- Mic map printed; lav concealment kit ready
- Room tone captured per set; file tagged
10.2 Take-Level
- Slate verified; scene/shot/take in iXML
- MX1/MX2 meters checked; ISOs armed
- Noise intrusions logged; circle takes flagged
- Wilds logged immediately after key setups
10.3 Day End
- Dual media copies verified (hash)
- CSV/PDF reports exported; slate photo archived
- Battery/media rollover plan for next day
11) Tables & Templates
11.1 Mic Coverage (usage classes)
| Class | Use case | Notes |
|---|---|---|
| Super/Hypercardioid | Interiors, reflective rooms | Off-axis rejection of sidewalls |
| Shotgun | Exteriors, DS/WS | Higher directivity; wind protection essential |
| Lavalier (omni/cardioid) | Dialogue under wardrobe | Manage rustle; maintain consistent distance |
| Plant | Vehicles, tables, lamps | Fixed perspective capture |
| Ambience pair | Room/space tone | Matched pair when needed; same SR/bit depth |
11.2 File & Track Naming (illustrative)
- Showcode: PRD
- Roll: PRD_S05_R03
- File: 012A_1_03.wav
- Tracks: MX1, MX2, BOOM, LAV_A, LAV_B, PLANT_DASH, PLANT_TABLE
- iXML notes: Noise: horn spike @00:12:13:10; Wardrobe: rustle LAV_B.
12) Data Artifacts (Outputs Generated on Set)
- PolyWAV files (48 kHz/24-bit, TC accurate).
- Sound reports (CSV + PDF).
- Room tone libraries per location (tagged).
- Wild lines catalog per scene.
- RF scan snapshots (image or CSV).
- Slate/mic-map photo references.
13) Language & Dubbing Considerations (India)
- Primary capture language: As per script; bilingual exchanges common.
- Regional re-versions: Separate ADR passes generate language-specific dialogue stems; walla recorded per language; M&E maintained for delivery.
- Dialects: City-specific prosody (e.g., Mumbai/Delhi variants) logged at breakdown to preserve continuity in ADR.
14) Outcome Alignment
- Editorial: Clean, well-labeled location mix accelerates offline and reduces relink friction.
- Post: ISO integrity + robust metadata enable targeted salvage; ADR decisioning guided by SNR and masking profiles; final mix supports international M&E without reconstruction losses.
Noisy Outdoor Areas — India (Analytical Note)Dominant sub-environments and noise profiles
| Environment | Primary sources | SPL @ ~3 m (dBA)* | Spectral signature | Temporal pattern | Seasonality / time windows | SNR vs close-mic** | ADR risk band |
|---|---|---|---|---|---|---|---|
| Urban arterial roads | Traffic flow, horns, buses, wet roads | 70–85; spikes 90–105 | Broad; horn energy 1–4 kHz; tire hiss 500–2 kHz | Continuous bed + transient spikes | Weekdays 08:00–21:00 peaks; rain ↑ hiss | Low–Med | Med–High |
| Markets/bazaars | Human crowd, vendors’ PAs, generators | 75–90 | Mid-high 1–5 kHz; PA 500 Hz–8 kHz | Dense transients, overlapping speech | Evenings, weekends | Low | High |
| Rail vicinity (outside stations) | Diesel/electric rumble, braking squeal | 75–95 | LF 30–120 Hz (engines) + 2–6 kHz (squeal) | Bursty with long events | Scheduled peaks | Low | High |
| Under flight paths | Aircraft overpass | 80–100 | LF-dominant 20–200 Hz + broadband | 20–60 s events | Fixed by runway use | Low | High |
| Construction zones | Concrete cutters, hammers, mixers | 85–100 | Wideband; strong 200 Hz–4 kHz | Intermittent, long duty cycles | Daylight hours | Low | High |
| Coastal/beaches | Surf, wind, tourists | 60–75; gusts modulate | LF surf 50–300 Hz; wind random | Quasi-continuous | Windier afternoons; monsoon ↑ | Med (in lulls) | Med |
| Hill/forest roads | Insects, birds, stream noise, scooters | 50–70; spikes from bikes | Tonal insects 3–8 kHz; water 100–1 kHz | Continuous bed + occasional spikes | Post-sunset insects ↑ | Med | Med |
| Religious/procession routes | Drums, loudspeakers, crowd | 85–100 | Percussive lows + PA mid-highs | Bursty blocks | Festival calendars | Low | High |
* Indicative ranges; site-specific.
** Close-mic = lav at ~15 cm; SNR bands: High ≥20 dB, Med 15–20 dB, Low <15 dB.
Outdoor acoustic constraints (summarized)
- Low-frequency beds: generators/traffic/aircraft (20–200 Hz) mask fundamentals of baritone/alto speech.
- Transient dominance: horns, PA calls, percussive tools create non-stationary interference that resists broadband NR.
- Wind modulation: adds random low-mid energy and capsule turbulence; severity scales with gusts and exposure.
- Wet surfaces: increase broadband tire hiss (500 Hz–2 kHz).
- Crowd speech overlap: dense 1–4 kHz content directly competes with consonant intelligibility.
Noisy Indoor Areas — India (Analytical Note)
Space types and noise/reverb characteristics
| Space type | Constant bed sources | Intermittent sources | Floor SPL (dBA) | RT60 (s) typical | Electrical artifacts (50 Hz) | External intrusion | SNR: Boom vs Lav |
|---|---|---|---|---|---|---|---|
| Apartment living room | Split/window AC, ceiling fans, refrigerators | Plumbing, lift motors, corridor voices | 35–55 (split AC lower) | 0.4–0.8 | Possible (grounding/earthing variance) | Medium via windows | Boom: Med; Lav: Med–High |
| Hotel room | HVAC, minibars | Doors, corridor carts | 35–45 | 0.3–0.6 | Low–Med | Low–Med | Boom: Med–High; Lav: High |
| Office/meeting room | HVAC, air purifiers | Keyboards, chairs, footsteps | 45–55 | 0.5–1.0 | Low | Medium (glazing) | Boom: Med; Lav: Med–High |
| Classroom/hall | Ceiling fans, old tube lights | Chairs, PA spill | 45–60 | 0.8–1.5 | Med (older ballasts) | Medium | Boom: Low–Med; Lav: Med |
| Warehouse/shed | Large fans, machinery | Vehicle beeps, door slams | 55–70 | 1.5–3.0 | Low–Med | Medium (roller doors) | Boom: Low; Lav: Med |
| Restaurant/café | HVAC, fridges | Crockery, music, crowd | 55–70 | 0.7–1.2 | Low | Medium | Boom: Low; Lav: Med |
| Soundstage (treated) | Low HVAC | — | 25–35 |
Indoor acoustic constraints (summarized)
- Mechanical beds: fans/HVAC produce steady 120–400 Hz components that mask vowel energy.
- Room reverberation: longer RT60 increases syllabic smearing; large untreated volumes (halls/warehouses) elevate ADR likelihood.
- Structure-borne noise: lifts/plumbing transmit LF through floors/walls; appears as non-stationary low rumbles.
- Electrical hum risk: 50 Hz fundamental + harmonics (100/150 Hz) in poorly grounded environments contaminates low mids.
- External leakage: traffic/horn ingress through single-glazed windows adds mid-high spikes.
Comparative indices
- ADR propensity (median cases): Outdoor urban/rail/flight paths High; indoor warehouses/halls Med–High; hotel rooms/soundstages Low.
- Primary masking bands: Outdoor LF beds (20–200 Hz) + transient mid-high spikes (1–6 kHz); Indoor mechanical lows (120–400 Hz) + reverberant smear (≥500 ms).
- Time-of-day stability: Indoor generally stable; outdoor shows strong diurnal peaks and festival/event-driven variance.
