* Async attributes on participant.
How it is different from existing participant attributes?
1. Async attribute can be added one at a time.
2. These are not included in `ParticipantInfo`.
3. Get an attribute bt participant identity and async attribute ID as
and when needed.
* clean up
* get full definitions, not just ids
* listener OnDataTrackSchema
* name length config
* data blob
* deps
* static check
* Add missing request ID
* Update protocol commit
* Wire up StoreDataBlobResponse
* Pass request ID through in GetDataBlobResponse
* deps
* atomic
* sctp at 1.9.5
* remove proto clone
---------
Co-authored-by: Jacob Gelman <3182119+ladvoc@users.noreply.github.com>
In single peer connection mode, when the server answers a subscriber's
offer, configureSenderAudio set the sender codec preferences from the
server MediaEngine's payload types. The answer could therefore advertise
Opus on a payload type the offerer never offered (server PT 111 vs
offered PT 109). Chrome tolerates this; Firefox decodes 0 samples
(silence) -- packets are received but never decoded. The forwarded RTP
already uses the offered PT, so only the answer SDP was inconsistent.
This regressed in v1.12.0 once the single-PC MediaEngine became a union
of publish+subscribe codecs.
Parse the remote offer's audio rtpmap and remap the sender audio codec
preferences to echo the offered payload types (RFC 3264 6.1) before
SetCodecPreferences.
Fixes#4599
Co-authored-by: laosun <14806343+cnvipstar@users.noreply.github.com>
* Prometheus metric for join latency.
Also including a couple of other failures in the signal connection path
and moving the signal connected to after all that.
Not doing counters for the new signal failure paths. I should not have
done for the other two I added a little while ago also (
validation failure and start participant failure) as those are not
scalable to keep adding to node stats. Will probably remove those two
from node stats later. Can add those counters if they are useful.
* deprecate signal failed counters
Previously it was anchored to participant transitioning to `ACTIVE` if
the add track request happened before that. But, that has a few issues
1.`ACTIVE` is for primary peer connection which could be subscriber peer
connection.
2. `ACTIVE` also include data channel establishment.
Switch to first connected time of publisher peer connection for that to
get a more accurate measure of track publish time.
noticed a config in deploy config while cleaning up some other usused
config. small clean up. probably there is a bunch more that can be
cleaned up, but doing a quick one as I noticed this.
* feat: acquire requested video layer directly at HIGH quality by default
Two changes that together remove the visible low->high quality ramp for a new
subscriber (both publisher-first and subscriber-first join orders):
1. Default a subscriber's initial video quality to HIGH on bind instead of LOW
for adaptive stream, so the subscribed max layer is the top layer. Adaptive
stream clients can still scale down afterwards based on viewport.
2. On initial layer acquisition the forwarder/selector latch directly onto the
allocator's target (the requested top layer) instead of opportunistically
latching onto the first lower key frame that arrives. A short
initial-acquisition grace aims the target at the requested layer; if it does
not show up in time, the target falls back to the highest layer seen so
acquisition never stalls.
Always on - no configuration flag.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat: gate start-at-desired-quality behind EnableStartAtDesiredQuality flag
Put the "acquire requested video layer directly at HIGH quality" behavior
behind a per-subscriber EnableStartAtDesiredQuality flag (default off, so
the original low->high ramp-up is restored unless enabled).
Plumbed from config.RTC.EnableStartAtDesiredQuality through ParticipantParams
-> SubscribedTrack/DownTrack -> Forwarder -> simulcast selector, gating all
three behavior changes: the HIGH default on bind, the forwarder's
initial-acquisition grace, and the selector's direct-latch-onto-target.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* remove config.
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Fix skipped packets accounting.
No need to copy unskipped packet RTP header to skipped packet.
That was causing padding bytes to be counted.
Also use Header.PaddingSize as base PaddingSize is deprecated.
* PaddingSize in header in utils
* service: enforce metadata size limit in CreateRoom, bump default to 512 KiB
CreateRoom previously accepted any metadata size; only UpdateRoomMetadata
rejected oversized payloads. Mirror the same CheckMetadataSize check at
the CreateRoom API boundary so both entrypoints are bounded.
Default MaxMetadataSize moves from 64000 to 512 * 1024 to match the
practical needs of customers using room metadata for richer state. The
limit remains configurable via the existing limits.max_metadata_size knob.
* service: split room vs. participant metadata limit, enforce on join + agent dispatch
LimitConfig.MaxMetadataSize was shared between room metadata and
participant metadata. Last commit's bump to 512 KiB lifted both ceilings;
this restores the participant ceiling to 64 KB and introduces a separate
MaxRoomMetadataSize (default 512 KiB) for room metadata.
Additional enforcement:
- RoomManager.StartSession rejects joins whose JWT-grants metadata or
attributes exceed the participant/attributes limits. The check was
missing entirely from this path.
- AgentDispatchService.CreateDispatch and the embedded
CreateRoomRequest.Agents path now validate metadata and attributes
against the common 64 KB ceilings (previously unbounded).
NewAgentDispatchService gains a LimitConfig parameter; the two wire_gen
callsites are updated.
* service: collapse metadata size limit to single 512 KiB knob
Reverts the LimitConfig split introduced in the previous commit:
MaxRoomMetadataSize, CheckRoomMetadataSize, and the max_room_metadata_size
yaml key are removed. MaxMetadataSize moves back to 512 * 1024 and gates
all metadata uniformly — room (CreateRoom, UpdateRoomMetadata), participant
(UpdateParticipant, signal UpdateMetadata, JWT grants on join), and agent
dispatch (CreateDispatch + embedded RoomAgentDispatch).
MaxAttributesSize stays at 64 KB and continues to gate participant and
agent-dispatch attributes separately.
Test cases consolidated under the single knob.
* kb -> kib
There are several places the participant can drop off after initiating a
connection attempt. Count those places as cancellation including when
participant is closed due to specific reasons.
Cancels should be discounted when determining RTC/ICE connectivity
success/failure percentage.
* agent: thread simulation flag from dispatch to job
Reads simulation from AgentDispatch / RoomAgentDispatch and copies it
onto Job in agent.LaunchJob and the inline room-agent path so workers
see the flag.
Stacked on top of livekit/protocol#1629.
* agent: replace simulation bool with attributes map
Threads the renamed attributes map (was bool simulation) from dispatch
to job and bumps the protocol pseudo-version.
* deps
* rtc: add RestartSessionTimer to re-anchor participant session duration
Exposes ParticipantImpl.RestartSessionTimer so the session timer can be
re-anchored to the actual join time. Duration is only ever emitted once
the participant becomes active, so re-anchoring at join keeps pre-join
wall-clock out of the reported/billed duration. Adds the method to the
LocalParticipant interface (fake regenerated) and a local protocol
replace to pick up SessionTimer.Reset.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* tidy
* update protocol
* report ended at for inactive sessions
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Paul Wells <paulwe@gmail.com>
* Add prom metrics for peer connectino state.
By direction (PUBLISHER vs SUBSCRIBER) and state ("started" ->
"connected"). This gives a way to track peer connections failing to
finish establishment.
The RTC active count can be useful for primary peer connection, but not
for non-primary. This counter can be used to track any and can generally
be used to understand success/failure rate of peer connection
establishment.
* add a couple of more states
* clean up and avoid duplicate reporting fully established
* staticcheck
* Update go deps to v4
Generated by renovateBot
* update dockertest to v4
* fix
---------
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
MoveToRoom resets the participant reporter resolver to receive new
(room, participant_session) keys for the destination, but the source
room's participant_session row never gets an end_time — the periodic
duration scrape only emits one once disconnectedAt is set, and a move
doesn't transition the participant to DISCONNECTED. Report end_time
immediately before the reset so the row is closed out cleanly.
* Metrics for participant active, i. e. fully established.
- Egress stub for v2 API
- Fix the participant canceled counter 🤦
- Add active counter -> this is increment when a participant becomes
active, i. e. primary peer connection established. Can be used to
monitor node wise connection establishment issues.
- Add singnalling validation fail counter.
With this, we have
- signalling validation fail
- signalling failed --> this is when the `startSession` fails
- signalling connected -> signalling is succesful and can send back
joinResponse to client
on media connection side
- rtc_init -> start
- rtc_connected -> participant session created (joined)
- rtc_active -> primay peer connection established
- rtc_canceled -> could not proceed with RTC connection due to not being
able to resume.
* signalling counters deps
* revert pion/webrtc to 4.2.12 to get SCTP without interleaving
* go back to pion/webrtc 4.2.11 and sctp 1.9.5
* telemetry: split webhook-processed hook registration out of NewTelemetryService
NewTelemetryService used to register a notifier processed-hook on the inner
*telemetryService directly. That made it impossible for downstream wrappers
(e.g. cloud's TelemetryService that overrides Webhook to fan out to a v3
observability pipeline) to intercept webhook events without double-firing
the legacy emission.
Lift the registration into a new exported helper RegisterWebhookHook, and
have the standalone server's wire provider createTelemetryService call it
right after construction so behavior is unchanged for callers that don't
wrap the service.
When a client hits /rtc/v[01]/validate with a base64 WrappedJoinRequest
whose embedded JoinRequest.ClientInfo is unset, validateInternal called
AugmentClientInfo with a nil *ClientInfo and panicked at ci.Address =
GetClientIP(req). The non-wrapped branch already allocates via
ParseClientInfo; do the same here so pi.Client always gets at least the
resolved client Address.