The core feature of Reach is connecting hospital staff with interpreters. A nurse opens the app, selects a language, and within seconds is connected to an interpreter who can facilitate communication with a patient. This happens at all hours, across time zones, on devices ranging from the latest iPhone to three-year-old Android phones on slow networks.
The matching system is the most complex piece of our architecture. Here's how it works.
The WebSocket layer
Every active user maintains a WebSocket connection to the matching service. We use raw WebSocket (ws library on the server, React Native's built-in WebSocket on the client) rather than Socket.io, because we need precise control over reconnection behaviour and message timing.
The connection lifecycle on the client:
class MatchingConnection {
private ws: WebSocket | null = null;
private reconnectAttempt = 0;
private heartbeatInterval: NodeJS.Timer | null = null;
connect() {
this.ws = new WebSocket(MATCHING_WS_URL);
this.ws.onopen = () => {
this.reconnectAttempt = 0;
this.startHeartbeat();
this.sendAvailabilityStatus();
};
this.ws.onmessage = (event) => {
const message = JSON.parse(event.data);
this.handleMessage(message);
};
this.ws.onclose = () => {
this.stopHeartbeat();
this.scheduleReconnect();
};
}
private scheduleReconnect() {
const delay = Math.min(
1000 * Math.pow(2, this.reconnectAttempt),
5000 // Cap at 5 seconds for time-sensitive matching
);
this.reconnectAttempt++;
setTimeout(() => this.connect(), delay);
}
private startHeartbeat() {
this.heartbeatInterval = setInterval(() => {
if (this.ws?.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify({ type: 'heartbeat' }));
}
}, 15000);
}
}
The reconnection cap is 5 seconds, not the typical 30 seconds used in non-critical applications. An interpreter who disconnects and reconnects needs to be back in the available pool quickly.
The availability state machine
Each interpreter has an availability status that follows a strict state machine:
OFFLINE → AVAILABLE → MATCHING → IN_SESSION → AVAILABLE
→ DECLINED → AVAILABLE
→ TIMEOUT → AVAILABLE
AVAILABLE → OFFLINE
IN_SESSION → OFFLINE (app killed during session)
States:
- OFFLINE: WebSocket disconnected or app in background for more than 5 minutes
- AVAILABLE: connected, app in foreground, ready to accept sessions
- MATCHING: the system is attempting to connect this interpreter with a request
- IN_SESSION: actively in a session
- DECLINED: interpreter declined the match request
- TIMEOUT: interpreter didn't respond within the timeout window
The state machine is enforced server-side. The client sends status updates, but the server validates transitions. An interpreter can't go from OFFLINE to IN_SESSION without going through MATCHING first.
const validTransitions: Record<Status, Status[]> = {
OFFLINE: ['AVAILABLE'],
AVAILABLE: ['MATCHING', 'OFFLINE'],
MATCHING: ['IN_SESSION', 'DECLINED', 'TIMEOUT', 'OFFLINE'],
IN_SESSION: ['AVAILABLE', 'OFFLINE'],
DECLINED: ['AVAILABLE', 'OFFLINE'],
TIMEOUT: ['AVAILABLE', 'OFFLINE'],
};
function validateTransition(current: Status, next: Status): boolean {
return validTransitions[current]?.includes(next) ?? false;
}
The matching sequence
When a hospital staff member requests an interpreter:
- Request received: the server creates a match request with the required language and priority level
- Pool query: the server queries available interpreters who speak the requested language, ordered by: language proficiency rating, response time history, and time since last session
- Match attempt: the server sends a match request to the top-ranked interpreter via WebSocket
- Response window: the interpreter has 20 seconds to accept or decline
- Accept: the server creates a session and connects both parties
- Decline or timeout: the server moves to the next interpreter in the pool
- Exhaustion: if no interpreters accept, the request enters a queue and interpreters aren'tified as they become available
The entire sequence targets completion in under 30 seconds for the common case where an interpreter is available.
async function matchRequest(request: MatchRequest): Promise<Session | null> {
const candidates = await getAvailableInterpreters(request.language);
for (const interpreter of candidates) {
const accepted = await offerMatch(interpreter.id, request, 20000);
if (accepted) {
return createSession(request, interpreter);
}
// If declined or timed out, try the next candidate
}
// No immediate match available
await enqueueRequest(request);
return null;
}
Push notification fallback
WebSocket connections aren't reliable on mobile. The app might be in the background. The operating system might have killed the WebSocket connection to save battery. The network might have switched from Wi-Fi to cellular.
When the WebSocket delivery fails (no pong response within 3 seconds), the server immediately sends a push notification:
async function offerMatch(
interpreterId: string,
request: MatchRequest,
timeoutMs: number
): Promise<boolean> {
// Try WebSocket first
const wsDelivered = await sendViaWebSocket(interpreterId, {
type: 'match_offer',
requestId: request.id,
language: request.language,
priority: request.priority,
});
if (!wsDelivered) {
// Fallback to push notification within 3 seconds
await sendPushNotification(interpreterId, {
title: 'Session Request',
body: `${request.language} interpreter needed`,
data: { requestId: request.id },
});
}
// Wait for response regardless of delivery method
return waitForResponse(interpreterId, request.id, timeoutMs);
}
The push notification opens the app and establishes a WebSocket connection. The interpreter can then accept the match through the normal flow.
This fallback path adds 2-3 seconds to the matching time. We track the ratio of WebSocket vs push notification deliveries to monitor connection health.
Monitoring
We track:
- Matching latency: time from request to session creation (p50, p95, p99)
- Match rate: percentage of requests that find an interpreter within 60 seconds
- WebSocket delivery rate: percentage of match offers delivered via WebSocket vs push notification
- Interpreter response time: how long interpreters take to accept or decline
- Queue depth: number of unmatched requests waiting
Alerts fire when:
- p95 matching latency exceeds 45 seconds
- Match rate drops below 85%
- WebSocket delivery rate drops below 70% (indicates a systemic connection issue)
- Queue depth exceeds 10 for more than 5 minutes
These metrics are on a real-time dashboard that the engineering team monitors. When matching latency degrades, we can usually identify the cause within minutes: a region with no available interpreters, a spike in requests from a specific hospital, or a WebSocket connectivity issue.
What we learned
Real-time matching on mobile is a systems problem, not a feature. The WebSocket connection management, the state machine, the fallback mechanisms, the monitoring: these aren't optional additions to a matching algorithm. They're the matching system.
The matching algorithm itself (rank interpreters by language, proficiency, and response history) is the simplest part. The complexity is in making it work reliably across unreliable networks, diverse devices, and all hours of the day.
The state machine diagram for interpreter availability is something I've been trying to figure out how to model for a similar matching system. This is the most concrete treatment of the problem I've found. Thank you.
The push notification fallback to WebSocket pattern is clever. We've been debating whether to implement something similar. Useful to see it described in a production context rather than as a hypothetical.