How would you design the data model for a chat application UI in React
Normalize: `messages: Map<id, Message>`, `channels: Map<id, Channel>`, `participants: Map<id, User>`, ordered `messageIdsByChannel: Map<channelId, id[]>`. Per-channel pagination cursors. Pending outbox keyed by tempId. Drafts per channel. Realtime updates merge into the normalized stores. UI selectors compute lists on demand. React Query for fetch + WebSocket for push.
Cross-link to [[design-the-send-message-feature-like-slack-ui-client-logic]] for the action layer.
Normalized state
type Message = {
id: string;
tempId?: string;
channelId: string;
authorId: string;
body: string;
status: 'sent' | 'pending' | 'failed';
createdAt: string;
editedAt?: string;
};
type Channel = {
id: string;
name: string;
memberIds: string[];
unreadCount: number;
lastReadAt: string;
};
type ChatState = {
channels: Record<string, Channel>;
messages: Record<string, Message>; // by id
messageIdsByChannel: Record<string, string[]>;// ordered, paginated
users: Record<string, User>;
outbox: OutboxEntry[];
drafts: Record<string, string>; // per channel
pagination: Record<string, { cursor: string | null; hasMore: boolean; loading: boolean }>;
};Normalization makes updates / lookups fast and avoids array scans.
Why normalize
- An edit hits one record, not every list that contains it.
- Joining ("who wrote this message") = O(1) lookup.
- Deduping realtime updates is trivial — set by id.
Adding messages
function applyIncoming(state, msg) {
// Dedup by id or tempId
if (msg.tempId && state.messages[msg.id]) return state;
state.messages[msg.id] = msg;
const ids = state.messageIdsByChannel[msg.channelId] ?? [];
// Insertion-sorted by createdAt
const i = upperBound(ids, msg.createdAt, (id) => state.messages[id].createdAt);
state.messageIdsByChannel[msg.channelId] = [...ids.slice(0, i), msg.id, ...ids.slice(i)];
}Pagination
Cursor-based per channel. Fetch /messages?channel=X&before=cursor returns N older. Stored as ordered array of ids; cursor for next.
Optimistic send
const tempId = uuid();
state.messages[tempId] = { id: tempId, tempId, status: 'pending', ... };
state.messageIdsByChannel[channelId].push(tempId);
api.send({ tempId, ... }).then((real) => {
delete state.messages[tempId];
state.messages[real.id] = real;
// Replace tempId with real.id in the order array
});Drafts
drafts[channelId] = inputValue — survives channel switches and reloads (persisted).
Realtime layer
WebSocket subscription:
ws.onmessage = (e) => {
const evt = JSON.parse(e.data);
switch (evt.type) {
case 'message:new': applyIncoming(evt.payload); break;
case 'message:edit': applyEdit(evt.payload); break;
case 'message:delete': applyDelete(evt.payload); break;
}
};On reconnect, fetch missed messages since the last known timestamp.
Selectors
UI components select from the normalized state:
useMessages(channelId) → ids.map(id => state.messages[id])
useMessageWithAuthor(id) → { ...message, author: users[authorId] }Memoize derived selectors (reselect) for expensive joins.
Render layer
- Virtualized message list (TanStack Virtual).
- Composer state separate from list (per-channel draft).
- Sticky day-dividers computed by walking the visible range.
Read receipts / unread counts
channel.unreadCount updated server-side; client tracks lastReadAt. When user scrolls to bottom, send a read receipt.
Edge cases
- Out-of-order WebSocket events (insert at correct position).
- Same user on multiple devices (dedup by id).
- Edits to messages not in cache (apply to message; if not present, ignore — will fetch on demand).
- Deletes (soft delete with tombstone or hard remove from ordered array).
Why not denormalize
channel.messages = [...full message objects] looks simpler but:
- Edits to a message must traverse every channel that holds it (rare in chat, but for shared messages — quoted, threads — common).
- Memory growth from duplicated payloads.
- Realtime reconciliation gets hairy.
Interview framing
"Normalize into id-keyed Maps: messages, channels, users, plus an ordered messageIdsByChannel for pagination. An edit hits one record; selectors join on demand. Pending sends live in an outbox keyed by tempId; on success replace the temp with the real id. WebSocket events merge into the normalized stores with dedup by id. Drafts per channel. Cursor pagination. Virtualized list for render. Edits and out-of-order events are easy because we have id-keyed data, not nested arrays. React Query for the cold fetches; WebSocket for push. The big anti-pattern is denormalized arrays of message objects per channel — every edit becomes a scan."
Follow-up questions
- •Why normalize over nested?
- •How do you handle out-of-order WebSocket events?
- •What's an outbox and why?
Common mistakes
- •Denormalized nested structures.
- •No dedup on incoming events.
- •Drafts in component state, lost on channel switch.
Performance considerations
- •Normalization makes updates O(1). Virtualize the list. Memoize selectors.
Edge cases
- •Out-of-order events.
- •Edits to off-screen messages.
- •Multiple devices.
- •Long disconnect + resync.
Real-world examples
- •Slack, Discord, Linear comments, Notion comments, Replit chat.