Back to JavaScript
JavaScript
easy
mid

How would you implement a DOM like data structure in JavaScript?

A node tree: `{ tag, attrs, children, parent }`. Operations: `createNode`, `append`, `remove`, `find` (by tag/id/predicate via DFS), `traverse`, `toHTML` (serialize). Mirror the DOM's parent/child invariants — appending a node detaches it from its previous parent. Useful for templating engines, virtual DOM exercises, and tree manipulation problems.

4 min read·~30 min to think through

Building a DOM-like data structure is a tree problem with invariants — parent/child references must stay consistent. The bugs people make are about those invariants, not the API.

The node shape

js
class Node {
  constructor(tag, attrs = {}) {
    this.tag = tag;
    this.attrs = attrs;
    this.children = [];
    this.parent = null;
  }
}

Append — detach first

js
append(child) {
  if (child.parent) child.parent.remove(child);   // detach from old parent
  child.parent = this;
  this.children.push(child);
  return child;
}

This mirrors real DOM behavior — appendChild moves a node if it already has a parent.

Remove

js
remove(child) {
  const i = this.children.indexOf(child);
  if (i !== -1) {
    this.children.splice(i, 1);
    child.parent = null;
  }
}

Insert before / after

js
insertBefore(child, ref) {
  if (child.parent) child.parent.remove(child);
  const i = this.children.indexOf(ref);
  this.children.splice(i, 0, child);
  child.parent = this;
}

Traversal (DFS)

js
traverse(visit) {
  visit(this);
  for (const c of this.children) c.traverse(visit);
}

find(predicate) {
  if (predicate(this)) return this;
  for (const c of this.children) {
    const f = c.find(predicate);
    if (f) return f;
  }
  return null;
}

findAll(predicate, out = []) {
  if (predicate(this)) out.push(this);
  for (const c of this.children) c.findAll(predicate, out);
  return out;
}

Queries — by tag / id / class

js
getElementById(id)            { return this.find((n) => n.attrs.id === id); }
getElementsByTagName(tag)     { return this.findAll((n) => n.tag === tag); }
getElementsByClassName(cls)   {
  return this.findAll((n) => (n.attrs.class || "").split(" ").includes(cls));
}

Serialization — toHTML

js
toHTML() {
  const attrs = Object.entries(this.attrs)
    .map(([k, v]) => ` ${k}="${escapeAttr(v)}"`)
    .join("");
  if (VOID_TAGS.has(this.tag)) return `<${this.tag}${attrs}/>`;
  const inner = this.children.map((c) =>
    c instanceof TextNode ? escapeText(c.text) : c.toHTML()
  ).join("");
  return `<${this.tag}${attrs}>${inner}</${this.tag}>`;
}

Don't forget to escape attribute values and text<, >, &, " — or you've built an XSS-prone templating engine.

Text nodes

A separate TextNode class with a text field — same parent pointer, no children. Mixing strings and Node objects in children is also fine.

Helper: createNode / h

A factory feels like JSX/h:

js
const h = (tag, attrs, ...children) => {
  const n = new Node(tag, attrs);
  for (const c of children.flat()) {
    n.append(typeof c === "string" ? new TextNode(c) : c);
  }
  return n;
};

const tree = h("div", { class: "wrap" },
  h("h1", {}, "Hello"),
  h("p", {}, "World")
);

Invariants to maintain

  • A node has exactly one parent at a time. Append detaches first.
  • parent.children and child.parent agree — never set one without the other.
  • No cycles — appending a node to its descendant should throw or be checked.
  • Cloning copies the subtree without reparenting the original.

Interview framing

"Each node has tag, attrs, children, parent. The invariant is parent/child consistency: append detaches from the previous parent first; remove nulls the child's parent pointer. DFS powers find, findAll, and the various getElement* queries. Serialization is recursive with proper escaping — otherwise the templating engine is XSS-prone. A small h(tag, attrs, ...children) factory makes construction ergonomic. The subtle correctness work is maintaining the parent/child invariant — that's where bugs hide."

Follow-up questions

  • Why does append detach first?
  • How would you prevent cycles (appending an ancestor)?
  • What's the difference between an HTML serializer that escapes and one that doesn't?
  • How would you implement a basic CSS selector engine on top?

Common mistakes

  • Setting parent without removing from old parent.
  • Iterating children while mutating them.
  • No escaping on toHTML — XSS.
  • No detach in append — node appears twice.

Performance considerations

  • O(n) traversal. For frequent id/class lookups, maintain an index. Avoid O(n) array splices in hot insert/remove paths.

Edge cases

  • Append node to itself / its descendant.
  • Void tags (img, br) shouldn't render children.
  • Empty children, empty attrs.

Real-world examples

  • Virtual DOM trees (React fiber).
  • Cheerio/JSDOM server-side DOM.
  • Email/HTML templating engines.

Senior engineer discussion

Seniors get the parent/child invariant right (detach-on-append, paired parent/children updates), escape on serialize, and reach for a real library (jsdom, parse5) for production rather than rolling one.

Related questions