Back to JavaScript
JavaScript
easy
mid

How would you implement a polyfill for JSON.parse?

A recursive descent parser: tokenize the JSON string, then parse values by type (object, array, string, number, true/false/null). Handle whitespace, escape sequences, nesting, and throw SyntaxError on malformed input. The realistic answer is explaining the parser structure, not a flawless full implementation.

5 min read·~20 min to think through

JSON.parse is a recursive descent parser — and the interview is mostly about whether you can describe the parser structure clearly, not produce a bug-free 200-line implementation.

The structure: tokenize, then parse recursively

You walk the string with a position cursor, and have a function per JSON value type, each recursing into the others:

js
function jsonParse(str) {
  let i = 0; // cursor

  function skipWhitespace() {
    while (i < str.length && /\s/.test(str[i])) i++;
  }

  function parseValue() {
    skipWhitespace();
    const ch = str[i];
    if (ch === "{") return parseObject();
    if (ch === "[") return parseArray();
    if (ch === '"') return parseString();
    if (ch === "-" || (ch >= "0" && ch <= "9")) return parseNumber();
    if (str.startsWith("true", i))  { i += 4; return true; }
    if (str.startsWith("false", i)) { i += 5; return false; }
    if (str.startsWith("null", i))  { i += 4; return null; }
    throw new SyntaxError(`Unexpected token at position ${i}`);
  }

  function parseObject() {
    const obj = {};
    i++; // consume '{'
    skipWhitespace();
    if (str[i] === "}") { i++; return obj; }
    while (true) {
      skipWhitespace();
      const key = parseString();          // keys are JSON strings
      skipWhitespace();
      if (str[i] !== ":") throw new SyntaxError("Expected ':'");
      i++;
      obj[key] = parseValue();            // recurse
      skipWhitespace();
      if (str[i] === ",") { i++; continue; }
      if (str[i] === "}") { i++; return obj; }
      throw new SyntaxError("Expected ',' or '}'");
    }
  }

  function parseArray() {
    const arr = [];
    i++; // consume '['
    skipWhitespace();
    if (str[i] === "]") { i++; return arr; }
    while (true) {
      arr.push(parseValue());             // recurse
      skipWhitespace();
      if (str[i] === ",") { i++; continue; }
      if (str[i] === "]") { i++; return arr; }
      throw new SyntaxError("Expected ',' or ']'");
    }
  }

  // parseString — handle escape sequences (\", \\, \n, \t, \uXXXX...)
  // parseNumber — handle sign, integer, fraction, exponent

  const result = parseValue();
  skipWhitespace();
  if (i < str.length) throw new SyntaxError("Unexpected trailing characters");
  return result;
}

The details / gotchas

  • Recursive descentparseValue dispatches by the next char; parseObject/parseArray call parseValue for their contents → handles arbitrary nesting.
  • Whitespace — skip it between every token.
  • Strings — the fiddly part: escape sequences (\", \\, \n, \t, \uXXXX).
  • Numbers — sign, integer part, optional fraction, optional exponent.
  • Strict JSON — no trailing commas, no comments, no single quotes, keys must be double-quoted strings — malformed input must throw SyntaxError.
  • Trailing characters after the root value → error.
  • The real JSON.parse also takes a reviver function — mention it; usually out of scope.

The framing

"It's a recursive descent parser. I walk the string with a cursor and have a parseValue that dispatches on the next character to parseObject, parseArray, parseString, parseNumber, or the literal keywords — and object/array parsers recurse back into parseValue, which is how arbitrary nesting works. The fiddly bits are string escape sequences and number formats, skipping whitespace between tokens, and being strict — no trailing commas or comments, throw SyntaxError on anything malformed, including trailing characters after the root value. I'd describe that structure clearly rather than claim a flawless full implementation."

Follow-up questions

  • What is recursive descent parsing?
  • What makes parsing strings the tricky part?
  • What inputs are valid JSON.stringify output but invalid for a lenient parser?
  • What does the reviver function argument do?

Common mistakes

  • Just calling eval() — unsafe and not really 'parsing'.
  • Not skipping whitespace between tokens.
  • Mishandling string escape sequences.
  • Accepting invalid JSON (trailing commas, comments, single quotes).
  • Not erroring on trailing characters after the root value.

Performance considerations

  • O(n) over the input string. Deeply nested JSON can hit recursion limits; the native JSON.parse is implemented in C++ and far faster — the polyfill is an exercise in understanding parsing, not a production replacement.

Edge cases

  • Deeply nested structures (recursion depth).
  • Escape sequences and unicode \uXXXX in strings.
  • Numbers with exponents and negative signs.
  • Empty objects/arrays.
  • Malformed input — must throw SyntaxError, not return garbage.

Real-world examples

  • Understanding why a trailing comma breaks JSON.parse.
  • Recursive descent is the same structure used for many real parsers.

Senior engineer discussion

Seniors describe the recursive-descent structure (cursor + per-type parse functions + dispatch), identify strings/numbers and strictness as the hard parts, never reach for eval, and mention the reviver argument.

Related questions