Inside unjs/destr

destrUnJS 團隊發佈的套件,為 JSON.parse alternative,其本質還是基於 JSON.parse 之上,新增開發友善的功能,像是:

下方簡單展示 destrJSON.parse 使用上的差異,這裡參考版本為 v2.0.3

// fallback to original value if parsing fails
JSON.parse('foo');  // Uncaught SyntaxError: Unexpected token 'o', "foo" is not valid JSON
destr('foo');       // 'foo'
 
// known string values
JSON.parse('NaN'); // Uncaught SyntaxError: "NaN" is not valid JSON
destr('NaN');      // NaN
 
// prototype pollution prevention
const input = '{ "user": { "__proto__": { "isAdmin": true } } }';
JSON.parse(input); // { user: { __proto__: { isAdmin: true } } }
destr(input);      // { user: {} }

Supporting known string values

目前支援的字串格式有:true, false, undefined, null, nan, infinity, -infinity

下方程式碼的第 8-32 行的判斷,是基於效能考量。在上述支援的字串中,最大長度是 '-infinity'.length 即為 9,透過此判斷可以避免對 long string 調用不必要的 toLowerCase()

相關討論可以參考:issue#80, pull#81

function destr<T = unknown>(value: any, options: Options = {}): T {
  if (typeof value !== "string") {
    return value;
  }
 
  const _value = value.trim();
  // skip
  if (_value.length <= 9) {
    const _lval = _value.toLowerCase();
    if (_lval === "true") {
      return true as T;
    }
    if (_lval === "false") {
      return false as T;
    }
    if (_lval === "undefined") {
      return undefined as T;
    }
    if (_lval === "null") {
      // eslint-disable-next-line unicorn/no-null
      return null as T;
    }
    if (_lval === "nan") {
      return Number.NaN as T;
    }
    if (_lval === "infinity") {
      return Number.POSITIVE_INFINITY as T;
    }
    if (_lval === "-infinity") {
      return Number.NEGATIVE_INFINITY as T;
    }
  }
  // skip
}

Prototype pollution prevention

檢查流程大致如下:

  1. 透過正規表達式檢查 "__proto__":"constructor": 字串是否存在 (包含可能的 unicode 組合)
  2. 如果存在,回傳 JSON.parse(value, jsonParseTransform) 的結果,透過 reviver function 來清除不合法的 property,反之回傳 JSON.parse(value),節省 reviver 檢查 property 的開銷
const suspectProtoRx =
  /"(?:_|\\u0{2}5[Ff]){2}(?:p|\\u0{2}70)(?:r|\\u0{2}72)(?:o|\\u0{2}6[Ff])(?:t|\\u0{2}74)(?:o|\\u0{2}6[Ff])(?:_|\\u0{2}5[Ff]){2}"\s*:/;
const suspectConstructorRx =
  /"(?:c|\\u0063)(?:o|\\u006[Ff])(?:n|\\u006[Ee])(?:s|\\u0073)(?:t|\\u0074)(?:r|\\u0072)(?:u|\\u0075)(?:c|\\u0063)(?:t|\\u0074)(?:o|\\u006[Ff])(?:r|\\u0072)"\s*:/;
 
export function destr<T = unknown>(value: any, options: Options = {}): T {
  // skip
  try {
    if (suspectProtoRx.test(value) || suspectConstructorRx.test(value)) {
      if (options.strict) {
        throw new Error("[destr] Possible prototype pollution");
      }
      return JSON.parse(value, jsonParseTransform);
    }
    return JSON.parse(value);
  } catch (error) {
    if (options.strict) {
      throw error;
    }
    return value as T;
  }
}
 
function jsonParseTransform(key: string, value: any): any {
  if (
    key === "__proto__" ||
    (key === "constructor" &&
      value &&
      typeof value === "object" &&
      "prototype" in value)
  ) {
    warnKeyDropped(key);
    return;
  }
  return value;
}
 
function warnKeyDropped(key: string): void {
  console.warn(`[destr] Dropping "${key}" key to prevent prototype pollution.`);
}

Regular Expression

const suspectProtoRx =
  /"(?:_|\\u0{2}5[Ff]){2}(?:p|\\u0{2}70)(?:r|\\u0{2}72)(?:o|\\u0{2}6[Ff])(?:t|\\u0{2}74)(?:o|\\u0{2}6[Ff])(?:_|\\u0{2}5[Ff]){2}"\s*:/;

JSON.Parse() reviver

If the reviver function returns undefined (or returns no value — for example, if execution falls off the end of the function), the property is deleted from the object. Otherwise, the property is redefined to be the return value.

調用 jsonParseTransform 如果得到 early return 的 undefined,會把不合法的 property 給清除

function jsonParseTransform(key: string, value: any): any {
  if (
    key === "__proto__" ||
    (key === "constructor" &&
      value &&
      typeof value === "object" &&
      "prototype" in value)
  ) {
    warnKeyDropped(key);
    return;
  }
  return value;
}

Fallback to original value if parsing fails

除了對 JSON.parse,加一層 try catch 之外,在 JSON.parse 前,先透過 JsonSigRx 做簡易的判斷

const JsonSigRx = /^\s*["[{]|^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+)?\s*$/;
 
function destr<T = unknown>(value: any, options: Options = {}): T {
  const _value = value.trim();
  if (_value.length <= 9) {
    // skip
  }
  if (!JsonSigRx.test(value)) {
    if (options.strict) {
      throw new SyntaxError("[destr] Invalid JSON");
    }
    return value as T;
  }
 
   try {
    if (suspectProtoRx.test(value) || suspectConstructorRx.test(value)) {
      if (options.strict) {
        throw new Error("[destr] Possible prototype pollution");
      }
      return JSON.parse(value, jsonParseTransform);
    }
    return JSON.parse(value);
  } catch (error) {
    if (options.strict) {
      throw error;
    }
    return value as T;
  }
  // skip
}

JsonSigRx

/^\s*["[{]|^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+)?\s*$/

先簡單分成兩個部分:

再拆解 ^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+

parsing number 相關可參考:issue#93, pull#94

Reference