Inside unjs/destr
destr 是 UnJS 團隊發佈的套件,為 JSON.parse
alternative,其本質還是基於 JSON.parse
之上,新增開發友善的功能,像是:
- supporting known string values (case insensitive)
- prototype pollution prevention
- fallback to original value if parsing fails
下方簡單展示 destr
與 JSON.parse
使用上的差異,這裡參考版本為 v2.0.3
// fallback to original value if parsing fails
JSON.parse('foo'); // Uncaught SyntaxError: Unexpected token 'o', "foo" is not valid JSON
destr('foo'); // 'foo'
// known string values
JSON.parse('NaN'); // Uncaught SyntaxError: "NaN" is not valid JSON
destr('NaN'); // NaN
// prototype pollution prevention
const input = '{ "user": { "__proto__": { "isAdmin": true } } }';
JSON.parse(input); // { user: { __proto__: { isAdmin: true } } }
destr(input); // { user: {} }
Supporting known string values
目前支援的字串格式有:true
, false
, undefined
, null
, nan
, infinity
, -infinity
下方程式碼的第 8-32 行的判斷,是基於效能考量。在上述支援的字串中,最大長度是 '-infinity'.length
即為 9
,透過此判斷可以避免對 long string 調用不必要的 toLowerCase()
function destr<T = unknown>(value: any, options: Options = {}): T {
if (typeof value !== "string") {
return value;
}
const _value = value.trim();
// skip
if (_value.length <= 9) {
const _lval = _value.toLowerCase();
if (_lval === "true") {
return true as T;
}
if (_lval === "false") {
return false as T;
}
if (_lval === "undefined") {
return undefined as T;
}
if (_lval === "null") {
// eslint-disable-next-line unicorn/no-null
return null as T;
}
if (_lval === "nan") {
return Number.NaN as T;
}
if (_lval === "infinity") {
return Number.POSITIVE_INFINITY as T;
}
if (_lval === "-infinity") {
return Number.NEGATIVE_INFINITY as T;
}
}
// skip
}
Prototype pollution prevention
檢查流程大致如下:
- 透過正規表達式檢查
"__proto__":
或"constructor":
字串是否存在 (包含可能的 unicode 組合) - 如果存在,回傳
JSON.parse(value, jsonParseTransform)
的結果,透過 reviver function 來清除不合法的 property,反之回傳JSON.parse(value)
,節省 reviver 檢查 property 的開銷
const suspectProtoRx =
/"(?:_|\\u0{2}5[Ff]){2}(?:p|\\u0{2}70)(?:r|\\u0{2}72)(?:o|\\u0{2}6[Ff])(?:t|\\u0{2}74)(?:o|\\u0{2}6[Ff])(?:_|\\u0{2}5[Ff]){2}"\s*:/;
const suspectConstructorRx =
/"(?:c|\\u0063)(?:o|\\u006[Ff])(?:n|\\u006[Ee])(?:s|\\u0073)(?:t|\\u0074)(?:r|\\u0072)(?:u|\\u0075)(?:c|\\u0063)(?:t|\\u0074)(?:o|\\u006[Ff])(?:r|\\u0072)"\s*:/;
export function destr<T = unknown>(value: any, options: Options = {}): T {
// skip
try {
if (suspectProtoRx.test(value) || suspectConstructorRx.test(value)) {
if (options.strict) {
throw new Error("[destr] Possible prototype pollution");
}
return JSON.parse(value, jsonParseTransform);
}
return JSON.parse(value);
} catch (error) {
if (options.strict) {
throw error;
}
return value as T;
}
}
function jsonParseTransform(key: string, value: any): any {
if (
key === "__proto__" ||
(key === "constructor" &&
value &&
typeof value === "object" &&
"prototype" in value)
) {
warnKeyDropped(key);
return;
}
return value;
}
function warnKeyDropped(key: string): void {
console.warn(`[destr] Dropping "${key}" key to prevent prototype pollution.`);
}
Regular Expression
const suspectProtoRx =
/"(?:_|\\u0{2}5[Ff]){2}(?:p|\\u0{2}70)(?:r|\\u0{2}72)(?:o|\\u0{2}6[Ff])(?:t|\\u0{2}74)(?:o|\\u0{2}6[Ff])(?:_|\\u0{2}5[Ff]){2}"\s*:/;
(?:)
為 non-capturing group,相關解釋可以參考 What is a non-capturing group in regular expressions?- 以
suspectProtoRx
第一組 non-capturing group` 為例:(?:_|\\u0{2}5[Ff])
其中\\u0{2}5[Ff]
是要比對_
在 unicode 中的代碼 (\u005F
) - 最後面的
"\s*:
表示在 json key field 結尾的"
與:
之間,可以接受任意個空白字元,例如:"__proto__" :
JSON.Parse() reviver
If the reviver function returns undefined (or returns no value — for example, if execution falls off the end of the function), the property is deleted from the object. Otherwise, the property is redefined to be the return value.
調用 jsonParseTransform
如果得到 early return 的 undefined
,會把不合法的 property 給清除
function jsonParseTransform(key: string, value: any): any {
if (
key === "__proto__" ||
(key === "constructor" &&
value &&
typeof value === "object" &&
"prototype" in value)
) {
warnKeyDropped(key);
return;
}
return value;
}
Fallback to original value if parsing fails
除了對 JSON.parse
,加一層 try catch 之外,在 JSON.parse
前,先透過 JsonSigRx
做簡易的判斷
const JsonSigRx = /^\s*["[{]|^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+)?\s*$/;
function destr<T = unknown>(value: any, options: Options = {}): T {
const _value = value.trim();
if (_value.length <= 9) {
// skip
}
if (!JsonSigRx.test(value)) {
if (options.strict) {
throw new SyntaxError("[destr] Invalid JSON");
}
return value as T;
}
try {
if (suspectProtoRx.test(value) || suspectConstructorRx.test(value)) {
if (options.strict) {
throw new Error("[destr] Possible prototype pollution");
}
return JSON.parse(value, jsonParseTransform);
}
return JSON.parse(value);
} catch (error) {
if (options.strict) {
throw error;
}
return value as T;
}
// skip
}
JsonSigRx
/^\s*["[{]|^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+)?\s*$/
先簡單分成兩個部分:
/^\s*["[{]
:是否為任意空白字元開頭,接著[
或{
的字串^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+)
:不超過安全範圍位元的數字
再拆解 ^\s*-?\d{1,16}(\.\d{1,17})?([Ee][+-]?\d+
:
^\s*-?\d{1,16}
:\s*
: 任意空白字元-?
:零或一次-
(for optional negative numbers)\d{1,16}
:比對小數點前,最多 16 位
(\.\d{1,17})?
:比對小數點後,最多 17 位?([Ee][+-]?\d+
):比對表示指數的語法,e.g.123E10
中的E10
parsing number 相關可參考:issue#93, pull#94