Spaces:
Sleeping
Sleeping
# ES Module Lexer | |
[![Build Status][actions-image]][actions-url] | |
A JS module syntax lexer used in [es-module-shims](https://github.com/guybedford/es-module-shims). | |
Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling. | |
A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only. | |
For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms. | |
_Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._ | |
> [Built with](https://github.com/guybedford/es-module-lexer/blob/main/chompfile.toml) [Chomp](https://chompbuild.com/) | |
### Usage | |
``` | |
npm install es-module-lexer | |
``` | |
For use in CommonJS: | |
```js | |
const { init, parse } = require('es-module-lexer'); | |
(async () => { | |
// either await init, or call parse asynchronously | |
// this is necessary for the Web Assembly boot | |
await init; | |
const source = 'export var p = 5'; | |
const [imports, exports] = parse(source); | |
// Returns "p" | |
source.slice(exports[0].s, exports[0].e); | |
// Returns "p" | |
source.slice(exports[0].ls, exports[0].le); | |
})(); | |
``` | |
An ES module version is also available: | |
```js | |
import { init, parse } from 'es-module-lexer'; | |
(async () => { | |
await init; | |
const source = ` | |
import { name } from 'mod\\u1011'; | |
import json from './json.json' assert { type: 'json' } | |
export var p = 5; | |
export function q () { | |
}; | |
export { x as 'external name' } from 'external'; | |
// Comments provided to demonstrate edge cases | |
import /*comment!*/ ( 'asdf', { assert: { type: 'json' }}); | |
import /*comment!*/.meta.asdf; | |
`; | |
const [imports, exports] = parse(source, 'optional-sourcename'); | |
// Returns "modထ" | |
imports[0].n | |
// Returns "mod\u1011" | |
source.slice(imports[0].s, imports[0].e); | |
// "s" = start | |
// "e" = end | |
// Returns "import { name } from 'mod'" | |
source.slice(imports[0].ss, imports[0].se); | |
// "ss" = statement start | |
// "se" = statement end | |
// Returns "{ type: 'json' }" | |
source.slice(imports[1].a, imports[1].se); | |
// "a" = assert, -1 for no assertion | |
// Returns "external" | |
source.slice(imports[2].s, imports[2].e); | |
// Returns "p" | |
source.slice(exports[0].s, exports[0].e); | |
// Returns "p" | |
source.slice(exports[0].ls, exports[0].le); | |
// Returns "q" | |
source.slice(exports[1].s, exports[1].e); | |
// Returns "q" | |
source.slice(exports[1].ls, exports[1].le); | |
// Returns "'external name'" | |
source.slice(exports[2].s, exports[2].e); | |
// Returns -1 | |
exports[2].ls; | |
// Returns -1 | |
exports[2].le; | |
// Dynamic imports are indicated by imports[2].d > -1 | |
// In this case the "d" index is the start of the dynamic import bracket | |
// Returns true | |
imports[2].d > -1; | |
// Returns "asdf" (only for string literal dynamic imports) | |
imports[2].n | |
// Returns "import /*comment!*/ ( 'asdf', { assert: { type: 'json' } })" | |
source.slice(imports[3].ss, imports[3].se); | |
// Returns "'asdf'" | |
source.slice(imports[3].s, imports[3].e); | |
// Returns "( 'asdf', { assert: { type: 'json' } })" | |
source.slice(imports[3].d, imports[3].se); | |
// Returns "{ assert: { type: 'json' } }" | |
source.slice(imports[3].a, imports[3].se - 1); | |
// For non-string dynamic import expressions: | |
// - n will be undefined | |
// - a is currently -1 even if there is an assertion | |
// - e is currently the character before the closing ) | |
// For nested dynamic imports, the se value of the outer import is -1 as end tracking does not | |
// currently support nested dynamic immports | |
// import.meta is indicated by imports[3].d === -2 | |
// Returns true | |
imports[4].d === -2; | |
// Returns "import /*comment!*/.meta" | |
source.slice(imports[4].s, imports[4].e); | |
// ss and se are the same for import meta | |
})(); | |
``` | |
### CSP asm.js Build | |
The default version of the library uses Wasm and (safe) eval usage for performance and a minimal footprint. | |
Neither of these represent security escalation possibilities since there are no execution string injection vectors, but that can still violate existing CSP policies for applications. | |
For a version that works with CSP eval disabled, use the `es-module-lexer/js` build: | |
```js | |
import { parse } from 'es-module-lexer/js'; | |
``` | |
Instead of Web Assembly, this uses an asm.js build which is almost as fast as the Wasm version ([see benchmarks below](#benchmarks)). | |
### Escape Sequences | |
To handle escape sequences in specifier strings, the `.n` field of imported specifiers will be provided where possible. | |
For dynamic import expressions, this field will be empty if not a valid JS string. | |
### Facade Detection | |
Facade modules that only use import / export syntax can be detected via the third return value: | |
```js | |
const [,, facade] = parse(` | |
export * from 'external'; | |
import * as ns from 'external2'; | |
export { a as b } from 'external3'; | |
export { ns }; | |
`); | |
facade === true; | |
``` | |
### ESM Detection | |
Modules that uses ESM syntaxes can be detected via the fourth return value: | |
```js | |
const [,,, hasModuleSyntax] = parse(` | |
export {} | |
`); | |
hasModuleSyntax === true; | |
``` | |
Dynamic imports are ignored since they can be used in Non-ESM files. | |
```js | |
const [,,, hasModuleSyntax] = parse(` | |
import('./foo.js') | |
`); | |
hasModuleSyntax === false; | |
``` | |
### Environment Support | |
Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm). | |
### Grammar Support | |
* Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators. | |
* Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking. | |
* Always correctly parses valid JS source, but may parse invalid JS source without errors. | |
### Limitations | |
The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking. | |
The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases: | |
```js | |
// Only "a" is detected as an export, "q" isn't | |
export var a = 'asdf', q = z; | |
// "b" is not detected as an export | |
export var { a: b } = asdf; | |
``` | |
The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above. | |
### Benchmarks | |
Benchmarks can be run with `npm run bench`. | |
Current results for a high spec machine: | |
#### Wasm Build | |
``` | |
Module load time | |
> 5ms | |
Cold Run, All Samples | |
test/samples/*.js (3123 KiB) | |
> 18ms | |
Warm Runs (average of 25 runs) | |
test/samples/angular.js (739 KiB) | |
> 3ms | |
test/samples/angular.min.js (188 KiB) | |
> 1ms | |
test/samples/d3.js (508 KiB) | |
> 3ms | |
test/samples/d3.min.js (274 KiB) | |
> 2ms | |
test/samples/magic-string.js (35 KiB) | |
> 0ms | |
test/samples/magic-string.min.js (20 KiB) | |
> 0ms | |
test/samples/rollup.js (929 KiB) | |
> 4.32ms | |
test/samples/rollup.min.js (429 KiB) | |
> 2.16ms | |
Warm Runs, All Samples (average of 25 runs) | |
test/samples/*.js (3123 KiB) | |
> 14.16ms | |
``` | |
#### JS Build (asm.js) | |
``` | |
Module load time | |
> 2ms | |
Cold Run, All Samples | |
test/samples/*.js (3123 KiB) | |
> 34ms | |
Warm Runs (average of 25 runs) | |
test/samples/angular.js (739 KiB) | |
> 3ms | |
test/samples/angular.min.js (188 KiB) | |
> 1ms | |
test/samples/d3.js (508 KiB) | |
> 3ms | |
test/samples/d3.min.js (274 KiB) | |
> 2ms | |
test/samples/magic-string.js (35 KiB) | |
> 0ms | |
test/samples/magic-string.min.js (20 KiB) | |
> 0ms | |
test/samples/rollup.js (929 KiB) | |
> 5ms | |
test/samples/rollup.min.js (429 KiB) | |
> 3.04ms | |
Warm Runs, All Samples (average of 25 runs) | |
test/samples/*.js (3123 KiB) | |
> 17.12ms | |
``` | |
### Building | |
This project uses [Chomp](https://chompbuild.com) for building. | |
With Chomp installed, download the WASI SDK 12.0 from https://github.com/WebAssembly/wasi-sdk/releases/tag/wasi-sdk-12. | |
- [Linux](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz) | |
- [Windows (MinGW)](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-mingw.tar.gz) | |
- [macOS](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-macos.tar.gz) | |
Locate the WASI-SDK as a sibling folder, or customize the path via the `WASI_PATH` environment variable. | |
Emscripten emsdk is also assumed to be a sibling folder or via the `EMSDK_PATH` environment variable. | |
Example setup: | |
``` | |
git clone https://github.com:guybedford/es-module-lexer | |
git clone https://github.com/emscripten-core/emsdk | |
cd emsdk | |
git checkout 1.40.1-fastcomp | |
./emsdk install 1.40.1-fastcomp | |
cd .. | |
wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz | |
gunzip wasi-sdk-12.0-linux.tar.gz | |
tar -xf wasi-sdk-12.0-linux.tar | |
mv wasi-sdk-12.0-linux.tar wasi-sdk-12.0 | |
cargo install chompbuild | |
cd es-module-lexer | |
chomp test | |
``` | |
For the `asm.js` build, git clone `emsdk` from is assumed to be a sibling folder as well. | |
### License | |
MIT | |
[actions-image]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml/badge.svg | |
[actions-url]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml | |