Spaces:
Sleeping
Sleeping
File size: 9,330 Bytes
1df763a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 |
# ES Module Lexer
[![Build Status][actions-image]][actions-url]
A JS module syntax lexer used in [es-module-shims](https://github.com/guybedford/es-module-shims).
Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling.
A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only.
For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms.
_Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._
> [Built with](https://github.com/guybedford/es-module-lexer/blob/main/chompfile.toml) [Chomp](https://chompbuild.com/)
### Usage
```
npm install es-module-lexer
```
For use in CommonJS:
```js
const { init, parse } = require('es-module-lexer');
(async () => {
// either await init, or call parse asynchronously
// this is necessary for the Web Assembly boot
await init;
const source = 'export var p = 5';
const [imports, exports] = parse(source);
// Returns "p"
source.slice(exports[0].s, exports[0].e);
// Returns "p"
source.slice(exports[0].ls, exports[0].le);
})();
```
An ES module version is also available:
```js
import { init, parse } from 'es-module-lexer';
(async () => {
await init;
const source = `
import { name } from 'mod\\u1011';
import json from './json.json' assert { type: 'json' }
export var p = 5;
export function q () {
};
export { x as 'external name' } from 'external';
// Comments provided to demonstrate edge cases
import /*comment!*/ ( 'asdf', { assert: { type: 'json' }});
import /*comment!*/.meta.asdf;
`;
const [imports, exports] = parse(source, 'optional-sourcename');
// Returns "modထ"
imports[0].n
// Returns "mod\u1011"
source.slice(imports[0].s, imports[0].e);
// "s" = start
// "e" = end
// Returns "import { name } from 'mod'"
source.slice(imports[0].ss, imports[0].se);
// "ss" = statement start
// "se" = statement end
// Returns "{ type: 'json' }"
source.slice(imports[1].a, imports[1].se);
// "a" = assert, -1 for no assertion
// Returns "external"
source.slice(imports[2].s, imports[2].e);
// Returns "p"
source.slice(exports[0].s, exports[0].e);
// Returns "p"
source.slice(exports[0].ls, exports[0].le);
// Returns "q"
source.slice(exports[1].s, exports[1].e);
// Returns "q"
source.slice(exports[1].ls, exports[1].le);
// Returns "'external name'"
source.slice(exports[2].s, exports[2].e);
// Returns -1
exports[2].ls;
// Returns -1
exports[2].le;
// Dynamic imports are indicated by imports[2].d > -1
// In this case the "d" index is the start of the dynamic import bracket
// Returns true
imports[2].d > -1;
// Returns "asdf" (only for string literal dynamic imports)
imports[2].n
// Returns "import /*comment!*/ ( 'asdf', { assert: { type: 'json' } })"
source.slice(imports[3].ss, imports[3].se);
// Returns "'asdf'"
source.slice(imports[3].s, imports[3].e);
// Returns "( 'asdf', { assert: { type: 'json' } })"
source.slice(imports[3].d, imports[3].se);
// Returns "{ assert: { type: 'json' } }"
source.slice(imports[3].a, imports[3].se - 1);
// For non-string dynamic import expressions:
// - n will be undefined
// - a is currently -1 even if there is an assertion
// - e is currently the character before the closing )
// For nested dynamic imports, the se value of the outer import is -1 as end tracking does not
// currently support nested dynamic immports
// import.meta is indicated by imports[3].d === -2
// Returns true
imports[4].d === -2;
// Returns "import /*comment!*/.meta"
source.slice(imports[4].s, imports[4].e);
// ss and se are the same for import meta
})();
```
### CSP asm.js Build
The default version of the library uses Wasm and (safe) eval usage for performance and a minimal footprint.
Neither of these represent security escalation possibilities since there are no execution string injection vectors, but that can still violate existing CSP policies for applications.
For a version that works with CSP eval disabled, use the `es-module-lexer/js` build:
```js
import { parse } from 'es-module-lexer/js';
```
Instead of Web Assembly, this uses an asm.js build which is almost as fast as the Wasm version ([see benchmarks below](#benchmarks)).
### Escape Sequences
To handle escape sequences in specifier strings, the `.n` field of imported specifiers will be provided where possible.
For dynamic import expressions, this field will be empty if not a valid JS string.
### Facade Detection
Facade modules that only use import / export syntax can be detected via the third return value:
```js
const [,, facade] = parse(`
export * from 'external';
import * as ns from 'external2';
export { a as b } from 'external3';
export { ns };
`);
facade === true;
```
### ESM Detection
Modules that uses ESM syntaxes can be detected via the fourth return value:
```js
const [,,, hasModuleSyntax] = parse(`
export {}
`);
hasModuleSyntax === true;
```
Dynamic imports are ignored since they can be used in Non-ESM files.
```js
const [,,, hasModuleSyntax] = parse(`
import('./foo.js')
`);
hasModuleSyntax === false;
```
### Environment Support
Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm).
### Grammar Support
* Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
* Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
* Always correctly parses valid JS source, but may parse invalid JS source without errors.
### Limitations
The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking.
The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases:
```js
// Only "a" is detected as an export, "q" isn't
export var a = 'asdf', q = z;
// "b" is not detected as an export
export var { a: b } = asdf;
```
The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above.
### Benchmarks
Benchmarks can be run with `npm run bench`.
Current results for a high spec machine:
#### Wasm Build
```
Module load time
> 5ms
Cold Run, All Samples
test/samples/*.js (3123 KiB)
> 18ms
Warm Runs (average of 25 runs)
test/samples/angular.js (739 KiB)
> 3ms
test/samples/angular.min.js (188 KiB)
> 1ms
test/samples/d3.js (508 KiB)
> 3ms
test/samples/d3.min.js (274 KiB)
> 2ms
test/samples/magic-string.js (35 KiB)
> 0ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (929 KiB)
> 4.32ms
test/samples/rollup.min.js (429 KiB)
> 2.16ms
Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3123 KiB)
> 14.16ms
```
#### JS Build (asm.js)
```
Module load time
> 2ms
Cold Run, All Samples
test/samples/*.js (3123 KiB)
> 34ms
Warm Runs (average of 25 runs)
test/samples/angular.js (739 KiB)
> 3ms
test/samples/angular.min.js (188 KiB)
> 1ms
test/samples/d3.js (508 KiB)
> 3ms
test/samples/d3.min.js (274 KiB)
> 2ms
test/samples/magic-string.js (35 KiB)
> 0ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (929 KiB)
> 5ms
test/samples/rollup.min.js (429 KiB)
> 3.04ms
Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3123 KiB)
> 17.12ms
```
### Building
This project uses [Chomp](https://chompbuild.com) for building.
With Chomp installed, download the WASI SDK 12.0 from https://github.com/WebAssembly/wasi-sdk/releases/tag/wasi-sdk-12.
- [Linux](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz)
- [Windows (MinGW)](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-mingw.tar.gz)
- [macOS](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-macos.tar.gz)
Locate the WASI-SDK as a sibling folder, or customize the path via the `WASI_PATH` environment variable.
Emscripten emsdk is also assumed to be a sibling folder or via the `EMSDK_PATH` environment variable.
Example setup:
```
git clone https://github.com:guybedford/es-module-lexer
git clone https://github.com/emscripten-core/emsdk
cd emsdk
git checkout 1.40.1-fastcomp
./emsdk install 1.40.1-fastcomp
cd ..
wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz
gunzip wasi-sdk-12.0-linux.tar.gz
tar -xf wasi-sdk-12.0-linux.tar
mv wasi-sdk-12.0-linux.tar wasi-sdk-12.0
cargo install chompbuild
cd es-module-lexer
chomp test
```
For the `asm.js` build, git clone `emsdk` from is assumed to be a sibling folder as well.
### License
MIT
[actions-image]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml/badge.svg
[actions-url]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml
|