Web Workers
Parsing and writing large spreadsheets takes time. During the process, if the SheetJS library is running in the web browser, the website may freeze.
Workers provide a way to off-load the hard work so that the website does not freeze during processing. The work is still performed locally. No data is sent to a remote server.
The following diagrams show the normal and Web Worker flows when exporting a dataset. The regions with a red background mark when the browser is frozen.
Normal Export | Web Worker Export |
---|---|
IE10+ and modern browsers support basic Web Workers. Some APIs like fetch
were
added later. Feature testing is strongly recommended.
Due to limitations of the live code blocks, all of the workers in this section are in-line. The code is embedded in template literals. For production sites, typically workers are written in separate JS files.
Example (click to show)
For example, an in-line worker like
const worker = new Worker(URL.createObjectURL(new Blob([`\
/* load standalone script from CDN */
importScripts("https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js");
/* this callback will run once the main context sends a message */
self.addEventListener('message', (e) => {
/* Pass the version string back */
postMessage({ version: XLSX.version });
}, false);
`])));
would typically be stored in a separate JS file like "worker.js":
/* load standalone script from CDN */
importScripts("https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js");
/* this callback will run once the main context sends a message */
self.addEventListener('message', (e) => {
/* Pass the version string back */
postMessage({ version: XLSX.version });
}, false);
and the main script would pass a URL to the Worker
constructor:
const worker = new Worker("./worker.js");
Installation
In all cases, importScripts
in a Worker can load the
SheetJS Standalone scripts
importScripts("https://cdn.sheetjs.com/xlsx-0.20.0/package/dist/xlsx.full.min.js");
For production use, it is highly encouraged to download and host the script.
ECMAScript Module Support (click to hide)
ESM is supported in Web Workers in the Chromium family of browsers (including Chrome and Edge) as well as in browsers powered by WebKit (including Safari).
For legacy browsers including Firefox and IE, importScripts
should be used.
Browser ESM imports require a complete URL including the .mjs
extension:
import * as XLSX from "https://cdn.sheetjs.com/xlsx-0.20.0/package/xlsx.mjs";
When using Worker ESM, the Worker constructor must set the type
option:
const worker = new Worker(
url_to_worker_script,
{ type: "module" } // second argument to Worker constructor
);
Inline workers additionally require the Blob MIME type text/javascript
:
const worker_code = `\
/* load standalone script from CDN */
import * as XLSX from "https://cdn.sheetjs.com/xlsx-0.20.0/package/xlsx.mjs";
// ... do something with XLSX here ...
`;
const worker = new Worker(
URL.createObjectURL(
new Blob(
[ worker_code ],
{ type: "text/javascript" } // second argument to the Blob constructor
)
),
{ type: "module" } // second argument to Worker constructor
);
Live Demos
Each browser demo was tested in the following environments:
Browser | Date | Comments |
---|---|---|
Chrome 116 | 2023-09-02 | |
Edge 116 | 2023-09-02 | |
Safari 16.6 | 2023-09-02 | File System Access API is not supported |
Brave 1.57 | 2023-09-02 | File System Access API is not supported |
Firefox 113 | 2023-05-22 | File System Access API is not supported |
Downloading a Remote File
fetch
was enabled in Web Workers in Chrome 42 and Safari 10.3
Typically the Web Worker performs the fetch
operation, processes the workbook,
and sends a final result (HTML table or raw data) to the main browser context:
Live Demo (click to show)
In the following example, the script:
- downloads https://sheetjs.com/pres.numbers in a Web Worker
- loads the SheetJS library and parses the file in the Worker
- generates an HTML string of the first table in the Worker
- sends the string to the main browser context
- adds the HTML to the page in the main browser context
Creating a Local File
XLSX.writeFile
will not work in Web Workers! Raw file data can be passed from
the Web Worker to the main browser context for downloading.
Typically the Web Worker receives an array of JS objects, generates a workbook, and sends a URL to the main browser context for download:
Live Demo (click to show)
In the following example, the script:
- sends a dataset (array of JS objects) to the Web Worker
- generates a workbook object in the Web Worker
- generates a XLSB file using
XLSX.write
in the Web Worker - generates an object URL in the Web Worker
- sends the object URL to the main browser context
- performs a download action in the main browser context
User-Submitted File
Typically FileReader
is used in the main browser context. In Web Workers, the
synchronous version FileReaderSync
is more efficient.
Typically the Web Worker receives a file pointer, reads and parses the file, and sends a final result (HTML table or raw data) to the main browser context:
Live Demo (click to show)
In the following example, when a file is dropped over the DIV or when the INPUT element is used to select a file, the script:
- sends the
File
object to the Web Worker - loads the SheetJS library and parses the file in the Worker
- generates an HTML string of the first table in the Worker
- sends the string to the main browser context
- adds the HTML to the page in the main browser context
Streaming Write
A more general discussion, including row-oriented processing demos, is included in the "Large Datasets" demo.
XLSX.stream.to_csv
incrementally generates CSV rows.
File System Access API
At the time of writing, the File System Access API is only available in Chromium and Chromium-based browsers like Chrome and Edge.
In local testing, committing each CSV row as it is generated is significantly slower than accumulating and writing once at the end.
When the target CSV is known to be less than 500MB, it is preferable to batch. Larger files may hit browser length limits.
Live Demo (click to show)
The following live demo fetches and parses a file in a Web Worker. The script:
- prompts user to save file (
window.showSaveFilePicker
in the main thread) - passes the URL and the file object to the Web Worker
- loads the SheetJS library in the Web Worker
- fetches the requested URL and parses the workbook from the Worker
- creates a Writable Stream from the file object.
- uses
XLSX.stream.to_csv
to generate CSV rows of the first worksheet- every 100th row, a progress message is sent back to the main thread
- at the end, a completion message is sent back to the main thread
The demo has a checkbox. If it is not checked (default), the Worker will collect each CSV row and write once at the end. If it is checked, the Worker will try to commit each row as it is generated.
The demo also has a URL input box. Feel free to change the URL. For example:
https://raw.githubusercontent.com/SheetJS/test_files/master/large_strings.xls
is an XLS file over 50 MB. The generated CSV file is about 55 MB.
https://raw.githubusercontent.com/SheetJS/libreoffice_test-files/master/calc/xlsx-import/perf/8-by-300000-cells.xlsx
is an XLSX file with 300000 rows (approximately 20 MB) yielding a CSV of 10 MB.