Skip to main content

Data Processing with QuickJS

QuickJS is an embeddable JS engine written in C. It has built-in support for reading and writing file data stored in memory.

SheetJS is a JavaScript library for reading and writing data from spreadsheets.

This demo uses QuickJS and SheetJS to pull data from a spreadsheet and print CSV rows. We'll explore how to load SheetJS in a QuickJS context and process spreadsheets from C programs.

The "Integration Example" section includes a complete command-line tool for reading data from files.

Integration Details

Many QuickJS functions are not documented. The explanation was verified against the latest release (commit d378a9f).

Initialize QuickJS

Most QuickJS API functions interact with a JSContext object1, which is normally created with JS_NewRuntime and JS_NewContext:

#include "quickjs.h"

/* initialize context */
JSRuntime *rt = JS_NewRuntime();
JSContext *ctx = JS_NewContext(rt);

QuickJS provides a global object through JS_GetGlobalObject:

/* obtain reference to global object */
JSValue global = JS_GetGlobalObject(ctx);
Cleanup (click to show)

Once finished, programs are expected to cleanup by using JS_FreeValue to free values, JS_FreeContext to free the context pointer, and JS_FreeRuntime to free the runtime:

/* global is a JSValue */
JS_FreeValue(ctx, global);

/* cleanup */
JS_FreeContext(ctx);
JS_FreeRuntime(rt);

The Integration Example frees JS values after use.

Load SheetJS Scripts

SheetJS Standalone scripts can be loaded and executed in QuickJS.

The main library can be loaded by reading the script from the file system and evaluating in the QuickJS context using JS_Eval:

static char *read_file(const char *filename, size_t *sz) {
FILE *f = fopen(filename, "rb");
if(!f) return NULL;
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
char *buf = (char *)malloc(fsize * sizeof(char));
*sz = fread((void *) buf, 1, fsize, f);
fclose(f);
return buf;
}

// ...
{
/* Read `xlsx.full.min.js` from the filesystem */
size_t len; char *buf = read_file("xlsx.full.min.js", &len);
/* evaluate from the QuickJS context */
JS_Eval(ctx, buf, len, "<input>", 0);
/* Free the file buffer */
free(buf);
}

If the library is loaded, XLSX.version will be a string. This string can be pulled into the main C program.

  1. Get the XLSX property of the global object using JS_GetPropertyStr:
/* obtain reference to the XLSX object */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
  1. Get the version property of the XLSX object using JS_GetPropertyStr:
/* obtain reference to `XLSX.version` */
JSValue version = JS_GetPropertyStr(ctx, XLSX, "version");
  1. Pull the string into C code with JS_ToCStringLen:
/* pull the version string into C */
size_t vlen; const char *vers = JS_ToCStringLen(ctx, &vlen, version);
printf("Version: %s\n", vers);

Reading Files

JS_NewArrayBuffer can generate an ArrayBuffer from a C byte array. The function signature expects uint8_t * instead of char *:

/* read file */
size_t dlen; uint8_t * dbuf = (uint8_t *)read_file("pres.numbers", &dlen);

/* load data into array buffer */
JSValue ab = JS_NewArrayBuffer(ctx, dbuf, dlen, NULL, NULL, 0);

The ArrayBuffer will be parsed with the SheetJS read method2. The CSV row data will be generated with sheet_to_csv3.

Parse the ArrayBuffer

The goal is to run the equivalent of the following JavaScript code:

/* `ab` is the `ArrayBuffer` from the previous step */
var wb = XLSX.read(ab);
  1. Get the XLSX property of the global object and the read property of XLSX:
/* obtain reference to XLSX.read */
JSValue XLSX = JS_GetPropertyStr(ctx, global, "XLSX");
JSValue XLSX_read = JS_GetPropertyStr(ctx, XLSX, "read");
  1. Create an array of arguments to pass to the function. In this case, the read function will be called with one argument (ArrayBuffer data):
/* prepare arguments */
JSValue args[] = { ab };
  1. Use JS_Call to call the function with the arguments:
/* call XLSX.read(ab) */
JSValue wb = JS_Call(ctx, XLSX_read, XLSX, 1, args);

Get First Worksheet

The goal is to get the first worksheet. In JavaScript, the SheetNames property of the workbook is an array of strings and the Sheets property holds worksheet objects4. The desired action looks like:

/* `wb` is the workbook from the previous step */
var wsname = wb.SheetNames[0];
var ws = wb.Sheets[wsname];
  1. Pull wb.SheetNames[0] into a C string using JS_GetPropertyStr:
/* get `wb.SheetNames[0]` */
JSValue SheetNames = JS_GetPropertyStr(ctx, wb, "SheetNames");
JSValue Sheet1 = JS_GetPropertyStr(ctx, SheetNames, "0");

/* pull first sheet name into C code */
size_t wslen; const char *wsname = JS_ToCStringLen(ctx, &wslen, Sheet1);
  1. Get the worksheet object:
/* get wb.Sheets[wsname] */
JSValue Sheets = JS_GetPropertyStr(ctx, wb, "Sheets");
JSValue ws = JS_GetPropertyStr(ctx, Sheets, wsname);

Convert to CSV

The goal is to call sheet_to_csv5 and pull the result into C code:

/* `ws` is the worksheet from the previous step */
var csv = XLSX.utils.sheet_to_csv(ws);
  1. Create a references to XLSX.utils and XLSX.utils.sheet_to_csv:
/* obtain reference to XLSX.utils.sheet_to_csv */
JSValue utils = JS_GetPropertyStr(ctx, XLSX, "utils");
JSValue sheet_to_csv = JS_GetPropertyStr(ctx, utils, "sheet_to_csv");
  1. Create arguments array:
/* prepare arguments */
JSValue args[] = { ws };
  1. Use JS_Call to call the function and use JS_ToCStringLen to pull the CSV:
JSValue csv = JS_Call(ctx, sheet_to_csv, utils, 1, args);
size_t csvlen; const char *csvstr = JS_ToCStringLen(ctx, &csvlen, csv);

At this point, csvstr is a C string that can be printed to standard output.

Complete Example

The "Integration Example" covers a traditional integration in a C application, while the "CLI Test" demonstrates other concepts using the quickjs CLI tool.

Integration Example

Tested Deployments

This demo was tested in the following deployments:

ArchitectureGit CommitDate
darwin-x646a89d7c2024-03-15
darwin-armd378a9f2024-05-23
win10-x649e561d52024-03-04
win11-armd378a9f2024-05-25
linux-x643b45d152024-04-25
linux-armd378a9f2024-05-25

When the demo was tested, d378a9f was the HEAD commit on the master branch.

QuickJS does not officially support Windows. The win10-x64 and win11-arm tests were run entirely within Windows Subsystem for Linux.

  1. Build libquickjs.a:
git clone https://github.com/bellard/quickjs
cd quickjs
git checkout d378a9f
make
cd ..
  1. Copy libquickjs.a and quickjs.h into the working directory:
cp quickjs/libquickjs.a .
cp quickjs/quickjs.h .
  1. Download sheetjs.quick.c:
curl -LO https://docs.sheetjs.com/quickjs/sheetjs.quick.c
  1. Build the sample application:
gcc -o sheetjs.quick -Wall sheetjs.quick.c libquickjs.a -lm

This program tries to parse the file specified by the first argument

  1. Download the SheetJS Standalone script and test file. Save both files in the project directory:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/xlsx.full.min.js
curl -LO https://docs.sheetjs.com/pres.numbers
  1. Run the test program:
./sheetjs.quick pres.numbers

If successful, the program will print the library version number, file size, first worksheet name, and the contents of the first sheet as CSV rows.

CLI Test

Tested Deployments

This demo was tested in the following environments:

Git CommitDate
d378a9f2024-05-23

When the demo was tested, d378a9f was the HEAD commit on the master branch.

  1. Build the qjs command line utility from source:
git clone https://github.com/bellard/quickjs
cd quickjs
git checkout d378a9f
make
cd ..
cp quickjs/qjs .
  1. Download the SheetJS Standalone script and the test file. Save both files in the project directory:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/xlsx.full.min.js
curl -LO https://docs.sheetjs.com/pres.numbers
  1. Download SheetJSQuick.js
curl -LO https://docs.sheetjs.com/quickjs/SheetJSQuick.js
  1. Test the program:
./qjs SheetJSQuick.js

If successful, the script will print CSV rows and generate SheetJSQuick.xlsx. The generated file can be opened in Excel or another spreadsheet editor.

Footnotes

  1. See "Runtime and Contexts" in the QuickJS documentation

  2. See read in "Reading Files"

  3. See sheet_to_csv in "CSV and Text"

  4. See "Workbook Object" in "SheetJS Data Model"

  5. See sheet_to_csv in "CSV and Text"