Skip to main content

Data Processing with Duktape

Duktape is an embeddable JS engine written in C. It has been ported to a number of exotic architectures and operating systems.

SheetJS is a JavaScript library for reading and writing data from spreadsheets.

The "Complete Example" section includes a complete command-line tool for reading data from spreadsheets and exporting to Excel XLSB workbooks.

The "Bindings" section covers bindings for other ecosystems.

Integration Details

Initialize Duktape

Duktape does not provide a global variable. It can be created in one line:

/* initialize */
duk_context *ctx = duk_create_heap_default();

/* duktape does not expose a standard "global" by default */
duk_eval_string_noresult(ctx, "var global = (function(){ return this; }).call(null);");

Load SheetJS Scripts

The SheetJS Standalone scripts can be parsed and evaluated in a Duktape context.

The shim and main libraries can be loaded by reading the scripts from the file system and evaluating in the Duktape context:

/* simple wrapper to read the entire script file */
static duk_int_t eval_file(duk_context *ctx, const char *filename) {
size_t len;
/* read script from filesystem */
FILE *f = fopen(filename, "rb");
if(!f) { duk_push_undefined(ctx); perror("fopen"); return 1; }
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
char *buf = (char *)malloc(fsize * sizeof(char));
len = fread((void *) buf, 1, fsize, f);
fclose(f);
if(!buf) { duk_push_undefined(ctx); perror("fread"); return 1; }

/* load script into the context */
duk_push_lstring(ctx, (const char *)buf, (duk_size_t)len);
/* eval script */
duk_int_t retval = duk_peval(ctx);
/* cleanup */
duk_pop(ctx);
return retval;
}

// ...
duk_int_t res = 0;

if((res = eval_file(ctx, "shim.min.js")) != 0) { /* error handler */ }
if((res = eval_file(ctx, "xlsx.full.min.js")) != 0) { /* error handler */ }

To confirm the library is loaded, XLSX.version can be inspected:

  /* get version string */
duk_eval_string(ctx, "XLSX.version");
printf("SheetJS library version %s\n", duk_get_string(ctx, -1));
duk_pop(ctx);

Reading Files

Duktape supports Buffer natively but should be sliced before processing. Assuming buf is a C byte array, with length len, this snippet parses data:

/* load C char array and save to a Buffer */
duk_push_external_buffer(ctx);
duk_config_buffer(ctx, -1, buf, len);
duk_put_global_string(ctx, "buf");

/* parse with SheetJS */
duk_eval_string_noresult(ctx, "workbook = XLSX.read(buf.slice(0, buf.length), {type:'buffer'});");

workbook will be a variable in the JS environment that can be inspected using the various SheetJS API functions.

Writing Files

duk_get_buffer_data can pull Buffer object data into the C code:

/* write with SheetJS using type: "array" */
duk_eval_string(ctx, "XLSX.write(workbook, {type:'array', bookType:'xlsx'})");

/* pull result back to C */
duk_size_t sz;
char *buf = (char *)duk_get_buffer_data(ctx, -1, sz);

/* discard result in duktape */
duk_pop(ctx);

The resulting buf can be written to file with fwrite.

Complete Example

Tested Deployments

This demo was tested in the following deployments:

ArchitectureVersionDate
darwin-x642.7.02024-04-04
darwin-arm2.7.02023-10-18
win10-x642.7.02024-03-27
win11-arm2.7.02023-12-01
linux-x642.7.02024-03-21
linux-arm2.7.02023-12-01

This program parses a file and prints CSV data from the first worksheet. It also generates an XLSB file and writes to the filesystem.

The flow diagram is displayed after the example steps

On Windows, the Visual Studio "Native Tools Command Prompt" must be used.

  1. Create a project folder:
mkdir sheetjs-duk
cd sheetjs-duk
  1. Download and extract Duktape:
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
mv duktape-2.7.0/src/*.{c,h} .
  1. Download the SheetJS Standalone script, shim script and test file. Move all three files to the project directory:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers
  1. Download sheetjs.duk.c:
curl -LO https://docs.sheetjs.com/duk/sheetjs.duk.c
  1. Compile standalone sheetjs.duk binary
gcc -std=c99 -Wall -osheetjs.duk sheetjs.duk.c duktape.c -lm

GCC may generate a warning:

duk_js_compiler.c:5628:13: warning: variable 'num_stmts' set but not used [-Wunused-but-set-variable]
duk_int_t num_stmts;
^

This warning can be ignored.

  1. Run the demo:
./sheetjs.duk pres.numbers

If the program succeeded, the CSV contents will be printed to console and the file sheetjsw.xlsb will be created. That file can be opened with Excel.

Flow Diagram

Bindings

Bindings exist for many languages. As these bindings require "native" code, they may not work on every platform.

The Duktape source distribution includes a separate Makefile for building a shared library. This library can be loaded in other programs.

Blingos

Duktape includes a number of "blingos" (function-like macros) which will not be included in the shared library. The macros must be manually expanded.

For example, duk_create_heap_default is defined as follows:

#define duk_create_heap_default() \
duk_create_heap(NULL, NULL, NULL, NULL, NULL)

The duk_create_heap_default blingo will not be defined in the shared library. Instead, duk_create_heap must be called directly. Using PHP FFI:

/* create new FFI object */
$ffi = FFI::cdef(/* ... arguments */);

/* call duk_create_heap directly */
$context = $ffi->duk_create_heap(null, null, null, null, null);

Null Pointers

The C NULL pointer must be used in some functions. Some FFI implementations have special values distinct from the language-native null value. Using Python, return type hints are specified with the restype property:

from ctypes import CDLL, c_void_p

duk = CDLL("libduktape.so")

duk.duk_create_heap.restype = c_void_p
context = duk.duk_create_heap(None, None, None, None, None)

PHP

There is no official PHP binding to the Duktape library. Instead, this demo uses the raw FFI interface1 to the Duktape shared library.

The SheetJSDuk.php demo script parses a file, prints CSV rows from the first worksheet, and creates a XLSB workbook.

PHP Demo

Tested Deployments

This demo was tested in the following deployments:

ArchitectureVersionPHP VersionDate
darwin-x642.7.08.3.42024-03-15
darwin-arm2.7.08.3.22024-02-13
linux-x642.7.08.2.72024-03-21
  1. Ensure php is installed and available on the system path

  2. Find the php.ini file:

php --ini

The following output is from the last macOS test:

Configuration File (php.ini) Path: /usr/local/etc/php/8.3
Loaded Configuration File: /usr/local/etc/php/8.3/php.ini
Scan for additional .ini files in: /usr/local/etc/php/8.3/conf.d
Additional .ini files parsed: /usr/local/etc/php/8.3/conf.d/ext-opcache.ini
  1. Edit the php.ini configuration file.

The following line should appear in the configuration:

php.ini (add to end)
extension=ffi

If this line is prefixed with a ;, remove the semicolon. If this line does not appear in the file, add it to the end.

  1. Build the Duktape shared library:
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
cd duktape-2.7.0
make -f Makefile.sharedlibrary
cd ..
  1. Copy the shared library to the current folder. When the demo was last tested, the shared library file name differed by platform:
OSname
Darwinlibduktape.207.20700.so
Linuxlibduktape.so.207.20700
cp duktape-*/libduktape.* .
  1. Download the SheetJS Standalone script, shim script and test file. Move all three files to the project directory:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers
  1. Download SheetJSDuk.php:
curl -LO https://docs.sheetjs.com/duk/SheetJSDuk.php
  1. Edit the SheetJSDuk.php script.

The $sofile variable declares the path to the library:

SheetJSDuk.php (edit highlighted line)
<?php

$sofile = './libduktape.207.20700.so';

The name of the library is libduktape.207.20700.so:

SheetJSDuk.php (change highlighted line)
$sofile = './libduktape.207.20700.so';
  1. Run the script:
php SheetJSDuk.php pres.numbers

If the program succeeded, the CSV contents will be printed to console and the file sheetjsw.xlsb will be created. That file can be opened with Excel.

Python

There is no official Python binding to the Duktape library. Instead, this demo uses the raw ctypes interface2 to the Duktape shared library.

Python Demo

Tested Deployments

This demo was tested in the following deployments:

ArchitectureVersionPythonDate
darwin-x642.7.03.12.22024-03-15
darwin-arm2.7.03.11.72024-02-13
linux-x642.7.03.11.32024-03-21
  1. Ensure python is installed and available on the system path.

  2. Build the Duktape shared library:

curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
cd duktape-2.7.0
make -f Makefile.sharedlibrary
cd ..
  1. Copy the shared library to the current folder. When the demo was last tested, the shared library file name differed by platform:
OSname
Darwinlibduktape.207.20700.so
Linuxlibduktape.so.207.20700
cp duktape-*/libduktape.* .
  1. Download the SheetJS Standalone script, shim script and test file. Move all three files to the project directory:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers
  1. Download SheetJSDuk.py:
curl -LO https://docs.sheetjs.com/duk/SheetJSDuk.py
  1. Edit the SheetJSDuk.py script.

The lib variable declares the path to the library:

SheetJSDuk.py (edit highlighted line)
#!/usr/bin/env python3

lib = "libduktape.207.20700.so"

The name of the library is libduktape.207.20700.so:

SheetJSDuk.py (change highlighted line)
lib = "libduktape.207.20700.so"
  1. Run the script:
python3 SheetJSDuk.py pres.numbers

If the program succeeded, the CSV contents will be printed to console and the file sheetjsw.xlsb will be created. That file can be opened with Excel.

Zig

Zig support is considered experimental.

Great open source software grows with user tests and reports. Any issues should be reported to the Zig project for further diagnosis.

Zig Compilation

The main Duktape code can be added to the Zig build pipeline.

The following explanation was verified against Zig 0.11.0.

Due to restrictions in the Zig C integration, the path to the Duktape src folder must be added to the include path list:

build.zig
    const exe = b.addExecutable(.{
// ...
});
// this line is required to make @cInclude("duktape.h") work
exe.addIncludePath(.{ .path = "duktape-2.7.0/src" });

The duktape.c source file must be added to the build sequence. For Zig version 0.11.0, Duktape must be compiled with flags -std=c99 -fno-sanitize=undefined and linked against libc and libm:

build.zig
    const exe = b.addExecutable(.{
// ...
});
exe.addCSourceFile(.{:
.file = .{ .path = "duktape-2.7.0/src/duktape.c" },
.flags = &.{ "-std=c99", "-fno-sanitize=undefined" }
});
exe.linkSystemLibrary("c");
exe.linkSystemLibrary("m");

Zig Import

duktape.h can be imported using the @cImport directive:

main.zig
const duktape = @cImport({
@cInclude("duktape.h");
});

Once imported, many API functions can be referenced from the duktape scope. For example, duk_peval_string in the C interface will be available to Zig code using the name duktape.duk_peval_string.

It is strongly recommended to colocate allocations and cleanup methods using defer. For example, a Duktape context is created with duk_create_heap and destroyed with duk_destroy_heap. The latter call can be deferred:

    const ctx = duktape.duk_create_heap(null, null, null, null, null);
defer _ = duktape.duk_destroy_heap(ctx);

Zig Translator Caveats

The Zig translator does not properly handle blingo void casts. For example, duk_eval_string_noresult is a function-like macro defined in duktape.h:

duk_eval_string_noresult blingo
#define duk_eval_string_noresult(ctx,src)  \
((void) duk_eval_raw((ctx), (src), 0, 0 /*args*/ | DUK_COMPILE_EVAL | DUK_COMPILE_NOSOURCE | DUK_COMPILE_STRLEN | DUK_COMPILE_NORESULT | DUK_COMPILE_NOFILENAME))

The compiler will throw an error involving anyopaque (C void):

error: opaque return type 'anyopaque' not allowed

The blingo performs a void cast to suppress certain C compiler warnings. The spiritual equivalent in Zig is to assign to _.

The duk_eval_raw method and each compile-time constant are available in the duktape scope. A manual translation is shown below:

_ = duktape.duk_eval_raw(ctx, src, 0, 0 | duktape.DUK_COMPILE_EVAL | duktape.DUK_COMPILE_NOSOURCE | duktape.DUK_COMPILE_STRLEN | duktape.DUK_COMPILE_NORESULT | duktape.DUK_COMPILE_NOFILENAME);

Zig Demo

Tested Deployments

This demo was tested in the following deployments:

ArchitectureVersionZigDate
darwin-x642.7.00.11.02024-03-10
win10-x642.7.00.11.02024-03-10
linux-x642.7.00.11.02024-03-10

On Windows, due to incompatibilities between WSL and PowerShell, some commands must be run in WSL Bash.

  1. Create a new project folder:
mkdir sheetjs-zig
cd sheetjs-zig
  1. Download Zig 0.11.0 from https://ziglang.org/download/ and extract to the project folder.
curl -LO https://ziglang.org/download/0.11.0/zig-macos-x86_64-0.11.0.tar.xz
tar -xzf zig-macos-x86_64-0.11.0.tar.xz
  1. Initialize a project:
./zig-macos-x86_64-0.11.0/zig init-exe
  1. Download the Duktape source and extract in the current directory. On Windows, the commands should be run within WSL:
curl -LO https://duktape.org/duktape-2.7.0.tar.xz
tar -xJf duktape-2.7.0.tar.xz
  1. Download the SheetJS Standalone script, shim script and test file. Move all three files to the src subdirectory:

The following commands can be run within a shell on macOS and Linux. On Windows, the commands should be run within WSL bash:

curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/xlsx.full.min.js
curl -LO https://sheetjs.com/pres.numbers
mv *.js src
  1. Add the highlighted lines to build.zig just after the exe definition:
build.zig (add highlighted lines)
    const exe = b.addExecutable(.{
.name = "sheetjs-zig",
// In this case the main source file is merely a path, however, in more
// complicated build scripts, this could be a generated file.
.root_source_file = .{ .path = "src/main.zig" },
.target = target,
.optimize = optimize,
});
exe.addCSourceFile(.{ .file = .{ .path = "duktape-2.7.0/src/duktape.c" }, .flags = &.{ "-std=c99", "-fno-sanitize=undefined" } });
exe.addIncludePath(.{ .path = "duktape-2.7.0/src" });
exe.linkSystemLibrary("c");
exe.linkSystemLibrary("m");
  1. Download main.zig and replace src/main.zig. The following command should be run in WSL bash or the macOS or Linux terminal:
curl -L -o src/main.zig https://docs.sheetjs.com/duk/main.zig
  1. Build and run the program:
./zig-macos-x86_64-0.11.0/zig build run -- pres.numbers

This step builds and runs the program. The generated program will be placed in the zig-out/bin/ subdirectory.

It should display some metadata along with CSV rows from the first worksheet. It will also generate sheetjs.zig.xlsx, which can be opened with a spreadsheet editor such as Excel.

Perl

The Perl binding for Duktape is available as JavaScript::Duktape::XS on CPAN.

The Perl binding does not have raw Buffer ops, so Base64 strings are used.

Perl Demo

Tested Deployments

This demo was tested in the following deployments:

ArchitectureVersionDate
darwin-x642.2.02024-03-15
darwin-arm2.2.02024-02-13
linux-x642.2.02024-03-21
  1. Ensure perl and cpan are installed and available on the system path.

  2. Install the JavaScript::Duktape::XS library:

cpan install JavaScript::Duktape::XS

On some systems, the command must be run as the root user:

sudo cpan install JavaScript::Duktape::XS
  1. Download SheetJSDuk.pl:
curl -LO https://docs.sheetjs.com/duk/SheetJSDuk.pl
  1. Download the SheetJS ExtendScript build and test file:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.2/package/dist/xlsx.extendscript.js
curl -LO https://sheetjs.com/pres.xlsx
  1. Run the script:
perl SheetJSDuk.pl pres.xlsx

If the script succeeded, the data in the test file will be printed in CSV rows. The script will also export SheetJSDuk.xlsb.

In some test runs, the command failed due to missing File::Slurp:

Can't locate File/Slurp.pm in @INC (you may need to install the File::Slurp module)

The fix is to install File::Slurp with cpan:

sudo cpan install File::Slurp

Footnotes

  1. See Foreign Function Interface in the PHP documentation.

  2. See ctypes in the Python documentation.