Google Sheets Data Interchange
This demo focuses on external data processing. For Google Apps Script custom functions, the "Google Sheets" extension demo covers Apps Script integration.
Google Sheets is a collaborative spreadsheet service with powerful external APIs for automation.
SheetJS is a JavaScript library for reading and writing data from spreadsheets.
This demo uses SheetJS to properly exchange data with spreadsheet files. We'll explore how to use NodeJS integration libraries and SheetJS in three data flows:
-
"Importing data": Data in a NUMBERS spreadsheet will be parsed using SheetJS libraries and written to a Google Sheets Document
-
"Exporting data": Data in Google Sheets will be pulled into arrays of objects. A workbook will be assembled and exported to Excel Binary workbooks (XLSB).
-
"Exporting files": SheetJS libraries will read XLSX files exported by Google Sheets and generate CSV rows from every worksheet.
It is strongly recommended to create a new Google account for testing.
One small mistake could result in a block or ban from Google services.
Google Sheets deprecates APIs quickly and there is no guarantee that the referenced APIs will be available in the future.
Integration Details
This demo uses the Sheets v4 and Drive v3 APIs through the official googleapis
connector module.
There are a number of steps to enable the Google Sheets API and Google Drive API for an account. The Complete Example covers the process.
Document Duality
Each Google Sheets document is identified with a unique ID. This ID can be found from the Google Sheets edit URL.
The edit URL starts with https://docs.google.com/spreadsheets/d/
and includes
/edit
. The ID is the string of characters between the slashes. For example:
https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
|^^^^^^^^^^^^^^^^^^^^^^^^^^^|--- ID
The same ID is used in Google Drive operations.
The following operations are covered in this demo:
Operation | API |
---|---|
Create Google Sheets Document | Sheets |
Add and Remove worksheets | Sheets |
Modify data in worksheets | Sheets |
Share Sheets with other users | Drive |
Generate raw file exports | Drive |
Authentication
It is strongly recommended to use a service account for Google API operations. The "Service Account Setup" section covers how to create a service account and generate a JSON key file.
The generated JSON key file includes client_email
and private_key
fields.
These fields can be used in JWT authentication:
import { google } from "googleapis";
// adjust the path to the actual key file.
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* connect to google services */
const jwt = new google.auth.JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets', // Google Sheets
'https://www.googleapis.com/auth/drive.file', // Google Drive
]
});
Connecting to Services
The google
named export includes special methods to connect to various APIs.
Google Sheets
const sheets = google.sheets({ version: "v4", auth: jwt });
google.sheets
takes an options argument that includes API version number and
authentication details.
Google Drive
const drive = google.drive({ version: "v3", auth: jwt });
google.drive
takes an options argument that includes API version number and
authentication details.
Array of Arrays
"Arrays of Arrays" are the main data format for interchange with Google Sheets. The outer array object includes row arrays, and each row array includes data.
SheetJS provides methods for working with Arrays of Arrays:
aoa_to_sheet
1 creates SheetJS worksheet objects from arrays of arrayssheet_to_json
2 can generate arrays of arrays from SheetJS worksheets
Export Document Data
The goal is to create an XLSB export from a Google Sheet. Google Sheets does not natively support the XLSB format. SheetJS fills the gap.
Convert a Single Sheet
sheets.spreadsheets.values.get
returns data from an existing Google Sheet. The
method expects a range. Passing the sheet name as the title will pull all rows.
If successful, the response object will have a data
property. It will be an
object with a values
property. The values will be represented as an Array of
Arrays of values. This array of arrays can be converted to a SheetJS sheet:
async function gsheet_ws_to_sheetjs_ws(id, sheet_name) {
/* get values */
const res = await sheets.spreadsheets.values.get({
spreadsheetId: id,
range: `'${sheet_name}'`
});
const values = res.data.values;
/* create SheetJS worksheet */
const ws = XLSX.utils.aoa_to_sheet(values);
return ws;
}
Convert a Workbook
sheets.spreadsheets.get
returns metadata about the Google Sheets document. In
the result object, the data
property is an object which has a sheets
property. The value of the sheets
property is an array of sheet objects.
The SheetJS book_new
3 method creates blank SheetJS workbook objects. The
book_append_sheet
4 method adds SheetJS worksheet objects to the workbook.
By looping across the sheets, the entire workbook can be converted:
async function gsheet_doc_to_sheetjs_wb(doc) {
/* Create a new workbook object */
const wb = XLSX.utils.book_new();
/* Get metadata */
const wsheet = await sheets.spreadsheets.get({spreadsheetId: id});
/* Loop across the Document sheets */
for(let sheet of wsheet.data.sheets) {
/* Get the worksheet name */
const name = sheet.properties.title;
/* Convert Google Docs sheet to SheetJS worksheet */
const ws = await gsheet_ws_to_sheetjs_ws(id, name);
/* Append worksheet to workbook */
XLSX.utils.book_append_sheet(wb, ws, name);
}
return wb;
}
This method returns a SheetJS workbook object that can be exported with the
writeFile
and write
methods.5
Update Document Data
The goal is to import data from a NUMBERS file to Google Sheets. Google Sheets does not natively support the NUMBERS format. SheetJS fills the gap.
Create New Document
sheets.spreadsheets.create
creates a new Google Sheets document. It can accept
a document title. It will generate a new workbook with a blank "Sheet1" sheet.
The response includes the document ID for use in subsequent operations:
const res = await sheets.spreadsheets.create({
requestBody: {
properties: {
/* Document Title */
title: "SheetJS Test"
}
}
});
const id = res.data.spreadsheetId;
When using a service worker, the main account does not have access to the new document by default. The document has to be shared with the main account using the Drive API:
await drive.permissions.create({
fileId: id, // this ID was returned in the response to the create request
fields: "id",
requestBody: {
type: "user",
role: "writer",
emailAddress: "[email protected]" // main address
}
});
Delete Non-Initial Sheets
Google Sheets does not allow users to delete every worksheet.
The recommended approach involves deleting every worksheet after the first.
The delete operation requires a unique identifier for a sheet within the Google
Sheets document. These IDs are found in the sheets.spreadsheets.get
response.
The following snippet performs one bulk operation using batchUpdate
:
/* get existing sheets */
const wsheet = await sheets.spreadsheets.get({spreadsheetId: id});
/* remove all sheets after the first */
if(wsheet.data.sheets.length > 1) await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: wsheet.data.sheets.slice(1).map(s => ({
deleteSheet: {
sheetId: s.properties.sheetId
}
}))}
});
Rename First Sheet
The first sheet must be renamed so that the append operations do not collide with the legacy name. Since most SheetJS-supported file formats and most spreadsheet applications limit worksheet name lengths to 32 characters, it is safe to set a name that exceeds 33 characters.
The updateSheetProperties
update method can rename individual sheets:
/* rename first worksheet to avoid collisions */
await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: [{
updateSheetProperties: {
fields: "title",
properties: {
sheetId: wsheet.data.sheets[0].properties.sheetId,
// the new title is 34 characters, to be exact
title: "thistitleisatleast33characterslong"
}
}
}]}
});
Append Worksheets
The read
and readFile
methods generate SheetJS
workbook objects from existing worksheet files.
Starting from a SheetJS workbook, the SheetNames
property6 is an array of
worksheet names and the Sheets
property is an object that maps sheet names to
worksheet objects.
Looping over the worksheet names, there are two steps to appending a sheet:
-
"Append a blank worksheet": The
addSheet
request, submitted through thesheets.spreadsheets.batchUpdate
method, accepts a new title and creates a new worksheet. The new worksheet will be added at the end. -
"Write data to the new sheet": The SheetJS
sheet_to_json
method with the optionheader: 1
7 will generate an array of arrays of data. This structure is compatible with thesheets.spreadsheets.values.update
operation.
The following snippet pushes all worksheets from a SheetJS workbook into a Google Sheets document:
/* add sheets from file */
for(let name of wb.SheetNames) {
/* (1) Create a new Google Sheets sheet */
await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: [
/* add new sheet */
{ addSheet: { properties: { title: name } } },
] }
});
/* (2) Push data */
const aoa = XLSX.utils.sheet_to_json(wb.Sheets[name], {header:1});
await sheets.spreadsheets.values.update({
spreadsheetId: id,
range: `'${name}'!A1`,
valueInputOption: "USER_ENTERED",
resource: { values: aoa }
});
}
Delete Initial Sheet
After adding new worksheets, the final step involves removing the initial sheet.
The initial sheet ID can be pulled from the worksheet metadata fetched when the non-initial sheets were removed:
/* remove first sheet */
await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: [
/* remove old first sheet */
{ deleteSheet: { sheetId: wsheet.data.sheets[0].properties.sheetId } }
] }
});
Raw File Exports
In the web interface, Google Sheets can export documents to XLSX
or ODS
.
Raw file exports are exposed through the files.export
method in the Drive API:
const drive = google.drive({ version: "v3", auth: jwt });
/* Request XLSX export */
const file = await drive.files.export({
/* XLSX MIME type */
mimeType: "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
fileId: id
});
The mimeType
property is expected to be one of the supported formats8. When
the demo was last tested, the following workbook conversions were supported:
Format | MIME Type |
---|---|
XLSX | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
ODS | application/x-vnd.oasis.opendocument.spreadsheet |
The response object has a data
field whose value will be a Blob
object. Data
can be pulled into an ArrayBuffer
and passed to the SheetJS read
9 method:
/* Obtain ArrayBuffer */
const ab = await file.data.arrayBuffer();
/* Parse */
const wb = XLSX.read(buf);
The code snippet works for XLSX and ODS. Google Sheets supports other formats with different integration logic.
Plaintext
The following formats are considered "plaintext":
Format | MIME Type |
---|---|
CSV (first sheet) | text/csv |
TSV (first sheet) | text/tab-separated-values |
For these formats, file.data
is a JS string that can be parsed directly:
/* Request CSV export */
const file = await drive.files.export({ mimeType: "text/csv", fileId: id });
/* Parse CSV string*/
const wb = XLSX.read(file.data, {type: "string"});
HTML
Google Sheets has one relevant HTML type:
Format | MIME Type |
---|---|
HTML (all sheets) | application/zip |
The HTML export of a Google Sheets worksheet includes a row for the column
labels (A
, B
, ...) and a column for the row labels (1
, 2
, ...).
The complete package is a ZIP file that includes a series of .html
files.
The files are written in tab order. The name of each file matches the name in
Google Sheets.
This ZIP can be extracted using the embedded CFB library:
import { read, utils, CFB } from 'xlsx';
// -------------------^^^-- `CFB` named import
// ...
/* Parse Google Sheets ZIP file */
const cfb = CFB.read(new Uint8Array(ab), {type: "array"});
/* Create new SheetJS workbook */
const wb = utils.book_new();
/* Scan through each entry in the ZIP */
cfb.FullPaths.forEach((n, i) => {
/* only process HTML files */
if(n.slice(-5) != ".html") return;
/* Extract worksheet name */
const name = n.slice(n.lastIndexOf("/")+1).slice(0,-5);
/* parse HTML */
const htmlwb = read(cfb.FileIndex[i].content);
/* add worksheet to workbook */
utils.book_append_sheet(wb, htmlwb.Sheets.Sheet1, name);
});
At this point wb
is a SheetJS workbook object10.
Complete Example
This demo was last tested on 2024 June 08 using googleapis
version 140.0.0
.
The demo uses Sheets v4 and Drive v3 APIs.
The Google Cloud web interface changes frequently!
The screenshots and detailed descriptions may be out of date. Please report any issues to the docs repo or reach out to the SheetJS Discord server.
Account Setup
- Create a new Google account or log into an existing account.
A valid phone number (for SMS verification) may be required.
- Open https://console.cloud.google.com in a web browser.
If this is the first time accessing Google Cloud resources, a terms of service modal will be displayed. Review the Google Cloud Platform Terms of Service by clicking the "Google Cloud Platform Terms of Service" link.
You must agree to the Google Cloud Platform Terms of Service to use the APIs.
Check the box under "Terms of Service" and click "AGREE AND CONTINUE".
Project Setup
The goal of this section is to create a new project.
- Open the Project Selector.
In the top bar, between the "Google Cloud" logo and the search bar, there will
be a selection box. Click the ▼
icon to show the modal.
If the selection box is missing, expand the browser window.
-
Click "NEW PROJECT" in the top right corner of the modal.
-
In the New Project screen, enter "SheetJS Test" in the Project name textbox and select "No organization" in the Location box. Click "CREATE".
A notification will confirm that the project was created:
API Setup
The goal of this section is to enable Google Sheets API and Google Drive API.
-
Open the Project Selector (
▼
icon) and select "SheetJS Test" -
In the search bar, type "Enabled" and select "Enabled APIs & services". This item will be in the "PRODUCTS & PAGES" part of the search results.
Enable Google Sheets API
-
Near the top of the page, click "+ ENABLE APIS AND SERVICES".
-
In the search bar near the middle of the page (not the search bar at the top), type "Sheets" and press Enter.
In the results page, look for "Google Sheets API". Click the card
-
In the Product Details screen, click the blue "ENABLE" button.
-
Click the left arrow (
<-
) next to "API/Service details".
Enable Google Drive API
-
Near the top of the page, click "+ ENABLE APIS AND SERVICES".
-
In the search bar near the middle of the page (not the search bar at the top), type "Drive" and press Enter.
In the results page, look for "Google Drive API". Click the card
- In the Product Details screen, click the blue "ENABLE" button.
Service Account Setup
The goal of this section is to create a service account and generate a JSON key.
- Go to https://console.cloud.google.com or click the "Google Cloud" image in the top bar.
Create Service Account
-
Click the Project Selector (
:·
icon) and select "SheetJS Test". -
In the search bar, type "Credentials" and select the "Credentials" item with subtitle "APIs & Services". This item will be in the "PRODUCTS & PAGES" group:
-
Click "+ CREATE CREDENTIALS". In the dropdown, select "Service Account"
-
Enter "SheetJService" for Service account name. Click "CREATE AND CONTINUE"
The Service account ID is generated automatically.
-
In Step 2 "Grant this service account access to project", click CONTINUE
-
In Step 3 click "DONE". You will be taken back to the credentials screen
Create JSON Key
-
Look for "SheetJService" in the "Service Accounts" table and click the email address in the row.
-
Click "KEYS" in the horizontal bar near the top of the page.
-
Click "ADD KEY" and select "Create new key" in the dropdown.
-
In the popup, select the "JSON" radio button and click "CREATE".
The page will download a JSON file. If prompted, allow the download.
- Click "CLOSE"
Create Document
The goal of this section is to create a document from the service account and share with the main account.
- Create a
SheetJSGS
folder and initialize:
mkdir SheetJSGS
cd SheetJSGS
npm init -y
-
Copy the JSON file from step 24 into the project folder.
-
Install dependencies:
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz googleapis
- Download
init.mjs
:
curl -LO https://docs.sheetjs.com/gsheet/init.mjs
Edit the marked lines near the top of the file:
/* Change this import statement to point to the credentials JSON file */
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* Change this to the primary account address, NOT THE SERVICE ACCOUNT */
const acct = "[email protected]";
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
'[email protected]'
should be replaced with the Google Account email address from step 0.
- Run the script:
node init.mjs
The script will print three lines:
Created Google Workbook a-long-string-of-characters
Created Google Worksheets "SheetJS1" and "SheetJS2"
Shared a-long-string-of-characters with [email protected]
The long string of characters after "Created Google Workbook" is the ID. Take note of this ID.
-
Sign into Google Sheets. A shared document "SheetJS Test" should be displayed in the table. It will be owned by the service account.
-
Open the shared document from step 31 and confirm that the document has two worksheets named "SheetJS1" and "SheetJS2".
Confirm the worksheet data matches the following screenshots:
Sheet | Data | Screenshot |
---|---|---|
SheetJS1 |
| |
SheetJS2 |
|
- Copy the URL and extract the document ID.
The URL of the document will look like
https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
---------------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^--- ID
The ID is a long string of letters and numbers and underscore characters (_
)
just before the /edit
part of the URL.
Confirm that this ID matches the ID printed in step 30.
Load Data from NUMBERS
The goal of this section is to update the new document with data from a sample NUMBERS file.
- Download the test file
pres.numbers
:
curl -LO https://docs.sheetjs.com/pres.numbers
- Download
load.mjs
:
curl -LO https://docs.sheetjs.com/gsheet/load.mjs
Edit the marked lines near the top of the file:
/* Change this import statement to point to the credentials JSON file */
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* Change this to the spreadsheet ID */
const id = "SOME-SPREADSHEETJS-ID";
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
"SOME-SPREADSHEETJS-ID"
should be replaced with the Document ID from step 33.
- Run the script:
node load.mjs
- Sign into Google Sheets and open the "SheetJS Test" shared document. It should show a list of Presidents, matching the contents of the test file.
Export Data to XLSB
The goal of this section is to export the raw data from Google Sheets to XLSB.
- Download
dump.mjs
:
curl -LO https://docs.sheetjs.com/gsheet/dump.mjs
Edit the marked lines near the top of the file:
/* Change this import statement to point to the credentials JSON file */
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* Change this to the spreadsheet ID */
const id = "SOME-SPREADSHEETJS-ID";
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
"SOME-SPREADSHEETJS-ID"
should be replaced with the Document ID from step 33.
- Run the script:
node dump.mjs
The script should create a file SheetJSExport.xlsb
in the project folder. This
file can be opened in Excel.
Export Raw Files
The goal of this section is to parse the Google Sheets XLSX export and generate CSV files for each worksheet.
-
Sign into Google Sheets and open the "SheetJS Test" shared document.
-
Click the Plus (
+
) icon in the lower left corner to create a new worksheet. -
In the new worksheet, set cell A1 to the formula
=SEQUENCE(3,5)
. This will assign a grid of values -
Download
raw.mjs
:
curl -LO https://docs.sheetjs.com/gsheet/raw.mjs
Edit the marked lines near the top of the file:
/* Change this import statement to point to the credentials JSON file */
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* Change this to the spreadsheet ID */
const id = "SOME-SPREADSHEETJS-ID";
-
'./sheetjs-test-726272627262.json'
should be replaced with the name of the JSON file in step 27. The./
prefix is required! -
"SOME-SPREADSHEETJS-ID"
should be replaced with the Document ID from step 33.
- Run the script:
node raw.mjs
The script will display the sheet names and CSV rows from both worksheets:
#### Sheet1
Name,Index
Bill Clinton,42
GeorgeW Bush,43
Barack Obama,44
Donald Trump,45
Joseph Biden,46
#### Sheet14
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
Footnotes
-
See "Workbook Object" for more details. ↩
-
See "Export MIME types for Google Workspace documents" in the Google Developer documentation for the complete list of supported file types. ↩
-
See "Workbook Object" for a description of the workbook object or "API Reference" for various methods to work with workbook and sheet objects. ↩