Google Sheets Data Interchange
This demo focuses on external data processing.
The "Google Sheets" extension demo covers Apps Script integration and user-defined functions.
Google Sheets is a collaborative spreadsheet service with powerful external APIs for automation.
SheetJS is a JavaScript library for reading and writing data from spreadsheets.
This demo uses SheetJS to properly exchange data with spreadsheet files. We'll explore how to use NodeJS integration libraries and SheetJS in three data flows:
-
"Importing data": Data in a NUMBERS spreadsheet will be parsed using SheetJS libraries and written to a Google Sheets Document
-
"Exporting data": Data in Google Sheets will be pulled into arrays of objects. A workbook will be assembled and exported to Excel Binary workbooks (XLSB).
-
"Exporting files": SheetJS libraries will read XLSX files exported by Google Sheets and generate CSV rows from every worksheet.
It is strongly recommended to create a new Google account for testing.
One small mistake could result in a block or ban from Google services.
Google Sheets deprecates APIs quickly and there is no guarantee that the referenced APIs will be available in the future.
Integration Details
This demo uses the Sheets v4 and Drive v3 APIs through the official googleapis
connector module.
There are a number of steps to enable the Google Sheets API and Google Drive API for an account. The Complete Example covers the process.
Document Duality
Each Google Sheets document is identified with a unique ID. This ID can be found from the Google Sheets edit URL.
The edit URL starts with https://docs.google.com/spreadsheets/d/
and includes
/edit
. The ID is the string of characters between the slashes. For example:
https://docs.google.com/spreadsheets/d/a_long_string_of_characters/edit#gid=0
|^^^^^^^^^^^^^^^^^^^^^^^^^^^|--- ID
The same ID is used in Google Drive operations.
The following operations are covered in this demo:
Operation | API |
---|---|
Create Google Sheets Document | Sheets |
Add and Remove worksheets | Sheets |
Modify data in worksheets | Sheets |
Share Sheets with other users | Drive |
Generate raw file exports | Drive |
Authentication
It is strongly recommended to use a service account for Google API operations. The "Service Account Setup" section covers how to create a service account and generate a JSON key file.
The generated JSON key file includes client_email
and private_key
fields.
These fields can be used in JWT authentication:
import { google } from "googleapis";
// adjust the path to the actual key file.
import creds from './sheetjs-test-726272627262.json' assert { type: "json" };
/* connect to google services */
const jwt = new google.auth.JWT({
email: creds.client_email,
key: creds.private_key,
scopes: [
'https://www.googleapis.com/auth/spreadsheets', // Google Sheets
'https://www.googleapis.com/auth/drive', // Google Drive
'https://www.googleapis.com/auth/drive.file', // Google Drive
]
});
Connecting to Services
The google
named export includes special methods to connect to various APIs.
Google Sheets
const sheets = google.sheets({ version: "v4", auth: jwt });
google.sheets
takes an options argument that includes API version number and
authentication details.
Google Drive
const drive = google.drive({ version: "v3", auth: jwt });
google.drive
takes an options argument that includes API version number and
authentication details.
Array of Arrays
"Arrays of Arrays" are the main data format for interchange with Google Sheets. The outer array object includes row arrays, and each row array includes data.
SheetJS provides methods for working with Arrays of Arrays:
aoa_to_sheet
1 creates SheetJS worksheet objects from arrays of arrayssheet_to_json
2 can generate arrays of arrays from SheetJS worksheets
Export Document Data
The goal is to create an XLSB export from a Google Sheet. Google Sheets does not natively support the XLSB format. SheetJS fills the gap.
Convert a Single Sheet
sheets.spreadsheets.values.get
returns data from an existing Google Sheet. The
method expects a range. Passing the sheet name as the title will pull all rows.
If successful, the response object will have a data
property. It will be an
object with a values
property. The values will be represented as an Array of
Arrays of values. This array of arrays can be converted to a SheetJS sheet:
async function gsheet_ws_to_sheetjs_ws(id, sheet_name) {
/* get values */
const res = await sheets.spreadsheets.values.get({
spreadsheetId: id,
range: `'${sheet_name}'`
});
const values = res.data.values;
/* create SheetJS worksheet */
const ws = XLSX.utils.aoa_to_sheet(values);
return ws;
}
Convert a Workbook
sheets.spreadsheets.get
returns metadata about the Google Sheets document. In
the result object, the data
property is an object which has a sheets
property. The value of the sheets
property is an array of sheet objects.
The SheetJS book_new
3 method creates blank SheetJS workbook objects. The
book_append_sheet
4 method adds SheetJS worksheet objects to the workbook.
By looping across the sheets, the entire workbook can be converted:
async function gsheet_doc_to_sheetjs_wb(doc) {
/* Create a new workbook object */
const wb = XLSX.utils.book_new();
/* Get metadata */
const wsheet = await sheets.spreadsheets.get({spreadsheetId: id});
/* Loop across the Document sheets */
for(let sheet of wsheet.data.sheets) {
/* Get the worksheet name */
const name = sheet.properties.title;
/* Convert Google Docs sheet to SheetJS worksheet */
const ws = await gsheet_ws_to_sheetjs_ws(id, name);
/* Append worksheet to workbook */
XLSX.utils.book_append_sheet(wb, ws, name);
}
return wb;
}
This method returns a SheetJS workbook object that can be exported with the
writeFile
and write
methods.5
Update Document Data
The goal is to import data from a NUMBERS file to Google Sheets. Google Sheets does not natively support the NUMBERS format. SheetJS fills the gap.
Create New Document
sheets.spreadsheets.create
creates a new Google Sheets document. It can accept
a document title. It will generate a new workbook with a blank "Sheet1" sheet.
The response includes the document ID for use in subsequent operations:
const res = await sheets.spreadsheets.create({
requestBody: {
properties: {
/* Document Title */
title: "SheetJS Test"
}
}
});
const id = res.data.spreadsheetId;
When using a service worker, the main account does not have access to the new document by default. The document has to be shared with the main account using the Drive API:
await drive.permissions.create({
fileId: id, // this ID was returned in the response to the create request
fields: "id",
requestBody: {
type: "user",
role: "writer",
emailAddress: "[email protected]" // main address
}
});
Delete Non-Initial Sheets
Google Sheets does not allow users to delete every worksheet.
The recommended approach involves deleting every worksheet after the first.
The delete operation requires a unique identifier for a sheet within the Google
Sheets document. These IDs are found in the sheets.spreadsheets.get
response.
The following snippet performs one bulk operation using batchUpdate
:
/* get existing sheets */
const wsheet = await sheets.spreadsheets.get({spreadsheetId: id});
/* remove all sheets after the first */
if(wsheet.data.sheets.length > 1) await sheets.spreadsheets.batchUpdate({
spreadsheetId: id,
requestBody: { requests: wsheet.data.sheets.slice(1).map(s => ({
deleteSheet: {
sheetId: s.properties.sheetId
}
}))}
});