Connecting to Google Spanner

Screenshot

Google Cloud Spanner is a fully managed, scalable, globally distributed, and strongly consistent database service that is designed for mission-critical applications. It combines the benefits of relational databases with the scalability of NoSQL databases, making it an ideal choice for modern applications that require high availability and low latency.

In this tutorial, we will show you how to connect to a Google Spanner database using GQL and display the data with Ogma. We will use the Google Cloud Spanner service and the Graph Query Language to retrieve data from the database.

We will also demonstrate how to visualize the data using Ogma, a powerful graph visualization library.

We will cover the following topics:

Setting up a Google Cloud Spanner database
Loading data from a static CSV file into the database
Connecting to the database using Node.js express app and GQL
Running queries using GQL
Visualizing the data with Ogma
Adding styles and layout to the graph
Adding interactivity to the graph

Prerequisites

A Google Cloud account
A Google Cloud Spanner instance
A Google Cloud Spanner database

The dataset

The dataset used in this tutorial is a simple graph of IMDB movies, actors and genres. You can find it in .csv format on Kaggle. It contains an exempt of about 1000 movies from IMDB. We will expalin to you how to parse and load this dataset into Google Spanner.

Connecting to Google Spanner

You can connect to a Google Spanner database using the @google-cloud/spanner library. This library provides a simple and easy-to-use API for connecting to Google Spanner and running queries. You will need to set up authentication using a service account key file.

Go to the Google Cloud Console (https://console.cloud.google.com/)
Navigate to "IAM & Admin" > "Service Accounts"
Click "Create Service Account"
Enter a name and description for your service account
Assign the necessary roles for Spanner access:
- Cloud Spanner Admin
- Cloud Spanner Database Admin
- Cloud Spanner Viewer (if you only need read access)
Click "Create Key" and select JSON format
Save the downloaded JSON file in a secure location - this is your credentials file (you can find a sample credentials file in the files folder of this tutorial)

Now you need to point your application and scripts to the credentials file. You can do this by setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of your JSON key file. Here's an example of the .env file we will be using in this tutorial:

# Google Cloud Spanner Configuration
SPANNER_PROJECT_ID=<your-project-id>
SPANNER_INSTANCE_ID=<your-instance-id>
SPANNER_DATABASE_ID=<your-database-id>

# Path to your Google Cloud Service Account key file
GOOGLE_APPLICATION_CREDENTIALS=<path-to-your-credentials-file>

You can find the example in the files/.env.example file. You can rename it to .env and fill in the values.

Setting up your Node.js application.

We will be using TypeScript and Node.js to do the data processing and the application server. So you will need to initialize a Node.js/Typescript project using npm:

bash

npm init -y
npm install typescript ts-node @types/node express tsx vite @google-cloud/spanner dotenv uuid csv-parser --save-dev

This will install the tsx transformer that would allow you running TypeScript code directly in Node.js. ts-node types would allow you operating the Node.js types easier in TypeScript. express and vite will be used later to create the API and the simple web application to visualize the data. @google-cloud/spanner is the library that allows you to connect to Google Spanner and run queries. dotenv is used to load environment variables from the .env file. uuid is used to generate unique IDs for the nodes in the graph. csv-parser is used to parse the CSV files.

Loading the dataset

Using .env file to load the credentials and configuration

We will be using the dotenv library to load the environment variables from the .env file. You can create a .env file in the root of your project and add the following content:

import dotenv from 'dotenv';
// Load environment variables from .env file
dotenv.config();

From now on you can use the process.env object to access the environment variables.

Parsing the CSV file

We will be using the csv-parser library to parse the CSV file. You can create a files/populate_db.ts file and add the following code:

import csv from 'csv-parser';
import fs from 'fs';
import { v4 as uuidv4 } from 'uuid';

...

fs.createReadStream(csvFilePath)
  .pipe(csv())
  .on('data', (row: MovieRow) => {
    // here you can process each row of the CSV file
    // and create the nodes and edges records for the graph,
  })

The uuid library is used to generate unique ids for the records - they will be used to create the relations and connect the nodes and edges. Full implementation you can find in the files/populate_db.ts file.

Creating the schema

After parsing the CSV file we will create a database and assign a schema to it. The schema will be used to create the tables and the indexes. You can find a full version of the script in the files/populate_db.ts script. But first, let's set up the tables, indexes and the graph schema. It looks like a set of SQL commands that we have separated into files/schema.ts file.

export const dbSchema = [
  // Movies table (Node)
  `CREATE TABLE Movies (
        id STRING(36) NOT NULL,
        title STRING(MAX) NOT NULL,
        description STRING(MAX),
        year INT64,
        runtime INT64,
        votes INT64,
      ) PRIMARY KEY (id)`,

  // People table (Node)
  `CREATE TABLE People (
        id STRING(36) NOT NULL,
        name STRING(MAX) NOT NULL
      ) PRIMARY KEY (id)`,

  // Genres table (Node)
  `CREATE TABLE Genres (
        id STRING(36) NOT NULL,
        name STRING(MAX) NOT NULL
      ) PRIMARY KEY (id)`,

  // Directed relationship table (Edge)
  `CREATE TABLE Directed (
        id STRING(36) NOT NULL,
        personId STRING(36) NOT NULL,
        movieId STRING(36) NOT NULL,
        FOREIGN KEY (personId) REFERENCES People (id),
        FOREIGN KEY (movieId) REFERENCES Movies (id)
      ) PRIMARY KEY (id)`,

  // ActedIn relationship table (Edge)
  `CREATE TABLE ActedIn (
        id STRING(36) NOT NULL,
        personId STRING(36) NOT NULL,
        movieId STRING(36) NOT NULL,
        FOREIGN KEY (personId) REFERENCES People (id),
        FOREIGN KEY (movieId) REFERENCES Movies (id)
      ) PRIMARY KEY (id)`,

  // HasGenre relationship table (Edge)
  `CREATE TABLE HasGenre (
        id STRING(36) NOT NULL,
        movieId STRING(36) NOT NULL,
        genreId STRING(36) NOT NULL,
        FOREIGN KEY (movieId) REFERENCES Movies (id),
        FOREIGN KEY (genreId) REFERENCES Genres (id)
      ) PRIMARY KEY (id)`,
];

This will create tables to store the movies, people, genres and the relationships between them. An important part of this step is to ennable the graph schema. This is done by a special command that will tell Google Spanner which tables are nodes and which tables are edges and hoe to connect them. So for that you need the following directive (add it to the array of queries above):

sql

CREATE OR REPLACE PROPERTY GRAPH movieGraph
  NODE TABLES (
    Movies,
    People,
    Genres
  )
  EDGE TABLES (
    Directed
      SOURCE KEY (personId) REFERENCES People (id)
      DESTINATION KEY (movieId) REFERENCES Movies (id)
      LABEL Directed,
    ActedIn
      SOURCE KEY (personId) REFERENCES People (id)
      DESTINATION KEY (movieId) REFERENCES Movies (id)
      LABEL ActedIn,
    HasGenre
      SOURCE KEY (genreId) REFERENCES Genres (id)
      DESTINATION KEY (movieId) REFERENCES Movies (id)
  )

You can see that it will create the movieGraph graph schema and define the nodes and edges. The SOURCE KEY and DESTINATION KEY are used to define the source and destination nodes of the edges. ActedIn, Directed and HasGenre are the types of edges we will be using in the application.

To create the tables and the graph schema, you can use the following code:

// Create Spanner client
const spanner = new Spanner({
  projectId: config.projectId
});

// Get instance and database
const instance = spanner.instance(config.instanceId!);
const database = instance.database(config.databaseId!);

// Check if schema already exists
const [schema] = await database.getSchema();
if (schema.length > 0 && !drop) {
  console.log('Schema already exists, skipping creation.');
  return;
}

// Create schema
try {
  const statements = dbSchema;
  const [operation] = await database.updateSchema(statements);
  await operation.promise();
  console.log('Schema created successfully!');
} catch (error) {
  console.error('Error creating schema:', error);
  throw error;
}

Inserting data

After creating the schema, you can insert the data into the database. You can use the insert method of the database object to insert the data. You can find a full version of the script in the files/populate_db.ts file. Here is an example of how to insert data into the database:

// Insert data into Movies table
const movieRows = movies.map((movie) => ({
  id: uuidv4(),
  title: movie.title,
  description: movie.description,
  year: movie.year,
  runtime: movie.runtime,
  votes: movie.votes,
}));
await database.table('Movies').insert(movieRows);

Now that the data is loaded and schema is created, you can see it in the Google Spanner console like this:

schema

Node.js API

In order to query the database you will need to create a Node.js REST API. The reason why you have to do it and cannot query the database directly from the browser is that you should not expose your database credentials to the client. So you will need to create a simple API that will handle the requests and return the data.

import express from 'express';
import { Spanner } from '@google-cloud/spanner';
import dotenv from 'dotenv';

// Load environment variables from .env file
dotenv.config();

// Initialize Express app
const app = express();
const port = 3000;

// Initialize Spanner client and point it at the database (see files/db.ts)
const database = getDatabase();

Now you can create the endpoints to query the database. We envision requesting data in a following way:

/api/movies/:title - to get the movie by title, its director, actors and genres
/api/actors/:name - to get the person by name, the movies they acted in and the genres of those movies
/api/genres/:name - to get the genre by name, the movies of that genre and the actors in those movies

Form the UX perspective, it will happen when we want to explore the graph by double-clicking on a node. So let's create the endpoints for that (full version in files/server.ts):

// Route stubs
app.get('/movie/:title', async (req: Request, res: Response) => {
  const { title } = req.params;
  const [rows] = await database.run({
    sql: GET_MOVIE_BY_TITLE,
    params: { title }
  });
  // Placeholder for database query logic
  res.json({
    message: `Stub: Get movie with title "${title}"`,
    graph: toGraph(rows)
  });
});

For convenience, we have separated the query and the data convertors. The query will look like this:

Cypher

GRAPH movieGraph
MATCH (m:Movies { id: @movieId })
OPTIONAL MATCH (director:People)-[:Directed]->(m)
OPTIONAL MATCH (actor:People)-[:ActedIn]->(m)
OPTIONAL MATCH (genre:Genres)-[:HasGenre]->(m)
RETURN
  m.id AS movieId,
  m.title AS title,
  m.description AS description,
  m.year AS year,
  m.runtime AS runtime,
  m.votes AS votes,
  director.id AS directorId,
  director.name AS directorName,
  actor.id AS actorId,
  actor.name AS actorName,
  genre.id AS genreId,
  genre.name AS genreName

This will traverse the graph and output all the inforamtion that we will need to form the resulting nodes and edges in JSON format. For convenience, we will convert them directly to the format that Ogma is expecting. We have a special parser class for that, you can find the full code for it in the files/parser.ts file. The logic there is that it's trying to extract all the possible nodes and links from the mixed data in query result and return them, possibly avoiding the duplication. So if it encounters movie, actor or genre data, it will make sure it will for the objects according to the model shapes and create the edges between them.

Running the server

You can run the server using the following command:

bash

npx tsx files/server.ts

This will start the API server on the port 3000. Of course it's a very simple way of running it and in the production environment you well have to taker care of the security and the access to the API from the client. But for our simple application it will be enough.

Web application

We will be using vite to create and run our web application. It's very simple, by default it will be looking for the index.html file in the root of the project. This file will load the index.ts script that will run the application with the visualization. You can find the full code in the files/index.ts file. The main part of the code is to create the Ogma instance and load the data from the API. You can use the following code to create the Ogma instance:

import Ogma from '@linkurious/ogma';

const ogma = new Ogma({
  container: 'graph-container',
});

You can then use npx vite to run the application. It will start a local server on port 5173 and you can access it in your browser at http://localhost:5173. Normally at this point you will just see the blank screen and no errors in the console.

Proxying the API

To make the api accessible from our dev server, we will need to add a proxy to the vite.config.ts file. You can do it like this:

import { defineConfig } from "vite";

export default defineConfig({
  server: {
    proxy: {
      "/api": {
        target: "http://localhost:3000", // Replace with your backend server URL
        changeOrigin: true,
        rewrite: (path) => path.replace(/^\/api/, ""),
      },
    },
  },
});

This will proxy all the requests to /api to the backend server running on port 3000. So now you can access the API from the frontend using the /api prefix. So you can run the server in one terminal and the client in another terminal.

Adding styles

We know that we will have the movies, actors and genres in the graph. Conveniently, we can share the type checking code between the server and the client, so creating selective node styles becomes very easy. You can see the full version of the code in files/styles.ts but it generallty looks like this:


const isMovie = (node: OgmaNode) => node.getType() === 'Movies';

export function addStyles(ogma: Ogma<NodeDataType>) {
  ogma.styles.addNodeRule(isMovie, {
    color: '#FCBC05',
    text: {
      content: node => node.getData('title'),
      font,
      size: 16.5,
      style: 'bold',
      minVisibleSize: 0
    },
    radius: 8,
    icon: {
      content: getIconCode('icon-film'),
      font: 'Lucide',
      scale: 0.5,
      color: '#372b09'
    }
  });
  ... // other types and states
}

Loading the first movie subgraph

Now that we have the API and the styles, we can load the first movie subgraph. We have wrapped all the calls to the API in a client class that will handle parameter validation and pass them to the API. You can find the full code in the files/client.ts file. We will use this abstraction like this:

const client = new API();
const initialMovie = "Chappie";
const graph = await client.getMovieByTitle(initialMovie);

// add the graph to Ogma (it will be styled immediately)
await ogma.setGraph(graph);
// run the layout to position the nodes and center on them
await ogma.layouts.force({ locate: true });

This will load the graph from the API and add it to Ogma.

graph

Congratulations, you have the first subgraph loaded from Google Spanner into your visualization! You can see the movie, its director, actors and genres. You can also see the styles applied to the nodes and edges.

Adding interactivity

We will add the behaviors to expand the adjacent nodes and edges when double-clicking on the node. For that, you will just need to figure which endpoint to query, based on the node type:

for an actor, you will need to query the /api/actors/:name endpoint to get all the movies the actor was in
for a genre, you will need to query the /api/genres/:name endpoint to get all the movies of that genre
for a movie, you will need to query the /api/movies/:title endpoint to get all the actors and genres of that movie

// listen to the double-click event on the graph
ogma.events.on("doubleclick", async ({ target }) => {
  if (target && target.isNode) { // we clicked a node
    if (isMovie(target)) { // and it represents a movie
      const movieTitle = target.getData("title");
      const graph = await client.getMovieByTitle(movieTitle);
      // add the graph, ignore duplicate nodes and edges
      await ogma.addGraph(graph, { ignoreInvalid: true });
      // zoom in on the new nodes
      await addGraphAndZoomIn(graph);
    } else if (isPerson(target)) { // other types of nodes
      ...

You can see the full experience in our Google Spanner example. It uses a mock database to query, but you can easily combine that code and this tutorial to recreate the full application using this data.

Getting started

React

Angular

Connecting to Google Spanner

Prerequisites

The dataset

Connecting to Google Spanner

Setting up your Node.js application.

Loading the dataset

Using .env file to load the credentials and configuration

Parsing the CSV file

Creating the schema

Inserting data

Node.js API

Running the server

Web application

Proxying the API

Adding styles

Loading the first movie subgraph

Adding interactivity

Connecting to Google Spanner ​

Prerequisites ​

The dataset ​

Connecting to Google Spanner ​

Setting up your Node.js application. ​

Loading the dataset ​

Using .env file to load the credentials and configuration ​

Parsing the CSV file ​

Creating the schema ​

Inserting data ​

Node.js API ​

Running the server ​

Web application ​

Proxying the API ​

Adding styles ​

Loading the first movie subgraph ​

Adding interactivity ​

Connecting to Google Spanner

Prerequisites

The dataset

Connecting to Google Spanner

Setting up your Node.js application.

Loading the dataset

Using .env file to load the credentials and configuration

Parsing the CSV file

Creating the schema

Inserting data

Node.js API

Running the server

Web application

Proxying the API

Adding styles

Loading the first movie subgraph

Adding interactivity