Objectives
This tutorial walks you through the following steps using the Spanner PGAdapter local proxy for PostgreSQL drivers:
- Create a Spanner instance and database.
- Write, read, and execute SQL queries on data in the database.
- Update the database schema.
- Update data using a read-write transaction.
- Add a secondary index to the database.
- Use the index to read and execute SQL queries on data.
- Retrieve data using a read-only transaction.
Costs
This tutorial uses Spanner, which is a billable component of the Google Cloud. For information on the cost of using Spanner, see Pricing.
Before you begin
Complete the steps described in Set up, which cover creating and setting a default Google Cloud project, enabling billing, enabling the Cloud Spanner API, and setting up OAuth 2.0 to get authentication credentials to use the Cloud Spanner API.
In particular, make sure that you run gcloud auth
application-default login
to set up your local development environment with authentication
credentials.
Prepare your local PGAdapter environment
You can use PostgreSQL drivers in combination with PGAdapter to connect to Spanner. PGAdapter is a local proxy that translates the PostgreSQL network protocol to the Spanner gRPC protocol.
PGAdapter requires either Java or Docker to run.
Install one of the following on your development machine if none of them are already installed:
Clone the sample app repository to your local machine:
git clone https://github.com/GoogleCloudPlatform/pgadapter.git
Change to the directory that contains the Spanner sample code:
psql
cd pgadapter/samples/snippets/psql-snippets
Java
cd pgadapter/samples/snippets/java-snippets mvn package -DskipTests
Go
cd pgadapter/samples/snippets/golang-snippets
Node.js
cd pgadapter/samples/snippets/nodejs-snippets npm install
Python
cd pgadapter/samples/snippets/python-snippets python -m venv ./venv pip install -r requirements.txt cd samples
C#
cd pgadapter/samples/snippets/dotnet-snippets
Create an instance
When you first use Spanner, you must create an instance, which is an allocation of resources that are used by Spanner databases. When you create an instance, you choose an instance configuration, which determines where your data is stored, and also the number of nodes to use, which determines the amount of serving and storage resources in your instance.
Execute the following command to create a Spanner instance in the region
us-central1
with 1 node:
gcloud spanner instances create test-instance --config=regional-us-central1 \
--description="Test Instance" --nodes=1
Note that this creates an instance with the following characteristics:
- Instance ID
test-instance
- Display name
Test Instance
- Instance configuration
regional-us-central1
(Regional configurations store data in one region, while multi-region configurations distribute data across multiple regions. For more information, see About instances.) - Node count of 1 (
node_count
corresponds to the amount of serving and storage resources available to databases in the instance. Learn more in Nodes and processing units.)
You should see:
Creating instance...done.
Look through sample files
The samples repository contains a sample that shows how to use Spanner with PGAdapter.
Take a look through thesamples/snippets
folder, which shows how to use
Spanner. The code shows how to create and use a new database. The data
uses the example schema shown in the
Schema and data model page.
Start PGAdapter
Start PGAdapter on your local development machine and point it to the instance that you created.
The following commands assume that you have executed
gcloud auth application-default login
.
Java Application
wget https://storage.googleapis.com/pgadapter-jar-releases/pgadapter.tar.gz \
&& tar -xzvf pgadapter.tar.gz
java -jar pgadapter.jar -i test-instance
Docker
docker pull gcr.io/cloud-spanner-pg-adapter/pgadapter
docker run \
--name pgadapter \
--rm -d -p 5432:5432 \
-v "$HOME/.config/gcloud":/gcloud:ro \
--env CLOUDSDK_CONFIG=/gcloud \
gcr.io/cloud-spanner-pg-adapter/pgadapter \
-i test-instance -x
Emulator
docker pull gcr.io/cloud-spanner-pg-adapter/pgadapter-emulator
docker run \
--name pgadapter-emulator \
--rm -d \
-p 5432:5432 \
-p 9010:9010 \
-p 9020:9020 \
gcr.io/cloud-spanner-pg-adapter/pgadapter-emulator
This starts PGAdapter with an embedded Spanner emulator. This embedded emulator automatically creates any Spanner instance or database that you connect to without the need to manually create them beforehand.
We recommend that you run PGAdapter in production as either a side-car container or as an in-process dependency. For more information on deploying PGAdapter in production, see Choose a method for running PGAdapter.
Create a database
Create a database called example-db
in the instance called test-instance
by
running the following on the command line.
gcloud spanner databases create example-db --instance=test-instance \
--database-dialect=POSTGRESQL
You should see:
Creating database...done.
Create tables
The following code creates two tables in the database.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./create_tables.sh example-db
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar createtables example-db
Go
go run sample_runner.go createtables example-db
Node.js
npm start createtables example-db
Python
python create_tables.py example-db
C#
dotnet run createtables example-db
The next step is to write data to your database.
Create a connection
Before you can do reads or writes, you must create a connection to PGAdapter. All of your interactions with Spanner must go through aConnection
. The database name is specified in the connection string.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./create_connection.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar createconnection example-db
Go
go run sample_runner.go createconnection example-db
Node.js
npm start createconnection example-db
Python
python create_connection.py example-db
C#
dotnet run createconnection example-db
Write data with DML
You can insert data using Data Manipulation Language (DML) in a read-write transaction.
These samples show how to execute a DML statement on Spanner using a PostgreSQL driver.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./write_data_with_dml.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar writeusingdml example-db
Go
go run sample_runner.go writeusingdml example-db
Node.js
npm start writeusingdml example-db
Python
python write_data_with_dml.py example-db
C#
dotnet run writeusingdml example-db
You should see the following response:
4 records inserted.
Write data with a DML batch
PGAdapter supports executing DML batches. Sending multiple DML statements in one batch reduces the number of round-trips to Spanner and improves the performance of your application.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./write_data_with_dml_batch.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar writeusingdmlbatch example-db
Go
go run sample_runner.go writeusingdmlbatch example-db
Node.js
npm start writeusingdmlbatch example-db
Python
python write_data_with_dml_batch.py example-db
C#
dotnet run writeusingdmlbatch example-db
You should see:
3 records inserted.
Write data with mutations
You can also insert data using mutations.
PGAdapter translates the PostgreSQL COPY
command to
mutations. Using COPY
is the most efficient way to quickly insert data in
your Spanner database.
COPY
operations are by default atomic. Atomic operations on
Spanner are bound by the commit size limit.
See CRUD limit for more information.
These examples show how to execute a non-atomic COPY
operation. This allows
the COPY
operation to exceed the commit size limit.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./write_data_with_copy.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar write example-db
Go
go run sample_runner.go write example-db
Node.js
npm start write example-db
Python
python write_data_with_copy.py example-db
C#
dotnet run write example-db
You should see:
Copied 5 singers
Copied 5 albums
Query data using SQL
Spanner supports a SQL interface for reading data, which you can access on the command line using the Google Cloud CLI or programmatically using a PostgreSQL driver.
On the command line
Execute the following SQL statement to read the values of all columns from the
Albums
table:
gcloud spanner databases execute-sql example-db --instance=test-instance \
--sql='SELECT singer_id, album_id, album_title FROM albums'
The result should be:
SingerId AlbumId AlbumTitle
1 1 Total Junk
1 2 Go, Go, Go
2 1 Green
2 2 Forever Hold Your Peace
2 3 Terrified
Use a PostgreSQL driver
In addition to executing a SQL statement on the command line, you can issue the same SQL statement programmatically using a PostgreSQL driver.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./query_data.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar query example-db
Go
go run sample_runner.go query example-db
Node.js
npm start query example-db
Python
python query_data.py example-db
C#
dotnet run query example-db
You should see the following result:
1 1 Total Junk
1 2 Go, Go, Go
2 1 Green
2 2 Forever Hold Your Peace
2 3 Terrified
Query using a SQL parameter
If your application has a frequently executed query, you can improve its performance by parameterizing it. The resulting parametric query can be cached and reused, which reduces compilation costs. For more information, see Use query parameters to speed up frequently executed queries.
Here is an example of using a parameter in the WHERE
clause to
query records containing a specific value for LastName
.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./query_data_with_parameter.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar querywithparameter example-db
Go
go run sample_runner.go querywithparameter example-db
Node.js
npm start querywithparameter example-db
Python
python query_data_with_parameter.py example-db
C#
dotnet run querywithparameter example-db
You should see the following result:
12 Melissa Garcia
Update the database schema
Assume you need to add a new column called MarketingBudget
to the Albums
table. Adding a new column to an existing table requires an update to your
database schema. Spanner supports schema updates to a database while the
database continues to serve traffic. Schema updates don't require taking the
database offline and they don't lock entire tables or columns; you can continue
writing data to the database during the schema update. Read more about supported
schema updates and schema change performance in
Make schema updates.
Add a column
You can add a column on the command line using the Google Cloud CLI or programmatically using a PostgreSQL driver.
On the command line
Use the following ALTER TABLE
command to
add the new column to the table:
gcloud spanner databases ddl update example-db --instance=test-instance \
--ddl='ALTER TABLE albums ADD COLUMN marketing_budget BIGINT'
You should see:
Schema updating...done.
Use a PostgreSQL driver
Execute the DDL statement using a PostgreSQL driver to modify the schema:
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./add_column.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar addmarketingbudget example-db
Go
go run sample_runner.go addmarketingbudget example-db
Node.js
npm start addmarketingbudget example-db
Python
python add_column.py example-db
C#
dotnet run addmarketingbudget example-db
You should see:
Added marketing_budget column
Execute a DDL batch
It is recommended to execute multiple schema modifications in one batch.
You can execute multiple DDL statements in one batch by using the built-in
batching feature of your PostgreSQL driver, by submitting all the DDL
statements as one SQL string separated by semicolons, or by using the
START BATCH DDL
and RUN BATCH
statements.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./ddl_batch.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar ddlbatch example-db
Go
go run sample_runner.go ddlbatch example-db
Node.js
npm start ddlbatch example-db
Python
python ddl_batch.py example-db
C#
dotnet run ddlbatch example-db
You should see:
Added venues and concerts tables
Write data to the new column
The following code writes data to the new column. It sets MarketingBudget
to
100000
for the row keyed by Albums(1, 1)
and to 500000
for the row keyed
by Albums(2, 2)
.
COPY
command to
mutations. COPY
commands are by default translated to Insert
mutations.
Execute set spanner.copy_upsert=true
to translate COPY
commands to
InsertOrUpdate
mutations. This can be used to update existing data in
Spanner.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./update_data_with_copy.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar update example-db
Go
go run sample_runner.go update example-db
Node.js
npm start update example-db
Python
python update_data_with_copy.py example-db
C#
dotnet run update example-db
You should see:
Updated 2 albums
You can also execute a SQL query to fetch the values that you just wrote.
Here's the code to execute the query:
psql
Java
Go
Node.js
Python
C#
Run the query with this command:
psql
PGDATABASE=example-db ./query_data_with_new_column.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar querymarketingbudget example-db
Go
go run sample_runner.go querymarketingbudget example-db
Node.js
npm start querymarketingbudget example-db
Python
python query_data_with_new_column.py example-db
C#
dotnet run querymarketingbudget example-db
You should see:
1 1 100000
1 2 null
2 1 null
2 2 500000
2 3 null
Update data
You can update data using DML in a read-write transaction.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./update_data_with_transaction.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar writewithtransactionusingdml example-db
Go
go run sample_runner.go writewithtransactionusingdml example-db
Node.js
npm start writewithtransactionusingdml example-db
Python
python update_data_with_transaction.py example-db
C#
dotnet run writewithtransactionusingdml example-db
You should see:
Transferred marketing budget from Album 2 to Album 1
Transaction tags and request tags
Use transaction tags and request tags
to troubleshoot transactions and queries in Spanner. You can set
transaction tags and request tags with the SPANNER.TRANSACTION_TAG
and
SPANNER.STATEMENT_TAG
session variables.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./tags.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar tags example-db
Go
go run sample_runner.go tags example-db
Node.js
npm start tags example-db
Python
python tags.py example-db
C#
dotnet run tags example-db
Retrieve data using read-only transactions
Suppose you want to execute more than one read at the same timestamp. Read-only
transactions observe a consistent
prefix of the transaction commit history, so your application always gets
consistent data.
Set the connection to read-only or use the SET TRANSACTION READ ONLY
SQL
statement to execute a read-only transaction.
The following shows how to run a query and perform a read in the same read-only transaction:
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./read_only_transaction.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar readonlytransaction example-db
Go
go run sample_runner.go readonlytransaction example-db
Node.js
npm start readonlytransaction example-db
Python
python read_only_transaction.py example-db
C#
dotnet run readonlytransaction example-db
You should see output similar to:
1 1 Total Junk
1 2 Go, Go, Go
2 1 Green
2 2 Forever Hold Your Peace
2 3 Terrified
2 2 Forever Hold Your Peace
1 2 Go, Go, Go
2 1 Green
2 3 Terrified
1 1 Total Junk
Partitioned queries and Data Boost
The partitionQuery
API divides a query into smaller pieces, or partitions, and uses multiple
machines to fetch the partitions in parallel. Each partition is identified by a
partition token. The PartitionQuery API has higher latency than the standard
query API, because it is only intended for bulk operations such as exporting or
scanning the whole database.
Data Boost lets you execute analytics queries and data exports with near-zero impact to existing workloads on the provisioned Spanner instance. Data Boost only supports partitioned queries.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./data_boost.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar databoost example-db
Go
go run sample_runner.go databoost example-db
Node.js
npm start databoost example-db
Python
python data_boost.py example-db
C#
dotnet run databoost example-db
For more information on running partitioned queries and using Data Boost with PGAdapter, see: Data Boost and partitioned query statements
Partitioned DML
Partitioned Data Manipulation Language (DML) is designed for the following types of bulk updates and deletes:
- Periodic cleanup and garbage collection.
- Backfilling new columns with default values.
psql
Java
Go
Node.js
Python
C#
Run the sample with the following command:
psql
PGDATABASE=example-db ./partitioned_dml.sh
Java
java -jar target/pgadapter-snippets/pgadapter-samples.jar partitioneddml example-db
Go
go run sample_runner.go partitioneddml example-db
Node.js
npm start partitioneddml example-db
Python
python partitioned_dml.py example-db
C#
dotnet run datpartitioneddmlboost example-db
Cleanup
To avoid incurring additional charges to your Cloud Billing account for the resources used in this tutorial, drop the database and delete the instance that you created.
Delete the database
If you delete an instance, all databases within it are automatically deleted. This step shows how to delete a database without deleting an instance (you would still incur charges for the instance).
On the command line
gcloud spanner databases delete example-db --instance=test-instance
Using the Google Cloud console
Go to the Spanner Instances page in the Google Cloud console.
Click the instance.
Click the database that you want to delete.
In the Database details page, click Delete.
Confirm that you want to delete the database and click Delete.
Delete the instance
Deleting an instance automatically drops all databases created in that instance.
On the command line
gcloud spanner instances delete test-instance
Using the Google Cloud console
Go to the Spanner Instances page in the Google Cloud console.
Click your instance.
Click Delete.
Confirm that you want to delete the instance and click Delete.
What's next
Learn how to access Spanner with a virtual machine instance.
Learn about authorization and authentication credentials in Authenticate to Cloud services using client libraries.
Learn more about Spanner Schema design best practices.