SAP Datasphere federated queries
As a data analyst, you can query relational data in SAP Datasphere from BigQuery using federated queries.
BigQuery SAP Datasphere federation lets BigQuery query data residing in SAP Datasphere in real time, without copying or moving data.
To run a SQL query in SAP Datasphere, specify that SQL query within BigQuery in a EXTERNAL_QUERY
function. The results are then transferred from SAP Datasphere to BigQuery.
Limitations
- You can only query relational views that are exposed for consumption. Other objects in SAP Datasphere are not accessible to the query federated through
EXTERNAL_QUERY
. - The federated query latency might be noticeably higher than the same query if it was executed directly in SAP Datasphere.
- The first query that uses SAP Datasphere connection in a given project might take more than a minute to run.
- No additional SQL pushdowns are supported for SAP Datasphere.
- The SAP Datasphere SQL query must specify aliases for columns that contain function results.
- When the usage of Compute Engine API in the query project is restricted by VPC Service Controls, the federated query will fail.
Before you begin
Ensure that your BigQuery administrator has created a SAP Datasphere connection and shared it with you.
Required roles
To get the permissions that you need to query SAP Datasphere,
ask your administrator to grant you the
BigQuery Connection User (roles/bigquery.connectionUser
) IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
Query data
To send a federated query to SAP Datasphere from a GoogleSQL query, use the EXTERNAL_QUERY function.
The following example is a federated query that joins a table in SAP Datasphere named ORDERS
and a table in BigQuery named mydataset.customers
.
SELECT c.customer_id, c.name, rq.first_order_date
FROM mydataset.customers AS c
LEFT OUTER JOIN EXTERNAL_QUERY(
'connection_id',
'''SELECT CUSTOMER_ID, MIN(ORDER_DATE) AS first_order_date
FROM ORDERS
GROUP BY CUSTOMER_ID''') AS rq
ON rq.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, rq.first_order_date;
View a SAP Datasphere table schema
The following examples use the EXTERNAL_QUERY function to retrieve database metadata from the
SYS
schema in SAP Datasphere.
-- List all views in a schema.
SELECT * FROM EXTERNAL_QUERY(
'connection_id',
'''SELECT VIEW_NAME FROM SYS.VIEWS
WHERE SCHEMA_NAME = 'MY_SCHEMA'''');
-- List all columns in a view.
SELECT * FROM EXTERNAL_QUERY(
'connection_id',
'''SELECT COLUMN_NAME, DATA_TYPE_NAME
FROM SYS.VIEW_COLUMNS
WHERE SCHEMA_NAME = 'MY_SCHEMA' AND
VIEW_NAME = 'my_view'
ORDER BY POSITION''');
Pricing
The cost of running a federated query is based on three factors:
- The compute cost of executing the query in SAP Datasphere.
- The bandwidth cost of transferring the query results from SAP Datasphere to BigQuery.
- The compute cost of executing the query in BigQuery.
Any SAP Datasphere related costs depend on the type of SAP service you use. To
limit the bandwidth cost, we recommend that you write the query in the
EXTERNAL_QUERY
so that it excludes all columns and rows that are not needed to
compute the final result.
There is no additional cost for running federated queries in BigQuery. For more information about BigQuery pricing, see Pricing.
What's next
- Learn about federated queries.
- Learn about unsupported data types.