SAP Order to Cash accelerator

The SAP accelerator for the order to cash process is a sample implementation of the SAP Table Batch Source feature in Cloud Data Fusion. The accelerator helps you get started when you create your end-to-end order to cash process and analytics. It includes sample Cloud Data Fusion pipelines that you can configure to perform the following tasks:

Connect to your SAP data source.
Perform transformations on your data in Cloud Data Fusion.
Store your data in BigQuery.
Set up analytics in Looker. This includes dashboards and an ML model, where you can define the key performance indicators (KPIs) for your order to cash process.

This guide describes the sample implementation, and how you can get started with your configurations.

The accelerator is available in Cloud Data Fusion environments running in version 6.3.0 and above.

Before you begin

If you don't already have one, sign up for a new account.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the Cloud Data Fusion and BigQuery APIs.

Enable the APIs

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Enable the Cloud Data Fusion and BigQuery APIs.

Enable the APIs

Download the SAP Table Batch Source.
You must have access to a Looker instance and have the marketplace labs feature turned on to install the Looker Block. You can request a free trial to get access to an instance.

Required skills

Setting up the SAP Order to Cash accelerator requires the following skills:

Expertise in SAP on-premises ERP systems and configuration
Familiarity with the Cloud Data Fusion
Familiarity with BigQuery
Familiarity with Looker
Familiarity with Identity and Access Management (IAM) service accounts and access control
Familiarity with data analytics, including writing SQL queries
Familiarity with Kimball's dimensional data model

Required users

The configurations described on this page require changes in your SAP system and in Google Cloud. You need to work with the following users of those systems to perform the configurations:

User type	Description
SAP admin	Administrator for your SAP system who can access the SAP service site for downloading software.
SAP user	An SAP user who is authorized to connect to an SAP system.
GCP admin	Administrator who controls IAM access for your organization, who creates and deploys service accounts and grants permissions for Cloud Data Fusion, BigQuery, and Looker.
Cloud Data Fusion user	Users who are authorized to design and run data pipelines in Cloud Data Fusion.
BigQuery Data Owner	Users who are authorized to create, view, and modify BigQuery datasets.
Looker Developer	These users can install the Looker Block through the Marketplace. They must have `develop`, `manage_model`, and `deploy` permissions.

Required IAM roles

In the accelerator's sample implementation, the following IAM roles are required. You might need additional roles if your project relies on other Google Cloud services.

BigQuery Admin (roles/bigquery.admin)
BigQuery Data Owner (roles/bigquery.dataOwner)
Storage Object Viewer (roles/storage.objectViewer)
Cloud Data Fusion Runner (roles/datafusion.runner) needs to be granted to the Dataproc service account

Process overview

You can implement the accelerator in your project with the following steps:

Configure the SAP ERP system and install the SAP transport provided.
Set up your Cloud Data Fusion environment to use the SAP Table Batch Source plugin.
Create datasets in BigQuery. The accelerator provides sample datasets for staging, dimensional, and fact tables.
Configure the sample Cloud Data Fusion pipelines from the accelerator to integrate your SAP data.
From the Cloud Data Fusion Hub, deploy the pipelines associated with the order to cash analytics process. These pipelines must be configured correctly to create the BigQuery dataset.
Connect Looker to the BigQuery project.
Install and deploy the Looker Block.

For more information, see Using the SAP Table Batch Source plugin.

Sample datasets in BigQuery

In the sample implementation in this accelerator, the following datasets are created in BigQuery.

Dataset name	Description
`sap_cdf_staging`	Contains all the tables from the SAP Source system as identified for that business process.
`sap_cdf_dimension`	Contains the key dimension entities like Customer Dimension and Material Dimension.
`sap_cdf_fact`	Contains the fact tables generated from the pipeline.

Sample pipelines in Cloud Data Fusion

Sample pipelines for this accelerator are available in the Cloud Data Fusion Hub.

To get the sample pipelines from the Hub:

Go to your instance:
1. In the Google Cloud console, go to the Cloud Data Fusion page.
2. To open the instance in the Cloud Data Fusion Studio, click Instances, and then click View instance.
  
  Go to Instances
Click Hub.
Select the SAP tab.
Select Pipelines. A page of sample pipelines opens.
Select the desired pipelines to download them.

Each of the pipelines contains macros that you can configure to run in your environment.

There are three types of sample pipelines:

Staging layer pipelines: The staging dataset in this type of pipeline is a direct mapping to the original source table in SAP. The sample staging layer pipelines have names that refer to the SAP source table and the BigQuery target table. For example, a pipeline named KNA1_Customer_Master refers to the SAP Source Table (KNA1) and BigQuery target table (CustomerMaster).
Dimension layer pipelines: The dimension layer dataset in this type of pipeline is a curated and refined version of the staging dataset that creates the dimension and facts needed for the analysis. The sample pipelines have names that refer to the target entity in the target BigQuery dataset. For example, a pipeline called customer_dimension refers to the Customer Dimension entity in the BigQuery dataset sap_cdf_fact.
Fact layer pipelines: The fact layer dataset is a curated and refined version of the staging dataset that creates the facts that are necessary for the analysis. These sample pipelines have names that refer to the target entity in the target BigQuery dataset. For example, a pipeline called sales_order_fact delivers curated data to the Sales Order Fact entity in the corresponding BigQuery dataset sap_cdf_fact.

The following sections summarize how to get the pipelines to work in your environment.

Configure staging layer pipelines

There are two configuration steps for the staging pipelines:

Configure the source SAP system.
Configure the target BigQuery dataset and table.

Parameters for the SAP Table Batch Source plugin

The SAP Table Batch Source plugin reads the content of an SAP table or view. The accelerator provides the following macros, which you can modify to control your SAP connections centrally.

Macro name	Description	Example
`${SAP Client}`	SAP client to use	`100`
`${SAP Language}`	SAP logon language	`EN`
`${SAP Application Server Host}`	SAP server name or IP address	`10.132.0.47`
`${SAP System Number}`	SAP system number	`00`
`${secure(saplogonusername)}`	SAP User name	For more information, see Using Secure Keys.
`${secure(saplogonpassword)}`	SAP user password	For more information, see Using Secure Keys.
`${Number of Rows to Fetch}`	Used to limit the number of extracted records.	`100000`

For more information, see Configuring the plugin.

Parameters for the BigQuery target

The accelerator provides the following macros for BigQuery targets.

BigQuery target connector configuration

Macro name	Description	Example
`${ProjectID}`	The project ID where the BigQuery dataset has been created.	`sap_adaptor`
`${Dataset}`	Target dataset	`sap_cdf_staging`

Sample pipelines used for order to cash KPIs

The following key business entities in the order to cash process correspond with sample pipelines in the accelerator. These pipelines deliver the data that powers the analytics about these entities.

Key business entities	Corresponding sample pipeline name
A `Customer` might be a person or an entity with whom the organization does business. These three SAP source tables capture details about the customer as they pertain to the business. Information from these tables contributes to the `customer_dimension` in the `sap_cdf_dimension` dataset.	KNA1_CustomerMaster KNVV_CustomerSales KNVP_CustomerPartnerFunction
`Material` is the commodity that is traded between the enterprise and its customers. Information from these SAP source tables contributes to the material_dimension in the `sap_cdf_dimension` dataset.	MARA_MaterialMaster MARD_MaterialStorageLocation
The order management sub-process of the order to cash process (when your system receives an order from the customer).	VBAK_SalesDocumentHeader VBAP_SalesDocumentItem VBEP_SalesDocumentSchedule
The order fulfillment and shipping sub-processes.	LIKP_DeliveryHeader LIPS_DeliveryItem
The invoicing and customer payments sub-processes (when the customer receives an invoice).	VBRK_BillingHeader VBRP_BillingLineItem
The accounts receivable and reporting sub-processes (when the payment is logged in your system).	ACDOCA_UniversalJournalItem

All Cloud Data Fusion staging pipelines

The following Cloud Data Fusion staging pipeline samples are available in the accelerator:

KNA1_CustomerMaster
KNVV_CustomerSales
KNVP_CustomerPartnerFunction
MARA_MaterialMaster
MARD_MaterialStorageLocation
VBAK_SalesDocumentHeader
VBAP_SalesDocumentItem
VBEP_SalesDocumentSchedule
LIKP_DeliveryHeader
LIPS_DeliveryItem
ACDOCA_UniversalJournalItem
VBRK_BillingHeader
VBRP_BillingLineItem
BSEG_AccountDocumentItem
BSID_AccDocCustOpenItem
BSAD_AccDocCustCloseItem
T001_CompanyCodes
T006A_UnitOfMeasure
T024D_MRPControllers
T042ZT_PaymentMethodText
T189T_PriceListCategory
TCURC_CurrencyCodes
TCURT_CurrencyCodesText
TCURW_ExchangeRateType
TINCT_CustomerIncotermsText
TVAKT_SalesDocumentType
TVAPT_SalesDocumentItemCategory
TVFST_BillingBlockReasonText
TVLST_DeliveryBlockText
TVTWT_DistributionChannelText
MAKT_MaterialDescription
T005T_CountryNames
T005U_RegionText
TVAST_SalesDocBlockReasonText
T077X_AccountGroupName
T134T_MaterialTypeDescription
T023T_MaterialGroupDescription
TSPAT_SalesDivisionText
TVKOV_DistributionChannel
TVTA_SalesArea
TVKOT_SalesOrgText
TVAUT_SalesDocumentOrderText
TVSBT_ShippingConditionText
TVAG_SalesDocRejectionReasons
TVAGT_SalesDocRejectionReasonDescription

Configure dimensional layer pipelines

You can extract KPIs from source SAP tables. To prepare the data for analysis, organize the data in the source table to match the BigQuery table's schema structure.

The accelerator creates the following four sample tables:

Table name	Table description
customer_dimension	Curated list* of Customers and their associated facts such as customer classification, customer hierarchy, and customer sales-related information.
material_dimension	Curated list of Materials and associated facts such as SKU number, product hierarchy, and classification.
sales_order_fact	Curated list of sales information such as order types, order status visibility, ordered quantity, and order value. These fields are typically aggregated to generate order management KPIs such as Open Orders, Confirmed Orders, Rejected Orders, and Billed Orders.
revenue_fact	Detailed accounting information generated by the Sale of the Material to Customers. Derived from the Accounting tables, this fact table contains information that can provide insights through Revenue KPIs including Gross Sales, Net Sales Before Discount, Net Sales After Discount, or Trends.

*In this context, the curated list comes from business logic that gets applied to the selected list of columns.

The accelerator builds the dimensional layer of the BigQuery dataset using SQL scripts, which you can modify for your project. For example, you can adapt these scripts to add more columns to the target BigQuery dataset entities.

Transformation to star schema: BigQuery executor pipeline names

The following BigQuery executor pipelines in Cloud Data Fusion load data into dimension and fact tables:

All dimensional transformation pipelines:

customer_dimension
material_dimension
sales_order_fact
revenue_fact

BigQuery executor configuration

Macro name	Example
`${ProjectID}`	`sap_adaptor`
`${StagingDatasetName}`	`sap_cdf_staging`
`${TargetDatasetName}`	`sap_cdf_dimension`

Connect Looker to the BigQuery project

To connect Looker to BigQuery, see the Looker documentation about BigQuery connections.

Install the block

You can access the SAP Looker Block on GitHub.

The Looker Block installs a pre-configured LookML model with two Explore environments and two dashboards.

What's next

Learn more about Cloud Data Fusion.
Learn more about SAP on Google Cloud.
Learn more about BigQuery.
Learn more about Looker Blocks.