Data Connection
1. Understanding GCP Project and Service Accounts
References:https://cloud.google.com/iam/docs/service-accounts
2. Data Import Case of 'Demo RPG Game' Service
3. GCP Environment Setup
3.1 Create Service Account for Your Project
First, you need to create a service account for the ABC in IAM & Admin -> Service Accounts through your GCP Project. When you create a user-managed service account in your project, you can give a name for the service account. This name appears in the email address that identifies the service account, which uses the following format:
service-account-name@project-id.iam.gserviceaccount.com
3.2. Authorized the Service Account
Secondly, you need to grant the following permissions to the Service Account you just created in IAM & Admin -> IAM:
1 Storage Admin (able to list and raed for GCS Bucket)
The following document describes in detail the predefined roles for Storage Transfer Service:Transfer Service - IAM Roles
2 Storage Transfer Admin (able to create Transfer job based on GCS)
The following document describes the authority of different roles:Cloud Storage - IAM Roles
3.3 Authorized the Service Account in ABC
Please enter the Service Account you created for ABC in the input box, and click the authorization button, and wait for a moment! The authorization operation is to add your account to the GCS Bucket independently created by ABC for your business, and give you the Storage Admin permission.
- Authorization Success
- Authorization Failed
Note: Do you have data security concerns? Don't worry, after authorized, you have all the permissions on the Bucket created by ABC, and you can perform any operations such as adding, modifying, deleting, and querying data (although we do not recommend you to delete and modify frequently), This operation complies with the GDPR and CCPA agreements. If you want to know more details about data compliance.
4. Import User Behavior Data to ABC
4.1 Start Using GCP Transfer Service
Go to Storage Transfer Service
4.2 Create a Transfer Job
4.3 Import Configuration
4.3.1 Fill in Source Information
Heads up:
- Please select
{tableName} in 'Bucket or folder' field. In the screenshot above, 'tab_saas_bucket' is your bucket name, 'tab_user_behavior' is your table name. (Example of path: tab_saas_bucket/tab_user_behavior/ds=20221005) - You can use filters to specify partitions. In the screenshot above, partitions of 20221005 ~ 20221007 are selected to be imported. (For full data import or scheduled import, you can skip this step )
You can check diagram below for more details.
4.3.2 Destination Information
4.3.2.1 Select Project and Bucket
- Select Project ID = abetterchoice
- Fill in the bucketName by using the information returned on step 5.3
- Select the bucket and view the child resource
4.3.2.2 Create a folder (required for the first time import)
- If this is your first time to import data source, please click "Create new folder".
- Fill in the folder name (notes: ABetterChoice will use this name to create the table for your project. Only letters, numbers and underscores are accepted)
4.3.2.3 Confirm the Import Destination
Click 'SELECT' button in new created folder to choose destination
4.3.3 Select Running Strategy
4.3.3.1 One Time Import Job (for data backtracing)
Select Run Once and Starting now to run the job instantly
4.3.3.2 Import the Data Periodicity
1. Configure the Time of Source Import
Select Relative time Range in 'Choose a source' step Here is an example of setting a transfer for recent 24 hours data
2. Configure Running Strategy
Set scheduler policy in 'Choose how and when to run job' step Starting from December 8th, a scheduled job will be launched at 3:00 AM every day to transfer the recent 24Hours file/folder in the Source to the Destination folder.
4.3.4 Run the Job
- Skip 'Choose settings' steps to use default settings
- Click 'Create' button to start the job