Google BigQuery is an industry-leading, fully-managed cloud data warehouse that allows you to store and analyze petabytes of data in no time.
This guide will help you configure BigQuery as a source from which you can route event data to your desired destinations through RudderStack.
Follow these steps below to grant the necessary permissions for warehouse actions. For BigQuery, use the BigQuery Console.
Go to https://console.cloud.google.com/iam-admin/roles and click on “CREATE ROLE”.
Fill it in as shown below.
Click on “ADD PERMISSIONS” and add the permissions mentioned below.
Click on “CREATE”.
Now, move to https://console.cloud.google.com/iam-admin/serviceaccounts and select the project which has the dataset or table you want to share.
Click on “CREATE SERVICE ACCOUNT”.
Fill in the details in Step 1 and click CREATE:
Fill in the details in Step 2 and click CONTINUE:
After filling in steps 1 and 2 click DONE.
This will move you to the list of service accounts. Now, click on the “3 dots” for the service account that just created, and select “Manage keys”.
Click on ADD KEY and select "Create new key". In the pop up select JSON and click CREATE.
Download this json file and use its data while creating a BQ source on RudderStack.
To set up Google BigQuery as a source in RudderStack, follow these steps:
Log into your RudderStack dashboard.
From the left panel, select Sources. Then, click on Add Source, as shown:
Scroll down to the Warehouse Sources and select BigQuery. Then, click on Next.
Assign a name to your source, and click on Create Credentials from Scratch. Then, click on Next.
Next, enter the GCP project ID and the Credentials JSON which RudderStack will use to import the data from your BigQuery instance.
Next, select the Schema and the Table from which you want RudderStack to import the data.
Once you specify the table containing the required columns, you will be able to preview a snippet of your data, as shown below:
Here, you can select all or only a few specific columns of your choice, search the columns by a keyword, and also edit the JSON Trait Key, as shown below. You can also preview the resultant JSON on the right. Once you've select the required table columns to import the data from, click on Next.
Next, you will be required to set the Run Frequency to schedule the data import from your PostgreSQL database to RudderStack. You can also specify the time when you want this synchronization to start, by choosing the time under the Sync Starting At option. Then, click on Next.
That's it! BigQuery is now successfully configured as a source on your RudderStack dashboard.
RudderStack will start importing data from your BigQuery instance as per the specified frequency. You can further connect this source to your preferred destinations by clicking on Connect Destinations or Add Destinations, as shown:
If you come across any issues while configuring Google BigQuery as a source on the RudderStack dashboard, please feel free to contact us. You can also start a conversation on our Slack channel; we will be happy to talk to you!