Snowflake
Step-by-step guide to add Snowflake as a destination in RudderStack
Snowflake is a cloud-based data warehouse provided as Software-as-a-Service (SaaS). It offers all the features of a modern data warehouse, including scalability, ease of use, secure access to your data, accelerated analytics capabilities, and much more.
RudderStack allows you to configure Snowflake as a destination to dump your event data seamlessly.
Please check our Warehouse Schemas guide to know how events are mapped to the tables in Snowflake.
Find the open-source transformer code for this destination in our GitHub repo.

Snowflake User Setup

To enable RudderStack access, please make sure you have an ACCOUNTADMIN, or an account that has MANAGE GRANTS.
The following sections illustrate how to create a virtual warehouse, a database, a role, and an user in Snowflake:

Creating a Virtual Warehouse

Create a X-Small warehouseby following the wizard on Snowflake's website as shown in the screenshot below:
creating x-small warehouse screenshot
You can set your data warehouse size as per your future data volume.
Alternatively, you can also use SQL to create a warehouse, as shown:
1
CREATE WAREHOUSE "RUDDER_WAREHOUSE"
2
WITH WAREHOUSE_SIZE = 'XSMALL'
3
WAREHOUSE_TYPE = 'STANDARD'
4
AUTO_SUSPEND = 600
5
AUTO_RESUME = TRUE;
Copied!
It is recommended that you set AUTO_SUSPEND to ~10 mins, and enable AUTO_RESUME to avoid any extra costs.

Creating a Database

Please create a new database to avoid conflicts with your existing data, as RudderStack creates its own tables.
The following screenshot demonstrates the Create Database option in Snowflake.
Creating a database in Snowflake
Alternatively, you can also use SQL to create a database, as shown:
1
CREATE DATABASE "RUDDER_EVENTS";
Copied!

Creating a Role for RudderStack

Please execute the following SQL commands to create a new role with the required permissions to load your data into the warehouse:
    Create a new role
1
CREATE ROLE "RUDDER";
Copied!
    Grant access to the virtual warehouse
1
GRANT USAGE ON WAREHOUSE "RUDDER_WAREHOUSE" TO ROLE "RUDDER";
Copied!
    Grant access to the database
1
GRANT USAGE ON DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";
2
GRANT CREATE SCHEMA ON DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";
3
GRANT ALL ON ALL SCHEMAS IN DATABASE "RUDDER_EVENTS" TO ROLE "RUDDER";
Copied!

Creating a User

Finally, please create a user to connect RudderStack to the previously created Snowflake warehouse, as shown:
Creating a user in Snowflake
Alternatively, you can use SQL to create a user in Snowflake, as shown:
1
CREATE USER "RUDDER_USER"
2
MUST_CHANGE_PASSWORD = FALSE
3
DEFAULT_ROLE = "RUDDER"
4
PASSWORD = "strong_unique_password";
5
GRANT ROLE "RUDDER" TO USER "RUDDER_USER";
Copied!

Configuring Snowflake in RudderStack

In order to enable dumping data to Snowflake, you will first need to add it as a destination to the source from which you are sending event data. Once the destination is enabled, events from RudderStack will start to flow to Snowflake.
To do so, please follow these steps:
    Choose a source to which you would like to add Snowflake as a destination. You can also simply create a destination and connect it to a source later.
Please follow our Adding a Source and Destination guide to know how to add a source in RudderStack.
    After choosing a source, select Snowflake from the list of destinations.
    Give your destination a name and then click on Next. You should then see the following screen:
Connection Settings for configuring Snowflake as a Destination
Connection Settings for configuring Snowflake as a Destination
    Add the required credentials in the Connection Settings as described below:
      Account - This is the account ID of your warehouse. Account ID is part of the Snowflake URL. The following examples illustrate the slight differences in the account ID for various cloud providers.
    Account ID sample
    Snowflake URL
    Snowflake cloud provider
    qya56091.us-east-1
    https://qya56091.us-east-1.snowflakecomputing.com
    Amazon Web Services (AWS)
    rx18795.east-us-2.azure
    https://rx18795.east-us-2.azure.snowflakecomputing.com
    Microsoft Azure (Azure)
    ah76025.us-central1.gcp
    https://ah76025.us-central1.gcp.snowflakecomputing.com
    Google Cloud Platform (GCP) **
    The following settings are applicable if you are using an Amazon S3 bucket for object storage:
      S3 bucket name - This is your unique S3 bucket name.
      AWS Access Key ID - This can be obtained from the AWS Console.
      AWS Secret Access Key - This can be obtained from AWS Console. Please refer to the Setting Up Amazon S3 section for more details.
    Please grant the following permissions to the IAM user:
    1
    "Effect": "Allow",
    2
    "Action": [
    3
    "s3:GetObject",
    4
    "s3:PutObject",
    5
    "s3:PutObjectAcl",
    6
    "s3:ListBucket"
    7
    ]
    Copied!

Configuring Snowflake Integration

AWS
Azure
GCP
If you have Amazon Web Services (AWS) as your cloud provider and want to leverage S3 as your object storage, you will need to follow a few more steps to configure your Snowflake destination with a snowflake integration.
Detailed instructions can be found here
Configuring snowflake integration with AWS -
    1.
    Create a policy in AWS:
    Replace <bucket> and <prefix> with your values in the JSON below. And create the policy with a name of your choice.
    1
    {
    2
    "Version": "2012-10-17",
    3
    "Statement": [
    4
    {
    5
    "Effect": "Allow",
    6
    "Action": [
    7
    "s3:PutObject",
    8
    "s3:GetObject",
    9
    "s3:GetObjectVersion",
    10
    "s3:DeleteObject",
    11
    "s3:DeleteObjectVersion"
    12
    ],
    13
    "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
    14
    },
    15
    {
    16
    "Effect": "Allow",
    17
    "Action": "s3:ListBucket",
    18
    "Resource": "arn:aws:s3:::<bucket>",
    19
    "Condition": {
    20
    "StringLike": {
    21
    "s3:prefix": [
    22
    "<prefix>/*"
    23
    ]
    24
    }
    25
    }
    26
    }
    27
    ]
    28
    }
    Copied!
    2.
    Create a role and attach the above policy in AWS: - Create role of type Another AWS account - Enter your AWS account ID and enable Require External ID - For external ID enter a dummy ID such as 0000. We will modify this later on. - And attach the policy created in Step 1. - Give your role a name and keep the role ARN handy for the next step. **
    3.
    Create cloud storage integration in Snowflake -
    1
    CREATE STORAGE INTEGRATION <integration_name>
    2
    TYPE = EXTERNAL_STAGE
    3
    STORAGE_PROVIDER = S3
    4
    ENABLED = TRUE
    5
    STORAGE_AWS_ROLE_ARN = '<iam_role>'
    6
    STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/')
    7
    [ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]
    Copied!
    <integration_name> with an integration name of your choice. <iam_role> with the role ARN from the previous step.
    Record the values - <integration_name>.
    4.
    Retrieve the AWS IAM user for your Snowflake account as shown:
    1
    DESC INTEGRATION <integration_name>;
    Copied!
    <integration_name> - The integration name used in the previous step.
    Record the values - STORAGE_AWS_ROLE_ARN and STORAGE_AWS_EXTERNAL_ID .
    5.
    Grant IAM user permission to access bucket objects in AWS -
      Choose the role created in Step 2 and edit the trust relationship with the following JSON.
    1
    {
    2
    "Version": "2012-10-17",
    3
    "Statement": [
    4
    {
    5
    "Sid": "",
    6
    "Effect": "Allow",
    7
    "Principal": {
    8
    "AWS": "<snowflake_user_arn>"
    9
    },
    10
    "Action": "sts:AssumeRole",
    11
    "Condition": {
    12
    "StringEquals": {
    13
    "sts:ExternalId": "<snowflake_external_id>"
    14
    }
    15
    }
    16
    }
    17
    ]
    18
    }
    Copied!
    <snowflake_user_arn> is the STORAGE_AWS_ROLE_ARN in the previous step. <snowflake_external_id> is the STORAGE_AWS_EXTERNAL_ID in the previous step.
    6.
    Grant integration access to role in Snowflake:
    1
    grant usage on integration <ingration_name> to role <sf_role>;
    Copied!
    <integration_name> is the integration created in the Step 3. <sf_role> is the role in Snowflake you want to grant access to.
If you have Microsoft Azure as your cloud provider and want to leverage Azure Blob Storage as your object storage, you will need to follow a few more steps to configure your Snowflake destination with a snowflake integration.
Detailed instructions can be found here.
Configuring snowflake integration with Azure -
    1.
    Create a storage account in Azure.
    2.
    Create a container in the storage account created above by navigating to <storage_account> - Storage Explorer - Blob Containers - Create a Blob Container.
    3.
    Create cloud storage integration in Snowflake:
    1
    CREATE STORAGE INTEGRATION <integration_name>
    2
    TYPE = EXTERNAL_STAGE
    3
    STORAGE_PROVIDER = AZURE
    4
    ENABLED = TRUE
    5
    AZURE_TENANT_ID = '<tenant_id>'
    6
    STORAGE_ALLOWED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/')
    7
    [ STORAGE_BLOCKED_LOCATIONS = ('azure://<account>.blob.core.windows.net/<container>/<path>/', 'azure://<account>.blob.core.windows.net/<container>/<path>/') ]
    Copied!
    <tenant_id> - Get your tenant id by navigating to Azure Active Directory - Properties. The tenant ID is displayed in the Directory ID field. <account> - <storage_account> - Access keys - <storageAccountName>
    4.
    Grant Snowflake access to the storage locations -
    Replace <integration_name> with the integration name created in the previous step.
    1
    DESC INTEGRATION <integration_name>;
    Copied!
    And record the values AZURE_CONSENT_URL and AZURE_MULTI_TENANT_APP_NAME .
    5.
    Navigate to the URL in the AZURE_CONSENT_URL obtained in the previous step and accept.
    6.
    Grant snowflake access to the container created in Step 2: - Navigate to Azure Services - Storage Accounts and select the storage account created in Step 1. - Add role : Navigate to Access Control (IAM) - Add Role Assignment and select either Storage Blob Data Reader(Read Access) or Storage Blob Data Contributor(Read and Write Access) - Add Assign Access : Add Service Principal as the type of security principal to assign the role to. And search for AZURE_MULTI_TENANT_APP_NAME obtained in step 4.
    7.
    Grant integration access to role in Snowflake:
    1
    grant usage on integration <ingration_name> to role <sf_role>;
    Copied!
    <integration_name> is the integration created in the Step 4. <sf_role> is the role in Snowflake you want to grant access to.
If you have Google Cloud Platform (GCP) as your cloud provider and want to leverage Google Cloud Storage as your object storage, you will need to follow a few more steps to configure your Snowflake destination with a snowflake integration.
Detailed instructions can be found here.
Configuring Snowflake Integration with Google Cloud Platform -
    1.
    Create a Cloud Storage integration in Snowflake.
    1
    CREATE STORAGE INTEGRATION <integration_name>
    2
    TYPE = EXTERNAL_STAGE
    3
    STORAGE_PROVIDER = GCS
    4
    ENABLED = TRUE
    5
    STORAGE_ALLOWED_LOCATIONS = ('gcs://<bucket>/<path>/', 'gcs://<bucket>/<path>/')
    Copied!
      <integration_name> is the name of the new integration being created
      <bucket> is the name of the Cloud Storage bucket that you created above
      <path> is an optional path that can be used to provide granular control over objects in the bucket
    2.
    Retrieve the Cloud Storage Service Account for your Snowflake account. The following DESCRIBE command will retrieve the ID for the Cloud Storage service account that was created for your Snowflake account.
    1
    DESC STORAGE INTEGRATION <integration_name>;
    Copied!
      Where <integration_name> is the name of the integration you specified above in step 1.
      The output will be a table that has a property called STORAGE_GCP_SERVICE_ACCOUNT. Retrieve that property value.
      The value that should be retrieved will have the following format: [email protected]<unique_string>.iam.gserviceaccount.com
    3.
    Grant the Service Account Permissions to Access Bucket Objects
      Create a custom IAM role that has the permissions required to access the bucket and get objects.
        1.
        Log into the Google Cloud Platform Console as a project editor
        2.
        From the home dashboard, choose IAM & admin » Roles.
        3.
        Click Create Role.
        4.
        Enter a name, and description for the custom role.
        5.
        Click Add Permissions.
        6.
        Filter the list of permissions, and add the following from the list below:
        Permission Property Name
        storage.buckets.get
        storage.objects.get
        storage.object.list
        storage.objects.create
        7.
        Click Create
    4.
    Assigning the Custom Role to the Cloud Storage Service Account
      Log into the Google Cloud Platform Console as a project editor.
      From the home dashboard, choose Cloud Storage » Browser:
      Select the checkbox of the bucket you would like to configure for access.
      Click SHOW INFO PANEL in the upper-right corner. The information panel for the bucket will slide out from the right.
      In the Add members field, search for the service account name from the DESCRIBE INTEGRATION output in Step 2: Retrieve the Cloud Storage Service Account for your Snowflake Account (in this topic).
      From the Select a role dropdown, select Storage » Custom » <role>, where <role> is the custom Cloud Storage role you created in Creating a Custom IAM Role (in this topic).
      Click the Add button. The service account name is added to the Storage Object Viewer role dropdown in the information panel.
    5.
    Grant usage to an external (i.e. Cloud Storage) stage that references the integration you created.
1
GRANT USAGE ON INTEGRATION <integration_name> TO ROLE "RUDDER";
Copied!
    "RUDDER" is the name of the role that you created above here
    <integration_name> is the name of the integration that you set above in step 1

Setting Up the Network Access

You will need to whitelist the RudderStack IPs to enable network access to it.
The IPs to be whitelisted are : 3.216.35.97, 34.198.90.241 , 54.147.40.62 , 23.20.96.9, and 18.214.35.254.

Contact Us

If you come across any issues while configuring Snowflake with RudderStack, please feel free to contact us or start a conversation on our Slack channel. We will be happy to help you.
Last modified 4mo ago