This guide lists the definitions of the RudderStack-related terms that you are likely to encounter throughout our documentation and while using RudderStack.
A Customer Data Platform (CDP) is a software/collection of tools that unifies and persists all the customer-specific records across multiple data sources in a centralized location accessible to other tools/platforms. A CDP lets you build a comprehensive customer profile and use the insights for a variety of use-cases.
The ELT (Extract, Load, Transform) process can be defined as:
Extract: Obtaining data from the source platform or application.
Load: Replicating the data from the source into the target system, typically a data warehouse or a data lake.
Transformation: Transforming the data in the desired format according to the business requirement/use-case.
The ETL (Extract, Transform, Load) process can be defined as:
Extract: Collecting/pulling data from a variety of sources.
Transform: Refining and transforming the data as per the required data quality and business requirements.
Load: Pushing the data into a consolidated data store (for e.g., data warehouse)
On the other hand, Reverse ETL is the process of routing the enriched data residing in your data warehouse to various downstream tools within your customer data stack. This includes various SaaS marketing, analytics, sales, and customer support tools.
The Control Plane manages the configuration of your sources and destinations. The interface for the control plane is the RudderStack web app.
The Data Plane is RudderStack's **core engine responsible for:
Receiving and buffering the event data
Transforming the events into the required destination format
Relaying the events to the destination
RudderStack's Control Plane offers an intuitive UI to configure your event data sources and destinations.
If you want to self-host these configurations, you can use the open-source Control Plane Lite utility to set up your Control Plane. You can then manage the source and destination configurations locally by exporting to or importing them from a JSON file.
RudderStack's Event Stream **feature lets you collect your event data from all of your web and mobile apps and route it to a wide array of customer tools and data warehouses via RudderStack.
****Cloud Extract is RudderStack's ETL feature that lets you collect your raw events and data from various third-party cloud platforms such as Google Analytics, Marketo, Facebook Ads, Stripe, etc. and send it to your data warehouse with a user-specified frequency.
****Warehouse Actions is RudderStack's Reverse-ETL feature. It allows you to leverage the already processed customer data residing in your data warehouse and route this enriched information to your desired destinations, including SaaS marketing, analytics, sales, and customer support tools.
RudderStack **[Transformations**](adding-a-new-user-transformation-in-rudderstack/) transform your source events into a destination-specific format. These transformations can be used across your Event Stream, Warehouse Actions, and Cloud Extract pipelines.
There are two key aspects of this feature:
RudderStack Transformer: This is a RudderStack service that transforms your incoming events into a destination-specific format.
Filtering or sampling events
Implementing a static logic to enrich your events
Removing any sensitive PII information from your customer events, and a lot more.
RudderStack's Live Events is a debugger that shows the live events collected from your sources and sent to the connected destinations in real-time. With this feature, you can easily debug any errors in the failing events at a destination level and reduce your troubleshooting time and efforts.
Broadly speaking, this feature can be further classified into two categories:
****Source Live Events: This feature gives you real-time visibility into the source events collected by RudderStack. This way, you can confirm if your source is configured correctly and is collecting & sending data as expected.
****Destination Live Events: When routing events to a destination, sometimes events don't show up in your destination. This feature gives you real-time visibility into the destination's responses and helps you troubleshoot the problem.
RudderStack's Data Governance **feature gives you the ability to access all your events and their metadata programmatically and identify any inconsistencies in them. This includes vital information related to the event schema, event payload versions, data types, and more.
User Suppression is RudderStack's enterprise feature that lets you programmatically suppress user data identified by a user ID. With this feature, you can block all the user data for all the sources and destinations in RudderStack.
A connection is a one-to-one directional flow of events between a RudderStack source and a destination.
You can set up different types of connections in RudderStack to send your events, based on the type of source:
Event Stream: One source to many destinations
Cloud Extract: Multiple sources to one warehouse destination
Warehouse Action: One warehouse source to one downstream destination (mainly due to the mappings required when setting up the connection).
You can send the event data from your website/mobile app to your desired destination via RudderStack in two ways:
Cloud Mode: In this mode, the RudderStack SDKs track and send the event data directly to the RudderStack server for processing. RudderStack processes this data and routes it to the desired destination. This mode is useful when you want to transform your source events in a destination-specific format.
Device Mode: In this mode, you can send the source events to your destination using the client-specific libraries on your website/mobile app. These libraries allow RudderStack to use the data you collect on your device to call the destination APIs without sending it to the RudderStack server first. This mode is useful when you want to send the events to a destination as-is, without any transformation.
These two modes are commonly referred to as RudderStack Connection Modes.
A source is a platform or an application (web, mobile, server-side, or a third-party cloud app) from where RudderStack tracks and collects your event data. We highly recommend creating a source for every unique data source that you want to track.
A destination is a tool or application where you want to send the data via RudderStack.
RudderStack supports sending events to your data warehouse. These are called the warehouse destinations.
Currently, RudderStack supports integration with all the leading data warehouses like Amazon Redshift, Azure Synapse, Google BigQuery, Snowflake, PostgreSQL, ClickHouse, and Microsoft SQL Server.
The RudderStack API Spec helps you plan your event data and provides various options for tracking your events across all the RudderStack SDKs and APIs. As RudderStack has a unified event semantic for different destination platforms, you can easily translate your event data to different downstream tools by following this spec.
When sending your events to a data warehouse via RudderStack, you need not define a schema for the event data before sending it from your source. Instead, RudderStack automatically does that for you by following a predefined warehouse schema. This schema defines the different tables and columns created based on different events.
The workspace token is a unique identifier of your RudderStack workspace. You can find it by logging in to the RudderStack web app.
The write key is a unique identifier for your source. It is used while sending events from that source to your specified destination via RudderStack.