Generally asked questions about RudderStack - including the platform, and its features and functionalities

This document aims to address the common as well as not-so-common queries and issues you might encounter while using the RudderStack platform.

If you come across any issue that is not listed in this document, please feel free to reach out to us. You can also start a conversation on our Discord channel, and we will be happy to help you.

RudderStack Server

This section contains some commonly-asked questions about the RudderStack Server.

RudderStack Architecture

I created an account with RudderStack but did not find any server-side sources, only client-side sources. Is this in the roadmap?

Yes, you are right. We don't explicitly list out any server-side sources, though all sources are pretty much the same. When you create a source, you get a writeKey which you can use to send events to RudderStack Data Plane. The writeKey is what identifies the source, the name (iOS, Android, or JavaScript) and can be used by the server-side SDKs. We kept the sources separate mainly to be consistent with Segment.

Upgrading RudderStack

The Helm deployment repo is on v0.1.3. Should I be upgrading the server to v0.1.6 or even master?

We recommend using well-tested GitHub releases. Upgrading the RudderStack Server to v0.1.6 should not be a problem.

RudderStack Transformations

This section aims to address the commonly asked questions about RudderStack's Transformation feature.

What is a RudderStack Transformation?

The RudderStack transformation module is converts the received event data into a suitable destination-specific format. You can also write your own user-specific transformation function in JavaScript to perform tasks such as aggregation and sampling of your event data, prior to routing it to the destination platform.

How do transformations handle batching?

The transformation functions are given a list of events, but the events are also pushed out in real time. What's the logic behind that?

The batching is done on a per end-user level. All the events from a given end-user are batched and then sent to the transformation function. The batching process is controlled via the following three parameters in config.toml:

  • processSessions = False (make it True for batching)

  • sessionThresholdEvents = 100

  • sessionInactivityThresholdInS = 120

Events from an end-user are batched till we have 100 events or 120 seconds of inactivity since the last event. This list is then passed to the transformation function.

How do transformations affect the pipeline? If RudderStack is connected to the database and transformation is used to enrich the data, will it slow everything down?

There is parallelism in calling the transformation function, so ideally it should not slow the system. However, if you have a really slow transformation, you can increase the number of transformation workers by tweaking numTransformWorker.

Are there configurations that control how many events are grouped together so that we can minimize the number of database calls?

Number of user events that are batched together can be configured with sessionThresholdEvents and sessionInactivityThresholdInS. Higher the numbers, longer the events are grouped into a session. It is important to note that these will increase the memory footprint proportionally.

Can we share connections across transformations?

Each execution of a transformation happens in a sandboxed V8 isolate. We do not support sharing data or connections across executions.