Syncing Events

Step 1 - Set the Query

After connecting to your warehouse, you can start syncing Events by configuring the query that Vitally will run against your warehouse on each incremental sync.

To sync Events, the query must select or alias all three of the following column names:

  • event: the name or type of event

  • timestamp: the time the event was generated

  • message_id: a unique ID for the event

And at least one of:

  • user_id: the ID of the user that 'owns' the event

  • account_id: the ID of the account that 'owns' the event

  • organization_id: the ID of the organization that 'owns' the event (only available if you have organization hierarchy enabled)

Incremental Sync

Incremental sync is a process of syncing warehouse data that has only been added or modified since the last sync cycle instead of processing the entire dataset each time. This approach is more efficient as it minimizes the amount of data queried and processed during each sync, making the update process quicker and resource-efficient.

Incremental sync is required for syncing Events

In order to incrementally query your data each sync, you'll need to add a clause to your query with the variable $LAST_SYNC_TIMESTAMP. We'll fill in that variable with the timestamp of the last sync each time we query the warehouse.

Incremental syncs should compare the $LAST_SYNC_TIMESTAMP with a timestamp that reflects when the data was synced to the data warehouse. This is particularly important when data is synced to the warehouse in batches.

SELECT
  'pageView' AS event,
  user_id AS user_id,
  created_at AS timestamp,
  id AS message_id,
  page,
  application
FROM page_views
WHERE created_at > $LAST_SYNC_TIMESTAMP

Adding Another Event Source

Events can be configured to sync from multiple queries in different tables. In order to sync an additional event query, select the "Add another event source" button at the bottom of the query list:

Step 2 - Configure Properties

In order to pull in additional metadata associated with the event, simply include those columns in your SELECT statement, and they will be added as event properties.

For the example query above, the following event properties will be synced:

{ "page": "...", "application": "..." }

What if my properties are all in one column?

If your event properties are contained in a single nested or JSON column, and you've selected it:

SELECT
  event_name AS event,
  user_id AS user_id,
  created_at AS timestamp,
  id AS message_id,
  meta_data
FROM all_events
WHERE created_at > $LAST_SYNC_TIMESTAMP

And the responses looks like:

{
  "event": "pageView",
  "user_id": "2dc30ead-2e64-4d87-8395-aad4be299698",
  "timestamp": "2019-11-15T16:08:59.915Z",
  "message_id": "7ca60dbd-0ad0-4ef6-b049-aac3fb5e0116",
  "meta_data": {
    "page": "home-page",
    "duration_ms": 3253
  },
}

Then you will want to set the optional "property field" to meta_data in order to track those values.

Step 3 - Continue Configuration

If you have other features to sync continue configuring those.

pageConfiguring Sync

OR Step 3 - Enable Sync

pageEnabling Sync

FAQ

Q: Why are some events missing in Vitally but correctly appear in my data warehouse?

A: This usually happens due to timestamp discrepancies in the incremental sync. Events are only synced to Vitally if their timestamp is > $LAST_SYNC_TIMESTAMP. If an event's timestamp doesn't meet this criteria, it won't appear in Vitally, even though it's present in your data warehouse. To resolve this, you'll need to ensure the timestamp you've selected in your WHERE clause above reflects the actual time the event happened and not when the event was loaded (i.e. if your warehouse only loads events every 24 hours)

Last updated