Overview
Statsig requires certain data schema in order for proper processing. We support 3 different types of datasets to be ingested into our platform:- Custom Events
- Precomputed Metrics
- Exposure Events
Custom Events
Events that are emitted by your application to measure the ongoing impact of your features and experiments.Required
Column | Description | Format/Rules |
---|---|---|
timestamp | The unix time your event was logged at | BIGINT. please cast timestamps into epoch time in seconds |
event_name | The name of the event | STRING/VARCHAR. Not null. Length < 128 characters |
unit_id | Unique unit identifier | STRING/VARCHAR. User ID, Stable ID, etc. The same event row can have multiple IDs |
Optional
Column | Description | Rules |
---|---|---|
event_value | The value of the event | STRING/VARCHAR. Length < 128 characters. Statsig will detect numeric values |
event_metadata | Metadata columns about the event | (MANY) STRING:STRING. Statsig will generate a metadata json field from however many metadata columns you provide |
metadata_json | Metadata json about the event | JSON STRING. Statsig will unpack this json 1 level deep. Nested values will be stored as strings |
An example dataset for events might look like this:
unit_id | visit_id | event | timestamp | value | metadata_blob | user_type |
---|---|---|---|---|---|---|
331444 | click | 1676484875 | {"click_target": "exit_details_button"} | power_user | ||
331444 | click | 1676484860 | {"click_target": "open_details_button"} | power_user | ||
265113 | click | 1676484333 | {"click_target": "button", "button_color": "green"} | churn_risk_user | ||
445332 | aeeer43d | visit | 1676483821 | {"page": "landing_page"} | new_user | |
224448 | checkout | 1676482222 | 33.22 | {"product_id": "11eefj", "product_category": "clothing"} | power_user |
- One user can send multiple of the same event, with or without any changes in metadata. Statsig will aggregate these together.
- You can send metadata in both of a json-formatted (only one-level deep) string, and/or pull in fields from columns. You can use metadata and values to generate custom metrics in the console, like sum(value) where “product_category”=“clothing”.
- You can send multiple IDs on a single event. For example, the visit above would could for both user and visit level metrics/experiments. During the mapping flow you tell us which unit types your different IDs correspond to in statsig.
Precomputed Metrics
Precomputed metrics are a powerful way to leverage statsig for experiment results. Use these to send complex metrics and metrics that require delays due to attribution windows or long baking periods. Precomputed metrics in statsig are expected to be calculated at a user-day granularity.Required
Column | Description | Format/Rules |
---|---|---|
unit_id | The unique user identifier this metric is for. This might not necessarily be a user_id - it could be a custom_id of some kind | STRING |
id_type | The id_type the unit_id represents. | STRING. A valid ID type from your project |
date | Date of the daily metric | DATE or ISOFORMATTED STRING. The date of the metric value for the unit_id provided |
metric_name | The name of the metric | STRING (Not null). Length < 128 characters |
metric_value | A numeric value for the metric | DOUBLE/NUMERIC. Metric value OR Both of numerator/denominator need to be provided for Statsig to process the metric. |
numerator | Numerator for metric calculation | DOUBLE/NUMERIC. If present along with a denominator in any record, the metric will be treated as ratio and only calculated for users with non-null denominators. |
denominator | Denominator for metric calculation | DOUBLE/NUMERIC. If present along with a numerator in any record, the metric will be treated as ratio and only calculated for users with non-null denominators. |
An example dataset for metrics might look like this:
unit_id | unit_type | date | metric_name | metric_value | numerator | denominator |
---|---|---|---|---|---|---|
331444 | user_id | 2023-02-13 | clicks | 2 | ||
aeeer43d | visit_id | 2023-02-13 | visits | 1 | ||
224448 | user_id | 2023-02-13 | checkout_rate | 2 | 15 | |
224448 | user_id | 2023-02-13 | clothing_checkout_value | 33.22 |
- In this dataset, unit types are in different rows from each other
- Metrics can either have a value or a numerator/denominator pair. We will calculate any metric with numerator/denominator pair as a ratio metric. Ratio takes priority over value; if you provide all 3 fields, we will assume it is a ratio metric.
- For users with null values, we will infer 0 for metric_value, and exclude null value users for ratio metrics.
Exposure Events
Exposure event import is deprecated. If this is an important use case, see Statsig Warehouse Native, available to Enterprise Customers
Required
Column | Description | Format/Rules |
---|---|---|
timestamp | The unix time your event was logged at | BIGINT. please cast timestamps into timezoneless unix time |
experiment | Your experiment identifier | STRING/VARCHAR. Not null. Length < 128 characters |
group_id | Unique identifier for the experiment groups | STRING/VARCHAR. Not null. |
unit_id | Unique user identifier | STRING/VARCHAR. User ID, Stable ID, etc. The same event row can have multiple IDs |
Optional
Column | Description | Rules |
---|---|---|
metadata_json | Metadata json about the event | JSON STRING. Statsig will unpack this json 1 level deep. Nested values will be stored as strings |