Configure Stream Enrich
Basic configuration
Template
Template configuration for Kafka can be found here.
Monitoring
You can also now include Snowplow Monitoring in the application. This is setup through an optional section at the bottom of the config. You will need to ammend:
monitoring.snowplow.collectorUri
insert your snowplow collector URI here.monitoring.snowplow.appId
the app-id used in decorating the events sent.
If you do not wish to include Snowplow Monitoring please remove the entire monitoring
section from the config.
Resolver configuration
You will also need a JSON configuration for the Iglu resolver used to look up JSON schemas. A sample configuration is available here.
Enrichments configuration
You may wish to use Snowplow's configurable enrichments. To do this, create a directory of enrichment JSONs. For each configurable enrichment you wish to use, the enrichments directory should contain a .json file with a configuration JSON for that enrichment. When you come to run Stream Enrich you can then pass in the path to this directory using the --enrichments parameter.
Example configurations could be found at GitHub repository.
See the documentation on configuring enrichments for details on the available enrichments.
Configuration in DynamoDB
When using with Kinesis, it’s possible to store the configuration of the resolver and/or enrichments in DynamoDB. In this case dynamodb:
prefix needs to be used in place of file:
prefix:
--resolver dynamodb:eu-west-1/configuration_table/resolver \
--enrichments dynamodb:eu-west-1/configuration_table/enrichment_
In this case it’s assumed that the enrichments and resolver are stored in a table named configuration_table
in eu-west-1, that the key for that table is id
, that the resolver JSON is stored in an item whose key has value resolver
, and the enrichments are stored in items whose keys have values beginning with enrichment
.
In the example above configuration_table
is a table with 2 columns : id
and json
.
There must be one line with resolver
as id
and the content in the json
column.
enrichment_
is the prefix used in the id
column to configure an enrichment, and then the content must be put in the json
column. Here is the list of all the enrichments (with enrichment_
prefix) in id
column :
- enrichment_api_request_enrichment_config
- enrichment_http_header_extractor_config
- enrichment_iab_spiders_and_robots_enrichment
- enrichment_pii_enrichment_config
- enrichment_sql_query_enrichment_config
- enrichment_weather_enrichment_config
- enrichment_yauaa_enrichment_config
- enrichment_anon_ip
- enrichment_campaign_attribution
- enrichment_cookie_extractor_config
- enrichment_currency_conversion_config
- enrichment_event_fingerprint_config
- enrichment_ip_lookups
- enrichment_javascript_script_config
- enrichment_referer_parser
- enrichment_ua_parser_config
- enrichment_user_agent_utils_config