![architecture] architecture-image
The Enrich process takes raw Snowplow events logged by a [Collector] collectors, cleans them up, enriches them and puts them into Storage storage.
ETL | Description | Status |
---|---|---|
[scala-hadoop-enrich] e1 (1) | The Snowplow Enrichment process built using Scalding for Apache Hadoop | Production-ready |
[scala-hadoop-shred] e2 (2) | The Snowplow Shred process for shredding JSONs for loading into Redshift | Beta |
[hadoop-event-recovery] e3 | A Hadoop job for retrieving raw Snowplow events from "bad bucket" errors | Production-ready |
[stream-enrich] e4 (3) | The Snowplow Enrichment process built as an Amazon Kinesis application | Production-ready |
[scala-common-enrich] e5 | A shared library for processing raw Snowplow events, used in (1) and (3) | Production-ready |
[emr-etl-runner] e6 | A Ruby app for running (1) and (2) on Amazon Elastic MapReduce | Production-ready |
Technical Docs | Setup Guide | Roadmap & Contributing |
---|---|---|
![i1] techdocs-image | ![i2] setup-image | ![i3] roadmap-image |
[Technical Docs] techdocs | [Setup Guide] setup | coming soon |