How it works
One spec. Three execution modes. Zero drift.
Define
Write transforms in Python or use the expression DSL for simple features. Group related features, set source dependencies, and register via API. Your features are versioned, searchable, and reusable.
Compute
Features run as actors on Apache Pekko with Python UDFs and an expression DSL. The same logic executes in real-time, on streams, or in batch — no code duplication, no training/serving skew.
Serve
Retrieve feature vectors via gRPC with sub-50ms latency. Pre-computed features are cached in Elasticsearch. Stream consumers keep features fresh as data arrives.
Three modes, one platform.
Whether you need features for a real-time API call, a streaming pipeline, or a historical backfill — the Enrichment Platform runs the same transformations across all three modes from a single definition.
Get startedRequest / Response — Sub-second feature inference via gRPC for live decisions
Stream — Near real-time from Kafka, Kinesis, CDC, and HTTP with Elasticsearch persistence
Batch — Historical feature calculation for training sets and backfills
Capabilities
Everything you need to go from raw data to production features.
Feature specifications
- Declarative JSON-based feature definitions
- Feature grouping for logical organization
- Schema-validated with protobuf type safety
- Versioned, searchable, and reusable across pipelines
Transformation engine
- Python UDFs — write transforms in Python, executed sandboxed
- Expression DSL for config-driven features (no code deploy)
- Batch UDFs return multiple features from a single function
- Pre-installed numpy, pandas, scikit-learn for ML scoring
Feature store
- Elasticsearch-backed feature storage and retrieval
- Point-in-time feature lookups for training
- Entity-based and pipeline-based retrieval
- Automatic index rollover and retention
Serving & monitoring
- gRPC API with sub-50ms response times
- Health checks and readiness probes
- OpenTelemetry tracing, Prometheus metrics, Grafana dashboards
- Horizontal autoscaling on Kubernetes
Built for SaaS platforms in regulated industries.
Most feature stores are designed for a single company's ML team. The Enrichment Platform is designed for SaaS companies that compute features per customer — with built-in tenant isolation, audit trails, and direct database connectivity.
Multi-tenant by design
Row-level security, per-tenant rate limiting, isolated feature computation, and cross-tenant sharing with access grants. Your customers each get their own isolated feature environment.
Compliance-ready audit trail
Every feature computation is stored in an event-sourced journal. Replay any entity's state at any point in time. Export audit packages for regulatory review. Prove what features existed for any decision.
CDC — skip the middleware
Point a connector directly at your database. No Kafka, no Spark, no ETL pipeline. Change Data Capture turns database writes into real-time features in minutes.
Use cases
Feature infrastructure your compliance team will love.
Risk & underwriting
Compute risk features from claims history, telematics, and external data in real-time for instant decisions.
Fraud detection
Stream transaction data through feature pipelines. Flag anomalies with features computed from live and historical patterns.
Personalization
Build user profiles from behavioral streams. Serve real-time feature vectors to recommendation models.
Credit scoring
Combine bureau data, transaction history, and alternative data into consistent feature sets for model training and serving.
Pricing engines
Enrich pricing requests with computed features from multiple sources. Same features in batch training and real-time quoting.
ML training pipelines
Generate point-in-time correct training datasets with batch mode. Eliminate training/serving skew with unified feature definitions.
Architecture
Connects to your data stack.
- Kafka
- Kinesis
- S3
- CDC
- HTTP
- gRPC API
- Kafka
- S3 export
- Webhooks
Bring your own data sources. We handle the compute, storage, and serving.
Up and running in minutes.
Install the SDK, connect with your API key, and start defining and querying features immediately.
Install the SDK
pip install datatier-enrichment
Connect
client = EnrichmentClient(endpoint, api_key="your-key", tenant_id="acme")
Query features
features = client.get_features("policy", "pol-123")
Pricing
Pay for what you compute.
Usage-based pricing tied to feature computations, stored features, and API calls. Start free, scale to enterprise.
Starter
0
- Up to 10 feature groups
- 100k computations / month
- 1 GB feature storage
- Real-time + batch modes
Growth
Usage-based
- Unlimited feature groups
- 10M+ computations / month
- All three execution modes
- Stream processing + autoscaling
Scale
Custom
- Dedicated compute and storage
- VPC peering / private endpoints
- SSO + audit logging
- Custom SLA & support
Stop rebuilding features for every model.
Define once, compute anywhere. Install the SDK and start querying features in minutes.