Feature Store Summit 2025

Check out all videos and slides presented at the conference!

From Real-Time ML to Agents with Hopsworks

Hopsworks supports building batch, real-time, and agentic AI system using a unified architecture around feature pipelines, training pipelines, and inference pipelines. In this talk, we walk through the journey of developing batch and real-time ML systems to agentic AI systems with this unified FTI pipeline architecture. We will look in particular at how we connect application state to agents using "application context protocol". That is, we will see how entity IDs in applications can be used to help agents reliably retrieve application state as context.

Lyft’s Feature Store: Architecture, Optimization, and Evolution

Lyft's Feature Store, a core infrastructure component in its Data Platform, optimizes the management & deployment of ML features at scale.

Clicklease + Hopsworks: On-Demand Feature Life Cycle Management

Clicklease is a financial services company that offers equipment leasing to small businesses and they make use of ML to make predictions about policies, pricing, and risk, but they need to compute features on-demand when users interact with the platform. In this talk, we present our journey from computing real-time features using stored procedures to using Python to define and compute features at request time. We will look in detail at some of the requirements for on-demand transformation functions and how we leverage Hopsworks to support our ML platform.

Zalando's Journey to Large-Scale, Real-Time Feature Serving with Hopsworks

Zalando is Europe’s largest online fashion platform, operating in over 25 countries and serving over 50 million customers. It uses the Hopsworks feature store to support real-time AI use cases, such as personalized recommendations.

Real-Time and Batch: Feature Engineering with Hamilton + Narwhals

Generating accurate training data for real-time features is notoriously difficult, often requiring duplicate logic and introducing the risk of train-serve skew.

Real-Time Feature Aggregation at Scale: iFood’s Path to Sub-Sec Latency

In this talk, we’ll walk through how we built a low-latency feature platform that aggregates and serves features in under one second using Spark Structured Streaming and Redis.

How Coinbase Builds Sequence Features for Machine Learning

In this talk, we’ll share how Coinbase built a framework to productionize user action sequences at scale, enabling models to learn directly from rich behavioral histories.

Real time ML at Roku

This talk details Roku's approach to achieving real-time feature serving at scale and dramatically improving the development velocity for our ML models. Covering the entire ML development lifecycle, from initial training to production deployment.

Vector Store: Uber’s Embedding Platform

This talk introduces Vector Store, Uber’s scalable platform for managing the full lifecycle of embeddings—including offline/streaming generation, batch/real-time ingestion, standardized retrieval APIs, and automated model switching—all backed by centralized metadata and governance.

Powering Real-Time AI at Pinterest: Feature Management at Scale with Galaxy Signal Platform.

At Pinterest, this challenge is met by the joint power of Galaxy, our online feature store, and Scorpion, our feature fetching and model inference platform.

EY: Predictive Analytics in Financial Industry

Machine learning algorithms and exploratory data analysis are crucial for uncovering hidden patterns in complex datasets, helping institutions define propensity and likeness aspects to predict customer behaviors and make informed decisions about products and risk management.

Building a Generative Recommender with Chronon

Leveraging modern techniques for recommendations — such as sequence and generative modeling — requires complex infrastructure to generate training data, process high-throughput event streams, and serve low-latency inference at scale.

Real-time ML: Accelerating Python for inference (below 10ms) at scale

We’ll share how we built a symbolic python interpreter that accelerates ML pipelines by transpiling Python into DAGs of static expressions. These expressions are optimized and run at scale with Velox–an OSS (~4k stars) unified query engine (C++) from Meta.