Streamlining Large-Scale Dataset Migrations with Background Coding Agents

By

Introduction

At Spotify, managing thousands of datasets across diverse services is a monumental task. When migrations become necessary—whether due to schema changes, infrastructure upgrades, or data format shifts—the process can quickly turn into a logistical nightmare. To tackle this, we developed a system of background coding agents powered by Honk, integrated with Backstage and Fleet Management. This article explores how these tools transformed our dataset migration workflows, reducing manual effort and accelerating time-to-completion.

Streamlining Large-Scale Dataset Migrations with Background Coding Agents
Source: engineering.atspotify.com

The Challenge of Downstream Consumer Migrations

Migrations at scale involve not only moving data but also ensuring that all downstream consumers—services, pipelines, and analytics tools—adapt seamlessly. Without automation, each migration requires manual coordination: identifying affected consumers, updating schemas, verifying compatibility, and deploying changes. With thousands of datasets, this becomes error-prone and time-consuming. Engineers often spend weeks or months on a single migration, diverting focus from feature development.

Key Pain Points

Enter Honk: Background Coding Agents

Honk is an internal Spotify platform that orchestrates background coding agents—automated bots that perform code modifications across repositories. These agents can detect dataset schemas, locate downstream consumers, and apply necessary transformations in a safe, auditable manner. Honk leverages Backstage for service cataloging and Fleet Management for coordinated deployment.

How Honk Works

  1. Schema Analysis: Honk reads the current dataset schema and the target schema, then computes a diff (additions, removals, type changes).
  2. Consumer Discovery: Using Backstage’s entity catalog, Honk identifies all services, jobs, and scripts that reference the dataset.
  3. Code Generation: For each consumer, Honk generates code patches that update field names, types, or serialization logic.
  4. Pull Request Creation: The agent opens pull requests in the respective repositories, complete with descriptions and test suggestions.
  5. Validation and Rollout: After passing automated checks (CI/CD), changes are staged via Fleet Management to prevent cascading failures.

Backstage: The Service Catalog Backbone

Backstage serves as Spotify’s developer portal, providing a unified catalog of all software components. For migrations, Backstage is invaluable because it links datasets to their consumers through owner metadata and dependency annotations. Honk queries Backstage’s APIs to build a comprehensive dependency graph, ensuring that no consumer is overlooked. This integration reduces the discovery pain point dramatically—teams no longer need to manually track down who uses which table.

The Role of Backstage in Honk Workflows

Fleet Management: Coordinated Rollout and Rollback

Fleet Management is Spotify’s system for deploying updates across a large number of services simultaneously. Dataset migrations often require that multiple consumer services update their schemas in coordination—otherwise, temporary incompatibility can cause data loss or pipeline failures. Fleet Management allows Honk to orchestrate a phased rollout: first update a small subset of consumers, verify correctness, then expand to the entire fleet. If an issue arises, a one-click rollback reverts all changes to a consistent state.

Streamlining Large-Scale Dataset Migrations with Background Coding Agents
Source: engineering.atspotify.com

Benefits of Fleet Management for Migrations

Real-World Impact: Migrating Thousands of Datasets

We deployed this pipeline for a major schema migration affecting over 3,000 datasets. Without Honk, the effort would have required approximately 50 engineering weeks of manual work. With background coding agents, the migration completed in under two weeks with zero reported incidents. The key factors were:

Since then, we’ve reused the same framework for subsequent migrations, each time reducing effort by over 80%.

Best Practices for Implementing Background Coding Agents

Based on our experience, here are key recommendations if you’re considering a similar approach:

  1. Invest in a Service Catalog: Tools like Backstage provide the dependency metadata essential for automation.
  2. Design Agent Actions as Code Reviews: Treat generated patches like human-written code—require tests and peer review.
  3. Build Rollback into the System: Use fleet coordination tools that support atomic deployments and rollbacks.
  4. Monitor and Alert: Track migration progress and system health; have a kill switch if unexpected issues arise.
  5. Document Successes: Share migration stories to build confidence in automated processes.

Conclusion

Dataset migrations no longer need to be a painful, manual ordeal. By combining Honk’s background coding agents with Backstage’s catalog and Fleet Management’s coordination, Spotify has transformed a previously dreaded task into a streamlined, automated process. This approach not only saves engineering time but also reduces risk and improves reliability. As our data ecosystem grows, we continue to refine these tools, enabling faster, safer migrations at scale.

Originally published on the Spotify Engineering blog.

Tags:

Related Articles

Recommended

Discover More

Exploring Dual Identity: Isabel J. Kim's 'Sublimation' Delivers a Haunting Sci-Fi Tale of Immigration and SelfHow to Supercharge Your CAD Workflow with an AI Agent (Adam)Mixtape Breaks Out: GameSpot Gives 9/10 to Beethoven & Dinosaur’s Coming-of-Age Rock OperaExploring the September 2025 Update for Python in Visual Studio Code: New AI Features and Environment EnhancementsHow to Build a Conversational Interface for Spotify Ads with Claude Code Plugins