Title: Integrating Arize Phoenix and Apache Iceberg for Local Telemetry Data Management and Querying
Authors: Don Branson
In modern data observability workflows, capturing and managing telemetry data is crucial for debugging and improving machine learning systems. This paper demonstrates the integration of Arize Phoenix, an open-source observability platform, with Apache Iceberg, a high-performance table format for data lakes, to create a scalable and efficient local telemetry data management and querying system. We present a step-by-step implementation for capturing telemetry data as Parquet files using Arize Phoenix on a local system and using Apache Iceberg to enable schema evolution, time travel, and efficient queries. This solution bridges the gap between data observability and data lake management for machine learning monitoring.