Video Data Processing with Apache Beam

Presented at Beam College 2026

This architecture leverages a Splittable DoFn for parallelized video ingestion, distributing frame extraction across workers before applying Sliding Window logic to generate temporal 3D tensors. It utilizes Beam’s RunInference API with a KeyedModelHandler for GPU-accelerated inference, ensuring robust state management. Finally, CoGroupByKey synchronizes asynchronous feature vectors with metadata, serializing the aligned dataset into TFRecord SequenceExamples for downstream training.

Instructor(s):