Beam College 2024 Sessions


Beam College is a free educational program to provide hands-on training to solve data processing use cases using Apache Beam.

Sessions for the 2024 edition are organized in 3 days:

1. Apache Beam Overview (July 23)

Sessions on this day will provide an overview of Apache Beam. First we will focus on understanding what is Apache Beam, how it differs from other tools in the data processing ecosystem and when it is a good fit for your project or organization. After that, you will learn how you can get started with Apache Beam and build your first pipeline.

 

2. Apache Beam for AI  (July 24)

On day 2, you will learn how you can use Apache Beam for implementing AI pipelines. On the first series of lessons, you will implement a machine learning pipeline all the way from conceptualization to coding and running it on a notebook. We will have an additional session on using Beam to interact with Google Gemini via Google AI Studio.

 

3. Making the jump from batch to streaming (July 25)

One of the main advantages of Apache Beam is that you use the same programming model for implementing batch and streaming pipelines. In the sessions for this day we will go over the key concepts that you need to understand for implementing streaming pipelines in Beam and walk through a demo. We will then provide an overview of Beam Quest, a learning resource for advanced streaming concepts with Apache Beam.

Program

Tuesday, Jul. 23

Title Speaker(s)
Welcome & Apache Beam Overview Pedro Galvan
Project Shield: How we use Beam to defend democracy and free expression Marc Howard
Getting started: Intro to creating a Beam pipeline Sascha Kerbler
Getting started: Beginner tools for Apache Beam Danny McCormick
Beam YAML Bootcamp: Effortless pipeline design for data processing Jeff Kinard
Getting started with Dataflow templates Sascha Kerbler
How to learn Beam: resources, communities, books, tools Israel Herraiz

Wednesday, Jul. 24

Title Speaker(s)
How to Implement a ML pipeline using Beam. Part 1: Concepts and defining our pipeline Danny McCormick & Kerry Donny-Clark
How to Implement a ML pipeline using Beam. Part 2: Coding Danny McCormick & Kerry Donny-Clark
Implementing a complex ML pipeline: Demo w/ Google AI Studios Israel Herraiz

Thursday, Jul. 25

Title Speaker(s)
Making the Jump from Batch to Streaming: Concepts and code Yi Hu
CI/CD with Dataflow templates Surjit Singh
Resources for further learning Svetak Sundhar