Danica began her career as a software engineer in data visualization and warehousing with a business intelligence team where she served as a point-person for standards and best practices in data visualization across her company. In 2018, Danica moved to San Francisco and pivoted to backend engineering with a derivatives data team which was responsible for building and maintaining the infrastructure that processes millions of financial market data per second in near real-time. Her first project on this team involved Kafka Streams and Kafka Connect. From there, she immersed herself in the world of data streaming and found herself quite at home in the Apache Kafka and Apache Flink communities. She now leads the open source advocacy efforts at Snowflake, supporting Apache Iceberg and Apache Polaris (incubating).
Outside of work, Danica is passionate about sustainability, increasing diversity in the technical community, and keeping her many houseplants alive. She can be found on X (Bluesky and Mastodon), talking about tech, plants, and baking @TheDanicaFine.
While many of us have adapted to work from home life, one major problem remains: finding an easy way to keep folks in your home away from your workspace when you’re on an important call. Dust off your Raspberry Pi––let’s build a custom on-air sign with Apache Kafka®, Apache Flink®, and Apache Iceberg™!
We’ll begin by writing Python scripts to capture key events––such as when a Zoom meeting is running and when a camera is being used––and produce it into Kafka. The live data are then consumed by a Raspberry Pi script to drive the operation of a custom designed on-air sign. From there, you’ll be introduced to the ins and outs of FlinkSQL for stream processing as we wrangle the data into a better format for downstream use. And, finally, we’ll see Iceberg in action and learn how to use query engines to analyze meeting and recording trends.
By the end of the session, you’ll be well-acquainted with this powerful trio of open source technologies and know how you could use the same scaffolding and scale out a simple, at-home project to millions of users and simultaneous events.
Have piping-hot, real-time data in Apache Kafka® but want to chill it down into Apache Iceberg™ tables? Let’s see how we can craft the perfect cup of “Iced Kaf-fee” for you and your needs!
We’ll start by grinding through the motivation for moving data from Kafka topics into Iceberg tables, exploring the benefits that doing so has to offer your analytics workflows. From there, we’ll open up the menu of options available to cool down your streams, including Apache Flink®, Apache Spark™, and Kafka Connect. Each brewing method has its own recipe, so we’ll compare their pros and cons, walk through use cases for each, and highlight when you might prefer a strong Spark roast over a smooth Flink blend—or maybe a Connect cold brew. Plus, we’ll share a sneak peek at future innovations that are percolating in the community to make sinking your Kafka data into Iceberg even easier.
By the end of the session, you’ll have everything you need to whip up the perfect pipeline and serve up your “Iced Kaf-fee” with confidence.
Searching for speaker images...