Guy works for Redis as a Developer Advocate. Combining his decades of experience in writing software with a passion for learning—and for sharing what he has learned—Guy explores interesting topics and spreads the knowledge he has gained around developer communities worldwide.
Teaching and community have long been a focus for Guy. He runs his local JavaScript meetup in Ohio and has served on the selection committees of numerous conferences. He'll happily speak anywhere that will have him and has even has helped teach programming at a prison in central Ohio.
In his personal life, Guy is a hard-boiled geek interested in role-playing games, science fiction, and technology. He also has a slightly less geeky interest in history and linguistics. In his spare time he likes to camp and studies history and linguistics.
Do you look like a famous meme character? Does someone you know? Knowing this information is vital—both for your career and your personal life. After all, am I the only one around here who wants to avoid Angry Walter? And who *wouldn't* want to work with Success Kid.
But can we even find out if we have a meme twin? There are lots of memes. And lots of people. How could we possibly search them all? Well, it's easier than you think if we turn those memes into embeddings and search them with a vector database!
But what's an embedding? And what's a vector database? I'll begin by exploring embeddings, showing how unstructured data, such as text and images, can be translated into hyper-dimensional arrays—called vectors. Then I'll talk about vector databases, covering what they are and how you can use them to store and search those embeddings with embeddings of your own.
Of course, we'll do this all by example. I've turned all the big memes—from Ancient Aliens Guy to Zombie Boy—into embeddings and have loaded them into a vector database. I've built an application around these embeddings and that database. I'll show you the code and the queries of this application so that you can build something similar for yourself. And, most importantly, we'll take some photos during the session and use it all to find your meme twin!
So, are you ready to find your meme twin? Or are you ready to learn how to use this technology? I say, Why Not Both?
Now, if you’re like most developers, you probably have no idea what probabilistic data structures are. In fact, I did a super-scientific poll on Twitter and found that out of 119 participants, 58% had never heard of them and 22% had heard the term but nothing more. I wonder what percentage of that 22% heard the term for the first time in the poll. We’re a literal-minded lot at times.
A probabilistic data structure is, well, they’re sort of like the TARDIS—bigger on the inside—and JPEG compression—a bit lossy. And, like both, they are fast, accurate enough, and can take you to interesting places of adventure. That last one might not be something a JPEG does.
More technically speaking, most probabilistic data structures use hashes to give you faster and smaller data structures in exchange for precision. If you’ve got a mountain of data to process, this is super useful. In this talk, we’ll briefly go over some common probabilistic data structures; dive deep into a couple (Bloom Filter, MinHash, and Top-K); and show a running application that makes use of Top-K to analyze the most commonly used words in all 112,092 of my UFO sightings.
When we’re done, you’ll be ready to start using some of these structures in your own applications. And, if you use the UFO data, maybe you’ll discover that the truth really is out there.
Searching for speaker images...