I spent much of my career in software companies as an applied scientist. Most of that work is proprietary; prototyping, benchmarking, data pipelines, training models, etc. The public artifacts are mainly patents and the occasional open-source project. They are listed below by company (most recent first) based on the kinds of infrastructure I was working on.
Founder and Chief Scientist. Building the Pinecone vector database - managed, large-scale, low-latency vector search that serves as long-term memory for AI applications.
Open source; benchmarks
Big ANN Benchmarks - a benchmark and competition for billion-scale approximate nearest-neighbor search, pushing the state of the art in vector-search algorithms and systems.
vq-bench - a benchmark for vector quantization (coming soon).
Director of Research and Head of Amazon AI Labs. Built the algorithms and distributed systems behind Amazon SageMaker - AWS's platform for training and serving machine learning models at scale.
See the paper Amazon SageMaker Elastic Algorithms (SIGMOD 2020) on the Research page.
Patents
Edo Liberty, Stefano Stefani, Alexander Smola, Craig Wiley, Steve Loeppky, Tom Faulhaber, Swami Sivasubramanian, Zohar Karnin
Edo Liberty, Zohar Karnin
Edo Liberty, Stefano Stefani, Swami Sivasubramanian, Zohar Karnin, Tom Faulhaber, Alexander Smola, Craig Wiley, Amir Sadoughi, Dayanand Rangegowda
Edo Liberty, Stefano Stefani, Steve Loeppky, Craig Wiley, Tom Faulhaber
Edo Liberty, Madhav Jha
Stefano Stefani, Craig Wiley, Thomas Faulhaber, Alexander Smola, Steven Loeppky, Richard Bice, Edo Liberty, Swaminathan Sivasubramanian, Charles Swan, Taylor Goodhart
Mu Li, Edo Liberty, Alexander Smola, Leyuan Wang
Madhav Jha, Edo Liberty
S. Genc, E. Liberty
Edo Liberty, Leo Dirac
Senior Research Director and Head of Yahoo Labs, New York. Built horizontal machine-learning platforms and the streaming-data systems that powered Yahoo's products, from advertising to mail.
Open source
Apache DataSketches is the leading and most popular open source implementation of streaming algorithms for sketching and summarizing data such as counting distinct items (like HLL), frequent items (aka top-k), streaming quantiles, and more. It is used by Druid, Spark, Yahoo, AWS, Google, and many more.
Patents
Kevin Lang, Edo Liberty, Konstantin Shmakov
KJ Lang, E Liberty, K Shmakov
Zohar Karnin, Guy Halawi, David Wajc, Edo Liberty
Edo Liberty, Zohar Karnin, Yoelle Maarek, Natalie Aizenberg
Ronny Lempel, Yoelle Maarek, Edward Bortnikov, Edo Liberty
Vishwanath Ramarao, Andrei Broder, Idan Szpektor, Edo Liberty, Yehuda Koren, Mark Risher, and Yoelle Maarek
Edo Liberty, Zohar Karnin, Yoelle Maarek
Zohar Karnin, Michal Aharon, Edo Liberty, Yoelle Maarek
Zohar Karnin, Edo Liberty, David Wajc, Guy Halawi
Edo Liberty, Yoelle Maarek
J Tetreault, A Pappu, E Liberty, L Cao, M Liu, E Pavlick, G Tsur, Y Maarek
Joel Tetreault, Aasish Pappu, Edo Liberty, Liangliang Cao, Meizhu Liu, Ellie Tobochnik, Gilad Tzur, Yoelle Maarek
Justin Thaler, Maxim Sviridenko, Edo Liberty, Prerit Uppal, Ron Belmarch, Jerry Shen
Justin Thaler, Maxim Sviridenko, Edo Liberty, Prerit Uppal, Ron Belmarch, Jerry Shen
Worked as an Intern (twice) at Google, specifically on Google Analytics and Google Maps.
Patents
Nir Ailon, Edo Liberty, Hari Khalsa
Technical founder. Built automatic content-recognition (ACR) infrastructure - fingerprinting broadcast video in real time to identify what is on screen and target contextually relevant content across millions of connected televisions.
Patents
Zeev Neumeier, Edo Liberty
Zeev Neumeier, Edo Liberty
Edo Liberty, Steven Zucker, Yosi Keller, Mauro M. Maggioni, Ronald R. Coifman, Frank Geshwind, and in collaboration with Plain Sight Systems.
Ezuzah Chrome Extension (a digital art piece) - your browser is your door to the internet, why not hang a Mezuzah?