Hello, and welcome to the new Expedera blog site! Since you are reading this, I’m betting you are at least curious, if not passionate about understanding and exploring the potential of deep learning in chip designs. In upcoming blogs, we will share our experience with chip designers like you and any architectural gems we find along our journey as we develop artificial intelligence inference IP (Intellectual Property) for the semiconductor industry. As with any relationship, it’s important we do a proper introduction of who we are. Let’s start with a bit of history on Expedera.
Expedera is an AI (Artificial Intelligence) semiconductor IP start-up based in Silicon Valley, California. The company was founded in 2018 by a team of three engineers—Da Chuang, Siyad Ma, and Sharad Chole, who had worked together at Cisco, focused on networking applications.
I think there is a prevailing belief that every start-up begins with an ah-ah moment. For the Expedera founders, it was more of a nagging sense that, hey deep learning is a lot like networking and we should be able to do this better! The three started out analyzing training and inference workloads in existing solutions with the idea of applying their decades of networking experience. No matter what approach they examined, the fundamental limiting factor was bandwidth limitations between memory and compute—no matter how much memory or processing was thrown at AI inference systems, bandwidth limitations still limited performance. Further, existing AI architectures were (and frankly, still are) highly inefficient. As they rely on traditional CPU designs, their effective utilization—the time spent actually processing, rather than waiting for the next processing cycle—is very low. Low utilization is extremely inefficient, as it wastes significant power and processing cycles.
The question occurred to them: how do we keep all the engines and resources busy? Can we create an AI inference processor that would allow users to optimize processing within available bandwidth, and in a way that dramatically increases utilization?
With that mission in mind, Expedera assembled a team of engineers and started the exhaustive process of design, iterate, design, iterate (rinse and repeat, as anyone knowledgeable of the process knows all too well). We taped out a test chip on TSMC 7nm, and in 2021, we completed our silicon characterization of that chip. Silicon results validated our design approach and showed that our Origin™ architecture provides truly market differentiated PPA (power, performance, area).
We exited stealth mode in April 2021 with a presentation at the 2021 Linley Spring Conference entitled “Expedera: Building a Neural Engine with Unrivaled IPS/W.” Since then, we’ve publicly detailed the Origin architecture, continued to scale our team and have continued to enhance the PPA of our products. Recently, we announced an $18M Series A funding round. As we start the new year, our first customer (a global consumer device company…more on that soon) is heading to mass production.
In coming blogs, you can expect to hear a lot more about our unique DLA architecture, packet-based implementations, emerging AI use cases, and many other topics. We invite you to follow us on LinkedIn, Twitter, and Facebook, where you’ll receive news on all things Expedera, including notifications on when new blogs are published here.
Have questions? We’d love to hear from you. Drop us a note here, and we will be in touch.