Origin E6

Perfect-Fit Solutions

The Origin E6 is a versatile NPU that is customized to match the needs of next-generation smartphones, automobiles, AV/VR, and consumer devices. With support for video, audio, and text-based AI networks, including standard, custom, and proprietary networks, the E6 is the ideal hardware/software co-designed platform for chip architects and AI developers. It offers broad native support for current and emerging AI models, and achieves ultra-efficient workload scheduling and memory management, with up to 90% processor utilization—avoiding dark silicon waste.

Innovative Architecture

The Origin E6 neural engine uses Expedera’s unique packet-based architecture, which is far more efficient than common layer-based architectures. The architecture enables parallel execution across multiple layers achieving better resource utilization and deterministic performance. It also eliminates the need for hardware-specific optimizations, allowing customers to run their trained neural networks unchanged without reducing model accuracy. This innovative approach greatly increases performance while lowering power, area, and latency.

Customizable

Power-Efficient

Sustainable Performance

Easy to Deploy

Field-Proven

Choose the Features You Need

Customization brings many advantages, including increased performance, lower latency, reduced power consumption, and eliminating dark silicon waste. Expedera works with customers to understand their use case(s), PPA goals, and deployment needs during their design stage. Using this information, we configure Origin IP to create a customized solution that perfectly fits the application.

An OEM wanted to create a best-in-class AI chip for future high-end consumer AR/VR devices. To accommodate future requirements, the OEM needed a high-performance NPU that could concurrently run multiple networks with zero penalty context switching. As the OEM evaluated options, they discovered that while many NPUs claim to support multiple networks, they often introduce high latency during network changes. Expedera’s ability to run multiple networks concurrently with no perceptible increase in latency while providing the best-tested power, performance, and area made the Origin architecture the perfect choice.

Features

Specifications

16 – 32 TOPS performance	Support for standard, custom, and proprietary neural networks
Performance efficiencies up to 18 TOPS/Watt	Input resolutions up to 4K and beyond
Runs LLM, CNN, RNN, DNN, LSTM, and other network types	Full software stack provided, including compiler, estimator, scheduler, and quantizer
Support for transformers, stable diffusion, large language models (LLMs), others	Delivered as Soft IP (RTL) or GDS

Compute Capacity	8K to 16K INT8 MACs
Multi-tasking	Run up to 8 Simultaneous Jobs
Power Efficiency	18 TOPS/W effective; no pruning, sparsity or compression required (though supported)
Example Networks Supported	HitNet, Denoise, ResNext, ResNet50 V1.5, ResNet50 V2, Inception V3, RNN-T, MobileNet SSD, MobileNet V1, UNET, BERT, EfficientNet, FSR CNN, CPN, CenterNet, YOLO V3, YOLO v5l, ShuffleNet2, Swin, SSD-ResNet34, DETR, others
Example Performance	MobileNet V1 (512 x 512): 3629 IPS, 2696 IPS/W (N7 process, 1GHz, no sparsity/pruning/compression applied)
Layer Support	Standard NN functions, including Conv, Deconv, FC, Activations, Reshape, Concat, Elementwise, Pooling, Softmax, others. Programmable general FP function, including Sigmoid, Tanh, Sine, Cosine, Exp, others, custom operators supported.
Data types	INT4/INT8/INT10/INT12/INT16 Activations/Weights FP16/BFloat16 Activations/Weights
Quantization	Channel-wise Quantization (TFLite Specification) Software toolchain supports Expedera, customer-supplied, or third-party quantization
Latency	Deterministic performance guarantees, no back pressure
Frameworks	TensorFlow, TFlite, ONNX, others supported

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Search

The Cutting Edge in On-Device AI

With support for the latest generative AI models and traditional RNN, CNN, and LSTM models, the Origin™ E6 NPUs scale from 16 to 32 TOPS to deliver the optimum balance of performance, efficiency, and features for demanding edge inference applications.

Perfect-Fit Solutions

Innovative Architecture

Choose the Features You Need

Market-Leading 18 TOPS/W

Efficient Resource Utilization

Full TVM-Based Software Stack

Successfully Deployed in 10M Devices

Use Case

Real-World User Experience

Download our White Papers

Get in Touch With Us

Origin E6

The Cutting Edge in On-Device AI

With support for the latest generative AI models and traditional RNN, CNN, and LSTM models, the Origin™ E6 NPUs scale from 16 to 32 TOPS to deliver the optimum balance of performance, efficiency, and features for demanding edge inference applications.

Perfect-Fit Solutions

Innovative Architecture

Choose the Features You Need

Use Case

Real-World User Experience

Download our White Papers

Get in Touch With Us

STAY INFORMED

Subscribe to our News