The Tensilica DNA 100 processor is an easily scalable processor comprising of specialized hardware engines and a tightly coupled Tensilica DSP. Deep neural networks are constituted of inherent sparsity (presence of zeros) in both weights and activations. The DNA 100 processor’s specialized hardware engines eliminate both loading and storing of zeros and applying compute on them, allowing this sparsity to be leveraged for power efficiency, bandwidth, and compute reduction. Retraining of neural networks can further increase the sparsity in the networks and achieve maximum performance from the DNA 100 processor’s sparse compute engine. This enables the DNA 100 to leverage sparsity for performance boost (through compute reduction), enhanced power efficiency and bandwidth reduction. Retraining of neural networks can further increase the sparsity in the networks and achieve maximum performance from the DNA 100 processor’s sparse compute engine. As a result, the DNA 100 processor delivers both high performance and power efficiency across a full range of compute from 0.5 TeraMAC (TMAC) to 100s of TMACs.