

# LOIHI ARCHITECTURE OVERVIEW

Mike Davies Director, Neuromorphic Computing Lab | Intel Labs

March 29, 2019 Neuro-Inspired Computational Elements, SUNY Polytechnic Institute

# **LEGAL INFORMATION**

This presentation contains the general insights and opinions of Intel Corporation ("Intel"). The information in this presentation is provided for information only and is not to be relied upon for any other purpose than educational. Intel makes no representations or warranties regarding the accuracy or completeness of the information in this presentation. Intel accepts no duty to update this presentation based on more current information. Intel is not liable for any damages, direct or indirect, consequential or otherwise, that may arise, directly or indirectly, from the use or misuse of the information in this presentation.

Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer.

No computer system can be absolutely secure. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel, the Intel logo, Movidius, Core, and Xeon are trademarks of Intel Corporation in the United States and other countries.

\*Other names and brands may be claimed as the property of others

Copyright © 2019 Intel Corporation.

# **The Engineering Perspective**

- Nature has come up with something amazing. Let's copy it...
- Not so simple very different design regimes
- Yet objectives and constraints are largely the same...
  - **Energy minimization**
  - Fast response time
  - Cheap to produce

Need to understand and apply the basic principles, *adapting for differences* 

### Status today:

|                                      | Nature                |                                     | Silicon                | Ratio       |  |
|--------------------------------------|-----------------------|-------------------------------------|------------------------|-------------|--|
| Neuron density <sup>[1]</sup>        | 100k/mm <sup>2</sup>  |                                     | 5k/mm <sup>2</sup>     | 20x         |  |
| Synaptic area <sup>[1]</sup>         | 0.001 um <sup>2</sup> |                                     | 0.4 um <sup>2[2]</sup> | 400x        |  |
| Synaptic Op Energy                   | ~2 fJ                 |                                     | ~4 pJ                  | 2000x       |  |
| But [1] Planar neocortex [2] ~5b SRA |                       |                                     |                        |             |  |
| Max firing rate                      | 100 Hz                |                                     | 1 GHz                  | 10,000,000x |  |
| Synaptic error rate                  | 75%                   |                                     | 0%                     | 00          |  |
|                                      |                       |                                     |                        |             |  |
| Nature                               |                       | Silico                              | n                      |             |  |
| Autonomous self-assembly             |                       | Fabricated manufacturing            |                        |             |  |
| Per-instance variability desired     |                       | Variability causes brittle failures |                        |             |  |
| Limited plasticity over lifetime     |                       | Must support rapid reprogramming    |                        |             |  |
| Nondeterministic operation           |                       | Deterministic operation desired     |                        |             |  |

### **Exploiting Sparsity with Spikes**













### **Chip Architecture**

| Technology:     | 14nm                 |  |
|-----------------|----------------------|--|
| Die Area:       | 60 mm <sup>2</sup>   |  |
| Core area:      | 0.41 mm <sup>2</sup> |  |
| NmC cores:      | 128 cores            |  |
| x86 cores:      | 3 LMT cores          |  |
| Max # neurons:  | 128K neurons         |  |
| Max # synapses: | 128M synapses        |  |
| Transistors:    | 2.07 billion         |  |

#### Low-overhead NoC fabric

- 8x16-core 2D mesh
- Scalable to 1000's cores
- Dimension order routed
- Two physical fabrics
- 8 GB/s per hop



### Neuromorphic Core Architecture



### Neuromorphic Core Microarchitecture



# **Basic Core Operation (Non-Learning)**

(Time multiplexing illustrated unrolled in space)



# Learning with Synaptic Plasticity

- Local learning rules essential property for efficient scalability
- Rules derived by optimizing an emergent statistical objective
- Plasticity on wide range of time scales for
  - ✓ Immediate supervised (labelled) learning
  - ✓ Unsupervised self-organization
  - ✓ Working memory
  - ✓ Reinforcement-based delayed feedback



Learning rules for weight  $W_{x,y}$ may *only* access presynaptic state x and postsynaptic state y

**Reward spikes** may be used to distribute graded reward/punishment values to a particular set of axon fanouts



### Loihi's Trace-Based Programmable Learning



### Learning Rule Examples

### Pairwise STDP:

$$W(t+1) = W(t) - A_{-}x_{0}(t)y_{1}(t) + A_{+}x_{1}(t)y_{0}(t)$$

Triplet STDP with heterosynaptic decay:

$$W(t+1) = W(t) - A_{-}x_{0}(t)y_{1}(t) + A_{+}x_{1}(t)y_{0}(t)y_{2}(t) - B \cdot W(t) \cdot y_{3}(t)$$

**Delay STDP:** 

$$D(t+1) = D(t) - A_{-} \frac{x_{0}(t)}{(127 - y_{1}(t))} + A_{+}(127 - \frac{x_{1}(t)}{y_{0}(t)})$$

### Two-variable Learning Rule Examples

Distal Reward with Synaptic Tags:

$$T(t+1) = T(t) - A_{-}x_{0}(t)y_{1}(t) + A_{+}x_{1}(t)y_{0}(t) - B \cdot T(t)$$

 $W(t+1) = W(t) + C \cdot r_1(t) \cdot T(t)$ 

### STDP with dynamic weight consolidation:

$$W(t+1) = W(t) - A_{-}x_{0}(t)y_{1}(t) + A_{+}x_{1}(t)y_{0}(t)y_{2}(t) - B_{1}(W-T)y_{3}(t)y_{0}(t)$$

$$T(t+1) = T(t) + \frac{1}{\tau_{cons}}(W-T) - B_2 T(w_{\theta} - T)(w_{max} - T)$$

# **Hierarchical Connectivity**



### **Multi-Compartment Neurons**





### Dendritic Compartment Unit Model





### Dendritic Compartments: Structural Model



(intel)

### Min/Max Threshold Homeostasis

Loihi supports intrinsic excitability homeostasis (aka threshold adaptation)

### **Dynamics:**

$$\Delta V_{th}(t) = \begin{cases} \beta(a(t) - a_{min}), & \text{if } a(t) < a_{min} \\ \beta(a(t) - a_{max}), & \text{if } a(t) > a_{max} \end{cases}$$
$$V_{th}(t) = V_{th}(t - T_{epoch}) + \Delta V_{th}(t)$$

(in terms of neuron's *activity trace* a(t))

Evaluated periodically every *Dendritic epoch*. (Usually set the same as the learning epoch)

### **Parameters:**

| Parameter        | Bits | Definition                                                                 |
|------------------|------|----------------------------------------------------------------------------|
| a <sub>max</sub> | 7    | Maximum activity level above which Vth will be raised.                     |
| a <sub>min</sub> | 7    | Minimum activity level, below<br>which Vth will be lowered.                |
| β                | 4    | Scaling constant relating activity trace differences to threshold changes. |

### **Example Homeostasis Dynamics**

#### Synaptic input drops abruptly at t=5000. 8K) Input spike rate Activity (0-127) drops abruptly - - - - - Amir ---- Amax Time Step

Neuron with abrupt input rate change



### **Other Synaptic Features**



### Mesh Operation: Fine-Grained Synchronization



Time step T begins.

Cores update dynamic neuron state and evaluate firing thresholds



Above-threshold neurons send spike messages to fanout cores

(Two neuron firings shown.)



All neurons that fire in time T route their spike messages to all destination cores.





### Exploring Mesh Scaling to 32 Chips Graph Search on Nahuku (32-chip Loihi System)



