TU Dresden and the University of Manchester are currently working on the next-generation SpiNNaker system. It will be a radical redesign using a state-of-the-art 28nm process so there is considerable room for more features. Things we are looking at include dedicated co-processors (e.g. for computation of exponentials and random numbers), and also architectural enhancements such as interconnect topology, relative size of memory vs. processing, etc.

This discussion group will give both an exciting preview of the next-generation SpiNNaker system and also a unique opportunity to participate directly in the design. We are looking especially for feedback from current and potential future users on features they'd like to see, characteristics of the current SpiNNaker system that they like or find to be an onerous limitation, and field experience on working with the quirks of the current design. Where would you like to have more performance? What components of neural function are sufficiently general that developing bespoke dedicated co-processors makes sense? How much system capacity in terms of neurons and synapses per chip do you need? How fine do you need the time resolution to be? What are the limits in terms of power consumption, heat, space, that you need in order to meet your design constraints for embedded systems (such as, particularly, robots)

We are looking for concrete and practical suggestions; this group will not be about your 'inner philosopher' but rather your 'inner engineer'. Bring us your ambitious but pragmatic vision of what the 'ideal' digital neuromorphic system would look like.

Day	Time	Location
Tue, 26.04.2016	22:00 - 23:00	Sala Panorama
Fri, 29.04.2016	15:00 - 16:00	Sala Panorama
Tue, 03.05.2016	14:00 - 15:00	Outside disco

What should the next-generation SpiNNaker system look like?

Protocol of first meeting on 26.4. at 10pm

(thanks to Simon for taking notes)

The discussion went through numerous constraints of the current system that could be improved in the next generation. They are sorted here according to the main sub-parts of the SpiNNaker system:

Processing

(processor cores + extensions -> implementing models)

Floating-point calculations are required for certain models/applications (e.g. matrix multiplications/inversions, models that require ODE solvers)
General agreement that we'd be running more complex models in future, perhaps with multi-compartment models, dendritic branches, etc. That would shift the communication/compute balance significantly to the compute side.

Memory

(internal/external -> state variables, synaptic matrix and parameters)

Local memory could become a more critical constraint if complexity of models increases
External memory access seemed not to be problematic (maybe hidden by/visible through communication limitations)

Communication

(internal and IO -> spike communication, interfacing to the outside world)

Easier connectivity for AER sensors (to avoid needing external FPGA; could FPGA on SpiNNaker board be employed?), and more generally for AER-compatible devices
Direct WIFI connectivity maybe also useful for independent systems (robots, Drones, etc); would allow a bigger, stationary SpiNNaker system to control a mobile agent
Data load times and data read times were felt as a rather severe obstacle for using the current system
- Especially important for models with explicit connectivity (e.g. pyNN.FromListConnector) and parameter sweeps
- Improved bandwidth from/to host needed
Spike packet drops are encountered for some benchmarks
- Better diagnostics would help giving more informative feedback (where, when, which) back to the user
Longer routing keys in the SpiNNaker router could be beneficial for more targeted routing. This would also be required for interfacing with new/future retinas/AER chips with number of pixels/neurons. A current ATIS interface already uses all available bits in the routing key, and would benefit from longer keys. Maybe also longer payloads could help for certain applications.

General

Mobile systems (smaller size than 48-node board, one chip or a few chips in a compact package) would be quite useful, to fit e.g. on small robots
Scalability: SpiNNaker is scalable in principle, but how to assess the constraints that apply to large-scale models? Communication bandwidth vs. router entries seems to be a likely limiting factor
- Do not only scale the network size, but also make models more detailed (dendritic branches, etc.)
- Investigation with two possible approaches: Either a set of example networks, i.e. networks that users work with anyway, or dedicated synthetic tests to analyze constraints in a more targeted fashion

Protocol of session on 29.04.2016 (Felix/Christian)

Network-on-chip improvements

Secure Packet Transmission - Mode/avoid packet loss
- required for non-spike data (tbd, stuff which is more critical than e.g. loosing a couple of spikes)
- self-check features of links and higher NoC layers?
Deadlock (fifos getting full) seems to be a large factor in current spinnaker spike packet loss, how to avoid it
- directed communication channels (as current spinnaker uses one event queue for both in and out, bidirectional communication leads to deadlock)
- increase bandwidth/fifo size
Extent routing table key length for increased system size of spinnaker 2
Memory Address Mapping

Processor

Double Precision floating point unit
DMA/Memory Improvements (e.g. Read-Modify-Request)
Memory Partition (Shared SRAM-Access for more than 1 core)
Embedding FPGA-like structure as configurable hardware accelerator

Configuration/setup

Reduce configuration time (priority issue)
- increase external bandwidth
- implement on-spinnaker configuration (self-mapping, etc)
Online interactions/reconfiguration
- Protocols to implement this
- what needs to be reconfigured

General

Benchmark/Profiling/Debug Features
Better detect for Routing/Execute Errors (realtime violations)

Protocol of session on 03.05.2016 (Andreas/Sebastian)

State-recording might require additional memory bandwidth and capacity
0.1ms timestep --> UMAN list implications of that
TUD evaluates HMC as memory solution, focus on power reduction at low utilization (e.g. sleep modes of SerDES transceivers)
Portable systems might power-down the HMC or only utilize less than 4 links for power saving reasons.
Memory discussion ongoing with UMAN
Hardware accelerators:
- e.g. expr, log, sqrt, logistics function … (TUD provides initial list, to be discussed with UMAN, evaluate HW overhead)
- DMA that supports memory access of arrays

Sebastian Höppner

Christian Mayr

Johannes Partzsch

Alexander Rast

Lukas Cavigelli

Simon Davidson

Gabriel Andres Fonseca Guerra

Michael Hopkins

Sebastian Höppner

Gengting Liu

Shih-Chii Liu

Christian Mayr

Manu Nair

Johannes Partzsch

Alexander Rast

Ole Richter

Alan Stokes

Evangelos Stromatias

André van Schaik

Bernhard Vogginger

Qi Xu

Yexin Yan

CapoCaccia

What Should the Next-Generation SpiNNaker Look Like?

Timetable

What should the next-generation SpiNNaker system look like?

Protocol of first meeting on 26.4. at 10pm

Processing

Memory

Communication

General

Protocol of session on 29.04.2016 (Felix/Christian)

Network-on-chip improvements

Processor

Configuration/setup

General

Protocol of session on 03.05.2016 (Andreas/Sebastian)

Leaders

Members