Abstract
Summary form only given. Design productivity is still a challenge for reconfigurable computing. It is expensive to port a software workload to RTL, to maintain the RTL as the workload evolves, and to wait for hours to recompile a bitstream after each design change. Soft processors can help mitigate these costs, and provide new pathways to application acceleration. A mid-range FPGA can now host hundreds of soft CPUs and their interconnection network, and such heterogeneous massively parallel processor and accelerator arrays can sustain hundreds of operations, memory accesses, and branches per cycle. This talk will look back on the history and diversity of soft processor cores for FPGAs, and their continuing relevance for the decade ahead. What new tools, IP, and infrastructure will help us to exploit the coming million LUT, 10 TFLOPS FPGAs? Along the way we will revisit an austere design esthetic and an implementation methodology for crafting FPGA-optimized soft cores, and see how the lessons of mapping one processor into one 1995 FPGA can inform us how to design massively parallel programmable accelerators going forward.