Parallella Epiphany III- Opinions wanted? / Computer Science / Forums

Forums

4hv.org :: Forums :: Computer Science

« Previous topic | Next topic »

Parallella Epiphany III- Opinions wanted?

1 2 next

Move Thread

LAN_403

Ash Small

Fri Oct 24 2014, 02:58PM

Registered Member #3414 Joined: Sun Nov 14 2010, 05:05PM
Location: UK
Posts: 4245

I recieved an E-mail from RS today advertising a new 'super computer' the size of a credit card.

,cl_4293628158,cl_4294959524,cl_4294959177,cl_4293 625764,cl_4293628159

Three different versions, any opinions, anyone?

EDIT:
]parallella_manual.pdf[/file]

Daedronus

Fri Oct 24 2014, 05:00PM

Registered Member #2329 Joined: Tue Sept 01 2009, 08:25AM
Location:
Posts: 370

Looks like a fpga based emulator for their actual silicon.
About 2x the price of the actual fpga.

hen918

Fri Oct 24 2014, 07:19PM

Registered Member #11591 Joined: Wed Mar 20 2013, 08:20PM
Location: UK
Posts: 556

I was following this on kick-starter a while ago. It is a custom ASIC with 16 or 64 individual processing cores, for parallel processing, like hash cracking and machine vision. I wouldn't mind one.
The 16 core processor is the co-processor, the host-processor has an arm CPU combined with an FPGA, for lots of power!

Steve Conner

Tue Oct 28 2014, 10:18AM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

Yes, it is a very powerful (for its size, cost and power budget) parallel computer. The host processor is a Xilinx Zynq, which is a dual core ARM (running Linux) combined with a FPGA on the same chip. I think the FPGA is mainly used to handle communications between the host processor, the Epiphany chip and the external memory, but you can add your own IP for things like display controllers. The FPGA is supported by Xilinx's free toolset, so you don't need to buy anything to start messing with it.

Then you have the Epiphany ASIC. Each of the parallel processing cores is a quite capable device with 32K of local memory and hardware floating point. You can program it in C/C++ using GCC. It supports OpenMP and OpenCL, industry standard parallel processing APIs, or you can make your own. Communication between cores is by the shared memory model.

Carbon_Rod

Wed Oct 29 2014, 07:52AM

Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155

This architecture physically separates core silicon from memory, and therefore must eventually be limited by its cache. A shared memory model would make this contention and pipeline-miss problem exponentially more problematic.

OpenCL is a buggy abstraction layer with a nonzero transactional cost. Anything that runs with overlapping subproblems essentially thrashes the coordinator algorithms.

The $25 Raspberry Pi has a 16 core GPU (now open source) with DMA access on the SoC.
Even with spacial locality, the gains for some types of problems have a nonlinear relationship to actual performance improvements. However, the FFT works great if you don't need some sort of real-time DSP:

For the $120 price (and the memory copy model),
anyone can buy an older PCIx16 based 200+ core 1GHz+ nVidia card with CUDA .
Note the same restrictions will hold....
and your problem might still be solved first by a i7 CPU from years ago.

There are several FPGA with ASIC CPU hybrid chips that have been around for awhile, but in general are only appropriate for a few set of problems like live video/audio/rf-signal stream processing.

The GCC will usually only guarantee functionality, but it is not even close to a high performance compiler with the machine code it outputs.

Cheers,

Steve Conner

Sun Nov 02 2014, 11:55AM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

While this is true, it is really a critique of the concept of parallel computing itself, not the Epiphany architecture as such. Not all problems in computing can be parallelised efficiently, and none of them can be parallelised automatically, you have to figure that part out yourself.

hen918

Sun Nov 02 2014, 12:58PM

Registered Member #11591 Joined: Wed Mar 20 2013, 08:20PM
Location: UK
Posts: 556

Carbon_Rod wrote ...

...
For the $120 price (and the memory copy model),
anyone can buy an older PCIx16 based 200+ core 1GHz+ nVidia card with CUDA .
Note the same restrictions will hold....
and your problem might still be solved first by a i7 CPU from years ago.
...

The parallella is a lot more efficient than an old GPU

Steve Conner

Sun Nov 02 2014, 04:56PM

Registered Member #30 Joined: Fri Feb 03 2006, 10:52AM
Location: Glasgow, Scotland
Posts: 6706

Depends how you define efficient. Gigaflops per watt? Gigaflops per dollar? Hackaday skulls per week of learning curve?

hen918

Sun Nov 02 2014, 05:58PM

Registered Member #11591 Joined: Wed Mar 20 2013, 08:20PM
Location: UK
Posts: 556

GFLOPS/W (or GFLOP/J) is how you usually define the efficiency of parallel processors. I was just thinking it would be difficult to put a Nvidia GPU in a battery powered machine vision robot/device, for example.

Carbon_Rod

Mon Nov 03 2014, 08:13AM

Registered Member #65 Joined: Thu Feb 09 2006, 06:43AM
Location:
Posts: 1155

I have seen teams that had several nVidia GPU based laptops running on robot platforms.
More importantly, they actually are able to push the design cycle forward without orphan standards, or board support package problems.

The new Tegra cores are a nice little SoC, but the ASIC codecs are usually where most of the energy efficiency gains are made.
New mobile ARM core SoC processors are more common than desktops now...

1 2 next

Moderator(s): Chris Russell, Noelle, Alex, Tesladownunder, Dave Marshall, Dave Billington, Bjørn, Steve Conner, Wolfram, Kizmo, Mads Barnkob