If you can read this, either the style sheet didn't load or you have an older browser that doesn't support style sheets. Try clearing your browser cache and refreshing the page.

(Humans Invent)   The world's first crowdsourced, affordable supercomputer   (humansinvent.com) divider line 24
    More: Cool  
•       •       •

3092 clicks; posted to Geek » on 29 Jul 2013 at 12:04 PM (51 weeks ago)   |  Favorite    |   share:  Share on Twitter share via Email Share on Facebook   more»



24 Comments   (+0 »)
   
View Voting Results: Smartest and Funniest
 
2013-07-29 12:40:09 PM
Pretty interesting. I'd like to see whether some of the applications I routinely work with will easily be built for the architecture. Although the 1GB of RAM on the board would be a major limitation.
 
2013-07-29 12:44:46 PM
Humans Invents proves again to be a shiatty website. It is complete bollocks to just add up the clock frequencies of each core to get to the total clock frequency. And no, I like the Adapteva thing, I actually have eight boards on order.
 
2013-07-29 01:26:45 PM

The wonderful travels of a turd: It is complete bollocks to just add up the clock frequencies of each core to get to the total clock frequency.


Holy shiat, thanks for saving me the time I would have spent reading that article. Christ, what a maroon.

For everyone else who's interested, just go to the Parallela website.
 
2013-07-29 02:09:07 PM
In addition to those things you guys pointed out: What makes a supercomputer a supercomputer in the first place is parallel interconnects. Multiple parallel connections to multiple boards all at once. The overhead for TCP over ethernet is too high and will bottleneck the memory and cpu communication across nodes. If they wanted to make it perfect, each board would have 4 InfiniBand connections... either 4x or 12x which would allow the kind of bandwidth across the bus necessary to prevent bottlenecks between nodes. This is nothing more than a glorified beowulf cluster with boards that contain a bunch of cores.
 
2013-07-29 02:58:24 PM
The two things a supercomputer needs are lots and lots of double precision FLOPS, and lots and lots of bandwidth to feed them.

It doesn't have double precision (only single) and, as HindiDiscoMonster pointed out, insufficient linterlinks to supply the bandwidth.  If you really want a supercomputer, try buying an AMD6950 (or Nvidia 680 or 780) and writing something seriously parallel (although with ~3G of memory, other supercomputers will kick sand in your face).

Basically, single precision is only good for problems that can be solved in real time.  Doing an FFT on audio data to 32k of samples sounds pretty bad just due to accumulated rounding errors.  Trying to do real supercomputing is just a joke (but might be great for serious DSP jobs).
 
2013-07-29 03:08:34 PM
Can it play Minecraft on high detail at a min of 60 FPS? No? Then GTFO.
 
2013-07-29 04:08:54 PM

yet_another_wumpus: The two things a supercomputer needs are lots and lots of double precision FLOPS, and lots and lots of bandwidth to feed them.

It doesn't have double precision (only single) and, as HindiDiscoMonster pointed out, insufficient linterlinks to supply the bandwidth.  If you really want a supercomputer, try buying an AMD6950 (or Nvidia 680 or 780) and writing something seriously parallel (although with ~3G of memory, other supercomputers will kick sand in your face).

Basically, single precision is only good for problems that can be solved in real time.  Doing an FFT on audio data to 32k of samples sounds pretty bad just due to accumulated rounding errors.  Trying to do real supercomputing is just a joke (but might be great for serious DSP jobs).


my dream is a collection of 12xTegra servers using the K20x chip and interlinked via InfiniBand... and if I ever win the lottery, I will build my very own supercomputer from it.

Dual Xeon G8 Quad Core systems with 128GB RAM/CPU @ PC12800 running with 12 K20X Tegra GPU Cards... they are I belive 6U systems, so figure 6 systems per rack, and lets say 10 racks. That would be 60 systems total using a fiber mapped InfiniBand interlink system @ 12X...

/aaaaah fantasies...
//good way to blow the lottery winnings.
 
2013-07-29 04:13:57 PM
Dammit... TESLA.... replace all instances of Tegra with TESLA... ie; cat lastpost | sed s/Tegra/TESLA/ > lastpost

:P
/Oh yes... I would also install the Intel Phi coprocessors as well.
 
2013-07-29 04:42:16 PM
Well, with Eben Upton (of Raspberry Pi) apparently saying that he wishes that he'd donated to it, and more and more articles devoted to the Parallela, it looks like it's not vapourware...

Now to find software to run on it...

/Ordered 2x16-cores, was hoping that they'd reach the funding minimum for the 64x
 
2013-07-29 05:28:44 PM

entropic_existence: Pretty interesting. I'd like to see whether some of the applications I routinely work with will easily be built for the architecture. Although the 1GB of RAM on the board would be a major limitation.


Yeah, at first i was thinking these might make a nice large integer factorization machine - needs only integer ops. But the final matrix reduction step of the qs or number field sieve requires a lot of memory. So maybe a pile of these doing the sieving, and 1 traditional server board parceling out the sieve tasks and combining the results to do the matrix reduction.
 
2013-07-29 05:41:53 PM
If you don't understand parallel computing, don't farking write an article about it!

This is the worst technical journalism I've ever seen, and I had a subscription to Wired.
 
2013-07-29 05:45:53 PM

HindiDiscoMonster: yet_another_wumpus: The two things a supercomputer needs are lots and lots of double precision FLOPS, and lots and lots of bandwidth to feed them.

It doesn't have double precision (only single) and, as HindiDiscoMonster pointed out, insufficient linterlinks to supply the bandwidth.  If you really want a supercomputer, try buying an AMD6950 (or Nvidia 680 or 780) and writing something seriously parallel (although with ~3G of memory, other supercomputers will kick sand in your face).

Basically, single precision is only good for problems that can be solved in real time.  Doing an FFT on audio data to 32k of samples sounds pretty bad just due to accumulated rounding errors.  Trying to do real supercomputing is just a joke (but might be great for serious DSP jobs).

my dream is a collection of 12xTegra servers using the K20x chip and interlinked via InfiniBand... and if I ever win the lottery, I will build my very own supercomputer from it.

Dual Xeon G8 Quad Core systems with 128GB RAM/CPU @ PC12800 running with 12 K20X Tegra GPU Cards... they are I belive 6U systems, so figure 6 systems per rack, and lets say 10 racks. That would be 60 systems total using a fiber mapped InfiniBand interlink system @ 12X...

/aaaaah fantasies...
//good way to blow the lottery winnings.


What would you use it for, just out of curiosity? For me, I would like to be the person to completely factor F12. But that is pretty fringe ultranerdy, even among a crowd of ubernerds. Curious what other people want with these systems.
 
2013-07-29 07:37:47 PM

HindiDiscoMonster: yet_another_wumpus: The two things a supercomputer needs are lots and lots of double precision FLOPS, and lots and lots of bandwidth to feed them.

It doesn't have double precision (only single) and, as HindiDiscoMonster pointed out, insufficient linterlinks to supply the bandwidth.  If you really want a supercomputer, try buying an AMD6950 (or Nvidia 680 or 780) and writing something seriously parallel (although with ~3G of memory, other supercomputers will kick sand in your face).

Basically, single precision is only good for problems that can be solved in real time.  Doing an FFT on audio data to 32k of samples sounds pretty bad just due to accumulated rounding errors.  Trying to do real supercomputing is just a joke (but might be great for serious DSP jobs).

my dream is a collection of 12xTegra servers using the K20x chip and interlinked via InfiniBand... and if I ever win the lottery, I will build my very own supercomputer from it.

Dual Xeon G8 Quad Core systems with 128GB RAM/CPU @ PC12800 running with 12 K20X Tegra GPU Cards... they are I belive 6U systems, so figure 6 systems per rack, and lets say 10 racks. That would be 60 systems total using a fiber mapped InfiniBand interlink system @ 12X...

/aaaaah fantasies...
//good way to blow the lottery winnings.


I'm pretty sure that there was a Dilbert sequence back in the 1990s (when it was largely a nerd chronicle) when Dilbert won the lottery (how, with his obvious understanding of the underlying math, is long since forgotten).  Dogbert explained the "law of found money" and told him to blow it on something he wanted before he blew it on things he didn't.  The last strip in that sequence ends with him talking to a Cray salesman (really dates the strip now that I think about it).
 
2013-07-29 07:50:09 PM
Allow me to crowd-source this thing to far superior device.
First someone needs to invent something like Infiniband, but with direct male/female connectivity and that passes power as well.
Put one of these connectors on each of two sides of the device plus above and below, so they can connect vertically or horizontally.

Then come up with multiple variations of the board (Unless stated otherwise all boards have one processor and 1GB of RAM)....

1.) Default... Stand-alone board that works essentially like the Parallella already works, but with the new power/data system. Acts as primary I/O device.
2.) Display... Adds dedicated HD graphics and graphics RAM + mini-display port (may not have the standard processor + RAM, depending on size needed to do this).
3.) Power... Takes two power inputs that pass on to other boards.
4.) SATA..... Connects to, and powers, two HDD's.
5.) Processor..... Has two processors and 2GB RAM instead of one.
6.) Memory..... Adds 4GB of RAM instead of just 1GB.
7.) Battery..... Adds a battery that will run the system for a short period if power is temporarily lost. This is the only board with no processor on it, it only stores and passes power, and does not pass data like the other boards. So it has to go at the start/end of a line of devices. Works well as the bottom of each column with power directly above it.

So basically fully modular computing where each piece adds exactly what the user needs, while also directly increasing the overall power of the device. Frames could be sold that hold X number of modules, to include portable ones that include a monitor, forming an oversized laptop... They would have a few extra power/data connections that could be used with cables to add more devices on the go if needed.
 
2013-07-29 08:20:36 PM

ThrobblefootSpectre: HindiDiscoMonster: yet_another_wumpus: The two things a supercomputer needs are lots and lots of double precision FLOPS, and lots and lots of bandwidth to feed them.

It doesn't have double precision (only single) and, as HindiDiscoMonster pointed out, insufficient linterlinks to supply the bandwidth.  If you really want a supercomputer, try buying an AMD6950 (or Nvidia 680 or 780) and writing something seriously parallel (although with ~3G of memory, other supercomputers will kick sand in your face).

Basically, single precision is only good for problems that can be solved in real time.  Doing an FFT on audio data to 32k of samples sounds pretty bad just due to accumulated rounding errors.  Trying to do real supercomputing is just a joke (but might be great for serious DSP jobs).

my dream is a collection of 12xTegra servers using the K20x chip and interlinked via InfiniBand... and if I ever win the lottery, I will build my very own supercomputer from it.

Dual Xeon G8 Quad Core systems with 128GB RAM/CPU @ PC12800 running with 12 K20X Tegra GPU Cards... they are I belive 6U systems, so figure 6 systems per rack, and lets say 10 racks. That would be 60 systems total using a fiber mapped InfiniBand interlink system @ 12X...

/aaaaah fantasies...
//good way to blow the lottery winnings.

What would you use it for, just out of curiosity? For me, I would like to be the person to completely factor F12. But that is pretty fringe ultranerdy, even among a crowd of ubernerds. Curious what other people want with these systems.


You know... I have no idea other than the most epic lan game of Quake Arena... j/k :P
I was thinking that it could be used for Raytracing for one thing... I love raytracing, but the problem is that it takes soo damn long and I am impatient. I want a parallel raytracer that doesn't cheat... actual raytracing, not cheats like POV/Renderman/etc... as per The Rendering equation... that's just off the top of my head... I am sure I could find many uses for it.
 
2013-07-29 08:57:06 PM
If I can make a decent emulator/media box out of this... it might be interesting.

/Raspberry Pi is cool, but has some limitations towards that task
 
2013-07-29 09:54:37 PM
Someone with javascript paste the article here. It's text, so it obviously needs javascript to display.
 
2013-07-30 02:49:23 AM

DigitalCoffee: Can it play Minecraft on high detail at a min of 60 FPS? No? Then GTFO.


Oh, come on. It can't solve the Mideast situation, either. Impossible standards are unhelpful.
 
2013-07-30 07:35:45 AM

Bisu: Someone with javascript paste the article here. It's text, so it obviously needs javascript to display.


ARTICLE (no formatting just pure text):

When we think of supercomputers, we think of giant server farms, hundreds or thousands of computers lashed together in air conditioned vaults to form a powerful hive mind with one purpose. These machines are capable of burning through seemingly infinite numbers of calculations to solve important problems, from weather forecasting to the ideal molecular structure for cancer drugs - or beating humans on TV quiz shows.

These feats are enabled by parallel computing: the means of making all the processors inside the supercomputer work together to compute in tandem. Think of it like two people doing the job of one so they can both clock off early, but on a vast scale.

Chip giants like Intel and minnows like Adapteva alike have turned to parallel computing

It's a technique we've seen emerge on a smaller scale in home computers, tablets and smartphones in recent years, with individual cores on a single processor sharing the workload to compute faster, speed up apps and improve the graphics in your favourite games.
Affordable supercomputer

It's also how the team behind the Parallella are ripping up the rulebook, crafting their own affordable supercomputer that fits in the palm of your hand and can work with others across the world.

This lowly little PCB (printed circuit board) may look like a Raspberry Pi and cost next to nothing but with its astonishing 66 cores - a top end smartphone might have four - this $99 computer can do a lot more, and the creators behind it just open-sourced it. What you're looking at here is the world's first crowdsourced supercomputer.
The Parallella is a high performance single board computer.

The Parallella is a high performance single board computer.

Where the low power 700MHz Raspberry Pi, which costs under £25, showed great restraints can inspire even greater creativity, the company behind the Parallella, Massachusetts-based Adapteva, is just hoping to blow the doors off with the amount of oomph it can pack into one £65 board instead.

When the Parallella board ships later this summer, you'll be able to buy a computer the size of a Pi, but that can compute at a staggering 45GHz frequency (Sixty-six cores all capable of cycling at a speed of 700MHz, or issuing 700 million instructions per second).

Andreas Olofsson, the founder and CEO of Adapteva, who previously worked at chip giant Texas Instruments, believes parallel computing is the only way to go. He says, "We're managing to fit more and more transistors onto our processors (and so get faster and faster) every year, but we're not far off atomic levels at this point, and pending some revolutionary scientific breakthrough, that's a brickwall.

"We had one processor running one task at a time and that's worked great, but then we hit a frequency wall, and then we hit a memory bottleneck and things just stopped."
Multicore coprocessor

So chip giants like Intel and minnows like Adapteva alike have turned to parallel computing: having different parts of the processor do more calculations at the same time, instead of having one central part of the processor doing all the calculations one after the other at a slightly faster rate.

[It's] a computer the size of a Pi that can compute at a staggering 45GHz frequency

It's something we've seen modern computers take advantage of in recent years - Intel's first dual-core PC chips for everyday Windows machines debuted in 2006 - but seldom on the scale Adapteva offers with its Epiphany multicore coprocessor. The $99 Parallela board packs in Ethernet, two USB ports, a HDMI connection, two ARM cores like your smartphone, and 16 extra Epiphany cores. The $750 version packs in 64 extra Epiphany cores.

"It's the only way to really scale in terms of energy efficiency, performance and cost," Olofsson says.

In other words, depending on what you need to compute, (medical or image data crunching, say, or scientific calculations), what you've got here is 64 Raspberry Pi computers on the same die, all working in perfect harmony.

They could be used for anything from speeding up and improving the quality of video chat to analysing imagery from drones in the sky (hey, they're not just for the military you know), realtime speech translation or even making advances in self-driving cars. After all, vehicles' like Google's driverless cars need to constantly calculate many factors when they're on the road, from speed to proximity to other drivers and road conditions.

Of course, not all computing tasks require parallel processing - in the same way not every job can be performed better by two people than one. But plenty more could be if they were programmed to, and Adapteva, which successfully Kickstarted the project late last year, raising a huge $898,921 from 4,965 backers, says that the Epiphany architecture can be programmed in much the same way as other chips are today.
The parallel computing problem

Technically, one Parallella board on its own does not meet today's definition of a supercomputer: its 90 gigaFLOPS output (think of this as a measure of calculations per second. Billions of them) pales in comparison to the 33.86 petaFLOPs (Er, quadrillions) which China's massive Tianhe-2 machine, the world's most powerful supercomputer, can muster. But then that monster does require 16,000 nodes, each with five expensive desktop processors inside (by contrast, each Parallella draws a mere five Watts of power).

On the other hand, ten Parallella boards working together would have been considered a supercomputer 10 years ago. But that's almost not the point: Adapteva wants to spearhead a movement that makes this technology accessible to all, not create a machine that charts on a geeky list of gigantic mainframes for a few short months, and can only be accessed by PhD students in a lab.

Once it's done that, the team will have laid the groundwork for its moonshot leap into silicon history: boards containing multiple chips on a die with 1,024 cores on each, a goal it hopes to achieve by around 2018. "At that point, there should be no question that the Parallella would qualify as a true supercomputing platform," the company claims.

The trick is getting the Parallella into the hands of programmers beyond early adopters. Adapteva hopes to solve the problem of cost in a unique way in the shady world of server farms and cluster computing: by giving away its designs and software. In June this year, it open-sourced the Parallella project on code repository GitHub, posting everything from the board designs to the Linux kernel it runs. It also encouraged Kickstarter backers to donate a cluster of Parallella boards to school computer clubs.

After all, servers and supercomputers are expensive: even the US army thinks so. In 2009, the American Department of Defense famously bought up 2,200 game consoles to power a military supercomputer (consoles are usually sold at a loss so hacking them to perform supercomputer calculations can be cheaper than buying dedicated servers).

If even Uncle Sam wants to save money on a supercomputer, we think the Parallella project might just be the start for Adapteva.
For more information go to Parallela
For related articles on Humans Invent please read:
RaspLogic: Engineering your home automation system
The computer that can't crash
Steve Furber: Building a computer to mimic the brain
The Antikythera mechanism: Inside the world's first computer
Inside The Bunker: Europe's most secure data centre
Raspberry Pi brings IT learning to the masses
Phil Edmonds: I've been coding since I was 12
The 3D journey: Inventing a real-life holodeck
Jon Nonweiler: Inventing an end to the daily commute
The 24/7 inventor: Building a robot lawnmower at home
Pocket diagnosis: The express blood test tech of the future
Ben Hadwen: The man revolutionising blood tests
The machine that grows gadgets
 
2013-07-30 08:11:21 AM
I have no clue to what any of what was just said, but I'd like to ask you guys here:

Will a bunch of these make-do for a top level GPU? For (1) Games, and (2) batch processing times in Photoshop CS6.

Thanks in advance!
 
2013-07-30 12:04:26 PM

uttertosh: I have no clue to what any of what was just said, but I'd like to ask you guys here:

Will a bunch of these make-do for a top level GPU? For (1) Games, and (2) batch processing times in Photoshop CS6.

Thanks in advance!


No. This processor has a generic instruction set geared toward general problem solving. A GPU has an instruction set geared specifically for certain math, and operations involved in graphics processing. Such as single instructions for base 2 logs, vector dot products, texture table lookups, etc. Which means the underlying algorithm is implemented in dedicated hardware (mucho faster). The memory bus (path between processor and memory) on a video board is wide and fast also, which is very important. Not so sure about memory bus this thing, didn't research it to that detail.
 
2013-07-30 12:21:23 PM

ThrobblefootSpectre: uttertosh: I have no clue to what any of what was just said, but I'd like to ask you guys here:

Will a bunch of these make-do for a top level GPU? For (1) Games, and (2) batch processing times in Photoshop CS6.

Thanks in advance!

No. This processor has a generic instruction set geared toward general problem solving. A GPU has an instruction set geared specifically for certain math, and operations involved in graphics processing. Such as single instructions for base 2 logs, vector dot products, texture table lookups, etc. Which means the underlying algorithm is implemented in dedicated hardware (mucho faster). The memory bus (path between processor and memory) on a video board is wide and fast also, which is very important. Not so sure about memory bus this thing, didn't research it to that detail.


Glad I refreshed the thread before I commented. You said it way better than I was gonna.
 
2013-07-30 12:26:59 PM

ThrobblefootSpectre: No

+ [stuff]

Thx! :-))
 
2013-07-30 12:31:39 PM
Oh, and FWIW, I love fark because I can be smart, funnay and ignorant, all with the one login/password!!

Socialism works!
 
Displayed 24 of 24 comments

View Voting Results: Smartest and Funniest


This thread is closed to new comments.

Continue Farking
Submit a Link »






Report