[Noisebridge-discuss] Build advice for a new system / heavy cluster GPU AI processing?
cymraegish at gmail.com
Mon Jul 11 18:20:16 PDT 2011
Your machine has pcie 16x v2 which is what you need for new GPGPU card. Word
is two cores per gpu (but some cards are dual gpu, not sure how that works)
You can get a double precision consumer grade card for about $400. 1
teraflop / 1.5 GB
There are openCL bindings for Python and Java and Linear Algebra in MacOSX,
drivers and low level support are built in off the shelf. Linux support is
compatible in testing Debian / new release Ubuntu.
The next version of osX will much better / easier program openCL by
integration with Grand Central Dispatch.
Upgrading to a quad i7 will only give you only about a 2x speedup. But you
are running on one core currently ? Uchh, fix that first.
OpenCL can also help you manage your multi cores that is part of the
advantage of it.
I am just starting out with this, trying to avoid CUDA, long term for me not
On Sat, Jul 9, 2011 at 12:45 AM, Sai <sai at saizai.com> wrote:
> Hi all.
> I've been running a very heavy classification AI project, and well…
> it's taking too fucking long to be realistic.
> I'm consequently thinking of
> a) upgrading my home desktop to something more beefy — it's 3 years
> old  — to something newer, and/or
> b) using something like EC2's cluster GPU spot instances  together
> with matlab parallel computing toolkit
> I've looked into GPU enabled libsvm a bit - libsvm is the main thing
> I'm currently using - and I've found a couple variant libraries 
> that use nVidia's CUDA API to get 3-50x improvement over CPU-only
> efficiency. I haven't found any non-beta ones that use opencl , and
> I am not familiar with that level of programming so can't currently
> create one. I'm willing learn it if needed, but that'd be a
> significant investment.
> AFAICT from talking w/ Ryan, CUDA is nVidia only but more well
> advanced and supported by classifier libraries, whereas opencl is
> supported by both Radeon & nVidia, possibly by more stuff in the
> future from being compatible, but currently not as well by the
> libraries I'm looking at.
> Another option on GPUs of course is a mobo that supports multiple full
> GPU cards and have one of each type.
> Unfortunately I've not been following the hardware market at all for
> the last 3 years, so I have no idea what the current sweet spots are
> for the various combinations of mobo, CPU, GPU, etc.
> So, I'd appreciate your advice:
> 1. Are there any good libraries or methods I may have missed that
> would be more efficient for my purposes, or compatible w/ both major
> GPU brands? (I'm open to non-SVM / MATLAB stuff too.)
> My knowledge of classifier AIs is what I would call basic (though more
> advanced than most); I don't have the math chops to really grok the
> harder aspects of the linear algebra involved and have only studied it
> so far as much as has been needed to get projects bootstrapped. I
> would however be interested in learning a great deal more, so pointers
> to good textbooks or whitepapers that would get me bootstrapped better
> would be appreciated.
> 2. Is it worth upgrading my system vs renting EC2 instances?
> I'd rather have hardware I can keep, and local accessibility of it,
> but I'm not sure how much of a premium that'll cost me.
> 3. If I do decide to upgrade my system, what's the current optimum
> "sweet spot" build of reasonably priced hardware?
> Probably the biggest constraint is that I primarily run OSX86 (sorry,
> I like it a lot more than any other OS I've used); preferably it
> should cost less than ~$3k (mind, I already have perfectly fine case,
> displays, sound system, HDs - we're only talking about internal
> It has to work as my day-to-day desktop machine (so e.g. driver
> compatibility w/ OSX86, Win 7, & Kubuntu), support at least 2 and
> preferably more monitors, sound system, SATA drives, etc (the usual
> stuff), as well as being capable of delivering a fair amount of power
> for the AI processing. Y'all are fellow hackers, so you probably
> already have good ideas of what you'd want out of your own systems,
> and that's probably reasonably close to my wants.
>  It's an 8-class pattern classification problem on a few thousand
> samples of direct-lead acquired neural firing traces, to investigate
> both the "time code" vs "pattern code" theories of neural firing and
> various aspects of the classification, like accuracy / compute time
> Testing just one C/G pairing of one of the implementations - and
> tuning optimization of the two using a simple hillclimbing grid search
> algorithm requires testing a lot of 'em - is taking 17 hours (w/ one
> CPU core, CPU bound).
> And I want to test the Cartesian cross of several different
> vectorizations, bin sizes, and SVM kernels. It's just not feasible at
> this speed.
> I'm reasonably sure that I'm not doing anything *too* stupid;
> basically all the time is being spent in libsvm itself. No swapping or
> other bottlenecks in my own code.
>  Current build:
> GIGABYTE GA-EP35-DS3L LGA 775 Intel P35 ATX Intel Motherboard
> Intel Core 2 Quad Q6600 Kentsfield 2.4GHz LGA 775 Quad-Core Processor
> Model BX80562Q6600
> G.SKILL 4GB (2 x 2GB) 240-Pin DDR2 SDRAM DDR2 1066 (PC2 8500) Dual
> Channel Kit Desktop Memory Model F2-8500CL5D-4GBPK
> MSI NX8800GT 512M OC GeForce 8800GT 512MB 256-bit GDDR3 PCI Express
> 2.0 x16 HDCP Ready SLI Supported Video Card
> Antec Nine Hundred Black Steel ATX Mid Tower Computer Case
> PC Power & Cooling Silencer 750 Quad (Red) 750W EPS12V Power Supply
>  http://aws.amazon.com/ec2/hpc-applications/
>  http://mklab.iti.gr/project/GPU-LIBSVM
>  https://code.ac.upc.edu/projects/nnvect/blog/author/ijurado
> Noisebridge-discuss mailing list
> Noisebridge-discuss at lists.noisebridge.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Noisebridge-discuss