[Noisebridge-discuss] Build advice for a new system / heavy cluster GPU AI processing?
mike at mindmech.com
Mon Jul 11 20:15:17 PDT 2011
(Thanks for posting Brian I forgot about this message whil
I was on vacation)
The grid search is your problem! It's unavoidable when you're
doing cross validation though, because you definitely want the
parameters that give you the lowest generalization error. You're
doing cross validation, right?
Although a GPU will help individual instances of training the
SVM classifer, in general you should parallelize the grid search
across cores. Specifically, train an SVM classifer per hyperparameter
combination (kernel, bin size, etc).
Also, SVM kind of sucks for multi-class classification. Have you
considered random forests?
On Sat, Jul 9, 2011 at 12:45 AM, Sai <sai at saizai.com> wrote:
> Hi all.
> I've been running a very heavy classification AI project, and well…
> it's taking too fucking long to be realistic.
> I'm consequently thinking of
> a) upgrading my home desktop to something more beefy — it's 3 years
> old  — to something newer, and/or
> b) using something like EC2's cluster GPU spot instances  together
> with matlab parallel computing toolkit
> I've looked into GPU enabled libsvm a bit - libsvm is the main thing
> I'm currently using - and I've found a couple variant libraries 
> that use nVidia's CUDA API to get 3-50x improvement over CPU-only
> efficiency. I haven't found any non-beta ones that use opencl , and
> I am not familiar with that level of programming so can't currently
> create one. I'm willing learn it if needed, but that'd be a
> significant investment.
> AFAICT from talking w/ Ryan, CUDA is nVidia only but more well
> advanced and supported by classifier libraries, whereas opencl is
> supported by both Radeon & nVidia, possibly by more stuff in the
> future from being compatible, but currently not as well by the
> libraries I'm looking at.
> Another option on GPUs of course is a mobo that supports multiple full
> GPU cards and have one of each type.
> Unfortunately I've not been following the hardware market at all for
> the last 3 years, so I have no idea what the current sweet spots are
> for the various combinations of mobo, CPU, GPU, etc.
> So, I'd appreciate your advice:
> 1. Are there any good libraries or methods I may have missed that
> would be more efficient for my purposes, or compatible w/ both major
> GPU brands? (I'm open to non-SVM / MATLAB stuff too.)
> My knowledge of classifier AIs is what I would call basic (though more
> advanced than most); I don't have the math chops to really grok the
> harder aspects of the linear algebra involved and have only studied it
> so far as much as has been needed to get projects bootstrapped. I
> would however be interested in learning a great deal more, so pointers
> to good textbooks or whitepapers that would get me bootstrapped better
> would be appreciated.
> 2. Is it worth upgrading my system vs renting EC2 instances?
> I'd rather have hardware I can keep, and local accessibility of it,
> but I'm not sure how much of a premium that'll cost me.
> 3. If I do decide to upgrade my system, what's the current optimum
> "sweet spot" build of reasonably priced hardware?
> Probably the biggest constraint is that I primarily run OSX86 (sorry,
> I like it a lot more than any other OS I've used); preferably it
> should cost less than ~$3k (mind, I already have perfectly fine case,
> displays, sound system, HDs - we're only talking about internal
> It has to work as my day-to-day desktop machine (so e.g. driver
> compatibility w/ OSX86, Win 7, & Kubuntu), support at least 2 and
> preferably more monitors, sound system, SATA drives, etc (the usual
> stuff), as well as being capable of delivering a fair amount of power
> for the AI processing. Y'all are fellow hackers, so you probably
> already have good ideas of what you'd want out of your own systems,
> and that's probably reasonably close to my wants.
>  It's an 8-class pattern classification problem on a few thousand
> samples of direct-lead acquired neural firing traces, to investigate
> both the "time code" vs "pattern code" theories of neural firing and
> various aspects of the classification, like accuracy / compute time
> Testing just one C/G pairing of one of the implementations - and
> tuning optimization of the two using a simple hillclimbing grid search
> algorithm requires testing a lot of 'em - is taking 17 hours (w/ one
> CPU core, CPU bound).
> And I want to test the Cartesian cross of several different
> vectorizations, bin sizes, and SVM kernels. It's just not feasible at
> this speed.
> I'm reasonably sure that I'm not doing anything *too* stupid;
> basically all the time is being spent in libsvm itself. No swapping or
> other bottlenecks in my own code.
>  Current build:
> GIGABYTE GA-EP35-DS3L LGA 775 Intel P35 ATX Intel Motherboard
> Intel Core 2 Quad Q6600 Kentsfield 2.4GHz LGA 775 Quad-Core Processor
> Model BX80562Q6600
> G.SKILL 4GB (2 x 2GB) 240-Pin DDR2 SDRAM DDR2 1066 (PC2 8500) Dual
> Channel Kit Desktop Memory Model F2-8500CL5D-4GBPK
> MSI NX8800GT 512M OC GeForce 8800GT 512MB 256-bit GDDR3 PCI Express
> 2.0 x16 HDCP Ready SLI Supported Video Card
> Antec Nine Hundred Black Steel ATX Mid Tower Computer Case
> PC Power & Cooling Silencer 750 Quad (Red) 750W EPS12V Power Supply
>  http://aws.amazon.com/ec2/hpc-applications/
>  http://mklab.iti.gr/project/GPU-LIBSVM
>  https://code.ac.upc.edu/projects/nnvect/blog/author/ijurado
> Noisebridge-discuss mailing list
> Noisebridge-discuss at lists.noisebridge.net
More information about the Noisebridge-discuss