[ml] This week: KDD, next week: Hadoop!
Andreas von Hessling
vonhessling at gmail.com
Tue May 18 08:52:27 PDT 2010
we haven't actually gotten far in running algorithms so far. To this
point you're the only one working on dimensionality reduction. I say
go for it; knock yourself out. It will be good just to get a sense
where we should focus our energy.
BTW I'll put up a description of how to set up Weka with this dataset
soon. There's some NN algorithms right in there...
On Mon, May 17, 2010 at 9:31 PM, Mike Schachter <mike at mindmech.com> wrote:
> Hey everyone!
> Just got back the other day and looking forward to meeting up Wednesday
> and hearing about Hadoop. I just read a bit through the KDD challenge, and
> was wondering if I could help out by doing something involving neural nets?
> Neural nets can be made good at generalization and prediction, and also
> reducing problem dimensionality by clustering. For example, we could
> cluster the input records into groups, and pass that group data into an SVM
> or something. Or we could use some sort of dimensionality reducing network
> and pass the dimensionally-reduced dataset to a bayesian learner (which
> wouldn't work well if the data was high dimensional).
> If someone was already thinking of doing this I'd be happy to help out,
> glean much of what happened from the meeting notes.
> See you Wednesday!
> On Wed, May 12, 2010 at 10:05 PM, Thomas Lotze <thomas.lotze at gmail.com>
>> Hello, all! There was a good meeting today where we talked about the KDD
>> dataset and plans for the next steps. I think it'll be a really good
>> opportunity for learning new tools and methods on machine learning, trading
>> knowledge and upping our collective ability! We've got plans to look at R,
>> libsvm, weka, and Hadoop to tackle the problem. I'm excited about working
>> with it, and anyone else who wants to get involved should email me, download
>> the data, and take a look at the wiki page I've put our initial plans in:
>> Next week, Vikarem will be presenting Hadoop, with some scripts and tools
>> to actually use it -- I think we're all aware of how important Hadoop
>> already is and will continue to be in the future for analyzing large data
>> sets, so I'm really glad that we've now got someone who knows about it and
>> is willing to tell us more! I think this is a really great opportunity, and
>> many thanks to Vikarem for presenting!
>> Best wishes,
>> ml mailing list
>> ml at lists.noisebridge.net
> ml mailing list
> ml at lists.noisebridge.net
More information about the ml