[ml] Fwd: [Noisebridge-discuss] I'm working on a series of articles about algorithms - data mining experts sought asap

Mike Schachter mschachter at eigenminds.com
Sun Jan 20 18:35:44 UTC 2013


---------- Forwarded message ----------
From: Jim Youll <jim at agentzero.com>
Date: Sat, Jan 19, 2013 at 3:50 PM
Subject: [Noisebridge-discuss] I'm working on a series of articles about
algorithms - data mining experts sought asap
To: noisebridge-discuss at lists.noisebridge.net


Hi Noisebridge.

I'm an SF-based tech geek and occasional writer who is working on a series
of articles for Fast Company (http://fastcompany.com/) about the algorithms
selected in the 2007 paper "Top 10 algorithms in data mining" (which is
locked behind an Springer-Verlag paywall, apropos of other more important
matters going on right now). Citation, abstract, and a list of the
algorithms are pasted at the end of this note; contact me if you need a
copy of the paper, of course.

Our primary goal is to give Fast Company's readers a quick primer on each
of the algorithms in a way that is accessible to a business/non-expert
audience.

The articles will be brief, a few hundred words. l'm looking for data
mining / domain experts who  are particularly adept at explaining
complicated math and science in layman's terms. Interviews can take place
in person in SF, via e-mail, IM, Skype, or phone, whatever is most
convenient.

To elaborate, here is what I'm in search of for each algorithm:
        - "What it does" in plain language, maybe with a simple example
        - How the algorithm changed "everyday" practice when it emerged, or
what it enabled that wasn't possible before
        - Pointers to companies, services, or even /types/ of services
where the algorithm is likely in operation today

We can work by e-mail, phone, or Skype - whatever is most convenient.

Please feel free to forward this note to folks whom you believe may be a
good fit for this project.

An ideal person would have expert knowledge of one or more of these
algorithms, a talent for explaining really technical stuff to regular
people, 20 minutes to spare for an interview, and an interest in possibly
being quoted in a Fast Company article. ;) I know I've seen people like
that around Noisebridge - lots of them - I'm just not sure who's available
in the next week or two.

Thanks!
- jim
jim at agentzero.com


----

Knowl Inf Syst (2008) 14:1–37 DOI 10.1007/s10115-007-0114-2
SURVEY PAPER
Top 10 algorithms in data mining
Xindong Wu · Vipin Kumar · J. Ross Quinlan · Joydeep Ghosh · Qiang Yang ·
Hiroshi Motoda · Geoffrey J. McLachlan · Angus Ng · Bing Liu · Philip S. Yu
· Zhi-Hua Zhou · Michael Steinbach · David J. Hand · Dan Steinberg
Received: 9 July 2007 / Revised: 28 September 2007 / Accepted: 8 October
2007 Published online: 4 December 2007
© Springer-Verlag London Limited 2007
Abstract This paper presents the top 10 data mining algorithms identified
by the IEEE International Conference on Data Mining (ICDM) in December
2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive
Bayes, and CART. These top 10 algorithms are among the most influential
data mining algorithms in the research community. With each algorithm, we
provide a description of the algorithm, discuss the impact of the
algorithm, and review current and further research on the algorithm. These
10 algorithms cover classification, clustering, statistical learning,
association analysis, and link mining, which are all among the most
important topics in data mining research and development.

ALGORITHMS
1 C4.5 and beyond
2 The k-means algorithm
3 Support vector machines
4 The Apriori algorithm
5 The EM algorithm
6 PageRank
7 AdaBoost
8 kNN: k-nearest neighbor classification
9 Naive Bayes
10 CART



_______________________________________________
Noisebridge-discuss mailing list
Noisebridge-discuss at lists.noisebridge.net
https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.noisebridge.net/pipermail/ml/attachments/20130120/47df723c/attachment.html>


More information about the ml mailing list