# Machine Learning

## Contents

### About Us

We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you participate? Join the mailing list, send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.

### Next Meeting

- When: Wednesday, 8/3/2011 @ 7:30-9:00pm
- Where: 2169 Mission St. (back corner classroom)
- Topic: Deep Belief Nets Part I: PyBrain Installfest!
- Details: We're going to install PyBrain and check out their Restricted Boltzmann Machine implementation.

### Talks and Workshops

We've given lots of workshops and talks over the past year or so, here's a few. Many of the workshops we've given previously are recurring and will be given again, especially upon request!

- Intro to Machine Learning
- A Brief Tour of Statistics
- Generalized Linear Models
- Neural Nets Workshop
- Support Vector Machines
- Random Forests
- Independent Components Analysis
- Deep Nets

### Code and SourceForge Site

- We have a Sourceforge Project
- We have a git repository on the project page, accessible as:

git clone git://ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge

- Send an email to the list if you want to become an administrator on the site to get write access to the git repo!

### Future Talks and Topics, Ideas

- Restricted Boltzmann Machines (Mike S, late August)
- Deep Nets w/ Stacked Autoencoders (Mike S, September)
- Generalized Linear Models (Mike S, September/October)
- Graphical Models (Tony)
- Working with the Kinect
- Computer Vision with OpenCV

### Mailing List

https://www.noisebridge.net/mailman/listinfo/ml

### Projects

- Small Group Subproblems
- Fundraising
- Noisebridge Machine Learning Course
- Kaggle Social Network Contest
- KDD Competition 2010
- HIV

### Datasets and Websites

- UCI Machine Learning Repository
- DataSF.org
- Infochimps
- Face Recognition Databases
- Time Series Data Library
- Data Q&A Forum
- Metaoptimize
- Quora ML Page
- A ton of Weather Data
- MLcomp
- Upload your algorithm and objectively compare it's performance to other algorithms

### Software Tools

- OpenCV
- Computer Vision Library
- Has ML component (SVM, trees, etc)
- Online tutorials here

- Mahout
- Hadoop cluster based ML package.

- Weka
- a collection of data mining tools and machine learning algorithms.

- MOA (Massive Online Analysis)
- Offshoot of weka, has all online-algorithms

- scikits.learn
- Machine learning Python package

- PyBrain
- Does feedforward, recurrent, SOM, deep belief nets.

- LIBSVM
- c-based SVM package

- PyML
- MDP
- Does not stand for Markov Decision Process :(

- Orange
- Strong data visualization component

- Journal of Machine Learning Software List
- VirtualBox Virtual Box Image with Pre-installed Libraries listed here
- Theano: Symbolic Expressions and Transparent GPU Integration
- Waffles
- Open source C++ set of machine learning command line tools.

- RapidMiner
- Mobile Robotic Programming Toolkit
- Gephi
- Graph Visualization

- ggplot
- Nice plotting package for R

- nitime
- NeuroImaging in Python, has some good time series analysis stuff and multi-variate response fitting.

### Presentations and other Materials

- Awesome Machine Learning Applications -- A list of cool applications of ML
- Hands-on Machine Learning, a presentation jbm gave on 2009-01-07.
- http://www.youtube.com/user/StanfordUniversity#g/c/A89DCFA6ADACE599 Stanford Machine Learning online course videos]
- Media:Brief_statistics_slides.pdf, a presentation given on statistics for the machine learning group
- LinkedIn discussion on good resources for data mining and predictive analytics
- Face Recognition Algorithms

### Topics to Learn and Teach

NBML Course - Noisebridge Machine Learning Curriculum (work-in-progress)

CS229 - The Stanford Machine learning Course @ noisebridge

- Supervised Learning
- Linear Regression
- Linear Discriminants
- Neural Nets/Radial Basis Functions
- Support Vector Machines
- Classifier Combination [1]
- A basic decision tree builder, recursive and using entropy metrics

- Unsupervised Learning
- Hidden Markov Models
- Clustering: PCA, k-Means, Expectation-Maximization
- Graphical Modeling
- Generative Models: gaussian distribution, multinomial distributions, HMMs, Naive Bayes
- Deep Belief Networks & Restricted Boltzmann Machines

- Reinforcement Learning
- Temporal Difference Learning

- Math, Probability & Statistics
- Metric spaces and what they mean
- Fundamentals of probabilities
- Decision Theory (Bayesian)
- Maximum Likelihood
- Bias/Variance Tradeoff, VC Dimension
- Bagging, Bootstrap, Jacknife [2]
- Information Theory: Entropy, Mutual Information, Gaussian Channels
- Estimation of Misclassification [3]
- No-Free Lunch Theorem [4]

- Machine Learning SDK's

- Applications
- Collective Intelligence & Recommendation Engines