Machine Learning
Jump to navigation
Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Noisebridge | About | Visit | 272 | Manual | Contact | Guilds | Resources | Events | Projects | 5MoF | Meetings | Donate | (Edit) |
Guilds | Meta | Code | Electronics | Fabrication | Games | Sewing | Music | AI | Neuro | Philosophy | Funding | Art | Security | Ham | Brew | Edit |
AI | Machine Learning | Botbridge | DreamTeam | ML Tools | Edit |
AI and reinforcement learning meetup at Noisebridge Wednesdays at 8pm.
|
Join the Mailing List
https://www.noisebridge.net/mailman/listinfo/ml
History
Machine Learning groups have been perennial at Noisebridge, accumulating knowledge, projects and meeting notes since 2008.
- Some of our info links may be outdated, so let us know if anything is wrong and edit the wiki as needed.
Past Teachers
- Andy McMurry
Learn about Data Science and Machine Learning
Classes
- Coursera Machine Learning Course with Andrew Ng
- Coursera Computational Neuroscience Course with Rajesh P N Rao and Adrienne Fairhall
- MIT Machine Learning Class with Tommi Jaakkola
- Stanford CS229
- Carnegie Mellon Machine Learning Course with Tom Mitchell
- Linear Algebra with Gilbert Strang
- Neural Networks Class with Hugo Larochelle
Books
- Elements of Statistical Learning
- Pattern Recognition and Machine Learning
- Information Theory, Inference, and Learning Algorithms
- Interactive Data Visualization for the Web (D3)
- Introduction to R
- Introduction to Probability (Grinstead and Snell)
- Modern Introduction to Probability and Statistics (Kraaikamp and Meester)
- Bayesian Reasoning and Machine Learning
Tutorials
- Signal Processing IPython Notebooks
- Introduction to ML with scikits.learn
- Learn how to use SAGE
- Online Machine Learning Courses
Noisebridge ML Class Slides
- Intro to Machine Learning
- A Brief Tour of Statistics
- Generalized Linear Models
- Neural Nets Workshop
- Support Vector Machines
- Random Forests
- Independent Components Analysis
- Deep Nets
Code and SourceForge Site
- We have a Sourceforge Project
- We have a git repository on the project page, accessible as:
git clone git://ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge
- Send an email to the list if you want to become an administrator on the site to get write access to the git repo!
Future Talks and Topics, Ideas
- Random Forests in R
- Restricted Boltzmann Machines (Mike S, some day)
- Analyzing brain cells (Mike S)
- Deep Nets w/ Stacked Autoencoders (Mike S, some day)
- Generalized Linear Models (Mike S, Erin L? some day)
- Graphical Models
- Working with the Kinect
- Computer Vision with OpenCV
Projects
- Small Group Subproblems
- Fundraising
- Noisebridge Machine Learning Course
- Kaggle Social Network Contest
- KDD Competition 2010
- HIV
Datasets and Websites
- UCI Machine Learning Repository
- DataSF.org
- Infochimps
- Face Recognition Databases
- Time Series Data Library
- Data Q&A Forum
- Metaoptimize
- Quora ML Page
- A ton of Weather Data
- MLcomp
- Upload your algorithm and objectively compare it's performance to other algorithms
- Social Security Death Master File!
- SIPRI Social Databases
- Wealth of information on international arms transfers and peace missions.
- Amazon AWS Public Datasets
- UCDP/PRIO Armed Conflict Datasets
- Socrata Government Datasets
- US City Census Data
- Yahoo Labs Datasets
Software Tools
Generic ML Libraries
- Weka
- a collection of data mining tools and machine learning algorithms.
- scikits.learn
- Machine learning Python package
- scikits.statsmodels
- Statistical models to go with scipy
- PyBrain
- Does feedforward, recurrent, SOM, deep belief nets.
- LIBSVM
- c-based SVM package
- PyML
- MDP
- Modular framework, has lots of stuff!
- VirtualBox Virtual Box Image with Pre-installed Libraries listed here
- sympy Does symbolic math
- Waffles
- Open source C++ set of machine learning command line tools.
- RapidMiner
- Mobile Robotic Programming Toolkit
- nitime
- NeuroImaging in Python, has some good time series analysis stuff and multi-variate response fitting.
- Pandas
- Data analysis workflow in python
- PyTables
- Adds querying capabilities to HDF5 files
- statsmodels
- Regression, time series analysis, statistics stuff for python
- Vowpal Wabbit
- "Intrinsically Fast" implementation of gradient descent for large datasets
- Shogun
- Fast implementations of SVMs
- MLPACK
- High performance scalable ML Library
- Torch
- MATLAB-like environment for state-of-the art ML libraries written in LUA
Deep Nets
- Theano
- Symbolic Expressions and Transparent GPU Integration
- Caffe
- Convolutional Neural Networks on GPU
- Neurolab
- Has support for recurrent neural nets
Online ML
- MOA (Massive Online Analysis)
- Offshoot of weka, has all online-algorithms
- Jubatus
- Distributed Online ML
- DOGMA
- MATLAB-based online learning stuff
- libol
- oll
- scw-learning
Graphical Models
- BUGS
- MCMC for Bayesian Models
- JAGS
- Hierarchical Bayesian Models
- Stan
- A graphical model compiler
- Jayes
- Bayesian networks in Java
- ToPS
- Probabilistic models of sequences
- PyMC
- Bayesian Models in Python
Text Stuff
- Beautiful Soup
- Screen-scraping tools
- SALLY
- Tool for embedding strings into vector spaces
- Gensim
- Topic modeling
Collaborative Filtering
- PREA
- Personalized Recommendation Algorithms Toolkit
- SVDFeature
- Collaborative Filtering and Ranking Toolkit
Computer Vision
- OpenCV
- Computer Vision Library
- Has ML component (SVM, trees, etc)
- Online tutorials here
- DARWIN
- Generic C++ ML and Computer Vision Library
- PetaVision
- Developing a real-time, full-scale model of the primate visual cortex.
Audio Processing
- Friture
- Real-time spectrogram generation
- pyo
- Real-time audio signal processing
- PYMir
- A library for reading mp3's into python, and doing analysis
- PRAAT
- Speech analysis toolkit
- Sound Analysis Pro
- Tool for analyzing animal sounds
- Luscinia
- Software for archiving, measuring, and analyzing bioacoustic data
- List of Sound Tools for Python
- Jasper
- Voice-control anything!
Data Visualization
- Orange
- Strong data visualization component
- Gephi
- Graph Visualization
- ggplot
- Nice plotting package for R
- MayaVi2
- 3D Scientific Data Visualization
- Cytoscape
- A JavaScript graph library for analysis and visualisation
- plot.ly
- Web-based plotting
- D3 Ebook
- Has a good list of HTML/CSS/Javascript data visualization tools.
- plotly
- Python plotting tool
Cluster Computing
- Mahout
- Hadoop cluster based ML package.
- STAR: Cluster
- Easily build your own Python computing cluster on Amazon EC2
Database Stuff
Neural Simulation
Other
Presentations and other Materials
- Awesome Machine Learning Applications -- A list of cool applications of ML
- Hands-on Machine Learning, a presentation jbm gave on 2009-01-07.
- http://www.youtube.com/user/StanfordUniversity#g/c/A89DCFA6ADACE599 Stanford Machine Learning online course videos]
- Media:Brief_statistics_slides.pdf, a presentation given on statistics for the machine learning group
- LinkedIn discussion on good resources for data mining and predictive analytics
- Face Recognition Algorithms
- Max Welling's ML classnotes
Topics to Learn and Teach
NBML Course - Noisebridge Machine Learning Curriculum (work-in-progress)
CS229 - The Stanford Machine learning Course @ noisebridge
- Supervised Learning
- Linear Regression
- Linear Discriminants
- Neural Nets/Radial Basis Functions
- Support Vector Machines
- Classifier Combination [2]
- A basic decision tree builder, recursive and using entropy metrics
- Unsupervised Learning
- Hidden Markov Models
- Clustering: PCA, k-Means, Expectation-Maximization
- Graphical Modeling
- Generative Models: gaussian distribution, multinomial distributions, HMMs, Naive Bayes
- Deep Belief Networks & Restricted Boltzmann Machines
- Reinforcement Learning
- Temporal Difference Learning
- Math, Probability & Statistics
- Metric spaces and what they mean
- Fundamentals of probabilities
- Decision Theory (Bayesian)
- Maximum Likelihood
- Bias/Variance Tradeoff, VC Dimension
- Bagging, Bootstrap, Jacknife [3]
- Information Theory: Entropy, Mutual Information, Gaussian Channels
- Estimation of Misclassification [4]
- No-Free Lunch Theorem [5]
- Machine Learning SDK's
- Applications
- Collective Intelligence & Recommendation Engines