Machine Learning: Difference between revisions

From Noisebridge
Jump to navigation Jump to search
m (Correct link, to User:Culteejen*)
 
(95 intermediate revisions by 21 users not shown)
Line 1: Line 1:
=== Next Meeting===
{{ai}}


*When:
*Where: 2169 Mission St. (back NE corner, Church classroom)
*Topic:
*Details: Currently on hiatus until somebody decides to pick it back up!
*Who:


=== Take the Noisebridge ML Survey ===
{{headerbox}}<font size=5>AI and reinforcement learning meetup at Noisebridge Wednesdays at 8pm.</font>
[http://www.surveymonkey.com/s/W2T9ZB6 Take a survey] and vote for what you want to learn!
*[https://www.meetup.com/noisebridge/events/kpsdrsyccqblb/ AI and Reinforcement Learning Meetup page]
*'''WHEN:''' Wednesdays at 8:00pm
*'''WHERE:''' 272 Capp St. (Church classroom)
*'''WHO:''' Anyone interested in learning about artificial intelligence, machine learning and related topics.
*'''CHANNELS:''' Join the [https://www.noisebridge.net/mailman/listinfo/ml|https://www.noisebridge.net/mailman/listinfo/ml] mailing list. #ai on [[Discord]] and [[Slack]]
* '''MAINTAINERS:''' TJ/[[User:Culteejen]], [[User:Ryan_L]]
* '''NOTES:''' [[Machine Learning/Meeting Notes|Meeting Notes]]
{{boxend}}


=== Crowdsourced Q&A ===
=== Join the Mailing List ===
Are you working on a data mining, machine learning, or statistics problem? Do you want some help? Consider sending an email to the [https://www.noisebridge.net/mailman/listinfo/ml mailing list] about it! Also consider setting up a day to come in and talk about the project you're working on and get input from <span class="plainlinks">[http://www.andrewflusche.com/services/spotsylvania-reckless-driving-defense/<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">Spotsylvania reckless driving</span>] other ML people.


=== About Us ===
https://www.noisebridge.net/mailman/listinfo/ml
We're a loosely-knit stochastic federation of people who like Noisebridge and like machine learning. What is machine learning? It's broad field that typically involves training computer models to solve problems. How can you <span class="plainlinks">[http://www.monoloop.com<span style="color:black;font-weight:normal; text-decoration:none!important; background:none!important; text-decoration:none;">website personalization</span>] participate? Join the [https://www.noisebridge.net/mailman/listinfo/ml mailing list], send an email and introduce yourself. Show up to the next meeting, share your thoughts. Participate in projects or start your own. Go to workshops, write code at workshops, learn stuff, give workshops of your own! All are welcome.
 
== History ==
Machine Learning groups have been perennial at Noisebridge, accumulating knowledge, projects and meeting notes since 2008.  
* Some of our info links may be outdated, so let us know if anything is wrong and edit the [[wiki]] as needed.
 
=== Past Teachers ===
*Andy McMurry


=== Talks and Workshops ===
=== Learn about Data Science and Machine Learning ===
We've given lots of workshops and talks over the past year or so, here's a few. Many of the workshops we've given previously are recurring and will be given again, especially upon request!
 
===== Classes =====
*[https://www.coursera.org/course/ml Coursera Machine Learning Course with Andrew Ng]
*[https://www.coursera.org/course/compneuro Coursera Computational Neuroscience Course with Rajesh P N Rao and Adrienne Fairhall]
*[http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-867-machine-learning-fall-2006/ MIT Machine Learning Class with Tommi Jaakkola]
*[http://cs229.stanford.edu/materials.html Stanford CS229]
*[http://www.cs.cmu.edu/~tom/10701_sp11/lectures.shtml Carnegie Mellon Machine Learning Course with Tom Mitchell]
*[http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/ Linear Algebra with Gilbert Strang]
*[https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH Neural Networks Class with Hugo Larochelle]
*[https://introtodeeplearning.com/ MIT Introduction to Deep Learning]
* [https://course.fast.ai/ Practical Deep Learning for Coders - Fast.ai ]
 
==== Books ====
*[http://statweb.stanford.edu/~tibs/ElemStatLearn/ Elements of Statistical Learning]
*[https://www.google.com/search?client=ubuntu&channel=fs&q=pattern+recognition+and+machine+learning&ie=utf-8&oe=utf-8#channel=fs&q=pattern+recognition+and+machine+learning+pdf Pattern Recognition and Machine Learning]
*[https://www.google.com/search?&channel=fs&q=+Information+Theory%2C+Inference%2C+and+Learning+Algorithms.&ie=utf-8&oe=utf-8#channel=fs&q=Information+Theory%2C+Inference%2C+and+Learning+Algorithms+pdf Information Theory, Inference, and Learning Algorithms]
*[http://chimera.labs.oreilly.com/books/1230000000345 Interactive Data Visualization for the Web (D3)]
*[http://cran.r-project.org/doc/manuals/R-intro.pdf Introduction to R]
*[http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/amsbook.mac.pdf Introduction to Probability (Grinstead and Snell)]
*[http://www.cis.temple.edu/~latecki/Courses/CIS2033-Spring12/A_modern_intro_probability_statistics_Dekking05.pdf Modern Introduction to Probability and Statistics (Kraaikamp and Meester)]
*[http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/090310.pdf Bayesian Reasoning and Machine Learning]
*[https://github.com/chandanverma07/Ebooks/blob/master/Deep%20Learning%20with%20Python%2C%20Fran%C3%A7ois%20Chollet.pdf Deep Learning with Python François Chollet]
 
==== Tutorials ====
*[http://nbviewer.ipython.org/github/unpingco/Python-for-Signal-Processing/tree/master/ Signal Processing IPython Notebooks]
*[http://scikit-learn.org/stable/tutorial/basic/tutorial.html Introduction to ML with scikits.learn]
*[http://www.sagemath.org/doc/tutorial/ Learn how to use SAGE]
*[https://skillcombo.com/topic/machine-learning/ Online Machine Learning Courses]
 
==== Noisebridge ML Class Slides ====
*[[NBML/Workshops/Intro to Machine Learning|Intro to Machine Learning]]
*[[NBML/Workshops/Intro to Machine Learning|Intro to Machine Learning]]
*[[NBML/Workshops/Brief Tour of Statistics|A Brief Tour of Statistics]]
*[[NBML/Workshops/Brief Tour of Statistics|A Brief Tour of Statistics]]
Line 42: Line 78:
*Working with the Kinect
*Working with the Kinect
*Computer Vision with OpenCV
*Computer Vision with OpenCV
=== Mailing List ===
https://www.noisebridge.net/mailman/listinfo/ml


=== Projects ===
=== Projects ===
Line 68: Line 100:
**Upload your algorithm and objectively compare it's performance to other algorithms
**Upload your algorithm and objectively compare it's performance to other algorithms
*[http://www.ntis.gov/products/ssa-dmf.aspx Social Security Death Master File!]
*[http://www.ntis.gov/products/ssa-dmf.aspx Social Security Death Master File!]
*[http://www.sipri.org/databases SIPRI Social Databases]
**Wealth of information on international arms transfers and peace missions.
*[http://aws.amazon.com/publicdatasets/ Amazon AWS Public Datasets]
*[http://www.prio.no/Data/Armed-Conflict/ UCDP/PRIO Armed Conflict Datasets]
*[https://opendata.socrata.com/browse Socrata Government Datasets]
*[http://us-city.census.okfn.org/ US City Census Data]
*[http://webscope.sandbox.yahoo.com/catalog.php Yahoo Labs Datasets]


=== Software Tools ===
=== Software Tools ===
Line 74: Line 113:
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka]
*[http://www.cs.waikato.ac.nz/ml/weka/ Weka]
**a collection of data mining tools and machine learning algorithms.
**a collection of data mining tools and machine learning algorithms.
*[http://moa.cs.waikato.ac.nz/ MOA (Massive Online Analysis)]
**Offshoot of weka, has all online-algorithms
*[http://scikit-learn.sourceforge.net/ scikits.learn]
*[http://scikit-learn.sourceforge.net/ scikits.learn]
**Machine learning Python package
**Machine learning Python package
Line 88: Line 125:
**Modular framework, has lots of stuff!
**Modular framework, has lots of stuff!
*[[Machine Learning/VirtualBox|VirtualBox]] Virtual Box Image with Pre-installed Libraries listed here
*[[Machine Learning/VirtualBox|VirtualBox]] Virtual Box Image with Pre-installed Libraries listed here
*[http://deeplearning.net/software/theano/ Theano: Symbolic Expressions and Transparent GPU Integration]
*[http://sympy.org sympy] Does symbolic math
*[http://sympy.org sympy] Does symbolic math
*[http://waffles.sourceforge.net/ Waffles]
*[http://waffles.sourceforge.net/ Waffles]
Line 106: Line 142:
*[http://www.shogun-toolbox.org/ Shogun]
*[http://www.shogun-toolbox.org/ Shogun]
**Fast implementations of SVMs
**Fast implementations of SVMs
*[http://www.mlpack.org/ MLPACK]
**High performance scalable ML Library
*[http://www.torch.ch/ Torch]
**MATLAB-like environment for state-of-the art ML libraries written in LUA
==== Deep Nets ====
*[http://deeplearning.net/software/theano/ Theano]
**Symbolic Expressions and Transparent GPU Integration
*[http://caffe.berkeleyvision.org/ Caffe]
**Convolutional Neural Networks on GPU
*[https://code.google.com/p/neurolab/ Neurolab]
**Has support for recurrent neural nets
==== Online ML ====
*[http://moa.cs.waikato.ac.nz/ MOA (Massive Online Analysis)]
**Offshoot of weka, has all online-algorithms
*[http://jubat.us/en/ Jubatus]
**Distributed Online ML
*[http://dogma.sourceforge.net/ DOGMA]
**MATLAB-based online learning stuff
*[http://code.google.com/p/libol/ libol]
*[http://code.google.com/p/oll/ oll]
*[http://code.google.com/p/scw-learning/ scw-learning]
==== Graphical Models ====
*[http://www.mrc-bsu.cam.ac.uk/bugs/ BUGS]
**MCMC for Bayesian Models
*[http://mcmc-jags.sourceforge.net/ JAGS]
**Hierarchical Bayesian Models
*[http://mc-stan.org/ Stan]
*[http://mc-stan.org/ Stan]
**A graphical model compiler
**A graphical model compiler
*[http://www.mlpack.org/ MLPACK]
*[https://github.com/kutschkem/Jayes Jayes]
**High performance scalable ML Library
**Bayesian networks in Java
*[http://tops.sourceforge.net/ ToPS]
**Probabilistic models of sequences
*[http://pymc-devs.github.io/pymc/ PyMC]
**Bayesian Models in Python
 
==== Text Stuff ====
*[http://www.crummy.com/software/BeautifulSoup/ Beautiful Soup]
**Screen-scraping tools
*[http://www.mlsec.org/sally/ SALLY]
**Tool for embedding strings into vector spaces
*[http://radimrehurek.com/gensim/ Gensim]
**Topic modeling
 
==== Collaborative Filtering ====
*[http://prea.gatech.edu/ PREA]
**Personalized Recommendation Algorithms Toolkit
*[http://svdfeature.apexlab.org/wiki/Main_Page SVDFeature]
**Collaborative Filtering and Ranking Toolkit


==== Computer Vision ====
==== Computer Vision ====
Line 116: Line 199:
**Has ML component (SVM, trees, etc)
**Has ML component (SVM, trees, etc)
**Online tutorials [http://www.pages.drexel.edu/~nk752/tutorials.html here]
**Online tutorials [http://www.pages.drexel.edu/~nk752/tutorials.html here]
*[http://drwn.anu.edu.au/ DARWIN]
**Generic C++ ML and Computer Vision Library
*[http://sourceforge.net/projects/petavision/ PetaVision]
**Developing a real-time, full-scale model of the primate visual cortex.


==== Audio Processing ====
==== Audio Processing ====
Line 124: Line 211:
*[https://github.com/jsawruk/pymir PYMir]  
*[https://github.com/jsawruk/pymir PYMir]  
**A library for reading mp3's into python, and doing analysis  
**A library for reading mp3's into python, and doing analysis  
*[http://www.fon.hum.uva.nl/praat/ PRAAT]
**Speech analysis toolkit
*[http://ofer.sci.ccny.cuny.edu/sound_analysis_pro Sound Analysis Pro]
**Tool for analyzing animal sounds
*[http://luscinia.sourceforge.net/ Luscinia]
**Software for archiving, measuring, and analyzing bioacoustic data
*[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python]
*[http://wiki.python.org/moin/PythonInMusic List of Sound Tools for Python]
*[http://jasperproject.github.io/ Jasper]
**Voice-control anything!


==== Data Visualization ====
==== Data Visualization ====
Line 137: Line 232:
*[http://cytoscape.github.io/cytoscape.js/ Cytoscape]
*[http://cytoscape.github.io/cytoscape.js/ Cytoscape]
**A JavaScript graph library for analysis and visualisation
**A JavaScript graph library for analysis and visualisation
 
*[https://plot.ly/ plot.ly]
**Web-based plotting
*[http://chimera.labs.oreilly.com/books/1230000000345/ch02.html D3 Ebook]
**Has a good list of HTML/CSS/Javascript data visualization tools.
*[https://plot.ly/ plotly]
**Python plotting tool
==== Cluster Computing ====
==== Cluster Computing ====
*[http://lucene.apache.org/mahout/ Mahout]
*[http://lucene.apache.org/mahout/ Mahout]
Line 143: Line 243:
*[http://web.mit.edu/star/cluster/ STAR: Cluster]
*[http://web.mit.edu/star/cluster/ STAR: Cluster]
**Easily build your own Python computing cluster on Amazon EC2
**Easily build your own Python computing cluster on Amazon EC2
==== Database Stuff ====
*[http://madlib.net/ MADlib]
**Machine learning algorithms for in-database data
*[http://www.joyent.com/products/manta Manta]
**Distributed object storage
==== Neural Simulation ====
*[http://nengo.ca/ Nengo]


==== Other ====
==== Other ====
Line 198: Line 307:
** Collective Intelligence & Recommendation Engines
** Collective Intelligence & Recommendation Engines


=== [[Machine Learning/Meeting Notes|Meeting Notes]]===


[[Category:Events]]
[[Category:Events]]
[[Category:Projects]]
[[Category:Projects]]

Latest revision as of 18:03, 29 November 2023

Noisebridge | About | Visit | 272 | Manual | Contact | Guilds | Resources | Events | Projects | 5MoF | Meetings | Donate | (Edit)
Guilds | Meta | Code | Electronics | Fabrication | Games | Sewing | Music | AI | Neuro | Philosophy | Funding | Art | Security | Ham | Brew | (Edit)
AI | Machine Learning | Botbridge | DreamTeam | ML Tools | (Edit)


AI and reinforcement learning meetup at Noisebridge Wednesdays at 8pm.

Join the Mailing List[edit]

https://www.noisebridge.net/mailman/listinfo/ml

History[edit]

Machine Learning groups have been perennial at Noisebridge, accumulating knowledge, projects and meeting notes since 2008.

  • Some of our info links may be outdated, so let us know if anything is wrong and edit the wiki as needed.

Past Teachers[edit]

  • Andy McMurry

Learn about Data Science and Machine Learning[edit]

Classes[edit]

Books[edit]

Tutorials[edit]

Noisebridge ML Class Slides[edit]

Code and SourceForge Site[edit]

    git clone git://ml-noisebridge.git.sourceforge.net/gitroot/ml-noisebridge/ml-noisebridge
  • Send an email to the list if you want to become an administrator on the site to get write access to the git repo!

Future Talks and Topics, Ideas[edit]

  • Random Forests in R
  • Restricted Boltzmann Machines (Mike S, some day)
  • Analyzing brain cells (Mike S)
  • Deep Nets w/ Stacked Autoencoders (Mike S, some day)
  • Generalized Linear Models (Mike S, Erin L? some day)
  • Graphical Models
  • Working with the Kinect
  • Computer Vision with OpenCV

Projects[edit]

Datasets and Websites[edit]

Software Tools[edit]

Generic ML Libraries[edit]

  • Weka
    • a collection of data mining tools and machine learning algorithms.
  • scikits.learn
    • Machine learning Python package
  • scikits.statsmodels
    • Statistical models to go with scipy
  • PyBrain
    • Does feedforward, recurrent, SOM, deep belief nets.
  • LIBSVM
    • c-based SVM package
  • PyML
  • MDP
    • Modular framework, has lots of stuff!
  • VirtualBox Virtual Box Image with Pre-installed Libraries listed here
  • sympy Does symbolic math
  • Waffles
    • Open source C++ set of machine learning command line tools.
  • RapidMiner
  • Mobile Robotic Programming Toolkit
  • nitime
    • NeuroImaging in Python, has some good time series analysis stuff and multi-variate response fitting.
  • Pandas
    • Data analysis workflow in python
  • PyTables
    • Adds querying capabilities to HDF5 files
  • statsmodels
    • Regression, time series analysis, statistics stuff for python
  • Vowpal Wabbit
    • "Intrinsically Fast" implementation of gradient descent for large datasets
  • Shogun
    • Fast implementations of SVMs
  • MLPACK
    • High performance scalable ML Library
  • Torch
    • MATLAB-like environment for state-of-the art ML libraries written in LUA

Deep Nets[edit]

  • Theano
    • Symbolic Expressions and Transparent GPU Integration
  • Caffe
    • Convolutional Neural Networks on GPU
  • Neurolab
    • Has support for recurrent neural nets

Online ML[edit]

Graphical Models[edit]

  • BUGS
    • MCMC for Bayesian Models
  • JAGS
    • Hierarchical Bayesian Models
  • Stan
    • A graphical model compiler
  • Jayes
    • Bayesian networks in Java
  • ToPS
    • Probabilistic models of sequences
  • PyMC
    • Bayesian Models in Python

Text Stuff[edit]

Collaborative Filtering[edit]

  • PREA
    • Personalized Recommendation Algorithms Toolkit
  • SVDFeature
    • Collaborative Filtering and Ranking Toolkit

Computer Vision[edit]

  • OpenCV
    • Computer Vision Library
    • Has ML component (SVM, trees, etc)
    • Online tutorials here
  • DARWIN
    • Generic C++ ML and Computer Vision Library
  • PetaVision
    • Developing a real-time, full-scale model of the primate visual cortex.

Audio Processing[edit]

Data Visualization[edit]

  • Orange
    • Strong data visualization component
  • Gephi
    • Graph Visualization
  • ggplot
    • Nice plotting package for R
  • MayaVi2
    • 3D Scientific Data Visualization
  • Cytoscape
    • A JavaScript graph library for analysis and visualisation
  • plot.ly
    • Web-based plotting
  • D3 Ebook
    • Has a good list of HTML/CSS/Javascript data visualization tools.
  • plotly
    • Python plotting tool

Cluster Computing[edit]

  • Mahout
    • Hadoop cluster based ML package.
  • STAR: Cluster
    • Easily build your own Python computing cluster on Amazon EC2

Database Stuff[edit]

  • MADlib
    • Machine learning algorithms for in-database data
  • Manta
    • Distributed object storage

Neural Simulation[edit]

Other[edit]

Presentations and other Materials[edit]

Topics to Learn and Teach[edit]

NBML Course - Noisebridge Machine Learning Curriculum (work-in-progress)

CS229 - The Stanford Machine learning Course @ noisebridge

  • Supervised Learning
    • Linear Regression
    • Linear Discriminants
    • Neural Nets/Radial Basis Functions
    • Support Vector Machines
    • Classifier Combination [2]
    • A basic decision tree builder, recursive and using entropy metrics
  • Reinforcement Learning
    • Temporal Difference Learning
  • Math, Probability & Statistics
    • Metric spaces and what they mean
    • Fundamentals of probabilities
    • Decision Theory (Bayesian)
    • Maximum Likelihood
    • Bias/Variance Tradeoff, VC Dimension
    • Bagging, Bootstrap, Jacknife [3]
    • Information Theory: Entropy, Mutual Information, Gaussian Channels
    • Estimation of Misclassification [4]
    • No-Free Lunch Theorem [5]
  • Machine Learning SDK's
    • OpenCV ML component (SVM, trees, etc)
    • Mahout a Hadoop cluster based ML package.
    • Weka a collection of data mining tools and machine learning algorithms.
  • Applications
    • Collective Intelligence & Recommendation Engines