Machine Learning Meetup Notes: 2010-08-18
Mike - HMMs[edit]
HMM used for time series data
Markov Chains: matrix of transition probabilities: a_i_j is prob to go from state i to j the prob funciton is only a function of the previous state
HMM: example is you are trying to predict the weather based on a diary of someone's ice cream eating habits hidden state = what you are trying to predict (weather) obs state: ice cream
3 problems HMMs can solve: Likelihood, Decoding, Training Likelihood: find all the paths Decoding: whats the best sequence of hidden states that produce your obs seq (Viterbi Algorithm), which is similar to Maximum Likelihood Training: Given an obs seq, learn the state transition probs and the emission probs of an HMM (Expectation Maximization, Wells-Baum, Forward-Backward Algorithm)
Thomas - HMM in R[edit]
three packages: HMM - doesnt allow for multiple chains hmm.discnp msm - allows for time based HMMs vs discrete time steps, you can fit in time inbetween states
Glen - protein prediction[edit]
uniprot.org, fasta
>sp|P69906|HBA_PANPA Hemoglobin subunit alpha OS=Pan paniscus GN=HBA1 PE=1 SV=2 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHG KKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTP AVHASLDKFLASVSTVLTSKYR
>sp|P69907|HBA_PANTR Hemoglobin subunit alpha OS=Pan troglodytes GN=HBA1 PE=1 SV=2 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHG KKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTP AVHASLDKFLASVSTVLTSKYR
>sp|P01942|HBA_MOUSE Hemoglobin subunit alpha OS=Mus musculus GN=Hba PE=1 SV=2 MVLSGEDKSNIKAAWGKIGGHGAEYGAEALERMFASFPTTKTYFPHFDVSHGSAQVKGHG KKVADALASAAGHLDDLPGALSALSDLHAHKLRVDPVNFKLLSHCLLVTLASHHPADFTP AVHASLDKFLASVSTVLTSKYR
- BLOSUM62 is used for determining probabilities for protein mutation (ie. V -> I is more likely than V -> W)
Mike - Speech recognition[edit]
- fourier transform takes a wave and turns it into frequencies
- spectogram - time frequency representation
- speech and language processing: daniel jurafsky and james martin