[파이썬 용 HMM 라이브러리][scikit-learn user guide] 1.4.7 Hidden Markov Models

Machine Learning/HMM 2015. 2. 26. 21:00

이 곳 구현은 아래 user-guide를 토대로 진행 됩니다.

http://www.math.unipd.it/~aiolli/corsi/1213/aa/user_guide-0.12-git.pdf

sklearn.hmm 은 HMM알고리즘을 구현하였다.

HMM에는 기본적으로 세가지 문제가 있다.

(1) 평가 : 주어진 모델의 파라미터(ⓗ) 들과 관측 데이터(O)가 주어졌을 때 최적의 hidden state의 열(sequence)를 예측하는 것

(2) 디코딩 : 주어진 모델의 파라미터(ⓗ) 들과 관측 데이터(O)가 주어졌을 때 데이터의 우도(likelihood)를 계산하는 것

(3) 학습 : 관측 데이터가 주어졌을 때 모델 파라미터(ⓗ)를 예측하는 것

(1)과 (2)는 Viterbi algorithm으로 알려진 동적 프로그래밍(dynamic programming)과 Foward-Backward 알고리즘으로 풀수 있다.

(3)은 Baum-Welch algorithm으로 알려진 Expectation-Maximization(EM)으로 풀 수 있다.

여기까지 읽어봤을 때 '패턴인식-오일식 저' 책의 내용과 일치한다.

아마 책이나 이 user guide다 둘 다 Rabiner89 논문 내용을 토대로 작성되었기 때문일 것 같다.

여튼 계속 짜 봐야지

앞부분을 짜보니 이런 젠장.. 아래와 같은 에러가 발생한다.

//anaconda/lib/python2.7/site-packages/sklearn/utils/__init__.py:75: DeprecationWarning: Class _BaseHMM is deprecated; WARNING: The HMM module and its function will be removed in 0.17as it no longer falls within the project's scope and API. It has been moved to a separate repository: https://github.com/hmmlearn/hmmlearn
  warnings.warn(msg, category=DeprecationWarning)

사이트를 가보니 HMM 이 좀 바뀐 것 같다.

scikit-learn에서 hmmlearn으로 옮겨 간 것 같다

그래서 일단 경고에 있는대로 https://github.com/hmmlearn/hmmlearn 주소로 들어가 시키는대로 하나씩 해보았다.

1. git을 clone 하고

git clone git://github.com/hmmlearn/hmmlearn.git

2. dependency를 확인하고(?)

pip install scikit-learn Python

3. hmmlearn을 git을 받은 폴더에서 설치한다.

python setup.py install

설치 후 sclearn을 hmmlearn으로 바꾸고 실행을 해보니 아래와 같은 에러 발생

ImportError cannot import name _hmmc

stack overflow도움으로(https://github.com/hmmlearn/hmmlearn/issues/3)

hmmlearn 폴더를 통채로 복사해버렸다.

sudo cp -rf ./* /usr/local/lib/python2.7/dist-packages/hmmlearn

로 설명을 해줬지만 난 아나콘다를 설치하고 그 환경에서 개발을 하고 있으므로

sudo cp -rf ./* /anaconda/lib/python2.7/site-packages/hmmlearn

로 복사를 해 주었다.

여튼 아래 코드는 실행 완료

import numpy as np

from hmmlearn import hmm

startprob = np.array([0.6, 0.3, 0.1])

transmat = np.array([[0.7, 0.2, 0.1], [0.3, 0.5, 0.2], [0.3, 0.3, 0.4]])

means = np.array([[0.0, 0.0], [3.0, -3.0], [5.0, 10.0]])

covars = np.tile(np.identity(2), (3, 1, 1))

model = hmm.GaussianHMM(3, "full", startprob, transmat)

model.means_ = means

model.covars_ = covars

X, Z = model.sample(100)

하지만, 코드를 실행하면 user-guide 처럼 그래프가 나올 줄 알았으나 나오지 않아서 당황함..

일단 코드 분석

array에 등록하는 것 까지는 ok

np의 tile이 뭘 하는걸까??

일단 구글 검색을 하려다 그냥 '파이썬라이브러리를 활용한 데이터 분석'책을 펴고 tile을 찾아보니 491p에 배열을 복사해 쌓는 메서드라고 나옴

그럼 np.identity로 불러온 배열을 3,1,1만큼 복사하는 것 같다.

print를 뭔지 봐야겠다.

np.identity는 [[ 1. 0.] 기본 배열(? 이름 까먹음)이 생긴다.

                     [ 0.  1.]]

print로 보니 아래로 세개의 배열이 생긴다.

covars
[[[ 1.  0.]
  [ 0.  1.]]

 [[ 1.  0.]
  [ 0.  1.]]

 [[ 1.  0.]
  [ 0.  1.]]]

그래프 그려주는 건 아래 코드

import numpy as np

import matplotlib.pyplot as plt

from hmmlearn import hmm

############################################################## # Prepare parameters for a 3-components HMM

# Initial population probability

start_prob = np.array([0.6, 0.3, 0.1, 0.0])

# The transition matrix, note that there are no transitions possible # between component 1 and 4

trans_mat = np.array([[0.7, 0.2, 0.0, 0.1],

[0.3, 0.5, 0.2, 0.0],

[0.0, 0.3, 0.5, 0.2],

[0.2, 0.0, 0.2, 0.6]])

# The means of each component

means = np.array([[0.0, 0.0],

[0.0, 11.0],

[9.0, 10.0],

[11.0, -1.0],

])

# The covariance of each component

covars = .5 * np.tile(np.identity(2), (4, 1, 1))

# Build an HMM instance and set parameters

model = hmm.GaussianHMM(4, "full", start_prob, trans_mat,

random_state=42)

# Instead of fitting it from the data, we directly set the estimated # parameters, the means and covariance of the components

model.means_ = means

model.covars_ = covars ###############################################################

# Generate samples

X, Z = model.sample(500)

# Plot the sampled data

plt.plot(X[:, 0], X[:, 1], "-o", label="observations", ms=6, mfc="orange", alpha=0.7)

# Indicate the component numbers

for i, m in enumerate(means):

plt.text(m[0], m[1], "Component %i" % (i + 1),

size=17, horizontalalignment="center",

bbox=dict(alpha=.7, facecolor="w"))

plt.legend(loc="best")

plt.show()

이제 내가 해보고 싶은 것은 '패턴인식 - 오일식저' 의 HMM코드를 파이썬으로 돌려보는 것이다.

이것은 다음에 다루겠다.

'Machine Learning > HMM' 카테고리의 다른 글

[파이썬 용 HMM 라이브러리][YAHMM] YAHMM라이브러리 (0)	2015.03.03
[파이썬 용 HMM 라이브러리][scikit-learn user guide] 1.4.7 Hidden Markov Models(2) (0)	2015.03.02
[파이썬 용 HMM 라이브러리][hmmpytk] (0)	2015.02.27
파이썬 용 HMM 라이브러리 (0)	2015.02.27
[파이썬 용 HMM 라이브러리][scikit-learn user guide] (0)	2015.02.25

Posted by 공놀이나하여보세

空놀이

[파이썬 용 HMM 라이브러리][scikit-learn user guide] 1.4.7 Hidden Markov Models

ImportError cannot import name _hmmc

'Machine Learning > HMM' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바