Python代写：CSC411DigitClassification

发布日期: 2016-01-07

Requirement

In this assignment, you will compare the characteristics and performance of
different classifiers, namely logistic regression, k-nearest neighbours and
naive Bayes. You will experiment with these extensions and extend the provided
code. Note that you should understand the code first instead of using it as a
black box.
Python versions of the code have been provided. You are free to work with
whichever you wish.

Analysis

作为 Machine Learning 的三大基础算法

Logistic regression ，也就是 logistic回归，常用于数据挖掘，疾病自动诊断，经济预测等领域
K-nearest neighbours ，也就是 K邻近算法，常用于数据挖掘，以及分类，对未知事物的识别等领域
Naive Bayes ，也就是朴素贝叶斯，常用于分类器，文本分类识别
本题给出了以上三大算法的基本实现，但是需要根据测试框架的调度逻辑，实现未完成的测试函数。
本题偏重工程性质，在不断的调试中，会加深对算法的理解。

Tips

下面是check_grad函数的实现
def check_grad(func, X, epsilon, *args):
if len(X.shape) != 2 or X.shape[1] != 1:
raise ValueError(“X must be a vector”)
y, dy, = func(X, *args)[:2] # get the partial derivatives dy
dh = np.zeros((len(X), 1))
for j in xrange(len(X)):
dx = np.zeros((len(X), 1))
dx[j] += epsilon
y2 = func(X+dx, *args)[0]
dx = -dx
y1 = func(X+dx, args)[0]
dh[j] = (y2 - y1)/(2epsilon)
print np.hstack((dy, dh)) # print the two vectors
d = LA.norm(dh-dy)/LA.norm(dh+dy) # return norm of diff divided by norm of sum
return d
—|—