MachineLearning代写:CS367SVMVSAdaboost


代写Machine Learning中的SVM和Adaboost算法,并进行对比和性能分析。

Support Vector Machine classifiers

Use the LibSVM package to train a SVM classifier: a simple example of its use
is provided in demo libsvm.m. The first 3 bullet items outline the steps
followed in SVM.

  1. Use a Gaussian kernel, and use 10-fold cross-validation to choose the cost and precision parameters (referred to as -c: cost / -g: gamma parameters in LibSVM documentation).
    Visualize the cross-validation error (a 2D function) using imagesc.
    Once you have picked the best combination of cost and precision evaluate the
    classifier’s generalization error on the test-set - report the number of
    misclassified points.
  2. Plot the decision boundaries, by appropriately modifying the code from the previous exercises. Superimpose on your plot also the support vectors, using the code outlined in SVM.m.
    If you work is working right, you should be getting results of the following
    form for the cross-validation error and the decision boundaries:

Adaboost

In Adaboost.m your task is to train an Adaboost classifier with synthetic
data. For reference, you are provided with the posterior P(y = 1|x), x {[0, 1]
[0, 1]} regularly sampled on the [0, 1] [0, 1] domain, so that you will see
how the output of the Adaboost classifier better approximates the posterior at
each round.

  • Implement the Adaboost algorithm described in class. This involves iterating over the following steps:
    • Find the best decision stump at each round.
    • Evaluate the weak learner’s weighted error, and estimate t.
    • Update the distribution over the training samples.
  • Plot as a function of the boosting round. Make sure that E is monotonically decreasing with time (otherwise your code is wrong). Verify that E(y(x) f(x)) provides an upper bound for the number of errors.
  • Show the response of your strong learner side-by-side with the posterior. Make sure that they look increasingly similar at larger iterations.
    If you code is working right, you should be getting results of the following
    form for the cross-validation error and the decision boundaries: strong
    learner round 10 strong learner round 50 strong learner round 100strong
    learner round 400
    Top row: strong learner scores, displayed within [-1, 1] for dierent numbers
    of boosting rounds.
    Bottom row: left: loss function of Adaboost for increasing training rounds;
    middle: class posterior (groundtruth); right: adaboost-based approximatiot to
    posterior, at round 400.

Some clarifications

  • The form of a decision stump, where d is the the dimension along which the decision is taken, s is 1 if is true and 0 otherwise, is the threshold applied along dimension d and s {-1, 1} is the polarity of the decision stump.

SVM vs. Adaboost

  • When searching for the best decision stump you need in principle to consider all possible combinations of d. As discussed in class, for a given dimension d the only values of that you actually need to work with are the values taken on by the training set features along that dimension.

SVM vs. Adaboost BONUS

Compare the performance of the classifier you obtained from the previous part
of the exercise with that of Adaboost. Use the same training and test sets for
both classifiers. As above, make sure you have reproducible results, by
appropriately setting the seed variable in rand,randn+.
Report their average misclassification errors on the training set and on the
test set. What do you observe?

Fast weak learner selection BONUS

When picking the best weak learner at each round a substantial speedup can be
gained by a clever implementation of the threshold selection. Below some hints
are provided, you can try to implement it quickly based on these hints.
Consider that we are searching for the best weak learner that operates with
the k th feature dimension, f(k) . Our task is to choose the best threshold
for a decision stump of the form.
The naive implementation is to consider all possible thresholds, where N is
the training sample size, and for each of those evaluate the associated
weighted error.
The complexity of this computation is O(N^2), since for every of the N
threshold values we need to perform N summations and multiplications.
Instead of this, a faster algorithm will first sort the N training samples
according to their value of k; namely k consider that we have a permutation of
the training set such that f[i], respectively. Using these, we then have the
following recursion for C(i).
The first and second equations follow from the definition in Eq. 2 and the
fact that f have been sorted. The third equation can be verified by
substitution. Sorting has O(N log N) complexity, while the recursion can be
implemented in O(N) time, so the overall complexity goes down from O(N^2) to
O(N log N). In the code provided to you, this idea has been used to accelerate
the weak learner selection.
Inspect the code provided to you and reply to the following questions in no
more than 100 words (the shorter, the better):

  • What is the code in lines 22-26 doing?
  • What is the code in lines 28-38 doing?
    Note: if you do not understand the code by looking at it, this is perfectly
    normal. Before “leaking” the solution, the assignment was asking you to write
    all of this code. Now, since the solution is leaked, we are asking you to
    spend time to understand what is going on in the provided solution. Running
    the code and observing the variables is an option to help you with doing this.
    But, since this is a bonus exercise, you are expected to “go the extra mile”
    to get the bonus.

文章作者: SafePoker
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 SafePoker !
  目录