MachineLearning代写:CSCC11Classification


完成Machine Learning中关于 Classification
的练习题。
![Classification](https://upload.wikimedia.org/wikipedia/commons/thumb/3/3b/Computer_vision_sample_in_Sim%C3%B3n_Bolivar_Avenue%2C_Quito.jpg/250px-
Computer_vision_sample_in_Sim%C3%B3n_Bolivar_Avenue%2C_Quito.jpg)

Instructions

Download the starter file on the course webpage. The file requires the same
login and password as the video lectures. Once downloaded and unpacked, a
directory called Exam will exist. Please do not change the directory structure
or the headers of functions in python files. All modified python files and
others you generate (e.g., plots) should be left in the Exam directory (which
you’ll submit). Remember to read all instructions and comments in the python
files very carefully.

Written work

For written answers you can either write on paper or use an editor that
supports equations (eg Word or Latex). If writing on paper, please scan or
photograph and save as a png file or perhaps pdf (and make sure it is easily
legible). If formatting your answers in Word or latex etc, please save as pdf.
All such png or pdf files should be left in the Exam directory for submission.

Environment

You’ll use the same environment you’ve been using on assignments. You will
need Python3,
Numpy, Matplotlib, and Pickle; you may not use any other python packages.
Please pre-install if working on your own computer. If you’d rather work on a
UTSC machine, remote login to one of these lab machines:

  1. IC lab (ranging from 01 to 50): i406-[01-50]
  2. BV lab (ranging from 01 to 40): b473-[01-40]

Submission

Submit your exam on the Mathlab server (like assignments). Do not modify the
Exam directory tree or method/function headers therein.

Q1. Classification with Gaussian Class Conditionals

In Chapter 8.4 of the online course lecture notes, we introduced the Gaussian
class-conditional (GCC) model for classification. The generative model for a
data point x, in the two class case. Where c is the class random variable; it
takes on the value 0 for one class, and 1 for the other. The decision function
for the CC model is the log posterior ratio.
To learn the model, given N data points, we estimate the class priors. For GCC
models the likelihoods are Gaussian, for which we estimate the means and
covariances.
For the problem below you are asked implement the GCC model. You will then
apply your algorithm to three datasets, each having a training set and a test
set. You will train this model, visualize the Gaussians, and identify the
properties of the GCC model. Starter code is in directory gcc.
Implementing GCC: You should implement the GCC model by completing the methods
train and predict in gcc.py :

  1. train fits the GCC model. The means and covariances for the likelihoods should be stored in self.means and self.covariances. The learned priors are stored in self.priors.
  2. predict takes inputs and computes the probability of each input being assigned to each class.
    You can test your code by running debug_gcc.py. It provides simple test cases
    and feedback to help debug your code. Once you are confident with your
    implementation, run train_gcc.py on the three datasets mentioned above by
    modifying the dataset variable in the main block. This code trains a GCC model
    on the specified dataset, computes the test accuracy, and displays the
    Gaussian likelihoods with the training and test data.
    Provide a brief answer (no more than 3 or 4 sentences) for each of the
    following questions in Questions.txt:
  3. Does GCC perform well on generic_1? Why or why not?
  4. Does GCC perform well on generic_2? Why or why not?
  5. Does GCC perform well on generic_3? Why or why not?
  6. If input feature vectors have length D, how many parameters do we need to learn for each Gaussian?
  7. Suppose you decide to use Naive Bayes with GCC. In words, what would you have to change in your code? How many parameters would you need to learn for each Gaussian in this case?
  8. Suppose certain elements of the input feature vectors are highly correlated. Why is this problematic for Naive Bayes?
  9. How could you transform the data to mitigate this problem with Naive Bayes?

Q2. Multi-Line Regression with Expectation-Maximization

In Assignment 4 we fit a Gaussian mixture model (GMM) to data using the EM
Algorithm. The mixture model provides a way to assign points to clusters in a
probabilistic sense. That is, we compute the posterior probability that point
i should be owned by (i.e., assigned to) the jth component of the mixture.
Where K is the number of components in the mixture. Given the ownership
probabilities, we then computed weighted means and covariances, the parameters
of the Gaussian components. This works well for density estimation and
clustering, but is actually much more general. To that end, here we’re going
to consider the use of EM for parametric model fitting; i.e., a mixture of
regressors.
Suppose you have a bunch of points in the plane, like those shown in the plot
below. Normally you might not know how many lines are in any given dataset,
but here it seems clear enough that you should be able to fit two lines. But
to do this you need to figure out a way to determine which points belong to
which line. And this is where the EM algorithm comes in, since we can use the
posterior probability that a given point comes from a given line. This will
allow us to simultaneously regress to multiple lines (not just two).
To find the model parameters we minimize the negative log likelihood with the
added constraint that mixing probabilities sum to 1. For this we add a
Lagrange multiplier.
We’ll treat the noise variance 2 as a hyper-parameter, and set it manually in
the code.
The EM algorithm alternates between inferring the posterior over the
assignment of points to lines in the E-step, and the estimation of the model
parameters in the M-step. For the E-step, as with the GMM in A4, we’ll write
the posterior ownership probability.

Written Part

The questions below ask you to derive the equations for the E-step and the
M-step. Save your answers as written.pdf in the directory em_line_fitting.
E.g., you could take pictures of written work on paper, or format
electronically, e.g., with LateX.

Programming Part

Implement multi-line fitting using the EM algorithm described above. The
starter code is in directory em_line_fitting. The associated methods are:

  1. _fit: fits K linear models to training data. Each iteration contains the E-step and the M-step. The iteration terminates when there is negligible change in the ownership probabilities.
  2. _e_step: computes the ownership probabilities for each data point.
  3. _m_step: computes the mixing probabilities and the parameters of each of the linear models.
  4. _visualize: visualizes training data with fitted lines and ownership probabilities for each point.
    You will need to complete the methods _e_step and _m_step. The equations for
    these steps are given above. The starter code includes a way to specify
    initial conditions for the optimization; this includes the initial line
    parameters and the mixing probabilities.
    You can run train_em_line_fitting.py to debug your code with pre-defined test
    cases (test_case = [i], where [i] = 0, 1, 2, 3). This script determines
    whether your learned model parameters, mixing probabilities, and ownership
    probabilities are correct. If you set visualize = True, this method will
    display at each iteration the current estimates of the fitted lines, along
    with a plot that shows, for each model, the ownership probability for each
    data point. Once you’re confident in the code, we’ll analyze the results.
    The EM algorithm will find one of several local optima. Some are good, and
    some are not great. The chance of getting “stuck” in poor minima often depends
    on the initial guess and the value of the model standard deviation. To explore
    this, we’ve pre-specified several test cases, with different numbers of lines
    and values of . Run train_em_line_fitting.py for each test case, setting
    test_case to be 0 through 4. If you set test_case = False, you can play with
    the algorithm by manually changing the seed, stddev, num_lines.
    When your code works correctly, the first four test cases should produce the
    following plots. In these plots, each component (line) and its associated
    ownership probabilities (the bar graph) have a unique colour.
    Now, reflect on these estimates and how they change with each iteration as the
    model fit evolves from the initial guess to the final estimate. And provide
    brief answers (no more than 5 sentences) to the following questions in
    Questions.txt :
  5. In test case 0, the two lines fit well. The ownership probabilities are either 0 or 1 near the right and left sides of the plot, but near 0.5 closer to the middle. Explain why this occurs and why it’s useful.
  6. In test case 1, the lines do not fit the data well. Explain what might be the source of the problem in this case.
  7. Test cases 2 and 3 fit two and three lines, but in both cases not all lines appear to fit the data well. Select one of these two test cases and explain what might be the source of the problem, making reference to the model variance and the ownership probabilities.

文章作者: SafePoker
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 SafePoker !
  目录