AI代写:CS3049LikelihoodWeightedSampling


Introduction

这次要代写的是一个AI作业,完成一个lecture里面的probabilistic graphical model (PGM).
编程语言可以是C/C++或者Java,虽然C/C++运算效率高,但是明显还是Java开发速度快。

Probabilistic graphical models

Your task is to write a program to implement likelihood weighted sampling, as
described in lecture 18, to perform inference on an arbitrary probabilistic
graphical model (PGM) of boolean random variables. You will need to parse an
input text file that encodes the graph and the conditional probability
distributions for each variable (conditioned on its parents). The format of
the file is given in the section below. Two networks have been defined in this
format and can be downloaded, along with two queries each from the forum, in
file ass3.zip.
Your program must be able to parse any file of this format to create and
populate an internal data structure of the PGM.
Your program must then prompt the user via the console (or read from
redirected standard input) for a single variable query, parse the query
correctly and evaluate the conditional probability distribution of the query
variable, given the evidence. The result must be written as two decimal values
(corresponding to the values of P(QueryVar=true|…) and P(QueryVar=false|…))
onto the standard output stream, separated by white space. For example:
0.872 0.128
Note that if you write anything else before these two numbers, automark will
interpret that output as the answer, almost certainly resulting in a test
failure.

File format

The graphical model will be specified by a text file with format:
N
rv0 rv1 … rvN-1
0 0 1 … 0
1 0 0 … 1

0 1 1 … 0
mat0
mat1

matN-1
Here:

  • N is the number of random variables in the network;
  • rv are the random variable names (arbitrary alphanumeric strings);
  • mat are two dimensional arrays of real numbers (in ASCII) that specify the conditional probability table of each random variable conditioned on its parents;
  • The matrix of zeros and ones specifes the directed arcs in the graph; a one (1) in the i,j entry indicates there is an edge from i to j (so the ith variable is a parent of the jth variable.
    The format of the Conditional Probability Table (CPT) matrices is a bit
    subtle. If a node has m parents, then the matrix needs to specify the
    probability of each outcome (true, false) conditioned on 2^m different
    combinations of parent values, so the matrix will be 2^m × 2 (rows × columns).
    Treating true as 1, and false as 0, concatenate the values of the parents in
    their numerical order from most significant bit to least significant bit (left
    to right) to create a row index r. The entry in the first column, rth row is
    then the probability that the variable is true given the values of the other
    variables (the entry in the corresponding 2nd column is the probability that
    the variable is false). Thus, the first row of the matrix corresponds to all
    conditioning variables taken the value false (r = 000 … 0), and the last row
    has all conditioning variables true (r = 111 … 1).

Deliverables

Write your program in Java or C/C++. In the case of Java, name your program
inference.java. Your program must be able to be compiled and run as follows:
$ javac inference.java
$ java inference graphfile.txt
In the case of C/C++, you must supply a makefile Makefile with a rule called
inference to compile your program into a Linux executable binary named
inference.bin. Your program must be able to be compiled and run as follows:
$ make inference
$ ./inference.bin graphfile.txt

NOTE1: If a makefile exists, the marking script will assume that the program
is written in C/C++, otherwise Java will be assumed.
NOTE2: graphfile.txt is of course a placeholder for the name of the file
containing the PGM structure.

Submission and assessment

You must submit your program on the Computer Science Web Submission System.
This means you must create the assignment under your own SVN repository to
store the submission files.
Your code will be compiled and run, testing its outputs for the two networks
with two queries each that have been provided to you, and an additional three
networks with two queries each. None of the networks will contain more than 16
variables. Since the results are stochastic, the query answers will not be
exact and will vary from run to run. The asnwers will therefore be tested
against a tolerance of ±0.01 (i.e. your answers must be within 1% of the true
values), so you must ensure convergence to this level of precision. If it
passes all tests you will get 15% of the overall course mark. The objective of
the tests is to check for the correct operation of your implementation. Hence,
the basis of the assessment is to compare your results against the expected
results. You must also ensure that you have an efficient implementation.

Using other source code

You may not use other source code for this assignment. You should personally
and carefully implement the likelihood weighted sampling to fully understand
the concept.


文章作者: SafePoker
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 SafePoker !
  目录