博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Deep learning:十四(Softmax Regression练习)
阅读量:5879 次
发布时间:2019-06-19

本文共 9807 字,大约阅读时间需要 32 分钟。

 

  前言:

  这篇文章主要是用来练习softmax regression在多分类器中的应用,关于该部分的理论知识已经在前面的博文中有所介绍。本次的实验内容是参考网页:。主要完成的是手写数字识别,采用的是MNIST手写数字数据库,其中训练样本有6万个,测试样本有1万个,且数字是0~9这10个。每个样本是一张小图片,大小为28*28的。

  实验环境:matlab2012a

 

  实验基础:

  这次实验只用了softmax模型,也就是说没有隐含层,而只有输入层和输出层,因为实验中并没有提取出MINST样本的特征,而是直接用的原始像素特征。实验中主要是计算系统的损失函数和其偏导数,其计算公式如下所示:

   

  

  一些matlab函数:

  sparse:

  生成一个稀疏矩阵,比如说sparse(A, B, k),,其中A和B是个向量,k是个常量。这里生成的稀疏矩阵的值都为参数k,稀疏矩阵位置值坐标点有A和B相应的位置点值构成。

  full:

  生成一个正常矩阵,一般都是利用稀疏矩阵来还原的。

 

 

  实验错误:

  按照作者给的starter code,结果连数据都加载不下来,出现如下错误提示:Error using permute Out of memory. Type HELP MEMORY for your options. 结果跟踪定位到loadMNISTImages.m文件中的images = permute(images,[2 1 3])这句代码,究其原因就是说images矩阵过大,在有限内存下不能够将其进行维度旋转变换。可是这个数据已经很小了,才几十兆而已,参考了很多out of memory的方法都不管用,后面直接把改句的前面一句代码images = reshape(images, numCols, numRows, numImages);改成images = reshape(images, numRows, numCols, numImages);反正实现的效果都是一样的。因为原因是内存问题,所以要么用64bit的matlab,要买自己对该函数去优化下,节省运行过程中的内存。

 

  实验结果:

  Accuracy: 92.640%

  和网页教程中给的结果非常相近了。

 

  实验主要部分代码:

  softmaxExercise.m:

%% CS294A/CS294W Softmax Exercise %  Instructions%  ------------% %  This file contains code that helps you get started on the%  softmax exercise. You will need to write the softmax cost function %  in softmaxCost.m and the softmax prediction function in softmaxPred.m. %  For this exercise, you will not need to change any code in this file,%  or any other files other than those mentioned above.%  (However, you may be required to do so in later exercises)%%======================================================================%% STEP 0: Initialise constants and parameters%%  Here we define and initialise some constants which allow your code%  to be used more generally on any arbitrary input. %  We also initialise some parameters used for tuning the model.inputSize = 28 * 28; % Size of input vector (MNIST images are 28x28)numClasses = 10;     % Number of classes (MNIST images fall into 10 classes)lambda = 1e-4; % Weight decay parameter%%======================================================================%% STEP 1: Load data%%  In this section, we load the input and output data.%  For softmax regression on MNIST pixels, %  the input data is the images, and %  the output data is the labels.%% Change the filenames if you've saved the files under different names% On some platforms, the files might be saved as % train-images.idx3-ubyte / train-labels.idx1-ubyteimages = loadMNISTImages('train-images.idx3-ubyte');labels = loadMNISTLabels('train-labels.idx1-ubyte');labels(labels==0) = 10; % Remap 0 to 10inputData = images;% For debugging purposes, you may wish to reduce the size of the input data% in order to speed up gradient checking. % Here, we create synthetic dataset using random data for testing% DEBUG = true; % Set DEBUG to true when debugging.DEBUG = false;if DEBUG    inputSize = 8;    inputData = randn(8, 100);    labels = randi(10, 100, 1);end% Randomly initialise thetatheta = 0.005 * randn(numClasses * inputSize, 1);%输入的是一个列向量%%======================================================================%% STEP 2: Implement softmaxCost%%  Implement softmaxCost in softmaxCost.m. [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, inputData, labels);                                     %%======================================================================%% STEP 3: Gradient checking%%  As with any learning algorithm, you should always check that your%  gradients are correct before learning the parameters.% if DEBUG    numGrad = computeNumericalGradient( @(x) softmaxCost(x, numClasses, ...                                    inputSize, lambda, inputData, labels), theta);    % Use this to visually compare the gradients side by side    disp([numGrad grad]);     % Compare numerically computed gradients with those computed analytically    diff = norm(numGrad-grad)/norm(numGrad+grad);    disp(diff);     % The difference should be small.     % In our implementation, these values are usually less than 1e-7.    % When your gradients are correct, congratulations!end%%======================================================================%% STEP 4: Learning parameters%%  Once you have verified that your gradients are correct, %  you can start training your softmax regression code using softmaxTrain%  (which uses minFunc).options.maxIter = 100;%softmaxModel其实只是一个结构体,里面包含了学习到的最优参数以及输入尺寸大小和类别个数信息softmaxModel = softmaxTrain(inputSize, numClasses, lambda, ...                            inputData, labels, options);                          % Although we only use 100 iterations here to train a classifier for the % MNIST data set, in practice, training for more iterations is usually% beneficial.%%======================================================================%% STEP 5: Testing%%  You should now test your model against the test images.%  To do this, you will first need to write softmaxPredict%  (in softmaxPredict.m), which should return predictions%  given a softmax model and the input data.images = loadMNISTImages('t10k-images.idx3-ubyte');labels = loadMNISTLabels('t10k-labels.idx1-ubyte');labels(labels==0) = 10; % Remap 0 to 10inputData = images;size(softmaxModel.optTheta)size(inputData)% You will have to implement softmaxPredict in softmaxPredict.m[pred] = softmaxPredict(softmaxModel, inputData);acc = mean(labels(:) == pred(:));fprintf('Accuracy: %0.3f%%\n', acc * 100);% Accuracy is the proportion of correctly classified images% After 100 iterations, the results for our implementation were:%% Accuracy: 92.200%%% If your values are too low (accuracy less than 0.91), you should check % your code for errors, and make sure you are training on the % entire data set of 60000 28x28 training images % (unless you modified the loading code, this should be the case)

 

  softmaxCost.m

function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels)% numClasses - the number of classes % inputSize - the size N of the input vector% lambda - weight decay parameter% data - the N x M input matrix, where each column data(:, i) corresponds to%        a single test set% labels - an M x 1 matrix containing the labels corresponding for the input data%% Unroll the parameters from thetatheta = reshape(theta, numClasses, inputSize);%将输入的参数列向量变成一个矩阵numCases = size(data, 2);%输入样本的个数groundTruth = full(sparse(labels, 1:numCases, 1));%这里sparse是生成一个稀疏矩阵,该矩阵中的值都是第三个值1                                                    %稀疏矩阵的小标由labels和1:numCases对应值构成cost = 0;thetagrad = zeros(numClasses, inputSize);%% ---------- YOUR CODE HERE --------------------------------------%  Instructions: Compute the cost and gradient for softmax regression.%                You need to compute thetagrad and cost.%                The groundTruth matrix might come in handy.M = bsxfun(@minus,theta*data,max(theta*data, [], 1));M = exp(M);p = bsxfun(@rdivide, M, sum(M));cost = -1/numCases * groundTruth(:)' * log(p(:)) + lambda/2 * sum(theta(:) .^ 2);thetagrad = -1/numCases * (groundTruth - p) * data' + lambda * theta;% ------------------------------------------------------------------% Unroll the gradient matrices into a vector for minFuncgrad = [thetagrad(:)];end

 

  softmaxTrain.m:

function [softmaxModel] = softmaxTrain(inputSize, numClasses, lambda, inputData, labels, options)%softmaxTrain Train a softmax model with the given parameters on the given% data. Returns softmaxOptTheta, a vector containing the trained parameters% for the model.%% inputSize: the size of an input vector x^(i)% numClasses: the number of classes % lambda: weight decay parameter% inputData: an N by M matrix containing the input data, such that%            inputData(:, c) is the cth input% labels: M by 1 matrix containing the class labels for the%            corresponding inputs. labels(c) is the class label for%            the cth input% options (optional): options%   options.maxIter: number of iterations to train forif ~exist('options', 'var')    options = struct;endif ~isfield(options, 'maxIter')    options.maxIter = 400;end% initialize parameterstheta = 0.005 * randn(numClasses * inputSize, 1);% Use minFunc to minimize the functionaddpath minFunc/options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost                          % function. Generally, for minFunc to work, you                          % need a function pointer with two outputs: the                          % function value and the gradient. In our problem,                          % softmaxCost.m satisfies this.minFuncOptions.display = 'on';[softmaxOptTheta, cost] = minFunc( @(p) softmaxCost(p, ...                                   numClasses, inputSize, lambda, ...                                   inputData, labels), ...                                                                 theta, options);% Fold softmaxOptTheta into a nicer formatsoftmaxModel.optTheta = reshape(softmaxOptTheta, numClasses, inputSize);softmaxModel.inputSize = inputSize;softmaxModel.numClasses = numClasses;                          end

 

  softmaxPredict.m:

function [pred] = softmaxPredict(softmaxModel, data)% softmaxModel - model trained using softmaxTrain% data - the N x M input matrix, where each column data(:, i) corresponds to%        a single test set%% Your code should produce the prediction matrix % pred, where pred(i) is argmax_c P(y(c) | x(i)). % Unroll the parameters from thetatheta = softmaxModel.optTheta;  % this provides a numClasses x inputSize matrixpred = zeros(1, size(data, 2));%% ---------- YOUR CODE HERE --------------------------------------%  Instructions: Compute pred using theta assuming that the labels start %                from 1.[nop, pred] = max(theta * data);%  pred= max(peed_temp);% ---------------------------------------------------------------------end

 

 

  参考资料:

     

     

 

 

 

 

 

 

转载于:https://www.cnblogs.com/tornadomeet/archive/2013/03/23/2977621.html

你可能感兴趣的文章
python基础教程_学习笔记19:标准库:一些最爱——集合、堆和双端队列
查看>>
C# 解决窗体闪烁
查看>>
CSS魔法堂:Transition就这么好玩
查看>>
【OpenStack】network相关知识学习
查看>>
centos 7下独立的python 2.7环境安装
查看>>
[日常] 算法-单链表的创建
查看>>
前端工程化系列[01]-Bower包管理工具的使用
查看>>
使用 maven 自动将源码打包并发布
查看>>
Spark:求出分组内的TopN
查看>>
Python爬取豆瓣《复仇者联盟3》评论并生成乖萌的格鲁特
查看>>
关于跨DB增量(增、改)同步两张表的数据小技巧
查看>>
学员会诊之03:你那惨不忍睹的三层架构
查看>>
vue-04-组件
查看>>
Golang协程与通道整理
查看>>
解决win7远程桌面连接时发生身份验证错误的方法
查看>>
C/C++ 多线程机制
查看>>
js - object.assign 以及浅、深拷贝
查看>>
python mysql Connect Pool mysql连接池 (201
查看>>
Boost在vs2010下的配置
查看>>
一起谈.NET技术,ASP.NET伪静态的实现及伪静态的意义
查看>>