Re: Inverse of Empirical Cumulative Distribution F



Alberto Andreotti wrote:

My simulation first generate a number that distributes according to a
Uniform; I think of it like a probability (e.g. the number of the
y-axes of my EmpiricalCDF), and I want to know the corresponding
return (e.g. the number of the x-axes).

Hi Alberto -

If I understand you correctly, what you are doing is generating multivariate uniform vectors from a copula, and then transforming them to returns, using the inverse CDF of some estimated univariate marginal distributions. Is that right?

There are several ways to do that. The following assumes that you have the Statistics Toolbox. Let's say x is one of your return series, sorted for convenience.

x = randn(10,1); x = sort(x);


1) fit a parametric distribution to x, and just apply the inverse to your uniform values, perhaps using a function like NORMINV.



2) Compute the empirical CDF estimate and invert that -- the "midpoint" values are probably what you want here, with piecewise linear interpolation:


% the usual ECDF
[Fi,xi] = ecdf(x);

% compute the midpoint values
n = length(x);
xj = xi(2:end);
Fj = (Fi(1:end-1)+Fi(2:end))/2;
xj = [xj(1)-Fj(1)*(xj(2)-xj(1))/((Fj(2)-Fj(1)));
      xj;
      xj(n)+(1-Fj(n))*((xj(n)-xj(n-1))/(Fj(n)-Fj(n-1)))];
Fj = [0; Fj; 1];

% plot those two CDF estimates for comparison
stairs(xi,Fi,'r');
hold on
plot(xj,Fj,'b-');
hold off

% apply the inverse "midpoint" CDF to uniform random values to
% get a sample from your marginal
u = rand(1000,1);
r = interp1(Fj,xj,u,'linear','extrap');


3) Use a kernel smoother to estimate the CDF and invert that:

% estimate the CDF using a kernel smoother
xgrid = linspace(-3,3,100);
F = ksdensity(x, xgrid, 'function','cdf');
plot(xgrid,F,'k-',x,zeros(size(x)),'rx');

% apply the inverse of kernel smooth estimate to uniform values to
% get a sample from your marginal
r = ksdensity(x, u, 'function','icdf');

% more simply, resample from the original data and add some noise
bw = .25;
r = randsample(x,1000,true) + normrnd(0,bw,1000,1);


I was thinking to apply a gaussian kernel in the middle of the
Empirical CDF and a GPD (Generalised Pareto Distribution) to the
queues in order to remove the staircase pattern typical of the
empirical CDF.

That's a combination of (1) and (3). The Statistics Toolbox demo "Modelling the Tails of a Distribution" describes how to fit a generalized Pareto distribution, and can be found at


   <<http://www.mathworks.com/products/demos/statistics/gparetodemo.html>>

You might also find the demo "Simulating Dependent Random Variables Using Copulas ", found at

   <<http://www.mathworks.com/products/demos/statistics/copulademo.html>>

helpful.

This webinar

  <<http://www.mathworks.com/cmspro/req10929.html?eventid=30203>>

actually demonstrates a VaR simulation, using a lot of these methods.

Hope this helps.

- Peter Perkins
  The MathWorks, Inc.
.



Relevant Pages