Re: Regression with an EM algorithm
- From: souzi HEC <souzi.hec@xxxxxxxxx>
- Date: Tue, 7 Jul 2009 13:28:40 -0700 (PDT)
On 7 juil, 16:35, souzi HEC <souzi....@xxxxxxxxx> wrote:
Hi,
I’m new in Matlab software programming and I need a little help to
implement an EM algorithm…
My name is Souad Romdhan, and I am a student in master "Finance and
Actuarial studies" at Institute of High Commercial Studies.
I am currently conducting a research project on the Modeling Loss
Index Triggers to price Cat Bonds: Application of the risk of
hurricanes in USA.
I need to solve with MATLAB (especially with EM algorithm) this
specific problem below:
I have two data samples. The first sample contains 66 depths for
historical hurricanes happened in Florida between 1935 - 2008. The
second sample contains only 48 losses associated to these hurricanes.
This second data is missing 19 values of losses.
Since 48 out 66 observations have information about hurricanes losses,
it is necessary to treat the missing data of losses for a further
analysis. So, I chose to make a regression with EM-algorithm:
1- The log (hurricanes losses) are directly proportional to the
depths of the hurricanes. Besides, statistically we observe a
relationship between the two vectors (depths of hurricanes and log
(losses)). So the approach is to estimate the missing losses by means
of the linear regression model x= α+ β yi + εi, (Linear Model)
where y (n x 1) is a vector of observations on the response variable
(depth) and x (n x p) is the p explanatory variables (=log(loss))..
2- This approach that will deal with the missing data is the
regression with expectation –Maximum algorithm (EM)
The expectation step:
This algorithm consists of omitting the cases with missing data and
running a regression on what remains. The regression coefficient will
be used to estimate the missing data.
The maximization step:
After this estimation step, a new regression will be done over the
complete data (including estimated values). With the new regression
coefficients, the missing data is re-estimated. This process will
continue until the estimates are adjusted to give model sampling
error, i.e. it will not be a longer noticeable change.
My problem is that estimates values are not being adjusted after one
step… If I take difference of these two estimated Beta it will be
zero... (So the loop is stopped, because the new beta minus old beta
is zero...). I attached the code with Data - maybe you can try to do
something with this...
Please help me to correct this algorithm with MATLAB Software.
Excuse me for mistakes because I’m not fluent at English writing
Thank you in advance to consider my request.
I look forward to hearing from you in the near future.
Best and kind regards
Souad
Below we find the algorithm and the data
The attached algorithm is:
% load Hurr DATA from Excel file
DATA = xlsread('C:Project\Hurr DATA.xls');
x=DATA(:,3);
y=log(DATA(:,5));
% take observed data only
yobs = y(not(isnan(y)));
ymis = y(isnan(y));
xobs = x(not(isnan(y)));
xmis = x(isnan(y));
% E-STEP
% estimate alpha and beta coefficients
Sxy = (xobs-mean(xobs))'*(yobs-mean(yobs));
Sxx = (xobs-mean(xobs))'*(xobs-mean(xobs));
Syy = (yobs-mean(yobs))'*(yobs-mean(yobs));
b = Sxy / Sxx
a = mean(yobs) - b * mean(xobs)
% calculate r square statistic
rs = Sxy^2 / (Sxx * Syy)
% plot regression line
xmin = min(xobs); xmax= max(xobs);
ymin = a + b*xmin; ymax = a + b * xmax;
plot([xmin xmax], [ymin ymax]); hold on;
% plot observed values
plot(xobs,yobs,'.');xlabel('depth'); ylabel('log(loss)');
% plot estimated values
yhatmis = a + b * xmis;
plot(xmis, yhatmis,'r.'); hold on
% plot 'new' data
figure;
xnew = [xobs; xmis(1:end)];
ynew = [yobs; yhatmis(1:end)];
plot(xnew,ynew,'g.')
% M-STEP
% fit regression to new data
Sxy = (xnew-mean(xnew))'*(ynew-mean(ynew));
Sxx = (xnew-mean(xnew))'*(xnew-mean(xnew));
Syy = (ynew-mean(ynew))'*(ynew-mean(ynew));
b = Sxy / Sxx
a = mean(ynew) - b * mean(xnew)
rs = Sxy^2 / (Sxx * Syy)
n = 200; i = 1;
tol = 1; err = tol + 1;
while and(i<n, err > tol)
Sxy = (xnew-mean(xnew))'*(ynew-mean(ynew));
Sxx = (xnew-mean(xnew))'*(xnew-mean(xnew));
Syy = (ynew-mean(ynew))'*(ynew-mean(ynew));
bnew = Sxy / Sxx;
anew = mean(ynew) - b * mean(xnew);
err = (bnew - b)^2+(anew - a)^2;
b = bnew; a = anew;
rs = Sxy^2 / (Sxx * Syy);
%update dataset
yhatmis = a + b * xmis;
xnew = [xobs; xmis];
ynew = [yobs; yhatmis];
i = i + 1;
end
My data is
Depht of Hurricanes Losses of Hurricanes
Complete Data Incomplete Data
897 NAN
973 NAN
964 NAN
985 NAN
975 NAN
962 NAN
985 NAN
951 NAN
980 NAN
940 NAN
974 NAN
963 NAN
975 NAN
954 NAN
958 NAN
955 10450000
985 NAN
975 NAN
930 55000000
981 NAN
968 67 200 000
966 6000000
974 2000000
948 28000000
982 5400000
983 NAN
977 2580000
909 1000000
980 2562500
955 56410000
970 65300000
964 79969000
1002 5000000
987 100000000
959 NAN
942 NAN
971 7 000 000
967 53864000
993 NAN
950 30 000 000
983 5 000 000
922 15000000000
930 15000000
941 8000000
925 60000000
973 350000000
942 1360000000
973 20000000
987 NAN
964 340000000
956 50000000
987 100 000 000
940 115 000 000
963 35 000 000
932 7430000000
960 4275000000
946 4300000000
950 3000000000
946 855000000
917 707000000
924 33500000
922 10300000000
970 195000000
It's interessting!
.
- References:
- Regression with an EM algorithm
- From: souzi HEC
- Regression with an EM algorithm
- Prev by Date: impixelinfo question
- Next by Date: Re: Help with polynomial fitting
- Previous by thread: Regression with an EM algorithm
- Next by thread: saving images and variables
- Index(es):
Relevant Pages
|
Loading