Re: Estimates shares of population with access to different levels of safe water



The main problem I see in this estimation is that the total amount
should sum 100% !
Notice that the outputs are correlated, since sum of % must be 100%, I
mean, if the first 2 percentages are given, the last one is
determined, this lead to correlated data.
Maybe you should estimate the first 2 ones using a regression and
calculates the last one, or, estimates all of them and then normalize
the results to 100%, I think that they will have few differences
between these methods.
I think a good method to use is a PLS regression, since it is robust
as a estimator for correlated data (for example, average years of
education have some correlation with average income, eg. plot 1 vs
other and you will see some correlation).
Good luck in your estimation

On Jan 28, 6:44 am, dale.roth...@xxxxxxxxx wrote:
I have a data set that includes the percentage of the population with
access to 3 different levels of safe water, e.g.: % with no access, %
with improved access, and % with household connections, which sum to
100%. The dataset includes values for more than 100 countries for 4
time periods. I am trying to come up with an estimation of what
percentage of the population falls into each of these categories as a
function of various explanatory variables, e.g. average income and
average years of education. It seems like some sort of a multinomial
or ordered logistic regression would be appropriate, but most
statistical packages seem to want the data to be entered as individual
trials rather than percentages. Any help. P.S. I am trying to do this
analysis using R.

.



Relevant Pages