Stata mi impute with scales

by RossHigashi Mon Feb 20, 2017 6:09 pm

My data contains an Engagement measure (eng) that is calculated as the mean of 5 survey items (eng1 .. eng5). There is missingness at the item level. For non-imputed analyses, I can simply

Code:: egen eng = rowmean(eng1-eng5)

.

If I wanted to use multiple imputation on this dataset, I am unclear whether I should (1) impute with only the items first, then calculate means; or (2) calculate scale means, then impute. The drawback to the second method is that it might produce scale values that are no longer the means of the scale items. Therefore, the first method seems more right.

Unfortunately, when attempting to do this, Stata generates a lot of extra data columns like _1_eng1 that I assume correspond to the imputed value for eng1 in imputed "world" 1. Does this mean that I need to manually calculate scale values for every single dataset separately? Will the later

Code:: mi estimate

command know how to use these values when I attempt to do my final regression?

Separately, I am running into "incomplete" imputation rounds, which I don't know what to do with. Does that mean a particular iteration failed to converge, or that a certain cell imputed to a "missing" value?

Output follows:

Code:: female: logistic regression age: predictive mean matching e1_last: predictive mean matching e2_last: predictive mean matching e3_last: predictive mean matching e7_last: predictive mean matching e8_last: predictive mean matching b_und_last: predictive mean matching b_val_last: predictive mean matching b_want_last: predictive mean matching ps1_last: predictive mean matching ps2_last: predictive mean matching ps3_last: predictive mean matching ps6_last: predictive mean matching ------------------------------------------------------------------ | Observations per m |---------------------------------------------- Variable | Complete Incomplete Imputed | Total -------------------+-----------------------------------+---------- female | 942 86 12 | 1028 age | 914 114 25 | 1028 e1_last | 679 349 189 | 1028 e2_last | 683 345 187 | 1028 e3_last | 682 346 187 | 1028 e7_last | 682 346 189 | 1028 e8_last | 679 349 191 | 1028 b_und_last | 628 400 208 | 1028 b_val_last | 626 402 211 | 1028 b_want_last | 621 407 213 | 1028 ps1_last | 687 341 185 | 1028 ps2_last | 684 344 187 | 1028 ps3_last | 682 346 187 | 1028 ps6_last | 684 344 186 | 1028 ------------------------------------------------------------------ (complete + incomplete = total; imputed is the minimum across m of the number of filled-in observations.) Note: Right-hand-side variables (or weights) have missing values; model parameters estimated using listwise deletion.