Dissertation/Thesis Abstract

An Imputation-Estimation Algorithm Using Time-Varying Auxiliary Covariates for a Longitudinal Model When Outcome is Missing by Design
by Temprosa, Marinella Gracia Montealegre, Ph.D., The George Washington University, 2012, 128; 3524075
Abstract (Summary)

In long term clinical trials, occurrence of missing data is an area of concern especially if the rate at which data are missing depends on the treatment group. Typically, some effort is spent on trying to identify the reasons the data are missing so that appropriate assumptions and analytic approaches can be properly applied. When data are missing by design, certain measurements are discontinued after meeting an endpoint, possibly due to ethical or financial constraints. Subjects who reach the absorbing barrier may stop data collection on some variables but may subsequent time-varying covariates available from continued follow-up. In this dissertation, we developed an Imputation-Estimation algorithm under an auxiliary missing at random assumption to assess whether the additional information from the time varying covariates can be used to improve estimation. Quality of estimates is evaluated in terms of bias, variance and coverage for the estimates of the parameters of interest. We contrast this method to other missing data approaches such as multiple imputation and available case analysis.

We illustrate this method using data from the Diabetes Prevention Program (DPP). The DPP was a diabetes prevention study that showed reductions of 58\% and 31\% in diabetes risk using intensive lifestyle or metformin interventions compared to placebo. According to the DPP protocol, the oral glucose tolerance test is discontinued after diabetes diagnosis. Because of the significant reduction in diabetes incidence by the metformin and lifestyle interventions, the rates of missing IGR and CIR are different among the treatment groups. This differential discontinuation among treatment groups results in informative monotone missing assessments of 30 minute glucose and insulin values. These 30 minute values are used to calculate surrogate measures of insulin secretion such as Insulin Glucose Ratio (IGR = (30-min insulin - fasting insulin)/(30-min glucose - fasting glucose)). Fasting blood glucose is collected at all time points and is associated with 30-minute glucose. The imputation estimation algorithm is applied to estimate the mean 30 minute blood glucose utilizing auxiliary information from the fasting blood glucose. In this example, fasting glucose is also the source of the discontinuation since diabetes diagnosis is based on the fasting glucose and 2 hour values during the OGTT. Because of the strong dependence between the fasting and 30 minute glucose measured at the same visit, the resulting estimates from the IE algorithm using the complete vector were similar to multiple imputation. Because the Placebo group experienced higher rates of diabetes incidence, the difference between available case analysis and the regression based imputations were greater than in the lifestyle group.

Indexing (document details)
Advisor: Lachin, John M.
Commitee: Bura, Efstathia, Cook, Nancy, Larsen, Micheal, Pan, Qing
School: The George Washington University
Department: Biostatistics
School Location: United States -- District of Columbia
Source: DAI-B 74/01(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Biostatistics
Keywords: Auxiliary information, Conditional means, Linear mixed models, Missing by design, Missing data, Multiple imputation
Publication Number: 3524075
ISBN: 9781267581488
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest