Despite some theoretical promise, it is unclear whether rule induction data mining approaches (e.g., classification trees and association rules) add methodological value to "orthodox" education research, i.e., research unrelated to computer-based education. To better understand whether and how rule induction methods could be useful to education researchers, I explored whether they, relative to regression approaches, (1) improve classification accuracy, and/or (2) offer new avenues of explanation. Additionally, I aimed to illustrate a practical and principled way to use the various rule induction approaches so researchers can more easily choose to use it. To these ends, I conducted an extended literature review on rule induction methods, and re-analyzed two regression studies (Byrnes & Miller, 2007; Thomas, 2006) on the National Educational Longitudinal Study of 1988 using ten rule induction approaches. Data mining happened in two rounds for each study: first, by using only the predictors used in the original study, and second by using all reasonable and available predictors. I compared results across methods and rounds to better understand whether, how, and why the rule induction may provide additional insights.
I found that while rule induction approaches can be labor intensive and not necessarily more predictive than regression, they can provide unique descriptions of the sample that shows at-a-glance, how key predictors relate to each other and to the outcome. They can also help identify relationships between variables that held for some subgroups but not others. For example: (i) rulesets induced from Byrnes and Miller's dataset suggested that Algebra 2 and math self-concept were positively related to 12th grade math scores, but only for those who were higher achieving in 8th grade math; (ii) association rules mined from Thomas' dataset suggested that factors such as school safety and honors program participation were more strongly associated with 12th grade achievement for lower income and students with lower parental education. Thus, when relationships between the predictors and outcome may not be uniform across the population, rule induction can provide more information than regression in exploring those relationships. Lessons learned and recommendations on how to apply rule induction approaches are also discussed.
|School:||University of Pittsburgh|
|School Location:||United States -- Pennsylvania|
|Source:||DAI-A 79/09(E), Dissertation Abstracts International|
|Subjects:||Statistics, Education, Information science|
|Keywords:||Association rules, Data mining, Decision tree, Quantitative research methodology, Sequential covering, Student achievement|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be