Evaluate and compare multiple imputation methods with consistent metrics and an intuitive S3 interface.
imputetoolkit
is an R package for evaluating the quality of data imputation methods.
This package implements an object-oriented S3 interface around an evaluator
class that computes multiple metrics comparing imputed values with ground truth data.
It is designed to help researchers and practitioners benchmark different imputation strategies side-by-side, providing both per-column and global metrics such as RMSE, MAE, R², correlation recovery, KS statistic, and accuracy.
By wrapping results in an evaluator
object, the package offers a consistent, user-friendly interface with familiar methods like print()
and summary()
.
The evaluator function is a strong candidate for Object-Oriented programming because:
class = "evaluator"
) instead of scattering them across lists or separate return values.plot.evaluator
, predict.evaluator
) could later extend functionality without rewriting core code.$rmse
or $r2
, making the workflow clunkier.Overall, the evaluator is a good candidate for OO programming because it bundles rich, structured outputs into an intuitive object, provides a user-friendly interface, and remains extensible for future enhancements.
evaluator()
Constructor that creates an object of class "evaluator"
.
Takes two named lists of numeric vectors (true_data
, imputed_data
) and a method name.
S3 Methods
print.evaluator(x)
– displays global evaluation metrics for an imputation method.summary.evaluator(x)
– returns a data.frame
with per-column metrics and global averages.Metrics Computed For each column:
These are also aggregated into global values stored in the evaluator
object.
You can install directly from GitHub:
# Install from GitHub
devtools::install_github("tanveer09/imputetoolkit")
library(imputetoolkit)
# Ground truth and imputed data
true_data <- list(
age = c(25, 30, 40),
income = c(50000, 60000, 70000)
)
imputed_data <- list(
age = c(25, 31, 39),
income = c(50000, 61000, 69000)
)
# Create evaluator object
result <- evaluator(true_data, imputed_data, method = "mean")
# Inspect results
print(result)
summary(result)
Evaluation for method: mean
Global Metrics:
RMSE : 1.2909
MAE : 1.0000
R^2 : 0.9456
Correlation: 0.9827
KS : 0.2000
Accuracy : 0.5000
Per-column metrics available in result$metrics
evaluator(true_data, imputed_data, method)
for each method.print()
for quick checks and summary()
for detailed per-column analysis.Unit tests are provided under the tests/testthat/
directory. To run all tests:
devtools::test()
These tests check that:
"evaluator"
are created correctly.All functions are documented with Roxygen2. To rebuild documentation, run:
devtools::document()
Help pages are available for all major functions:
?evaluator
?print.evaluator
?summary.evaluator
Some parts of this package — including documentation drafting, README preparation, and sections of the R/C++ code (e.g., error handling and function scaffolding), were assisted by ChatGPT (OpenAI).
All generated content was reviewed, debugged, and adapted before inclusion in the final submission.