| PimaIndiansDiabetes {mlbench} | R Documentation |
Pima Indians Diabetes Database
Description
A data frame with 768 observations on 9 variables.
Usage
data("PimaIndiansDiabetes", package = "mlbench")
data("PimaIndiansDiabetes2", package = "mlbench")
Format
| pregnant | Number of times pregnant |
| glucose | Plasma glucose concentration (glucose tolerance test) |
| pressure | Diastolic blood pressure (mm Hg) |
| triceps | Triceps skin fold thickness (mm) |
| insulin | 2-Hour serum insulin (mu U/ml) |
| mass | Body mass index |
| pedigree | Diabetes pedigree function |
| age | Age (years) |
| diabetes | Class variable (test for diabetes) |
Details
The data set PimaIndiansDiabetes2 contains a corrected
version of the original data set. While the UCI repository index
claims that there are no missing values, closer inspection of the data
shows several physical impossibilities, e.g., blood pressure or body
mass index of 0. In PimaIndiansDiabetes2, all zero values of
glucose, pressure, triceps, insulin and
mass have been set to NA, see also
Wahba, Gu, Wang, and Chappell (1995) and Ripley (1996).
Source
Original owners: National Institute of Diabetes and Digestive and Kidney Diseases
Donor of database: Vincent Sigillito (vgs@aplcen.apl.jhu.edu)
These data have been taken from the UCI Repository Of Machine Learning Databases (Blake and Merz 1998) and were converted to R format by Friedrich Leisch in the late 1990s.
The data no longer seems to be available from the UC Irvine Machine Learning Repository (now at https://archive.ics.uci.edu/).
References
Blake CL, Merz CJ (1998). “UCI Repository of Machine Learning Databases.” University of California, Irvine, Department of Information and Computer Science. Formerly available from ‘http://www.ics.uci.edu/~mlearn/MLRepository.html’. Ripley BD (1996). Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge. doi:10.1017/CBO9780511812651. Wahba G, Gu C, Wang Y, Chappell R (1995). “Chapter Soft Classification, a.k.a. Risk Estimation, via Penalized Log Likelihood and Smoothing Spline Analysis of Variance.” In Wolpert DH (ed.), The Mathematics of Generalization, 331–359. Addison-Wesley, Reading, MA.
Examples
data("PimaIndiansDiabetes", package = "mlbench")
summary(PimaIndiansDiabetes)
data("PimaIndiansDiabetes2", package = "mlbench")
summary(PimaIndiansDiabetes2)