| asNumericDF {Ecdat} | R Documentation |
Delete commas (thousand separators) and drop information after a
blank, then coerce to numeric and order the rows by the
orderBy. Some Excel imports include commas as thousand
separators; this replaces any commas with char(0), ”. Also, some
character data includes footnote references following the year. Table
F-1 from the US Census Bureau needs all three of these features: It
needs orderBy, because the most recent year appears first, just
the opposite of most other data sets where the most recent year
appears last. It has footnote references following a character string
indicating the year. And it includes commas as thousand separators.
asNumericChar(x)
asNumericDF(x, keep=function(x)any(!is.na(x)), orderBy)
x |
For |
keep |
something to indicate which columns to keep |
orderBy |
Which columns to order the rows of |
1. Replace commas by nothing
2. strsplit on ' ' and take only the first part, thereby eliminating the footnote references.
3. Replace any blanks with NAs
4. as.numeric
5. lapply(x, 1-4)
6. order the rows; by default, ascending on the first column
all numeric data.frame
Spencer Graves
fakeF1 <- data.frame(yr=c('1948', '1947 (1)'),
q1=c('1,234', ''), duh=rep(NA, 2) )
nF1 <- asNumericDF(fakeF1)
nF1. <- data.frame(yr=asNumericChar(fakeF1$yr),
q1=asNumericChar(fakeF1$q1))[2:1,]
# correct answer
row.names(nF1.) <- 2:1
nF1c <- data.frame(yr=1947:1948, q1=c(NA, 1234))
row.names(nF1c) <- 2:1
all.equal(nF1, nF1.)
all.equal(nF1, nF1c)