8. 数据排序
> leadership$age
[1] 32 45 25 39 NA
> newdata <- leadership[order(leadership$age),]
> newdata
  manager   testDate country gender age item1 item2 item3 item4 item5
3       3 2008-10-01      UK      F  25     3     5     5     5     2
1       1 2008-10-24      US      M  32     5     4     5     5     5
4       4 2008-10-12      UK      M  39     3     3     4    NA    NA
2       2 2008-10-28      US      F  45     3     5     2     5     5
5       5 2009-05-01      UK      F  NA     2     2     1     2     1
  stringAsFactors agecat
3           FALSE  Young
1           FALSE  Young
4           FALSE  Young
2           FALSE  Young
5           FALSE   <NA>
> 
> 
> attach(leadership)
The following objects are masked _by_ .GlobalEnv:
    age, country, gender, manager
> newdata <- leadership[order(gender, age),]
> detach(leadership)
> newdata
  manager   testDate country gender age item1 item2 item3 item4 item5
3       3 2008-10-01      UK      F  25     3     5     5     5     2
2       2 2008-10-28      US      F  45     3     5     2     5     5
5       5 2009-05-01      UK      F  NA     2     2     1     2     1
1       1 2008-10-24      US      M  32     5     4     5     5     5
4       4 2008-10-12      UK      M  39     3     3     4    NA    NA
  stringAsFactors agecat
3           FALSE  Young
2           FALSE  Young
5           FALSE   <NA>
1           FALSE  Young
4           FALSE  Young
> 
> attach(leadership)
The following objects are masked _by_ .GlobalEnv:
    age, country, gender, manager
> newdata <- leadership[order(gender, -age),]
> detach(leadership)
> newdata
  manager   testDate country gender age item1 item2 item3 item4 item5
5       5 2009-05-01      UK      F  NA     2     2     1     2     1
2       2 2008-10-28      US      F  45     3     5     2     5     5
3       3 2008-10-01      UK      F  25     3     5     5     5     2
4       4 2008-10-12      UK      M  39     3     3     4    NA    NA
1       1 2008-10-24      US      M  32     5     4     5     5     5
  stringAsFactors agecat
5           FALSE   <NA>
2           FALSE  Young
3           FALSE  Young
4           FALSE  Young
1           FALSE  Young
> 
9. 数据集的合并
9.1 添加列
> patientID <- c(1, 2, 3, 4)
> age <- c(25, 34, 28, 52)
> status <- c("poor", "improved", "excellent", "poor")
> gender <- c("F", "M", "M", "F")
> dataframeA <- data.frame(patientID, gender)
> dataframeA
  patientID gender
1         1      F
2         2      M
3         3      M
4         4      F
> dataframeB <- data.frame(patientID, age, status)
> dataframeB
  patientID age    status
1         1  25      poor
2         2  34  improved
3         3  28 excellent
4         4  52      poor
> total <- merge(dataframeA, dataframeB, by="ID")
Error in fix.by(by.x, x) : ‘by‘ must specify a uniquely valid column
> total <- merge(dataframeA, dataframeB, by="patientID")
> total
  patientID gender age    status
1         1      F  25      poor
2         2      M  34  improved
3         3      M  28 excellent
4         4      F  52      poor
> total <- merge(dataframeA, dataframeB, by=c("gender", "age"))
Error in fix.by(by.x, x) : ‘by‘ must specify a uniquely valid column
> total <- merge(dataframeA, dataframeB, by=c("patientID", "age"))
Error in fix.by(by.x, x) : ‘by‘ must specify a uniquely valid column
> 
> total <- cbind(dataframeA, dataframeB)
> total
  patientID gender patientID age    status
1         1      F         1  25      poor
2         2      M         2  34  improved
3         3      M         3  28 excellent
4         4      F         4  52      poor
> 
9.2 添加行
> total <- rbind(dataframeA, dataframeB) Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match
10. 数据集取子集
10.1 选入(保留)变量
10.2 剔除(丢弃)变量
10.3 选入观测
10.4 subset() 函数
10.5 随机抽样
原文:http://www.cnblogs.com/wnzhong/p/7496188.html