加入收藏 | 设为首页 | 会员中心 | 我要投稿 东莞站长网 (https://www.0769zz.com/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 大数据 > 正文

『Data Science』R语言学习笔记,基础语法

发布时间:2021-03-01 06:16:39 所属栏目:大数据 来源:网络整理
导读:Data Types Data Object Vector x - c(0.5,0.6) ## numericx - c(TRUE,FALSE) ## logicalx - c(T,F) ## logicalx - c("a","b","c") ## characterx - 9:29 ## integerx - c(1+0i,2+4i) ## complexx - vector("numeric",length = 10) ## create a numeric vect

apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

  • It is most often used to apply a function to the rows or columns of a matrix.
  • It can be used with general arrays,e.g. taking the average of an array of matrices.
  • It is not really faster than writing a loop,but it works in one line!
> str(apply)
function (X,MARGIN,...)
  • x is an array
  • MARGIN is an integer vector indicating which margins should be "retained"
  • FUN is a function to be applied.
  • ... is for other arguments to be passed to FUN
> x <- matrix(1:4,2)
> x
     [,2]
[1,]    1    3
[2,]    2    4
> apply(x,mean)
[1] 2 3
> apply(x,mean)
[1] 1.5 3.5
  • MARGIN = 1 Compute the mean at every row,and return a vector as result.
  • MARGIN = 1 Compute the mean at every column,and return a vector as result.

Other shortcuts.

  • rowSums = apply(x,sum)
  • rowMeans = apply(x,mean)
  • colSums = apply(x,sum)
  • colMeans = apply(x,mean)

Apply in multiple dimensions array,in the source below,we use a vector as a MARGIN value to complete the compute of multiple dimensions compute.

> a <- array(rnorm(2 * 2 * 10),c(2,10))
> apply(a,c(1,mean)
           [,1]        [,]  0.6869065 -0.66529430
[2,] -0.1136978 -0.04124547

mapply

mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.

> str(mapply)
function (FUN,...,MoreArgs = NULL,SIMPLIFY = TRUE,USE.NAMES = TRUE)
  • FUN is a function to apply.
  • ... contains arguments to apply over.
  • MoreArgs is a list of other arguments to FUN.
  • SIMPLIFY indicates whether the result should be simplified.

tapply

tapply is used to apply a function over subsets of a vector.

split

split divides the data in the vector x into the groups defined by f. The replacement forms replace values corresponding to such a division. unsplit reverses the effect of split.

> s <- split(airquality,airquality$Month)
> sapply(s,function(x) colMeans(x[,c("Ozone","Wind")]))
             5        6        7        8     9
Ozone       NA       NA       NA       NA    NA
Wind  11.62258 10.26667 8.941935 8.793548 10.18

(编辑:东莞站长网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!