r - detect outliers in a group and outlier in the single data -

car    100 200 300 group1  34  35  34 group1  57  67  34 group1  68  76  6 group2  45  23  23

i have problems while detecting outliers in dataframe. want detect if there complete vector (one row) outlier of corresponding group vectors (rows one-three)for each group. further want detect if there outlier in 1 specific row. problem found solution code have repeat whole code every single row , check table "true". there outomatisation possible? e.g. creating matrix of outputs have check >sum(matrix==true)

the code:

x=as.numeric(data_without[1,1:400]) grubbs.flag <- function(x) {      outliers <- null      test <- x      grubbs.result <- grubbs.test(test)      pv <- grubbs.result$p.value      while(pv < 0.05) {          outliers <- c(outliers,as.numeric(strsplit(grubbs.result$alternative," ")[[1]][3]))          test <- x[!x %in% outliers]          grubbs.result <- grubbs.test(test)          pv <- grubbs.result$p.value      }      return(data.frame(x=x,outlier=(x %in% outliers)))  }  grubbs.flag(x)          x outlier 1   0.1157   false 2   0.1152   false 3   0.1163   false 4   0.1165   false

i've read object documentation , default option checks if there single outlier given data. therefore consider suffices run test once per each group.

first data split group , test done recursively each group. p-value , description returned @ end see outlier if - it'd easy identify outlier it'll either maximum or minimum value.

library(outliers) df <- t(data.frame(car = c(100,200,300),                  g1 = c(34,35,34),                  g1 = c(57,67,34),                  g1 = c(68, 76, 6),                  g2 = c(45, 23, 23))) row.names(df) <- c("car", "group1", "group1", "group1", "group2")  lst <- lapply(1:length(unique(row.names(df))), function(x) {   df[row.names(df)==unique(row.names(df))[x],] })  lst [[1]] [1] 100 200 300  [[2]] [,1] [,2] [,3] group1   34   35   34 group1   57   67   34 group1   68   76    6  [[3]] [1] 45 23 23  lapply(lst, function(x) {   tst <- grubbs.test(x)   c(tst$p.value, tst$alternative) }) [[1]] [1] "0.5"                             "highest value 300 outlier"  [[2]] [1] "0.244875529263511"            "lowest value 6 outlier"  [[3]] [1] "0"                              "highest value 45 outlier"

Autos

Search This Blog

r - detect outliers in a group and outlier in the single data -