Small movements of tagged animals result in discernible variations in the strength of the received signal (Cochran et al. (1965); Kjos and Cochran (1970)) that reflect changes in the angle and distance between the transmitter and receiver. Kays et al. (2011) proposed a method for automatically classifying active and passive behaviour based on a threshold difference in the signal strength of successive VHF signals recorded by a commercial automatic radio-tracking system. However, machine learning (ML) algorithms are optimised for the recognition of complex patterns in a dataset and are typically robust against factors that influence signal propagation, such as changes in temperature and humidity, physical contact with conspecifics and/or multipath signal propagation (Alade (2013)). Accordingly, a ML model trained with a dataset encompassing the possible diversity of signal patterns related to active and passive behaviour can be expected to perform at least as well as a threshold-based approach. In this work, we built on the methodology of Kays et al. (2011) by calibrating two random forest models (1 for data comming from only one receiver and one for data coming from at least two receivers), based on millions of data points representing the behaviours of multiple tagged individuals of two temperate bat species (Myotis bechsteinii, Nyctalus leisleri).
The method was tested by applying it to independent data from bats, humans, and a bird species and then comparing the results with those obtained using the threshold-based approach of Kays et al. (2011) applying a threshold value of 2.5 dBw signalstrength difference suggested by Schofield et al 2018.
In order to make our work comprehensible, code and data are made available to all interested parties. Data for model training can be found here. Data for evaluation is stored here.
This resource contains the following steps:
But before we get started:
Although deep learning methods have been successfully applied to several ecological problems where large amounts of data are available (Christin, Hervet, and Lecomte (2019)), we use a random forest model due to the following reasons:
For model training and tuning we use the caret
R-Package (Kuhn 2008). For the forward feature selection we use the CAST
R-Package developed by Meyer et al. 2018.
Additional packages needed are: randomForest
,ranger
, doParallel
, MLeval
, data.table
, dplyr
, plyr
Load packages
library(caret); library(randomForest);library(ranger); library(doParallel);library(MLeval);library(CAST);library(data.table);library(dplyr);library(plyr)
Only one antenna is necessary to classify VHF signals into active vs. passive states (Kays et al. 2011). However, agreement between receivers of the same station provides additional information and can improve the reliability of the classification. Our groundtruth dataset was balanced by randomly down-sampling the activity class with the most data to the amount of data contained by the class with the least data. These balanced datasets were then split into 50% training data and 50% test data for data originating from one receiver. The same procedure was used for data derived from the signals of two receivers, resulting in two training and two test datasets. From a total of 3,243,753 VHF signals, 124,898 signals were assigned to train the two-receiver model and 294,440 signals to train the one-receiver model (Table 1).
Since not all variables are equally important to the model and some may even be misleading,we performed a forward feature selection on 50% of the training data. The forward feature selection algorithm implemented in the R package CAST (Meyer et al. (2018)) selects the best pair of all possible two variable combinations by evaluating the performance of a k-fold cross-validation (CV). The algorithm iteratively increases the number of predictors until no improvement of the performance is achieved by adding further variables.
# get data and check class distribution
<-readRDS("model_tuning/data/batsTrain_1_receiver.rds")
data_1
table(data_1$Class)
##
## active passive
## 294173 294173
#forward feature selection
<-names(data_1[, -ncol(data_1)])
predictors
<-makeCluster(10)
cl
registerDoParallel(cl)
<- trainControl(## 10-fold CV
ctrl method = "cv",
number = 10)
#run ffs model with 10-fold CV
set.seed(10)
<- ffs(predictors=data_1[,predictors],response = data_1$Class,method="rf",
ffsmodel metric="Kappa",
tuneLength = 1,
trControl=ctrl,
verbose = TRUE)
$selectedvars
ffsmodel
saveRDS(ffsmodel, "model_tunig/models/m_r1.rds")
stopCluster(cl)
Red dots display two-variables combinations, dots with the colors from yellow to pink stand for models to each of which another variable has been added. Dots with a black border mark the optimal variable combination in the respective iteration.
<-readRDS("model_tuning/models/m_r1.rds")
m1
print(m1)
## Random Forest
##
## 588346 samples
## 7 predictor
## 2 classes: 'active', 'passive'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 529512, 529512, 529511, 529512, 529511, 529511, ...
## Resampling results:
##
## Accuracy Kappa
## 0.9631628 0.9263257
##
## Tuning parameter 'mtry' was held constant at a value of 2
print(plot_ffs(m1))
plot(varImp(m1))
#get data and check class distribution
<-readRDS("model_tuning/data/batsTrain_2_receivers.rds")
data_2
table(data_2$Class)
##
## active passive
## 110274 110274
<-names(data_2[, -ncol(data_2)])
predictors
<-makeCluster(10)
cl
registerDoParallel(cl)
<- trainControl(## 10-fold CV
ctrl method = "cv",
number = 10)
#run ffs model
set.seed(10)
<- ffs(predictors=data_2[,predictors],response = data_2$Class,method="rf",
ffsmodel metric="Kappa",
tuneLength = 1,
trControl=ctrl,
verbose = TRUE)
$selectedvars
ffsmodel
saveRDS(ffsmodel, "model_tuning/models/m_r2.rds")
stopCluster(cl)
Red dots display two-variables combinations, dots with the colors from yellow to pink stand for models to each of which another variable has been added. Dots with a black border mark the optimal variable combination in the respective iteration.
<-readRDS("model_tuning/models/m_r2.rds")
m2print(m2)
## Random Forest
##
## 220548 samples
## 8 predictor
## 2 classes: 'active', 'passive'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 198494, 198493, 198494, 198494, 198492, 198492, ...
## Resampling results:
##
## Accuracy Kappa
## 0.9740011 0.9480022
##
## Tuning parameter 'mtry' was held constant at a value of 2
print(plot_ffs(m2))
plot(varImp(m2))
Random forest is an algorithm which is far less tunable than other algorithms such as support vector machines (Probst, Wright, and Boulesteix (2019)) and is known to provide good results in the default settings of existing software packages (Fernández-Delgado et al., 2014). Even though the performance gain is still low, tuning the parameter mtry provides the biggest average improvement of the AUC (0.006) (Probst et al.2018). Mtry is defined as the number of randomly drawn candidate variables out of which each split is selected when growing a tree. Here we reduce the existing predictor variables to those selected by the forward feature selection and iteratively increase the number of randomly drawn candidate variables from 1 to the total number of selcted variables. Other parameters, such as the number of trees are held constant according to default settings in the packages used.
#reduce to ffs variables
<-names(data_1[, c(m1$selectedvars, "Class")])
predictors<-data_1[, predictors]
batsTune
#tune number of variable evaluated per tree- number of trees is 500
<- trainControl(## 10-fold CV
ctrl method = "cv",
number = 10,
verboseIter = TRUE)
)
<- expand.grid(
tunegrid mtry = 1:(length(predictors)-1), # mtry specified here
splitrule = "gini"
min.node.size = 10
,
)
<- train(Class~.,
tuned_model data=batsTune,
method='rf',
metric='Kappa',
tuneGrid=tunegrid,
ntree=1000,
trControl=ctrl)
saveRDS(tuned_model,"model_tunig/models/m_r1_tuned.rds")
<-readRDS("model_tuning/models/m_r1_tuned.rds")
m1_tuned
print(m1_tuned)
## Random Forest
##
## 588346 samples
## 7 predictor
## 2 classes: 'active', 'passive'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 529510, 529512, 529511, 529511, 529511, 529512, ...
## Resampling results across tuning parameters:
##
## mtry Accuracy Kappa
## 1 0.9591601 0.9183202
## 2 0.9616518 0.9233036
## 3 0.9619646 0.9239291
## 4 0.9618371 0.9236742
## 5 0.9615039 0.9230079
## 6 0.9610569 0.9221139
## 7 0.9602819 0.9205637
##
## Tuning parameter 'splitrule' was held constant at a value of gini
##
## Tuning parameter 'min.node.size' was held constant at a value of 10
## Kappa was used to select the optimal model using the largest value.
## The final values used for the model were mtry = 3, splitrule = gini
## and min.node.size = 10.
#reduce to ffs variables
<-names(data_2[, c(m2$selectedvars, "Class")])
predictors<-data_2[, predictors]
batsTune
#tune number of variable evaluated per tree- number of trees is 1000
<- trainControl(## 10-fold CV
ctrl method = "cv",
number = 10,
verboseIter = TRUE
)
<- expand.grid(
tunegrid mtry = 1:(length(predictors)-1), # mtry specified here
splitrule = "gini"
min.node.size = 10
,
)<- train(Class~.,
tuned_model_2 data=batsTune,
method='rf',
metric='Kappa',
tuneGrid=tunegrid,
ntree=1000,
trControl=ctrl)
print(tuned_model_2)
saveRDS(tuned_model_2,"model_tuning/models/m_r2_tuned.rds")
<-readRDS("model_tuning/models/m_r2_tuned.rds")
m2_tunedprint(m2_tuned)
## Random Forest
##
## 220548 samples
## 8 predictor
## 2 classes: 'active', 'passive'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 198494, 198494, 198493, 198492, 198493, 198494, ...
## Resampling results across tuning parameters:
##
## mtry Accuracy Kappa
## 1 0.9719608 0.9439215
## 2 0.9724187 0.9448374
## 3 0.9717975 0.9435951
## 4 0.9712988 0.9425976
## 5 0.9710041 0.9420081
## 6 0.9707139 0.9414277
## 7 0.9703285 0.9406569
## 8 0.9702605 0.9405209
##
## Tuning parameter 'splitrule' was held constant at a value of gini
##
## Tuning parameter 'min.node.size' was held constant at a value of 10
## Kappa was used to select the optimal model using the largest value.
## The final values used for the model were mtry = 2, splitrule = gini
## and min.node.size = 10.
Both models ( based on data from 1 receiver and 2 receivers) had very high performance metrics (Kappa, Accuracy) with slightly better results for the 2 receivers model.Tuning the mtry parameter did not increase the performance which indicates that for our use case default settings are a good choice.
For the Validation of the model performance and applicability to species with different movement behaviour (speed etc. than bats) we generated three different data sets. + 1. We put 50% of our bat data aside + 2. We collected ground truth data of a tagged medium spotted woodpecker + 3. We simulated different movement intensities by humans carrying transmitters through the forest
In this section we will test how well the models perform in terms of different performance metrics such as F-Score, Accuracy, ROC-AUC
We first take a look at te 50% test data that has been put aside for evaluation. Here we actually perform the prediction using the two trained models. For the woodpecker and human walk data set we will use already predicted data that has been processed by script validation_woodpecker
and validation_human_activity
.
# Testdata 1 receiver
<-readRDS("validation/bats/data/batsTest_1_receiver.rds")
Test_1print(table(Test_1$Class))
##
## active passive
## 294172 294172
# Default names as expected in Caret
$obs<-factor(Test_1$Class)
Test_1
#get binary prediction
<-predict(m1, Test_1)
pred1$pred<-factor(pred1)
Test_1
#probabilities
<-predict(m1, Test_1, type="prob")
prob<-cbind(Test_1, prob) Test_1
#calculate roc-auc
<- MLeval::evalm(data.frame(prob, Test_1$obs))
roc1 saveRDS(roc1, "validation/bats/results/roc_1receiver.rds")
#create confusion matrix
<- confusionMatrix(factor(Test_1$pred), factor(Test_1$Class))
cm_r1print(cm_r1)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 282861 10220
## passive 11311 283952
##
## Accuracy : 0.9634
## 95% CI : (0.9629, 0.9639)
## No Information Rate : 0.5
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9268
##
## Mcnemar's Test P-Value : 1.099e-13
##
## Sensitivity : 0.9615
## Specificity : 0.9653
## Pos Pred Value : 0.9651
## Neg Pred Value : 0.9617
## Prevalence : 0.5000
## Detection Rate : 0.4808
## Detection Prevalence : 0.4981
## Balanced Accuracy : 0.9634
##
## 'Positive' Class : active
##
print(cm_r1$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9615497 0.9652584 0.9651291
## Neg Pred Value Precision Recall
## 0.9616918 0.9651291 0.9615497
## F1 Prevalence Detection Rate
## 0.9633361 0.5000000 0.4807749
## Detection Prevalence Balanced Accuracy
## 0.4981456 0.9634041
#
twoClassSummary(Test_1, lev = levels(Test_1$obs))
## ROC Sens Spec
## 0.9942587 0.9615497 0.9652584
<- readRDS("validation/bats/results/roc_1receiver.rds")
roc1 print(roc1$roc)
#two receivers
<-readRDS("validation/bats/data/batsTest_2_receivers.rds")
Test_2
table(Test_2$Class)
##
## active passive
## 110273 110273
$obs<-Test_2$Class
Test_2#get binary prediction
<-predict(m2, Test_2)
pred2$pred<-pred2
Test_2#probabilities
<-predict(m2, Test_2, type="prob")
prob2<-cbind(Test_2, prob2) Test_2
#calculate roc-auc
<- MLeval::evalm(data.frame(prob2, Test_2$obs))
roc2
saveRDS(roc2, "validation/bats/results/roc_2receivers.rds")
<- confusionMatrix(factor(Test_2$pred), factor(Test_2$obs))
cm_r2print(cm_r2)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 107568 2746
## passive 2705 107527
##
## Accuracy : 0.9753
## 95% CI : (0.9746, 0.9759)
## No Information Rate : 0.5
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.9506
##
## Mcnemar's Test P-Value : 0.588
##
## Sensitivity : 0.9755
## Specificity : 0.9751
## Pos Pred Value : 0.9751
## Neg Pred Value : 0.9755
## Prevalence : 0.5000
## Detection Rate : 0.4877
## Detection Prevalence : 0.5002
## Balanced Accuracy : 0.9753
##
## 'Positive' Class : active
##
print(cm_r2$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9754700 0.9750982 0.9751074
## Neg Pred Value Precision Recall
## 0.9754608 0.9751074 0.9754700
## F1 Prevalence Detection Rate
## 0.9752887 0.5000000 0.4877350
## Detection Prevalence Balanced Accuracy
## 0.5001859 0.9752841
#
twoClassSummary(data_2, lev = levels(data_2$obs))
<- readRDS("validation/bats/results/roc_2receivers.rds")
roc2 print(roc2$roc)
#two receivers
<-readRDS("validation/woodpecker/data/woodpecker_groundtruth.rds")
wp
$obs<-as.factor(wp$observed)
wp$pred<-as.factor(wp$prediction) wp
#create confusion matrix
<- confusionMatrix(wp$pred, wp$obs)
cm_wpprint(cm_wp)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 8309 31
## passive 432 7969
##
## Accuracy : 0.9723
## 95% CI : (0.9697, 0.9748)
## No Information Rate : 0.5221
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9447
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.9506
## Specificity : 0.9961
## Pos Pred Value : 0.9963
## Neg Pred Value : 0.9486
## Prevalence : 0.5221
## Detection Rate : 0.4963
## Detection Prevalence : 0.4982
## Balanced Accuracy : 0.9734
##
## 'Positive' Class : active
##
print(cm_wp$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9505777 0.9961250 0.9962830
## Neg Pred Value Precision Recall
## 0.9485776 0.9962830 0.9505777
## F1 Prevalence Detection Rate
## 0.9728939 0.5221313 0.4963264
## Detection Prevalence Balanced Accuracy
## 0.4981781 0.9733514
print(twoClassSummary(wp, lev = levels(wp$obs)))
## ROC Sens Spec
## 0.9982197 0.9505777 0.9961250
<- MLeval::evalm(data.frame(wp[, c("active", "passive")], wp$obs), plots=c("r")) roc_wp
#print(roc_wp$roc)
#two receivers
<-readRDS("validation/human/data/human_walk_groundtruth.rds")
hm$obs<-factor(hm$observation)
hm$pred<-factor(hm$prediction) hm
#create confusion matrix
<- confusionMatrix(factor(hm$pred), factor(hm$obs))
cm_hmprint(cm_hm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 25787 280
## passive 717 5870
##
## Accuracy : 0.9695
## 95% CI : (0.9675, 0.9713)
## No Information Rate : 0.8117
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9028
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.9729
## Specificity : 0.9545
## Pos Pred Value : 0.9893
## Neg Pred Value : 0.8911
## Prevalence : 0.8117
## Detection Rate : 0.7897
## Detection Prevalence : 0.7983
## Balanced Accuracy : 0.9637
##
## 'Positive' Class : active
##
print(cm_hm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9729475 0.9544715 0.9892584
## Neg Pred Value Precision Recall
## 0.8911492 0.9892584 0.9729475
## F1 Prevalence Detection Rate
## 0.9810352 0.8116617 0.7897042
## Detection Prevalence Balanced Accuracy
## 0.7982789 0.9637095
twoClassSummary(hm, lev = levels(hm$obs))
## ROC Sens Spec
## 0.9902507 0.9729475 0.9544715
<- MLeval::evalm(data.frame(hm[, c("active", "passive")], hm$obs),plots=c("r")) roc_hm
#print(roc_hm$roc)
Regardless of whether the models were tested on independent test data from bats or on data from other species (human, woodpecker), the performance metrics were always close to their maxima.
The results of the ML-based approach were compared with those of a threshold-based approach (Kays et al. 2011)by calculating the difference in the signal strength between successive signals for all three test datasets (bats, bird, humans). We applied a threshold of 2.5 dB which was deemed appropriate to optimally separate active and passive behaviours in previous studies . In addition, the optimize-function of the R-package stats (R Core Team, 2021) was used to identify the value of the signal strength difference that separated the training dataset into active and passive with the highest accuracy. This value was also applied to all three test datasets.
To find the threshold value that optimizes the accuracy (data is balanced) when separating the data into active and passive, we first calculated the signal strength difference of consecutive signals in the complete bats data set, than separated 50 % balanced test and train data and finally used the optimize function from the R base package to determine the best threshold.
#get all bat data
<-fread("validation/bats/data/train_2020_2021.csv")
trn
#calculate signal strength difference per station
<-plyr::ldply(unique(trn$station), function(x){
dtrn
<-trn[trn$station==x,]
tmp<-tmp[order(tmp$timestamp),]
tmp<-tmp%>%group_by(ID)%>%
tmpmutate(Diff = abs(max_signal - lag(max_signal)))
return(tmp)
})
##data clean up
<-dtrn[!is.na(dtrn$Diff),]
dtrn<-dtrn[!(dtrn$behaviour=="active" & dtrn$Diff==0),]
dtrn
##factorize
$behaviour<-as.factor(dtrn$behaviour)
dtrntable(dtrn$behaviour)
##
## active passive
## 513831 2654868
#balance data
set.seed(10)
<-downSample(x = dtrn,
tdowny = dtrn$behaviour)
#create 50% train and test
<-createDataPartition(tdown$Class, p = .5,
trainIndex list = FALSE,
times = 1)
<- tdown[ trainIndex,]
dtrn <- tdown[-trainIndex,]
dtst
#optimize seperation value based on accuracy (remeber data is balanced)
<-dtrn$Diff
value<-dtrn$behaviour
group
= Vectorize(function(th) mean(c("passive", "active")[(value > th) + 1] == group))
accuracy <-optimize(accuracy, c(min(value, na.rm=TRUE), max(value, na.rm=TRUE)), maximum=TRUE)
ac
$maximum ac
## [1] 1.088167
#classify data by optimized value
$pred<-NA
dtst$pred[dtst$Diff>ac$maximum]<-"active"
dtst$pred[dtst$Diff<=ac$maximum]<-"passive"
dtst
#calc confusion matrix
$pred<-factor(dtst$pred)
dtst<-confusionMatrix(factor(dtst$Class), factor(dtst$pred))
cm
print(cm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 198976 57939
## passive 81121 175794
##
## Accuracy : 0.7294
## 95% CI : (0.7281, 0.7306)
## No Information Rate : 0.5451
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.4587
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.7104
## Specificity : 0.7521
## Pos Pred Value : 0.7745
## Neg Pred Value : 0.6842
## Prevalence : 0.5451
## Detection Rate : 0.3872
## Detection Prevalence : 0.5000
## Balanced Accuracy : 0.7312
##
## 'Positive' Class : active
##
print(cm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.7103825 0.7521146 0.7744818
## Neg Pred Value Precision Recall
## 0.6842497 0.7744818 0.7103825
## F1 Prevalence Detection Rate
## 0.7410486 0.5451161 0.3872409
## Detection Prevalence Balanced Accuracy
## 0.5000000 0.7312485
#2.5 dB value from the literature
$pred<-NA
dtst$pred[dtst$Diff>2.5]<-"active"
dtst$pred[dtst$Diff<=2.5]<-"passive"
dtst
$pred<-factor(dtst$pred)
dtst<-confusionMatrix(dtst$Class, dtst$pred)
cmprint(cm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 143475 113440
## passive 48093 208822
##
## Accuracy : 0.6856
## 95% CI : (0.6844, 0.6869)
## No Information Rate : 0.6272
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.3713
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.7490
## Specificity : 0.6480
## Pos Pred Value : 0.5585
## Neg Pred Value : 0.8128
## Prevalence : 0.3728
## Detection Rate : 0.2792
## Detection Prevalence : 0.5000
## Balanced Accuracy : 0.6985
##
## 'Positive' Class : active
##
print(cm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.7489508 0.6479883 0.5584532
## Neg Pred Value Precision Recall
## 0.8128058 0.5584532 0.7489508
## F1 Prevalence Detection Rate
## 0.6398236 0.3728237 0.2792266
## Detection Prevalence Balanced Accuracy
## 0.5000000 0.6984695
Since activity observations are not continuous but signal recording on the tRackIT-Stations is, we first have to calculate the signal strength difference on the raw data and than match it to the ground truth observations
#list raw signals
<-list.files("validation/woodpecker/data/raw/", full.names = TRUE)
wp
#calculate signal strength difference
<-plyr::ldply(wp, function(x){
wp_tst
<-fread(x)
tmp<-tmp[order(tmp$timestamp),]
tmp<-tmp%>%mutate(Diff = abs(max_signal - lag(max_signal)))
tmpreturn(tmp)
})
$timestamp<-lubridate::with_tz(wp_tst$timestamp, "CET")
wp_tst
#get observations and merge by timestamp
<-readRDS("validation/woodpecker/data/woodpecker_groundtruth.rds")
wp_gtruth
<-merge(wp_gtruth, wp_tst, all.x = TRUE) wp_tst
$pred<-NA
wp_tst$pred[wp_tst$Diff>ac$maximum]<-"active"
wp_tst$pred[wp_tst$Diff<=ac$maximum]<-"passive"
wp_tst
$pred<-factor(wp_tst$pred)
wp_tst$observed<-factor(wp_tst$observed)
wp_tst
<-confusionMatrix(factor(wp_tst$observed), factor(wp_tst$pred))
cm
print(cm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 8191 3822
## passive 590 7692
##
## Accuracy : 0.7826
## 95% CI : (0.7769, 0.7883)
## No Information Rate : 0.5673
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.5757
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.9328
## Specificity : 0.6681
## Pos Pred Value : 0.6818
## Neg Pred Value : 0.9288
## Prevalence : 0.4327
## Detection Rate : 0.4036
## Detection Prevalence : 0.5919
## Balanced Accuracy : 0.8004
##
## 'Positive' Class : active
##
print(cm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9328095 0.6680563 0.6818447
## Neg Pred Value Precision Recall
## 0.9287612 0.6818447 0.9328095
## F1 Prevalence Detection Rate
## 0.7878234 0.4326681 0.4035969
## Detection Prevalence Balanced Accuracy
## 0.5919192 0.8004329
#evaluate with 2.5 dB value from the literature
$pred<-NA
wp_tst$pred[wp_tst$Diff>2.5]<-"active"
wp_tst$pred[wp_tst$Diff<=2.5]<-"passive"
wp_tst
$pred<-factor(wp_tst$pred)
wp_tst<-confusionMatrix(wp_tst$observed, wp_tst$pred)
cmprint(cm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 5499 6514
## passive 284 7998
##
## Accuracy : 0.665
## 95% CI : (0.6585, 0.6715)
## No Information Rate : 0.7151
## P-Value [Acc > NIR] : 1
##
## Kappa : 0.3792
##
## Mcnemar's Test P-Value : <2e-16
##
## Sensitivity : 0.9509
## Specificity : 0.5511
## Pos Pred Value : 0.4578
## Neg Pred Value : 0.9657
## Prevalence : 0.2849
## Detection Rate : 0.2710
## Detection Prevalence : 0.5919
## Balanced Accuracy : 0.7510
##
## 'Positive' Class : active
##
print(cm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9508905 0.5511301 0.4577541
## Neg Pred Value Precision Recall
## 0.9657088 0.4577541 0.9508905
## F1 Prevalence Detection Rate
## 0.6180040 0.2849470 0.2709534
## Detection Prevalence Balanced Accuracy
## 0.5919192 0.7510103
Human activity observations are also not continuous so we have to calc signal strength diff for each individual on the raw data
<-list.dirs("validation/human/data/", full.names = TRUE)
hm_dirs<-hm_dirs[grep("raw", hm_dirs)]
hm_dirs<-plyr::ldply(hm_dirs, function(d){
hm_tst
<-list.files(d, full.names = TRUE)
fls
<-plyr::ldply(fls, function(x){
tmp_dat
<-fread(x)
tmp<-tmp[order(tmp$timestamp),]
tmp<-tmp%>%mutate(Diff = abs(max_signal - lag(max_signal)))
tmpreturn(tmp)
})
return(tmp_dat)})
#get obesrvations and merge
<-readRDS("validation/human/data/human_walk_groundtruth.rds")
hm_gtruth<-merge(hm_gtruth, hm_tst, all.x = TRUE)
hm_tst<-hm_tst[!duplicated(hm_tst$timestamp),] hm_tst
#evaluate based on optimized threshold
$pred<-NA
hm_tst$pred[hm_tst$Diff>ac$maximum]<-"active"
hm_tst$pred[hm_tst$Diff<=ac$maximum]<-"passive"
hm_tst
$pred<-factor(hm_tst$pred)
hm_tst$observed<-factor(hm_tst$observation)
hm_tst
<-confusionMatrix(hm_tst$observed, hm_tst$pred)
cm
print(cm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 9613 2292
## passive 143 2030
##
## Accuracy : 0.827
## 95% CI : (0.8207, 0.8333)
## No Information Rate : 0.693
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.5282
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.9853
## Specificity : 0.4697
## Pos Pred Value : 0.8075
## Neg Pred Value : 0.9342
## Prevalence : 0.6930
## Detection Rate : 0.6828
## Detection Prevalence : 0.8456
## Balanced Accuracy : 0.7275
##
## 'Positive' Class : active
##
print(cm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9853424 0.4696900 0.8074759
## Neg Pred Value Precision Recall
## 0.9341924 0.8074759 0.9853424
## F1 Prevalence Detection Rate
## 0.8875860 0.6929962 0.6828385
## Detection Prevalence Balanced Accuracy
## 0.8456457 0.7275162
#print(cm$table)
#evaluate based on 2.5 dB value from the literature
$pred<-NA
hm_tst$pred[hm_tst$Diff>2.5]<-"active"
hm_tst$pred[hm_tst$Diff<=2.5]<-"passive"
hm_tst
$pred<-factor(hm_tst$pred)
hm_tst<-confusionMatrix(hm_tst$observed, hm_tst$pred)
cm
print(cm)
## Confusion Matrix and Statistics
##
## Reference
## Prediction active passive
## active 7036 4869
## passive 29 2144
##
## Accuracy : 0.6521
## 95% CI : (0.6441, 0.66)
## No Information Rate : 0.5018
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.3024
##
## Mcnemar's Test P-Value : < 2.2e-16
##
## Sensitivity : 0.9959
## Specificity : 0.3057
## Pos Pred Value : 0.5910
## Neg Pred Value : 0.9867
## Prevalence : 0.5018
## Detection Rate : 0.4998
## Detection Prevalence : 0.8456
## Balanced Accuracy : 0.6508
##
## 'Positive' Class : active
##
print(cm$byClass)
## Sensitivity Specificity Pos Pred Value
## 0.9958953 0.3057180 0.5910122
## Neg Pred Value Precision Recall
## 0.9866544 0.5910122 0.9958953
## F1 Prevalence Detection Rate
## 0.7418028 0.5018469 0.4997869
## Detection Prevalence Balanced Accuracy
## 0.8456457 0.6508066
print(cm$table)
## Reference
## Prediction active passive
## active 7036 4869
## passive 29 2144
When calibrating the threshold based approach on an adequate train data set,ii is generally able to separate active and passive behavior but performance metrics (F1=0.74, 0.78, 0.89; bats, woodpecker, human) are between 10 and 20 points worth and more variable than our random forest model (F1= 0.97, 0.97, 0.98; bats,woodpecker,human). With F-scores between 0.6 and 0.74 the threshold value proposed in the literature performed significantly worth.
Since only the test data set of the bats is balanced but the woodpecker data is slightly imbalanced and the human activity data set is highly imbalanced lets also take a look at a metric that takes the data distribution into account:
Cohen’s kappa is defined as:
K=(p_0-p_e)/(1-p_e)
where p_0 is the overall accuracy of the model and p_e is the measure of the agreement between the model predictions and the actual class values as if happening by chance.
Cohen’s kappa is always less than or equal to 1. Values of 0 or less, indicate that the classifier is not better than chance. Landis and Koch (1977) provide a way to characterize values. According to their scheme a value < 0 is indicating no agreement , 0–0.20 slight agrement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement , and 0.81–1 as almost perfect agreement.
Kappa values based on the 2.5 dB separation value from the literature ranged between 0.3 (humans) and 0.38 (woodpecker), i.e. a fair agreement. For the optimized threshold Kappa values were significantly better in all cases (0.46, 0.58, 0.53; bats, woodpecker, humans); i.e. moderate agreement. However, even the best Kappa value for the threshold based approach only showed a moderate agreement while all Kappa values based on the random-forest model showed an almost perfect agreement ( 0.94, 0.94, 0.90 ; bats, woodpecker, humans ).