2019 Data overview

distribution

This short ‘analysis’ script shall demonstrate how we can use Github to create automated reports on the collected data in a collaborative way. The following table summarizes all sensor locations used in 2019:

All of 2019’s HOBO locations
hobo_id radiation_influence longitude latitude min_t max_t mean_t p90_t
10088310 high 7.806718 48.01462 2.935209 3.349429 3.150005 3.310494
10347312 moderate 7.846758 47.99385 3.597357 4.183214 3.869199 4.112050
10347315 moderate 7.856484 47.99165 3.000959 3.534959 3.263331 3.478476
10347318 low 7.845553 48.00467 2.468881 2.799980 2.620185 2.757361
10347320 moderate 7.815115 47.99623 3.250396 3.660208 3.443525 3.606123
10347325 moderate 7.852197 47.99449 3.565740 4.078900 3.815450 4.010816
10347326 very high 7.815046 47.99625 3.730811 4.406255 4.055476 4.325709
10347334 high 7.832835 47.99594 2.929849 3.201699 3.060031 3.168557
10347365 moderate 7.857000 48.00600 3.624805 3.828831 3.727947 3.808827
10347370 moderate 7.837758 48.00113 2.724418 3.183245 2.949453 3.137543
10347386 moderate 7.820571 47.99443 3.176835 3.504983 3.337344 3.473536
10347392 moderate 7.821586 48.00696 4.095670 4.401841 4.247455 4.364576
10349993 low 7.852334 48.01661 3.452544 3.889417 3.662897 3.836199
10350007 low 7.835518 47.98811 3.277500 3.585531 3.425900 3.552456
10350009 moderate 7.842000 47.99100 3.021436 3.251051 3.150306 3.234738
10350010 moderate 7.904306 47.98636 3.074440 3.565604 3.293331 3.498308
10350042 high 7.809647 47.98841 2.878318 3.201388 3.034553 3.164152
10350043 very high 7.815046 47.99625 2.719985 3.021985 2.864837 2.986971
10350049 high 7.828954 47.97359 3.309319 3.844928 3.570272 3.786965
10350057 high 7.833801 47.99558 3.256022 3.625774 3.428990 3.580334
10350062 high 7.831752 48.01040 2.500043 2.841713 2.665629 2.805714
10350081 moderate 7.836256 47.98446 3.697644 3.963889 3.825795 3.933556
10350090 moderate 7.836111 47.98450 3.511293 3.773598 3.633039 3.737476
10350097 high 7.815128 47.99635 3.459292 3.756738 3.611459 3.725979
10350099 moderate 7.855889 48.02044 3.917778 4.143911 4.025511 4.114489
10610853 high 7.849450 47.98659 2.296417 3.468750 2.952569 3.376808
10760706 moderate 7.868000 48.04020 2.992475 3.482252 3.224680 3.423634
10760710 moderate 7.811149 48.01132 4.796038 5.185191 4.979109 5.138973
10760763 low 7.866200 48.00960 2.844423 3.120141 2.979783 3.090195
10760810 moderate 7.853889 48.01222 3.397208 3.620208 3.509257 3.597540
10760814 moderate 7.851530 48.01164 2.661575 2.894356 2.770222 2.864765
10760823 low 7.856944 47.94222 1.874965 2.446047 2.145790 2.371102
10801132 moderate 7.853889 48.01222 3.252114 3.607371 3.418845 3.562727
10801134 moderate 7.846786 48.00408 2.455714 2.814943 2.632527 2.772409

The database is searched on compile time of this RMarkdown file. Check the source for the corresponding SQL code. As you can see, there is no username or password hardcoded into the file. We use environment variables for passing in this information. The calculated indices are the mean of hourly minimum, maximum, mean and 90% percentile temperature, aggregated live on query. Remember, that the database stores 5-min data. We do also exclude the first few days and the last day of the campaign and limit the used temperatures to a recorded light intensity range of 250 < light < 2500.

An overview over all used locations and their spatial distribution is given below:

Let’s group by radiation_influence and have a look at the mean temperatures:

hobos.2019 %>%
  ggplot(aes(x=radiation_influence, y=mean_t)) +
  geom_violin()

analysis

Before you continue with in-depth analysis, or run a violin/boxplot on each index value, let’s look at the correlations between the indices:

correlation(hobos.2019)
##        min_t max_t mean_t p90_t
## min_t   1.00  0.95   0.98  0.96
## max_t   0.95  1.00   0.99  1.00
## mean_t  0.98  0.99   1.00  0.99
## p90_t   0.96  1.00   0.99  1.00

Looks like the there is a high correlation between the average hourly minimum and average hourly mean temperature. Let’s verify this by a T-Test.

t.test(hobos.2019$min_t, hobos.2019$mean_t, paired = T)
## 
##  Paired t-test
## 
## data:  hobos.2019$min_t and hobos.2019$mean_t
## t = -11.411, df = 33, p-value = 5.38e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.2286325 -0.1594397
## sample estimates:
## mean of the differences 
##              -0.1940361