This short ‘analysis’ script shall demonstrate how we can use Github to create automated reports on the collected data in a collaborative way. The following table summarizes all sensor locations used in 2019:
hobo_id | radiation_influence | longitude | latitude | min_t | max_t | mean_t | p90_t |
---|---|---|---|---|---|---|---|
10088310 | high | 7.806718 | 48.01462 | 2.935209 | 3.349429 | 3.150005 | 3.310494 |
10347312 | moderate | 7.846758 | 47.99385 | 3.597357 | 4.183214 | 3.869199 | 4.112050 |
10347315 | moderate | 7.856484 | 47.99165 | 3.000959 | 3.534959 | 3.263331 | 3.478476 |
10347318 | low | 7.845553 | 48.00467 | 2.468881 | 2.799980 | 2.620185 | 2.757361 |
10347320 | moderate | 7.815115 | 47.99623 | 3.250396 | 3.660208 | 3.443525 | 3.606123 |
10347325 | moderate | 7.852197 | 47.99449 | 3.565740 | 4.078900 | 3.815450 | 4.010816 |
10347326 | very high | 7.815046 | 47.99625 | 3.730811 | 4.406255 | 4.055476 | 4.325709 |
10347334 | high | 7.832835 | 47.99594 | 2.929849 | 3.201699 | 3.060031 | 3.168557 |
10347365 | moderate | 7.857000 | 48.00600 | 3.624805 | 3.828831 | 3.727947 | 3.808827 |
10347370 | moderate | 7.837758 | 48.00113 | 2.724418 | 3.183245 | 2.949453 | 3.137543 |
10347386 | moderate | 7.820571 | 47.99443 | 3.176835 | 3.504983 | 3.337344 | 3.473536 |
10347392 | moderate | 7.821586 | 48.00696 | 4.095670 | 4.401841 | 4.247455 | 4.364576 |
10349993 | low | 7.852334 | 48.01661 | 3.452544 | 3.889417 | 3.662897 | 3.836199 |
10350007 | low | 7.835518 | 47.98811 | 3.277500 | 3.585531 | 3.425900 | 3.552456 |
10350009 | moderate | 7.842000 | 47.99100 | 3.021436 | 3.251051 | 3.150306 | 3.234738 |
10350010 | moderate | 7.904306 | 47.98636 | 3.074440 | 3.565604 | 3.293331 | 3.498308 |
10350042 | high | 7.809647 | 47.98841 | 2.878318 | 3.201388 | 3.034553 | 3.164152 |
10350043 | very high | 7.815046 | 47.99625 | 2.719985 | 3.021985 | 2.864837 | 2.986971 |
10350049 | high | 7.828954 | 47.97359 | 3.309319 | 3.844928 | 3.570272 | 3.786965 |
10350057 | high | 7.833801 | 47.99558 | 3.256022 | 3.625774 | 3.428990 | 3.580334 |
10350062 | high | 7.831752 | 48.01040 | 2.500043 | 2.841713 | 2.665629 | 2.805714 |
10350081 | moderate | 7.836256 | 47.98446 | 3.697644 | 3.963889 | 3.825795 | 3.933556 |
10350090 | moderate | 7.836111 | 47.98450 | 3.511293 | 3.773598 | 3.633039 | 3.737476 |
10350097 | high | 7.815128 | 47.99635 | 3.459292 | 3.756738 | 3.611459 | 3.725979 |
10350099 | moderate | 7.855889 | 48.02044 | 3.917778 | 4.143911 | 4.025511 | 4.114489 |
10610853 | high | 7.849450 | 47.98659 | 2.296417 | 3.468750 | 2.952569 | 3.376808 |
10760706 | moderate | 7.868000 | 48.04020 | 2.992475 | 3.482252 | 3.224680 | 3.423634 |
10760710 | moderate | 7.811149 | 48.01132 | 4.796038 | 5.185191 | 4.979109 | 5.138973 |
10760763 | low | 7.866200 | 48.00960 | 2.844423 | 3.120141 | 2.979783 | 3.090195 |
10760810 | moderate | 7.853889 | 48.01222 | 3.397208 | 3.620208 | 3.509257 | 3.597540 |
10760814 | moderate | 7.851530 | 48.01164 | 2.661575 | 2.894356 | 2.770222 | 2.864765 |
10760823 | low | 7.856944 | 47.94222 | 1.874965 | 2.446047 | 2.145790 | 2.371102 |
10801132 | moderate | 7.853889 | 48.01222 | 3.252114 | 3.607371 | 3.418845 | 3.562727 |
10801134 | moderate | 7.846786 | 48.00408 | 2.455714 | 2.814943 | 2.632527 | 2.772409 |
The database is searched on compile time of this RMarkdown file. Check the source for the corresponding SQL code. As you can see, there is no username or password hardcoded into the file. We use environment variables for passing in this information. The calculated indices are the mean of hourly minimum, maximum, mean and 90% percentile temperature, aggregated live on query. Remember, that the database stores 5-min data. We do also exclude the first few days and the last day of the campaign and limit the used temperatures to a recorded light intensity range of 250 < light < 2500.
An overview over all used locations and their spatial distribution is given below:
Let’s group by radiation_influence and have a look at the mean temperatures:
hobos.2019 %>%
ggplot(aes(x=radiation_influence, y=mean_t)) +
geom_violin()
Before you continue with in-depth analysis, or run a violin/boxplot on each index value, let’s look at the correlations between the indices:
correlation(hobos.2019)
## min_t max_t mean_t p90_t
## min_t 1.00 0.95 0.98 0.96
## max_t 0.95 1.00 0.99 1.00
## mean_t 0.98 0.99 1.00 0.99
## p90_t 0.96 1.00 0.99 1.00
Looks like the there is a high correlation between the average hourly minimum and average hourly mean temperature. Let’s verify this by a T-Test.
t.test(hobos.2019$min_t, hobos.2019$mean_t, paired = T)
##
## Paired t-test
##
## data: hobos.2019$min_t and hobos.2019$mean_t
## t = -11.411, df = 33, p-value = 5.38e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.2286325 -0.1594397
## sample estimates:
## mean of the differences
## -0.1940361