Summarising en-net online forum statistics

The ennet package also includes analytic functions that summarises the text data available from the en-net online forum. Currently, there are four sets of analytic functions available from ennet:

Counting number of topics/questions

Summarising the number of topics or questions raised within the en-net online forum is basic and useful analytics that can proxy relative importance of a thematic area within the forum. This is facilitated by three sets of functions - 1) count_topics_time; 2) count_topics_theme; and, 3) count_topics_author.

Counting number of topics/questions by time

The count_topics_time set of functions consist of four functions that count topics by day, by week, by month, or by year. To count the number of topics/questions per day, the count_topics_day function is used as follows:

count_topics_day()

which results in:

#> # A tibble: 1,806 x 2
#>    day            n
#>  * <date>     <int>
#>  1 2009-02-14     1
#>  2 2009-02-18     1
#>  3 2009-02-19     2
#>  4 2009-02-20     1
#>  5 2009-02-23     1
#>  6 2009-02-24     2
#>  7 2009-02-25     3
#>  8 2009-03-02     1
#>  9 2009-03-03     1
#> 10 2009-03-07     1
#> # … with 1,796 more rows

By default, the count_topics_day function will provide an output that is not sorted by number of topics/questions per day. To sort by number of topics/questions per day, specify .sort as TRUE:

count_topics_day(.sort = TRUE)

which gives the following output:

#> # A tibble: 1,806 x 2
#>    day            n
#>    <date>     <int>
#>  1 2015-06-02    12
#>  2 2019-05-13    11
#>  3 2015-08-13     8
#>  4 2016-09-30     8
#>  5 2014-08-19     7
#>  6 2018-04-09     7
#>  7 2020-04-02     7
#>  8 2012-05-16     6
#>  9 2015-05-07     6
#> 10 2016-11-11     6
#> # … with 1,796 more rows

Counting topics/questions by week, month, or year require just using the function named after the time required. For weekly counts, use count_topics_week, for monthly use count_topics_month, and for yearly use count_topics_year.

## Count topics/questions by week
count_topics_week()

## Count topics/questions by month
count_topics_month()

## Count topics/questions by year
count_topics_year()

Counting number of topics/questions by theme

The count_topics_theme set of functions consists of two functions that count topics/questions by theme, and count topics/questions by theme and by time.

Counting of topics by theme is specified as follows:

count_topics_theme()

which results in:

#> # A tibble: 18 x 2
#>    Theme                                                              n
#>    <chr>                                                          <int>
#>  1 Announcements & Nutritionists needed                            1245
#>  2 Management of wasting/acute malnutrition                         538
#>  3 Assessment and Surveillance                                      458
#>  4 Infant and young child feeding interventions                     203
#>  5 Coverage assessment                                               94
#>  6 Upcoming trainings                                                93
#>  7 COVID-19 and nutrition programming                                84
#>  8 Micronutrients                                                    55
#>  9 Other thematic area                                               52
#> 10 Scaling Up Nutrition (SUN)                                        51
#> 11 Cross-cutting issues                                              47
#> 12 Food assistance                                                   32
#> 13 Management of At Risk Mothers and Infants                         27
#> 14 Prevention and management of stunting                             21
#> 15 Partnerships for research                                         18
#> 16 Adolescent nutrition                                              13
#> 17 Simplified Approaches for the Management of Acute Malnutrition    11
#> 18 Multi-sector nutrition programming                                 3

Results provided by count_topics_theme is not sorted by number of topics/questions. To sort, specify .sort as TRUE:

count_topics_theme(.sort = TRUE)

which results in:

#> # A tibble: 18 x 2
#>    Theme                                                              n
#>    <chr>                                                          <int>
#>  1 Announcements & Nutritionists needed                            1246
#>  2 Management of wasting/acute malnutrition                         539
#>  3 Assessment and Surveillance                                      458
#>  4 Infant and young child feeding interventions                     206
#>  5 Coverage assessment                                               94
#>  6 Upcoming trainings                                                93
#>  7 COVID-19 and nutrition programming                                85
#>  8 Micronutrients                                                    55
#>  9 Other thematic area                                               52
#> 10 Scaling Up Nutrition (SUN)                                        51
#> 11 Cross-cutting issues                                              47
#> 12 Food assistance                                                   32
#> 13 Management of At Risk Mothers and Infants                         28
#> 14 Prevention and management of stunting                             21
#> 15 Partnerships for research                                         18
#> 16 Adolescent nutrition                                              14
#> 17 Simplified Approaches for the Management of Acute Malnutrition    11
#> 18 Multi-sector nutrition programming                                 3

Counting of topics/questions by theme can also be done by time using the count_topics_theme_time. For example, counting topics/questions by theme by day is done as follows:

count_topics_theme_time(by_time = "day")

which gives the following output:

#> # A tibble: 2,496 x 3
#>    Theme                                day            n
#>    <chr>                                <date>     <int>
#>  1 Announcements & Nutritionists needed 2015-06-02    12
#>  2 Announcements & Nutritionists needed 2019-05-13    11
#>  3 Announcements & Nutritionists needed 2015-08-13     7
#>  4 Announcements & Nutritionists needed 2015-05-07     6
#>  5 Scaling Up Nutrition (SUN)           2016-11-11     6
#>  6 Announcements & Nutritionists needed 2013-10-21     5
#>  7 Announcements & Nutritionists needed 2014-09-11     5
#>  8 Announcements & Nutritionists needed 2015-08-11     5
#>  9 Announcements & Nutritionists needed 2015-09-02     5
#> 10 Announcements & Nutritionists needed 2016-03-11     5
#> # … with 2,486 more rows

Unlike the previous count functions, the output of count_topics_theme_time is already sorted by default.

To count topics/questions by theme by week, month, or year just requires changing the by_time argument to the desired time interval as shown below:

## Count topics/questions by theme by week
count_topics_theme_time(by_time = "week")

## Count topics/questions by theme by month
count_topics_theme_time(by_time = "month")

## Count topics/questions by theme by year
count_topics_theme_time(by_time = "year")

Counting number of topics/questions by author

The count_topics_author set of functions consists of two functions that count topics/questions by author, and count topics/questions by author and by time.

Counting of topics by author is specified as follows:

count_topics_author()

which results in:

#> # A tibble: 1,110 x 2
#>    Author                              n
#>    <chr>                           <int>
#>  1 Tamsin Walters                    163
#>  2 Marie Lecuyer                      61
#>  3 Michael ALVES                      58
#>  4 Anonymous 1494                     49
#>  5 Marie McGrath                      49
#>  6 Mark Myatt                         43
#>  7 <NA>                               40
#>  8 Anonymous 81                       39
#>  9 Mija Ververs                       37
#> 10 Nutrition International - NTEAM    32
#> # … with 1,100 more rows

Results provided by count_topics_author is not sorted by number of topics/questions. To sort, specify .sort as TRUE:

count_topics_author(.sort = TRUE)

which results in:

#> # A tibble: 1,110 x 2
#>    Author                              n
#>    <chr>                           <int>
#>  1 Tamsin Walters                    163
#>  2 Marie Lecuyer                      61
#>  3 Michael ALVES                      58
#>  4 Anonymous 1494                     49
#>  5 Marie McGrath                      49
#>  6 Mark Myatt                         43
#>  7 <NA>                               40
#>  8 Anonymous 81                       39
#>  9 Mija Ververs                       37
#> 10 Nutrition International - NTEAM    32
#> # … with 1,100 more rows

Counting of topics/questions by author can also be done by time using the count_topics_author_time. For example, counting topics/questions by author by day is done as follows:

count_topics_author_time(by_time = "day")

which gives the following output:

#> # A tibble: 2,750 x 3
#>    Author            day            n
#>    <chr>             <date>     <int>
#>  1 Anonymous 1494    2015-06-02    11
#>  2 Mark Hawkes       2019-05-13    11
#>  3 Andi Kendle       2017-09-25     6
#>  4 GTAM Wasting TWG  2020-08-07     5
#>  5 Isabelle Modigell 2019-06-06     5
#>  6 Marie McGrath     2016-09-30     5
#>  7 Michael ALVES     2014-09-11     5
#>  8 Michael ALVES     2015-09-02     5
#>  9 Thuy Nguyen       2016-11-11     5
#> 10 Alan Mason        2016-09-15     4
#> # … with 2,740 more rows

Unlike the previous count functions, the output of count_topics_author_time is already sorted by default.

To count topics/questions by author by week, month, or year just requires changing the by_time argument to the desired time interval as shown below:

## Count topics/questions by author by week
count_topics_author_time(by_time = "week")

## Count topics/questions by author by month
count_topics_author_time(by_time = "month")

## Count topics/questions by author by year
count_topics_author_time(by_time = "year")

Deprecated count functions

In the current release of the ennet package, the following functions have now undergone deprecation:

These functions have now been superseded by the more performant count_topics functions above. We recommend that if you have been using these deprecated functions previously to update your work to use the new functions described above. These deprecated function will be made defunct on the next release of ennet package.

Arranging topics by number of views

Summarising the number of topics or questions raised within the en-net online forum by arranging them based on number of views can proxy level of interest to a specific topic by those participating in the forum. This is facilitated using the arrange_views function. Ranking of topics by number of views is done per thematic area and by a specific time period. Ranking of topics by number of views by thematic area and by month and year is performed by default:

get_themes() %>%
  get_themes_topics() %>%
  arrange_views()

which results in:

#> # A tibble: 3,045 x 9
#>    Theme   Topic          Views Author  Posted     Link      Replies Month  Year
#>    <chr>   <chr>          <int> <chr>   <date>     <chr>       <int> <fct> <dbl>
#>  1 Adoles… Age group of …   738 Anonym… 2020-02-15 https://…       1 Feb    2020
#>  2 Adoles… Launch of res…  2101 Jo Lof… 2018-05-23 https://…       0 May    2018
#>  3 Adoles… Reaching out …  2319 Emily … 2018-06-18 https://…       3 Jun    2018
#>  4 Adoles… adolescent ma…  2245 Anonym… 2018-06-21 https://…       1 Jun    2018
#>  5 Adoles… adolescent ma…  2057 Anonym… 2018-06-21 https://…       1 Jun    2018
#>  6 Adoles… anthropometry…  2025 Emily … 2018-06-29 https://…       2 Jun    2018
#>  7 Adoles… Teen/adolesce…  1985 Anne H… 2018-06-19 https://…       1 Jun    2018
#>  8 Adoles… health educat…  1728 Anonym… 2018-06-19 https://…       1 Jun    2018
#>  9 Adoles… Is there any …  1038 Aisha … 2020-06-12 https://…       1 Jun    2020
#> 10 Adoles… Assessing nut…  2050 Ursula… 2018-07-25 https://…       3 Jul    2018
#> # … with 3,035 more rows

Arranging topics by number of views by thematic area and by year is performed as follows:

get_themes() %>%
  get_themes_topics() %>%
  arrange_views(by_date = "year")

which results in:

#> # A tibble: 3,045 x 9
#>    Theme    Topic        Views Author  Posted     Link       Replies Month  Year
#>    <chr>    <chr>        <int> <chr>   <date>     <chr>        <int> <fct> <dbl>
#>  1 Adolesc… Reaching ou…  2319 Emily … 2018-06-18 https://w…       3 Jun    2018
#>  2 Adolesc… adolescent …  2245 Anonym… 2018-06-21 https://w…       1 Jun    2018
#>  3 Adolesc… Launch of r…  2101 Jo Lof… 2018-05-23 https://w…       0 May    2018
#>  4 Adolesc… adolescent …  2057 Anonym… 2018-06-21 https://w…       1 Jun    2018
#>  5 Adolesc… Assessing n…  2050 Ursula… 2018-07-25 https://w…       3 Jul    2018
#>  6 Adolesc… anthropomet…  2025 Emily … 2018-06-29 https://w…       2 Jun    2018
#>  7 Adolesc… Teen/adoles…  1985 Anne H… 2018-06-19 https://w…       1 Jun    2018
#>  8 Adolesc… Adolescent …  1928 Tamsin… 2018-08-01 https://w…       3 Aug    2018
#>  9 Adolesc… adolescent …  1872 Anonym… 2018-07-04 https://w…       1 Jul    2018
#> 10 Adolesc… MUAC tape f…  1850 Anonym… 2018-09-27 https://w…       1 Sep    2018
#> # … with 3,035 more rows

Arranging topics by number of views by thematic area overall across the years is performed as follows:

get_themes() %>%
  get_themes_topics() %>%
  arrange_views(by_date = "all")

which results in:

#> # A tibble: 3,045 x 9
#>    Theme    Topic        Views Author  Posted     Link       Replies Month  Year
#>    <chr>    <chr>        <int> <chr>   <date>     <chr>        <int> <fct> <dbl>
#>  1 Adolesc… Reaching ou…  2319 Emily … 2018-06-18 https://w…       3 Jun    2018
#>  2 Adolesc… adolescent …  2245 Anonym… 2018-06-21 https://w…       1 Jun    2018
#>  3 Adolesc… Launch of r…  2101 Jo Lof… 2018-05-23 https://w…       0 May    2018
#>  4 Adolesc… adolescent …  2057 Anonym… 2018-06-21 https://w…       1 Jun    2018
#>  5 Adolesc… Assessing n…  2050 Ursula… 2018-07-25 https://w…       3 Jul    2018
#>  6 Adolesc… anthropomet…  2025 Emily … 2018-06-29 https://w…       2 Jun    2018
#>  7 Adolesc… Teen/adoles…  1985 Anne H… 2018-06-19 https://w…       1 Jun    2018
#>  8 Adolesc… Adolescent …  1928 Tamsin… 2018-08-01 https://w…       3 Aug    2018
#>  9 Adolesc… adolescent …  1872 Anonym… 2018-07-04 https://w…       1 Jul    2018
#> 10 Adolesc… MUAC tape f…  1850 Anonym… 2018-09-27 https://w…       1 Sep    2018
#> # … with 3,035 more rows

By default, the output of arrange_views is grouped by thematic area. This default behaviour can be changed by setting the by_theme argument to FALSE. For example, to arrange the topics by number of views by month and year across all themes:

get_themes() %>%
  get_themes_topics() %>%
  arrange_views(by_theme = FALSE)

which results in:

#> # A tibble: 3,045 x 9
#>    Theme      Topic         Views Author Posted     Link     Replies Month  Year
#>    <chr>      <chr>         <int> <chr>  <date>     <chr>      <int> <fct> <dbl>
#>  1 Infant an… Breastfeedin…  9338 Anony… 2010-01-07 https:/…       3 Jan    2010
#>  2 Announcem… Nutrition in…  8517 Marie… 2010-01-05 https:/…      NA Jan    2010
#>  3 Assessmen… How to Maint…  5692 Anony… 2010-01-15 https:/…       2 Jan    2010
#>  4 Assessmen… Interpretati…  8706 Tamsi… 2011-01-05 https:/…      10 Jan    2011
#>  5 Assessmen… Calculating …  7705 Ali M… 2011-01-19 https:/…       3 Jan    2011
#>  6 Assessmen… Prospective …  6075 Jeff … 2011-01-27 https:/…       7 Jan    2011
#>  7 Announcem… IYCF Consult…  5468 Aliso… 2011-01-04 https:/…      NA Jan    2011
#>  8 Announcem… Technical Su…  5450 Tamsi… 2011-01-18 https:/…      NA Jan    2011
#>  9 Cross-cut… Transitionin…  5290 Anony… 2011-01-25 https:/…       1 Jan    2011
#> 10 Assessmen… Result with …  5226 Anony… 2011-01-25 https:/…       2 Jan    2011
#> # … with 3,035 more rows

Arranging topics by number of replies

Summarising the number of topics or questions raised within the en-net online forum by arranging them based on number of replies can proxy level of interest to a specific topic by those participating in the forum specifically those who provide responses and feedback to responses within the discussion. This is facilitated using the arrange_replies function. Ranking of topics by number of replies is done per thematic area and by a specific time period. Ranking of topics by number of replies by thematic area and by month and year is performed by default:

get_themes() %>%
  get_themes_topics() %>%
  arrange_replies()

which results in:

#> # A tibble: 1,800 x 9
#>    Theme   Topic          Views Author  Posted     Link      Replies Month  Year
#>    <chr>   <chr>          <int> <chr>   <date>     <chr>       <int> <fct> <dbl>
#>  1 Adoles… Age group of …   738 Anonym… 2020-02-15 https://…       1 Feb    2020
#>  2 Adoles… Launch of res…  2101 Jo Lof… 2018-05-23 https://…       0 May    2018
#>  3 Adoles… Reaching out …  2319 Emily … 2018-06-18 https://…       3 Jun    2018
#>  4 Adoles… anthropometry…  2025 Emily … 2018-06-29 https://…       2 Jun    2018
#>  5 Adoles… adolescent ma…  2245 Anonym… 2018-06-21 https://…       1 Jun    2018
#>  6 Adoles… adolescent ma…  2057 Anonym… 2018-06-21 https://…       1 Jun    2018
#>  7 Adoles… Teen/adolesce…  1985 Anne H… 2018-06-19 https://…       1 Jun    2018
#>  8 Adoles… health educat…  1728 Anonym… 2018-06-19 https://…       1 Jun    2018
#>  9 Adoles… Is there any …  1038 Aisha … 2020-06-12 https://…       1 Jun    2020
#> 10 Adoles… Assessing nut…  2050 Ursula… 2018-07-25 https://…       3 Jul    2018
#> # … with 1,790 more rows

Arranging topics by number of replies by thematic area and by year is performed as follows:

get_themes() %>%
  get_themes_topics() %>%
  arrange_replies(by_date = "year")

which results in:

#> # A tibble: 1,800 x 9
#>    Theme    Topic        Views Author  Posted     Link       Replies Month  Year
#>    <chr>    <chr>        <int> <chr>   <date>     <chr>        <int> <fct> <dbl>
#>  1 Adolesc… Adolescent …  1928 Tamsin… 2018-08-01 https://w…       3 Aug    2018
#>  2 Adolesc… Assessing n…  2050 Ursula… 2018-07-25 https://w…       3 Jul    2018
#>  3 Adolesc… Reaching ou…  2319 Emily … 2018-06-18 https://w…       3 Jun    2018
#>  4 Adolesc… anthropomet…  2025 Emily … 2018-06-29 https://w…       2 Jun    2018
#>  5 Adolesc… MUAC tape f…  1850 Anonym… 2018-09-27 https://w…       1 Sep    2018
#>  6 Adolesc… adolescent …  1872 Anonym… 2018-07-04 https://w…       1 Jul    2018
#>  7 Adolesc… adolescent …  2245 Anonym… 2018-06-21 https://w…       1 Jun    2018
#>  8 Adolesc… adolescent …  2057 Anonym… 2018-06-21 https://w…       1 Jun    2018
#>  9 Adolesc… Teen/adoles…  1985 Anne H… 2018-06-19 https://w…       1 Jun    2018
#> 10 Adolesc… health educ…  1728 Anonym… 2018-06-19 https://w…       1 Jun    2018
#> # … with 1,790 more rows

Arranging topics by number of replies by thematic area overall across the years is performed as follows:

get_themes() %>%
  get_themes_topics() %>%
  arrange_replies(by_date = "all")

which results in:

#> # A tibble: 1,800 x 9
#>    Theme   Topic          Views Author  Posted     Link      Replies Month  Year
#>    <chr>   <chr>          <int> <chr>   <date>     <chr>       <int> <fct> <dbl>
#>  1 Adoles… Adolescent sc…  1928 Tamsin… 2018-08-01 https://…       3 Aug    2018
#>  2 Adoles… Assessing nut…  2050 Ursula… 2018-07-25 https://…       3 Jul    2018
#>  3 Adoles… Reaching out …  2319 Emily … 2018-06-18 https://…       3 Jun    2018
#>  4 Adoles… anthropometry…  2025 Emily … 2018-06-29 https://…       2 Jun    2018
#>  5 Adoles… Is there any …  1038 Aisha … 2020-06-12 https://…       1 Jun    2020
#>  6 Adoles… Age group of …   738 Anonym… 2020-02-15 https://…       1 Feb    2020
#>  7 Adoles… MUAC tape for…  1850 Anonym… 2018-09-27 https://…       1 Sep    2018
#>  8 Adoles… adolescent sc…  1872 Anonym… 2018-07-04 https://…       1 Jul    2018
#>  9 Adoles… adolescent ma…  2245 Anonym… 2018-06-21 https://…       1 Jun    2018
#> 10 Adoles… adolescent ma…  2057 Anonym… 2018-06-21 https://…       1 Jun    2018
#> # … with 1,790 more rows

By default, the output of arrange_replies is grouped by thematic area. This default behaviour can be changed by setting the by_theme argument to FALSE. For example, to arrange the topics by number of replies by month and year across all themes:

get_themes() %>%
  get_themes_topics() %>%
  arrange_replies(by_theme = FALSE)

which results in:

#> # A tibble: 1,800 x 9
#>    Theme     Topic         Views Author  Posted     Link     Replies Month  Year
#>    <chr>     <chr>         <int> <chr>   <date>     <chr>      <int> <fct> <dbl>
#>  1 Infant a… Breastfeedin…  9338 Anonym… 2010-01-07 https:/…       3 Jan    2010
#>  2 Assessme… How to Maint…  5692 Anonym… 2010-01-15 https:/…       2 Jan    2010
#>  3 Assessme… Interpretati…  8706 Tamsin… 2011-01-05 https:/…      10 Jan    2011
#>  4 Assessme… Prospective …  6075 Jeff M… 2011-01-27 https:/…       7 Jan    2011
#>  5 Assessme… Calculating …  7705 Ali Ma… 2011-01-19 https:/…       3 Jan    2011
#>  6 Assessme… Result with …  5226 Anonym… 2011-01-25 https:/…       2 Jan    2011
#>  7 Cross-cu… Transitionin…  5290 Anonym… 2011-01-25 https:/…       1 Jan    2011
#>  8 Assessme… Nutritional …  7743 Mary M… 2012-01-01 https:/…      10 Jan    2012
#>  9 Upcoming… Regional SMA…  5856 Yara S… 2012-01-10 https:/…       8 Jan    2012
#> 10 Micronut… multi-micron…  6169 REBECC… 2012-01-18 https:/…       6 Jan    2012
#> # … with 1,790 more rows