class: center, middle, inverse, title-slide .title[ #
R Conference 2024, Kelantan
] .subtitle[ ##
Publication-ready Statistical Results Using
] .author[ ###
Dr. Wan Nor Arifin
] .institute[ ###
Biostatistics & Research Methodology UnitUniversiti Sains Malaysiawnarifin.github.io
] .date[ ###
27 October 2024
] ---
## Outlines - #### Preparing & Presenting Results - #### Prepare Publication-ready Results Using
- #### Other Potentials Using
--- class: center, middle # Preparing & Presenting Results --- ## Raw output _vs_ Expected output .pull-left[ #### Raw analysis results 😱😱😱 ``` Call: glm(formula = cad ~ dbp10 + gender, family = "binomial", data = data) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -6.1205 1.3167 -4.648 3.34e-06 *** dbp10 0.4950 0.1463 3.383 0.000717 *** genderman 0.8057 0.3908 2.062 0.039253 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 191.56 on 199 degrees of freedom Residual deviance: 175.20 on 197 degrees of freedom AIC: 181.2 Number of Fisher Scoring iterations: 4 ``` ] -- .pull-right[ #### Expected presentation format 👍👍👍 <span> <table class="table lightable-classic" style="font-size: 12px; margin-left: auto; margin-right: auto; font-family: Georgia; margin-left: auto; margin-right: auto;border-bottom: 0;"> <caption style="font-size: initial !important;"><span style="font-size:small"> **Table 1**: Associated factors of coronary artery disease (_n_ = 200) </span></caption> <thead> <tr> <th style="text-align:left;font-weight: bold;"> Factors </th> <th style="text-align:right;font-weight: bold;"> _b_ </th> <th style="text-align:right;font-weight: bold;"> SE </th> <th style="text-align:right;font-weight: bold;"> Adj. OR </th> <th style="text-align:left;font-weight: bold;"> 95% CI </th> <th style="text-align:right;font-weight: bold;"> _z_-stats </th> <th style="text-align:right;font-weight: bold;"> _P_-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> DBP (by 10mmHg) </td> <td style="text-align:right;"> 0.495 </td> <td style="text-align:right;"> 0.146 </td> <td style="text-align:right;"> 1.641 </td> <td style="text-align:left;"> 1.24, 2.21 </td> <td style="text-align:right;"> 3.383 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> Gender (Male vs Female) </td> <td style="text-align:right;"> 0.806 </td> <td style="text-align:right;"> 0.391 </td> <td style="text-align:right;"> 2.238 </td> <td style="text-align:left;"> 1.055, 4.935 </td> <td style="text-align:right;"> 2.062 </td> <td style="text-align:right;"> 0.039 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> OR = odds ratio, SE = standard error.</td></tr></tfoot> </table> </span> ] --- ## Issues in Preparing & Presenting Statistical Results - **Raw outputs — not suitable for publication / thesis / report / presentation** - Different standards & guidelines for each - Preparation — tedious, nightmare, error-prone! <img src="img/img4.png" height="355px"> --- ## Issues in Preparing & Presenting Statistical Results - **Raw outputs — not suitable for publication / thesis / report / presentation** - **Different standards & guidelines for each** - Preparation — tedious, nightmare, error-prone! <img src="img/img4.png" height="355px"> <img src="img/img2.png" height="355px"> --- ## Issues in Preparing & Presenting Statistical Results - **Raw outputs — not suitable for publication / thesis / report / presentation** - **Different standards & guidelines for each** - **Preparation — tedious, nightmare, error-prone!** <img src="img/img4.png" height="355px"> <img src="img/img2.png" height="355px"> <img src="img/img3.png" height="355px"> <br/><small><small>*Images generated with Copilot</small></small> --- ## Analyze & Prepare Statistical Results Using
- Raw outputs from
can be unsightly - But,
outputs are easy to **reprocess** - Packages and functions are abundant in
to refine raw outputs - A breeze in
as compared to other statistical software - Usually in combination with `R Markdown` / `Quarto Document` -- .center[Table and plot straight from
!] .pull-left[ <span> <table class="table lightable-classic" style="font-size: 12px; margin-left: auto; margin-right: auto; font-family: Georgia; margin-left: auto; margin-right: auto;border-bottom: 0;"> <caption style="font-size: initial !important;"><span style="font-size:small"> **Table 1**: Associated factors of coronary artery disease (_n_ = 200) </span></caption> <thead> <tr> <th style="text-align:left;font-weight: bold;"> Factors </th> <th style="text-align:right;font-weight: bold;"> _b_ </th> <th style="text-align:right;font-weight: bold;"> SE </th> <th style="text-align:right;font-weight: bold;"> Adj. OR </th> <th style="text-align:left;font-weight: bold;"> 95% CI </th> <th style="text-align:right;font-weight: bold;"> _z_-stats </th> <th style="text-align:right;font-weight: bold;"> _P_-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> DBP (by 10mmHg) </td> <td style="text-align:right;"> 0.495 </td> <td style="text-align:right;"> 0.146 </td> <td style="text-align:right;"> 1.641 </td> <td style="text-align:left;"> 1.24, 2.21 </td> <td style="text-align:right;"> 3.383 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> Gender (Male vs Female) </td> <td style="text-align:right;"> 0.806 </td> <td style="text-align:right;"> 0.391 </td> <td style="text-align:right;"> 2.238 </td> <td style="text-align:left;"> 1.055, 4.935 </td> <td style="text-align:right;"> 2.062 </td> <td style="text-align:right;"> 0.039 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> OR = odds ratio, SE = standard error.</td></tr></tfoot> </table> </span> ] .pull-right[ <img src="index_files/figure-html/unnamed-chunk-4-1.png" width="100%" /> .center[<span style='font-size:small;font-family:Georgia'>**Figure 1**: Associated factors of coronary artery disease (_n_ = 250)</span>] ] --- class: center, middle # Prepare Publication-ready Results Using R --- ## Formats of Statistical Results .pull-left[ #### Table <span> <table class="table" style="font-size: 12px; font-family: Georgia; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;font-weight: bold;"> Factors </th> <th style="text-align:right;font-weight: bold;"> _b_ </th> <th style="text-align:right;font-weight: bold;"> SE </th> <th style="text-align:right;font-weight: bold;"> Adj. OR </th> <th style="text-align:left;font-weight: bold;"> 95% CI </th> <th style="text-align:right;font-weight: bold;"> _z_-stats </th> <th style="text-align:right;font-weight: bold;"> _P_-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> DBP (by 10mmHg) </td> <td style="text-align:right;"> 0.495 </td> <td style="text-align:right;"> 0.146 </td> <td style="text-align:right;"> 1.641 </td> <td style="text-align:left;"> 1.24, 2.21 </td> <td style="text-align:right;"> 3.383 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> Gender (Male vs Female) </td> <td style="text-align:right;"> 0.806 </td> <td style="text-align:right;"> 0.391 </td> <td style="text-align:right;"> 2.238 </td> <td style="text-align:left;"> 1.055, 4.935 </td> <td style="text-align:right;"> 2.062 </td> <td style="text-align:right;"> 0.039 </td> </tr> </tbody> </table> </span> #### In-text statistics <span style="font-family:Georgia;font-size:0.7em;"> Individuals with higher diastolic blood pressure (DBP) have higher odds of developing coronary artery disease (CAD), with an adjusted odds ratio (OR) of 1.641 (95% CI: 1.24, 2.21) for every 10mmHg increase in DBP. In addition, males have higher odds of developing CAD compared to females, with adjusted OR of 2.238 (95% CI: 1.055, 4.935). </span> ] .pull-right[ #### Plot <span> <img src="index_files/figure-html/unnamed-chunk-6-1.png" width="100%" /> </span> ] --- ## Approaches in R - **Ready-made outputs** — basic outputs from packages - **Custom outputs** — fine-tune outputs from packages - **Custom functions** — custom outputs, avoid repetition from existing functions -- <br/><br/><br/><br/> .center[ ###There are so many options available in
, amazing & exciting! ] --- ## Table .pull-left[ #### Ready-made outputs - So many packages for this purpose - `gtsummary`
- Summary tables for analysis: `epidisplay`, `finalfit`, `jtools`, `stargazer`, `jmv`, ... - Descriptive tables: `arsenal`, `table1`, ... - and many more...? #### Custom outputs - `knitr`: `kable()`
, `kableExtra`
- `broom`: `tidy()`
- `gt`, `flextable` - more...? ] .pull-right[ #### Custom functions - Mix & match preceding functions for repeated use, i.e. same format for same statistical analysis - Save your time, more readable code, reuse for your specific need ] --- ## Table .pull-left[ #### Ready-made outputs - So many packages for this purpose - `gtsummary`
- Summary tables for analysis: `epidisplay`, `finalfit`, `jtools`, `stargazer`, `jmv`, ... - Descriptive tables: `arsenal`, `table1`, ... - and many more...? #### Custom outputs - `knitr`: `kable()`
, `kableExtra`
- `broom`: `tidy()`
- `gt`, `flextable` - more...? ] .pull-right[ #### Custom functions - Mix & match preceding functions for repeated use, i.e. same format for same statistical analysis - Save your time, more readable code, reuse for your specific need These packages & options usually allow you to get output as plain text, HTML, LaTeX or Word formats, even as image or PDF. So, these are adaptable to any document. ] --- ## Table: Ready-made outputs .pull-left[ #### R code: Setup data & analyze ``` r # load & prepare data data = foreign::read.spss("slog.sav",T,T) data$dbp10 = data$dbp/10 data$gender = factor(data$gender, labels = c("Female","Male")) # analyze mlog = glm(cad ~ dbp10 + gender, "binomial", data) ``` #### R code: Prepare & present ``` r # present gtsummary::theme_gtsummary_journal("jama") # specify theme *gtsummary::tbl_regression(mlog, exponentiate = TRUE, label = list( dbp10 = "DBP (by 10mmHg)", gender = "Gender")) gtsummary::reset_gtsummary_theme() ``` ] -- .pull-right[ #### Output .center[ <img src="img/tbl_or.png" width="350px"> ] ] --- ## Table: Ready-made outputs .pull-left[ #### R code: Prepare data ``` r # load & prepare data data = foreign::read.spss("slog.sav",T,T) data$cad = factor(data$cad, labels = c("No CAD", "CAD")) data$gender = factor(data$gender, labels=c("Female","Male")) ``` #### R code: Present ``` r # present library(gtsummary) # for all_x() settings *gtsummary::tbl_summary( subset(data, select = c(cad, sbp:age, gender)), by = cad, label = list(cad = "CAD", sbp = "SBP", dbp = "DBP", chol = "Cholesterol", age = "Age", gender = "Gender"), statistic = list( all_continuous() ~ "{mean} ({sd})", all_categorical() ~ "{n} ({p}%)"), digits = list(all_continuous() ~ c(1,1), all_categorical() ~ c(0,1))) ``` ] -- .pull-right[ #### Output .center[ <img src="img/tbl_des.png" width="350px"> ] ] --- ## Table: Ready-made outputs .pull-left[ #### R code: Prepare data ``` r # load & prepare data data = foreign::read.spss("slog.sav",T,T) data$cad = factor(data$cad, labels = c("No CAD", "CAD")) data$gender = factor(data$gender, labels=c("Female","Male")) ``` #### R code: Present ``` r # present gtsummary::theme_gtsummary_journal("jama") # combine gtsummary::theme_gtsummary_mean_sd() # two themes *gtsummary::tbl_summary( subset(data, select = c(cad, sbp:age, gender)), by = cad, label = list(cad = "CAD", sbp = "SBP", dbp = "DBP", chol = "Cholesterol", age = "Age", gender = "Gender"), digits = list(all_continuous() ~ c(1,1), all_categorical() ~ c(0,1))) gtsummary::reset_gtsummary_theme() ``` ] -- .pull-right[ #### Output
Characteristic
No CAD
, N = 163
CAD
, N = 37
SBP, Mean (SD)
129.3 (22.3)
143.8 (25.6)
DBP, Mean (SD)
80.8 (12.6)
89.0 (12.2)
Cholesterol, Mean (SD)
6.1 (1.2)
6.6 (1.2)
Age, Mean (SD)
45.2 (8.4)
47.4 (8.8)
Gender, n (%)
Female
87 (53.4)
13 (35.1)
Male
76 (46.6)
24 (64.9)
] --- ## Table: Custom outputs .pull-left[ #### R code: Setup data & analyze ``` r # load & prepare data data = foreign::read.spss("slog.sav",T,T) data$dbp10 = data$dbp/10 # analyze mlog = glm(cad ~ dbp10 + gender, "binomial", data) ``` #### R code: Prepare with `broom::tidy()` + `tibble` ``` r *out_tidy = broom::tidy(mlog) *out_or_tidy = broom::tidy(mlog, exp = T, conf.int = T) *out = tibble::tibble( Factors = c("DBP (by 10mmHg)", "Gender (Male vs Female)"), "_b_" = out_tidy$estimate[-1], SE = out_tidy$std.error[-1], "Adj. OR" = out_or_tidy$estimate[-1], "95% CI" = paste0(out_or_tidy$conf.low[-1]|>round(3), ", ", out_or_tidy$conf.high[-1]|>round(3)), "_z_-stats" = out_tidy$statistic[-1], "_P_-value" = out_tidy$p.value[-1] ); out ``` ] -- .pull-right[ #### Output: Unformatted ``` # A tibble: 2 × 5 Factors `_b_` SE `Adj. OR` `95% CI` <chr> <dbl> <dbl> <dbl> <chr> 1 DBP (by 10mmHg) 0.495 0.146 1.64 1.24, 2.21 2 Gender (Male vs Female) 0.806 0.391 2.24 1.055, 4.935 ``` ``` # A tibble: 2 × 2 `_z_-stats` `_P_-value` <dbl> <dbl> 1 3.38 0.000717 2 2.06 0.0393 ``` ] --- ## Table: Custom outputs .pull-left[ #### R code: Present with `knitr::kable()` + `kableextra` ``` r *knitr::kable(out, format = "html", digits = 3, caption = "<span style='font-size:small'> **Table 1**: Associated factors of coronary artery disease (_n_ = 200) </span>") |> * kableExtra::kable_styling(font_size = 12) |> kableExtra::kable_classic(html_font = "Georgia") |> kableExtra::row_spec(row = 0, bold = T) |> kableExtra::footnote(general = "OR = odds ratio, SE = standard error.", general_title = "") ``` ] -- .pull-right[ #### Output <span> <table class="table lightable-classic" style="font-size: 12px; margin-left: auto; margin-right: auto; font-family: Georgia; margin-left: auto; margin-right: auto;border-bottom: 0;"> <caption style="font-size: initial !important;"><span style="font-size:small"> **Table 1**: Associated factors of coronary artery disease (_n_ = 200) </span></caption> <thead> <tr> <th style="text-align:left;font-weight: bold;"> Factors </th> <th style="text-align:right;font-weight: bold;"> _b_ </th> <th style="text-align:right;font-weight: bold;"> SE </th> <th style="text-align:right;font-weight: bold;"> Adj. OR </th> <th style="text-align:left;font-weight: bold;"> 95% CI </th> <th style="text-align:right;font-weight: bold;"> _z_-stats </th> <th style="text-align:right;font-weight: bold;"> _P_-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> DBP (by 10mmHg) </td> <td style="text-align:right;"> 0.495 </td> <td style="text-align:right;"> 0.146 </td> <td style="text-align:right;"> 1.641 </td> <td style="text-align:left;"> 1.24, 2.21 </td> <td style="text-align:right;"> 3.383 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> Gender (Male vs Female) </td> <td style="text-align:right;"> 0.806 </td> <td style="text-align:right;"> 0.391 </td> <td style="text-align:right;"> 2.238 </td> <td style="text-align:left;"> 1.055, 4.935 </td> <td style="text-align:right;"> 2.062 </td> <td style="text-align:right;"> 0.039 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> OR = odds ratio, SE = standard error.</td></tr></tfoot> </table> </span> ] --- ## Table: Custom functions .pull-left[ #### R code: Function `mlg_tbl()` ``` r *mlog_tbl = function(mlog_glm) { out_tidy = broom::tidy(mlog_glm) out_or_tidy = broom::tidy(mlog_glm, exp = T, conf.int = T) tibble::tibble( Factors = out_tidy$term[-1], "_b_" = out_tidy$estimate[-1], SE = out_tidy$std.error[-1], "Adj. OR" = out_or_tidy$estimate[-1], "95% CI" = paste0(out_or_tidy$conf.low[-1]|>round(3), ", ", out_or_tidy$conf.high[-1]|>round(3)), "_z_-stats" = out_tidy$statistic[-1], "_P_-value" = out_tidy$p.value[-1]) } ``` ] .pull-right[ #### R code: Function `mlg_kbl()` ``` r *mlog_kbl = function(mlog_tbl, caption = "Table", font_size = 12, font_family = "Georgia") { knitr::kable(mlog_tbl, format = "html", digits = 3, caption = caption) |> kableExtra::kable_styling(font_size = font_size) |> kableExtra::kable_classic(html_font = font_family) |> kableExtra::row_spec(row = 0, bold = T) |> kableExtra::footnote(general = "OR = odds ratio, SE = standard error.", general_title = "") } ``` ] --- ## Table: Custom functions .pull-left[ #### R code: `mlg_tbl()` + `mlg_kbl` ``` r # present table *tbl1 = mlog_tbl(mlog) tbl1$Factors = c("DBP (by 10mmHg)", "Gender (Male vs Female)") tbl1_caption = paste0( "**Table 1**: Associated factors of CAD (_n_ = ", nrow(mlog$data), ")") *mlog_kbl(tbl1, caption = tbl1_caption, * font_size = 14, font_family = "Comic Sans MS") ``` Shorter, reusable code in lengthy document ] -- .pull-right[ #### Output <span> <table class="table lightable-classic" style="font-size: 14px; margin-left: auto; margin-right: auto; font-family: Comic Sans MS; margin-left: auto; margin-right: auto;border-bottom: 0;"> <caption style="font-size: initial !important;">**Table 1**: Associated factors of CAD (_n_ = 200)</caption> <thead> <tr> <th style="text-align:left;font-weight: bold;"> Factors </th> <th style="text-align:right;font-weight: bold;"> _b_ </th> <th style="text-align:right;font-weight: bold;"> SE </th> <th style="text-align:right;font-weight: bold;"> Adj. OR </th> <th style="text-align:left;font-weight: bold;"> 95% CI </th> <th style="text-align:right;font-weight: bold;"> _z_-stats </th> <th style="text-align:right;font-weight: bold;"> _P_-value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> DBP (by 10mmHg) </td> <td style="text-align:right;"> 0.495 </td> <td style="text-align:right;"> 0.146 </td> <td style="text-align:right;"> 1.641 </td> <td style="text-align:left;"> 1.24, 2.21 </td> <td style="text-align:right;"> 3.383 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:left;"> Gender (Male vs Female) </td> <td style="text-align:right;"> 0.806 </td> <td style="text-align:right;"> 0.391 </td> <td style="text-align:right;"> 2.238 </td> <td style="text-align:left;"> 1.055, 4.935 </td> <td style="text-align:right;"> 2.062 </td> <td style="text-align:right;"> 0.039 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> OR = odds ratio, SE = standard error.</td></tr></tfoot> </table> </span> _Comic Sans MS_ 😊 ] --- ## Plot .pull-left[ #### Ready-made outputs - Usually available in specific packages for specific analyses, must explore - Mostly rely on `ggplot2`
and base `graphics`
- Examples: `semPlot` (SEM path diagram)
, `OptimalCutpoints` (ROC curve), `finalfit` (OR plot), `survival` (survival analysis) → note they are usually tied to specific analyses - Again, explore! ] .pull-right[ #### Custom outputs - Customize options in above packages - `ggplot2`
, base `graphics`
, `lattice` #### Custom functions - Graphics options are often too many and tedious to repeatedly use without affecting readability of your code - Advisable to write functions for customizing the plots for your specific data and use-case ] -- These packages & options usually allow you get output several image formats e.g. png, svg, pdf etc. So, this can be easily integrated in any document. --- ## Plot: Ready-made outputs .pull-left[ #### R code ``` r attr(data$cad, "label") = "Coronary artery disease" attr(data$dbp10, "label") = "DBP (by 10mmHg)" attr(data$gender, "label") = "Gender" data$gender = factor(data$gender, labels=c("Female","Male")) *finalfit::or_plot(data, dependent, explanatory) ``` ] .pull-right[ #### Output <span> <img src="index_files/figure-html/unnamed-chunk-27-1.png" width="100%" /> </span> ] --- ## Plot: Ready-made outputs .pull-left[ #### R code ``` r attr(data$cad, "label") = "Coronary artery disease" attr(data$dbp10, "label") = "DBP (by 10mmHg)" data$gender = factor(data$gender, labels=c("Female","Male")) attr(data$gender, "label") = "Gender (vs Female)" explanatory = c("dbp10", "gender") dependent = "cad" finalfit::or_plot(data, dependent, explanatory, dependent_label = "Odds ratio (OR) of coronary artery disease by factor", suffix = "", remove_ref = T, title_text_size = 16, table_text_size = 5, plot_opts = list(ggplot2::xlab("OR (95% CI)"))) ``` ] .pull-right[ #### Output <span> <img src="index_files/figure-html/unnamed-chunk-29-1.png" width="100%" /> </span> ] --- ## Plot: Ready-made outputs .pull-left[ #### R code ``` r # SEM model model = " F1 =~ Q1 + Q2 + Q3 + Q4 + Q5 F2 =~ Q6 + Q7 + Q8 + Q9 + Q10 F2 ~ F1 " survey = lavaan::simulateData(model) # sim. data for practice fit_model = lavaan::sem(model, data = survey) # draw path diagram *semPlot::semPaths(fit_model, what="path", whatLabels="name", edge.color = "black", layout = "tree2", rotation = 2) ``` ] .pull-right[ #### Output <img src="index_files/figure-html/unnamed-chunk-31-1.png" width="100%" /> ] --- ## Plot: Ready-made outputs .pull-left[ #### R code ``` r data1 = data data1$p_mlog = predict(mlog, type = "response") # probability OptimalCutpoints::optimal.cutpoints(X = "p_mlog", status = "cad", tag.healthy = "no cad", methods = "Youden", data = data1, ci.fit = TRUE) |> plot(which = 1) ``` ] .pull-right[ #### Output <img src="index_files/figure-html/unnamed-chunk-33-1.png" width="100%" /> ] --- ## Plot: Custom outputs Can use `ggplot2`, base `graphics` or `lattice`. Typically, you have to customize the plot, this deserves a separate talk. But, most often the settings are lengthy! → require custom functions for repeatedly used plots. --- ## Plot: Custom functions .pull-left[ #### R code: Simulated data ``` r set.seed(007) data_hist = tibble::tibble( "Weight (kg)" = rnorm(1000, mean = 50, sd = 15), "Height (cm)" = rnorm(1000, mean = 150, sd = 10)) ``` #### R code: Histogram by `ggplot2` ``` r library(ggplot2) ggplot(data = data_hist, aes(`Weight (kg)`)) + geom_histogram(bins = 10) ``` ] -- .pull-right[ #### Output: Without customization <img src="index_files/figure-html/unnamed-chunk-36-1.png" width="100%" /> ] --- ## Plot: Custom functions .pull-left[ #### R code: Simulated data ``` r set.seed(007) data_hist = tibble::tibble( "Weight (kg)" = rnorm(1000, mean = 50, sd = 15), "Height (cm)" = rnorm(1000, mean = 150, sd = 10)) ``` #### R code: Histogram by `ggplot2` ``` r library(ggplot2) ggplot(data = data_hist, aes(`Weight (kg)`)) + geom_histogram(bins = 10, color = "black", fill = "cornsilk", alpha = 0.5) + labs(title = "Histogram of Weight (kg)", y = "Frequency") + theme_classic() + theme(plot.title = element_text(color = "steelblue", size = 16, face = "bold", hjust = 0.5), axis.title = element_text(hjust = 1), axis.line = element_line( arrow = grid::arrow(length = unit(.2, "cm")))) ``` ] .pull-right[ #### Output: With customization + **lengthy options** <img src="index_files/figure-html/unnamed-chunk-39-1.png" width="100%" /> ] --- ## Plot: Custom functions .pull-left[ #### R code: Function `my_hist` ``` r *my_hist = function(data, variable, bins = 10) { title_x = paste0("Histogram of ", names(data[variable])) ggplot2::ggplot(data = data, aes(.data[[variable]])) + ggplot2::geom_histogram(bins = 10, color = "black", fill = "cornsilk", alpha = 0.5) + ggplot2::labs(title = title_x, y = "Frequency") + ggplot2::theme_classic() + ggplot2::theme( plot.title = ggplot2::element_text( color = "steelblue", size = 16, face = "bold", hjust = 0.5), axis.title = ggplot2::element_text(hjust = 1), axis.line = ggplot2::element_line( arrow = grid::arrow(length = unit(.2, "cm")))) } ``` ] .pull-right[ ] --- ## Plot: Custom functions .pull-left[ #### R code: `my_hist` with **Weight (kg)** ``` r set.seed(007) data_hist = tibble::tibble( "Weight (kg)" = rnorm(1000, mean = 50, sd = 15), "Height (cm)" = rnorm(1000, mean = 150, sd = 10) ) *my_hist(data_hist, "Weight (kg)") ``` ] -- .pull-right[ #### Output <img src="index_files/figure-html/unnamed-chunk-42-1.png" width="100%" /> ] --- ## Plot: Custom functions .pull-left[ #### R code: `my_hist` with **Weight (kg)** and **Height (cm)** ``` r set.seed(007) data_hist = tibble::tibble( "Weight (kg)" = rnorm(1000, mean = 50, sd = 15), "Height (cm)" = rnorm(1000, mean = 150, sd = 10) ) *my_hist(data_hist, "Weight (kg)") *my_hist(data_hist, "Height (cm)") ``` ] .pull-right[ #### Output <img src="index_files/figure-html/unnamed-chunk-44-1.png" width="100%" /><img src="index_files/figure-html/unnamed-chunk-44-2.png" width="100%" /> ] --- ## In-text For in-text ouput, <code>`r code`</code> is used with `R Markdown` to allow in-text integration. .pull-left[ #### Ready-made outputs - Specific text results directly from functions, or subset of output objects - <code>`r mean(x)`</code>, <code>`r object$x`</code> etc - `gtsummary`'s `inline_text()` #### Custom functions - Functions for repetitive `paste0()` use - mean (SD), _r_ coefficient (_p_-value = 0.xxx), OR (95% CI, LL OR, UL OR) etc - <code>`r mean_sd(x)`</code>, <code>`r or_ci(object)`</code> → simplify the code, easy to reuse ] .pull-right[ #### Custom outputs - Combine all these, `paste()` and `paste0()`
are commonly used - <code>`r paste0(mean(x), " (", sd(x), ")`</code> - <code>`r paste0(object$estimate, " (95% CI: ", object$LL, ", ", object$UL, ")")`</code> - Can also save `paste0()` resulting objects, so can use <code>`r mean_sd`</code>, <code>`r or_ci`</code> ] --- ## Text: Ready-made outputs .pull-left[ #### R code ``` r # analysis part fit = lm(sbp ~ saltadd, BP) # using broom *out = broom::tidy(fit, conf.int = T) ``` #### Markdown <span style="font-family:Lucida Console;font-size:0.7em;">There were \_n\_ = <code>`r nrow(BP)`</code> participants, with <code>`r table(BP$sex)[2]`</code> of them are females. The mean SBP was <code>`r mean(BP$sbp)|>round(1)`</code> (SD = <code>`r sd(BP$sbp)|>round(1)`</code>). From the regression analysis, adding salt to the food increases SBP by <code>`r out$estimate[2]|>round(1)`</code> mmHg.</span> ] .pull-right[ #### Output <span style="font-family:Georgia;font-size:0.8em;">There were _n_ = 80 participants, with 39 of them are females. The mean SBP was 151.2 (SD = 37.3). From the regression analysis, adding salt to the food increases SBP by 25.6 mmHg.</span> ] --- ## Text: Ready-made outputs .pull-left[ #### R code ``` r # analysis part fit = lm(sbp ~ saltadd, BP) # using gtsummary *gtout = gtsummary::tbl_regression(fit, conf.int = T) ``` #### Markdown <span style="font-family:Lucida Console;font-size:0.7em;">There were \_n\_ = <code>`r nrow(BP)`</code> participants, with <code>`r table(BP$sex)[2]`</code> of them are females. The mean SBP was <code>`r mean(BP$sbp)|>round(1)`</code> (SD = <code>`r sd(BP$sbp)|>round(1)`</code>). From the regression analysis, adding salt to the food increases SBP by <code>`r gtsummary::inline_text(gtout, variable = saltadd, level = "yes")`</code> mmHg.</span> ] .pull-right[ #### Output <span style="font-family:Georgia;font-size:0.8em;">There were _n_ = 80 participants, with 39 of them are females. The mean SBP was 151.2 (SD = 37.3). From the regression analysis, adding salt to the food increases SBP by 26 (95% CI 9.8, 41; p=0.002) mmHg.</span> ] --- ## Text: Custom outputs .pull-left[ #### R code ``` r *tbl_sex = table(BP$sex) # analysis part fit = lm(sbp ~ saltadd, BP) out = broom::tidy(fit, conf.int = T) *coef_ci = paste0(out$estimate[2]|>round(1), * " (95% CI: ", out$conf.low[2]|>round(1), * ", ", out$conf.high[2]|>round(1), ")") ``` #### Markdown <span style="font-family:Lucida Console;font-size:0.7em;"> There were \_n\_ = <code>`r nrow(BP)`</code> participants, with <code>`r tbl_sex[2]`</code> (<code>`r prop.table(tbl_sex)[2]|>round(3)*100`</code>%) of them are females. The mean SBP was <code>`r paste0(mean(BP$sbp)|>round(1), " (SD = ", sd(BP$sbp)|>round(1), ")")`</code>. From the regression analysis, adding salt to the food increases SBP by <code>`r coef_ci`</code> mmHg.</span> ] .pull-right[ #### Output <span style="font-family:Georgia;font-size:0.8em;">There were _n_ = 80 participants, with 39 (48.8%) of them are females. The mean SBP was 151.2 (SD = 37.3). From the regression analysis, adding salt to the food increases SBP by 25.6 (95% CI: 9.8, 41.3) mmHg.</span> Using `paste0()` _makes it easier to add more complicated results ] --- ## Text: Custom functions .pull-left[ #### R code ``` r # custom functions *mean_sd = function(x, digits = 1) { paste0(mean(x)|>round(digits), " (SD = ", sd(BP$sbp)|>round(digits), ")")} *n_percent = function(x, cat_number = 1, digits = 1) { tbl = table(x) n = tbl[cat_number] p = prop.table(tbl) percent = p[cat_number] * 100 paste0(n, " (", percent|>round(digits), "%)") } *coef_ci = function(tidy_out, estimate_number = 2, * digits = 1) { paste0(tidy_out$estimate[estimate_number]|>round(digits), " (95% CI: ", tidy_out$conf.low[estimate_number]|>round(digits), ", ", tidy_out$conf.high[estimate_number]|>round(digits), ")") } # analysis part *out = lm(sbp ~ saltadd, BP) |> * broom::tidy(conf.int = T) ``` ] .pull-right[ #### Markdown <span style="font-family:Lucida Console;font-size:0.7em;"> There were \_n\_ = <code>`r nrow(BP)`</code> participants, with <code>`r n_percent(BP$sex, 2)`</code> of them are females, and <code>`r n_percent(BP$saltadd, 2)`</code> participants added salt to their food. The mean SBP and DBP were <code>`r mean_sd(BP$sbp)`</code> and <code>`r mean_sd(BP$dbp)`</code> respectively. From the regression analysis, adding salt to the food increases SBP by <code>`r coef_ci(out, 2)`</code> mmHg.</span> Now we use these customized functions in R markdown. Note we use `paste0()` in these functions ] --- ## Text: Custom functions .pull-left[ #### Markdown <span style="font-family:Lucida Console;font-size:0.7em;"> There were \_n\_ = <code>`r nrow(BP)`</code> participants, with <code>`r n_percent(BP$sex, 2)`</code> of them are females, and <code>`r n_percent(BP$saltadd, 2)`</code> participants added salt to their food. The mean SBP and DBP were <code>`r mean_sd(BP$sbp)`</code> and <code>`r mean_sd(BP$dbp)`</code> respectively. From the regression analysis, adding salt to the food increases SBP by <code>`r coef_ci(out, 2)`</code> mmHg.</span> ] .pull-right[ #### Output <span style="font-family:Georgia;font-size:0.8em;">There were _n_ = 80 participants, with 39 (48.8%) of them are females, and 43 (53.8%) participants added salt to their food. The mean SBP and DBP were 151.2 (SD = 37.3) and 97.5 (SD = 37.3) respectively. From the regression analysis, adding salt to the food increases SBP by 25.6 (95% CI: 9.8, 41.3) mmHg.</span> Customized functions make it easier to to obtain same statistical results for different variables ] --- class: center, middle # Other Potentials Using R --- ## Custom Package for MJMS - A new package for Malaysian Journal of Medical Sciences (MJMS) - Functions to produce tables for common statistical analyses - Utilize `broom` + `knitr::kable()` or `gtsummary`'s `theme_gtsummary_xxx()` - Encourage authors to use
— can get presentation-ready results with minimal efforts - Future, easy-to-specify journal specific templates that community can contribute → utilizing the same package --- ## Local LLM Using `rollama` .pull-left[ - `rollama` links
with local LLM tool `ollama` - Eliminates the need to write templates - Give it example tables/plots from the target journal — multi-shot prompts - Let it format from raw
outputs ] -- .pull-right[ <img src="https://ollama.com/public/ollama.png" height="150px"> `ollama` </br></br> <img src="https://jbgruber.github.io/rollama/logo.svg" height="150px">
`ollama` ] --- ## Local LLM Using `rollama` .pull-left[ - `rollama` links
with local LLM tool `ollama` - Eliminates the need to write templates - Give it example tables/plots from the target journal — multi-shot prompts - Let it format from raw
outputs .center[ But, running it locally needs GPU... <img src="img/gpu.png" height="250px" alt="Image generated with Copilot"></br> <small><small>*Image generated with Copilot</small></small> ] ] .pull-right[ <img src="https://ollama.com/public/ollama.png" height="150px"> `ollama` </br></br> <img src="https://jbgruber.github.io/rollama/logo.svg" height="150px">
`ollama` ] --- class: center, middle # PhD Position Vacancy --- class: center, middle ![](img/eposter_GRA_DR_WAN_NOR_ARIFIN.jpg) --- class: center, middle # Thanks!