class: center, middle, inverse, title-slide .title[ #
Conference 2025, TAR UMT
] .subtitle[ ##
From
user to
CRAN: A journey
] .author[ ###
Dr. Wan Nor Arifin
] .institute[ ###
Biostatistics & Research Methodology UnitUniversiti Sains Malaysiawnarifin.github.io
] .date[ ###
16 November 2025
] ---
# Outlines - ###
phases: - ####
user - ####
functions - ####
package - ####
CRAN - ###
you
eady? --- class: center, middle # As
user --- ## As
user - It was back in early 2015 - I was forced to use R for psychometric analyses (factor analysis, structural equation modeling) - I used whatever packages and functions available - Honestly, frustrated as the R outputs never look nice! -- #### Output 1: _t_-test ``` r t.test(mpg ~ am, mtcars) ``` ``` Welch Two Sample t-test data: mpg by am t = -3.7671, df = 18.332, p-value = 0.001374 alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0 95 percent confidence interval: -11.280194 -3.209684 sample estimates: mean in group 0 mean in group 1 17.14737 24.39231 ``` --- ## As
user - It was back in early 2015 - I was forced to use R for psychometric analyses (factor analysis, structural equation modeling) - I used whatever packages and functions available - Honestly, frustrated as the R outputs never look nice! #### Output 2: count and % ``` r tbl = table(mtcars$am); tbl ``` ``` 0 1 19 13 ``` ``` r prop.table(tbl) * 100 ``` ``` 0 1 59.375 40.625 ``` --- class: center, middle # From
user to
functions --- ## From
user to
functions ### Data analysis for research projects - I had to repeatedly use some functions – e.g. descriptive statistics - Back then, it was quite irritating to get simple outputs such as n (%) for categorical variables directly from R package (no `gtsummary` or other helper functions back then) - `table()`, `prop.table()`, `paste0()`, `cbind()`, `rbind()` etc were the norms - Wrote some [functions](https://github.com/wnarifin/medicalstats-in-R/tree/master/functions) -- #### Painful way ``` r tbl = table(chickwts["feed"]); tbl_perc = prop.table(tbl) * 100 data.frame(Variable = names(chickwts)[2], Label = levels(chickwts[,"feed"]), n = as.numeric(tbl), Percent = as.numeric(tbl_perc)) ``` ``` Variable Label n Percent 1 feed casein 12 16.90141 2 feed horsebean 10 14.08451 3 feed linseed 12 16.90141 4 feed meatmeal 11 15.49296 5 feed soybean 14 19.71831 6 feed sunflower 12 16.90141 ``` --- ## From
user to
functions ### Data analysis for research projects - I had to repeatedly use some functions – e.g. descriptive statistics - Back then, it was quite irritating to get simple outputs such as n (%) for categorical variables directly from R package (no `gtsummary` or other helper functions back then) - `table()`, `prop.table()`, `paste0()`, `cbind()`, `rbind()` etc were the norms - Wrote some [functions](https://github.com/wnarifin/medicalstats-in-R/tree/master/functions) #### With function ``` r source("desc_cat_fun.R") desc_cat(chickwts["feed"]) ``` ``` $feed Variable Label n Percent 1 feed casein 12 16.90141 2 - horsebean 10 14.08451 3 - linseed 12 16.90141 4 - meatmeal 11 15.49296 5 - soybean 14 19.71831 6 - sunflower 12 16.90141 ``` --- ## From
user to
functions ### Custom functions for students - One of my students need custom functions for his analysis, involving new methods - So, I wrote some [functions](https://github.com/wnarifin/medicalstats-in-R/blob/master/functions/alpha_fun.R) for him to use for his project. Fortunately, he was willing to use R 😎 - Also wrote some [functions](https://github.com/wnarifin/medicalstats-in-R/blob/master/functions/tbl2raw_fun.R) for teaching stats lectures, otherwise the lecture will be on how to create raw data by messing around with `table()` and `rep()` -- .pull-left[ #### Cross-tabulation table ``` doc2 doc1 1 2 1 30 5 2 15 30 ``` ] -- .pull-left[ #### → Raw dataset (painful way) ``` r y = rep(1:dim(tbl_12)[2], times = margin.table(tbl_12, 2)) x = rep(rep(1:dim(tbl_12)[1], dim(tbl_12)[2]), as.numeric(tbl_12)) xy = data.frame(x, y) colnames(xy) = names(dimnames(tbl_12)) xy |> head(3) ``` ``` doc1 doc2 1 1 1 2 1 1 3 1 1 ``` ] --- ## From
user to
functions ### Custom functions for students - One of my students need custom functions for his analysis, involving new methods - So, I wrote some [functions](https://github.com/wnarifin/medicalstats-in-R/blob/master/functions/alpha_fun.R) for him to use for his project. Fortunately, he was willing to use R 😎 - Also wrote some [functions](https://github.com/wnarifin/medicalstats-in-R/blob/master/functions/tbl2raw_fun.R) for teaching stats lectures, otherwise the lecture will be on how to create raw data by messing around with `table()` and `rep()` .pull-left[ #### Cross-tabulation table ``` doc2 doc1 1 2 1 30 5 2 15 30 ``` ] .pull-left[ #### → Raw dataset (honorable way) ``` r source("tbl2raw_fun.R") tbl2raw(tbl_12) |> head(3) ``` ``` doc1 doc2 1 1 1 2 1 1 3 1 1 ``` ] --- ## From
user to
functions ### R Shiny vs Javascript - I needed to develop some R Shiny apps for my online [sample size calculator](https://wnarifin.github.io/ssc_web.html) - Javascript (`jstat.js`) did not have stats functions that I needed, although I would have preferred to keep all code in HTML + JS - R came to the rescue - To ensure that users won't even notice the difference, I even made an effort to ensure the web design remained the same for [R Shiny](https://wnarifin.shinyapps.io/ss_sem_rmsea/) and [HTML + JS](https://wnarifin.github.io/ssc/ss1mean.html) pages - The R Shiny relied on the [R functions](https://github.com/wnarifin/medicalstats-in-R/blob/master/functions/ss_sem_fun.R) written just for this purpose --- ## From
user to
functions ### R Shiny vs Javascript .pull-left[ #### R Shiny  ] .pull-left[ #### HTML + JS  ] --- ## From
user to
functions ### PhD project - I required many functions to implement existing methods and new methods, all of which were non-existent at that time - I ended up having to write many more functions to also shorten the code for long and repetitive code - Relied a lot on many `source("r_functions_script")` calls - To some extent, this experience had made me "itching" to write functions here and there </br></br><small>*Some of the [R scripts and functions](https://github.com/wnarifin/sipw_in_pvb/tree/main/simulation/functions) were made available together with a recently published paper. I didn't share all, because I sometimes wrote functions for my own use in a way that only I could understand).</small> --- class: center, middle # From
functions to
package --- ## From
functions to
package - In 2021, I submitted a [review article](https://arxiv.org/abs/2509.12217) that also demonstrated how to implement some statistical methods in R - The comments by the reviewers were that the code was too lengthy (of course it was lengthy because the methods did not exist in any package at that time) - One of the reviewers asked me to develop a package for that article, so that the code is easier to implement - Instead of rebutting the reviewer, I took up the challenge to come up with a package --- ## From
functions to
package - So, what I did was clicking **RStudio > New Project > New Directory > New Package** and started editing from there - I gathered all functions that I developed for the PhD project and rewrote them properly - Then, 3 months later, the package was ready, and I resubmitted the paper, only then it was accepted -- - The package "PVBcorrect" was in [Github](https://github.com/wnarifin/PVBcorrect) since then  --- ## From
functions to
package - So, what I did was clicking **RStudio > New Project > New Directory > New Package** and started editing from there - I gathered all functions that I developed for the PhD project and rewrote them properly - Then, 3 months later, the package was ready, and I resubmitted the paper, only then it was accepted - The package "PVBcorrect" was in [Github](https://github.com/wnarifin/PVBcorrect) since then - But, I did not have the courage to submit to **CRAN**, mostly because I heard the review was difficult, and you have to keep updating the package with each new release of R version etc --- class: center, middle # From
package to
CRAN --- ## From
package to
CRAN - It had been in my mind for quite some time, it was troublesome for me to use my own package as I had to get it from Github / local install - Wouldn't it be easier if this is available from CRAN? i.e. `install.packages("PVBcorrect)` -- - I surveyed some simple packages available in CRAN, and looked at the source code, for some inspirations and expectations - Then, I started working on submitting PVBcorrect package to CRAN in early Sept 2025 - I relied on steps outlined in <https://r-pkgs.org/release.html> -- - I tried my luck to submit the package as it was (i.e. with some error messages) -- - ... quickly **rejected** --- ## From
package to
CRAN - Then, after ironing out some issues, managed to get only two warning messages, one of that was because I used "`<<-`" assignment (ever use this one?) - Again tried my luck to submit to CRAN again, this time **rejected** by human reviewer for this reason -- - I spent some time rewriting some important functions in my package as I relied heavily on this "`<<-`" to keep temporary statistical weights between iterations --- ## From
package to
CRAN - Once the code worked again without the "`<<-`", I resubmitted to CRAN - ... and well, it was **rejected** again, now by another human reviewer -- - The comments: 1. reduce the length of the package name to < 65 characters 2. add more details in the Description text 3. add DOI/URL for references to all methods used in the package -- - I resubmitted again in early Oct 2025, it was **accepted** (finally) and had it listed in CRAN that month --- ## From
package to
CRAN - Now, `PVBcorrect` is available for installation from CRAN at <https://CRAN.R-project.org/package=PVBcorrect> - Easily installed by `install.packages("PVBcorrect")` --  --- ## From
package to
CRAN The steps that worked for me: ``` r # Preliminaries of new submission usethis::use_mit_license() # set MIT license, will auto update DESCRIPTION usethis::use_news_md() # Package NEWS link in Help usethis::use_cran_comments() # Comments documenting communications # Check the package readiness for CRAN devtools::check(remote = TRUE, manual = TRUE) # basic CRAN check devtools::check_win_devel() # more extensive check on win-builder service # patiently wait for it to finish! # search win-builder: in email # Submit to CRAN devtools::submit_cran() # check email and respond accordingly ``` -- and make sure, in the package readiness for CRAN step, it must be error free, warning free and only 1 note, ``` ## R CMD check results 0 errors | 0 warnings | 1 notes This is a new submission. ``` --- class: center, middle #
you
eady --- ##
you
eady ### Tips --- ##
you
eady ### Tips #### 1. Base package - learn and get used to functions in base package - reduce dependencies of your package on external packages - don't rely too much on tidyverse for very simple tasks - e.g. `data[,"var"]` vs `data |> select(var)` --- ##
you
eady ### Tips #### 1. Base package #### 2. Function - learn to write functions to make your (coding) life easier - if you're working on reasonably lengthy code, identify repetitive and lengthy code chunks and turn those into functions - I talked about this in [R Conference 2024](https://wnarifin.github.io/2024-r-conf-my-pres-R) --- ##
you
eady ### Tips #### 1. Base package #### 2. Function #### 3. Github - learn how to use Github to share your functions and packages --- ##
you
eady ### Quick start --- ##
you
eady ### Quick start #### 1. RStudio > New Project > New Directory > New Package <!-- - if you want to start from scratch --> <!-- Showcase: gif/image of the steps --> <center> <video width="540" height="360" autoplay muted loop> <!-- muted is a must for autoplay --> <!-- <video width="540" height="360" controls autoplay muted loop> --> <source src="img/new_package.mp4" type="video/mp4"> </video> </center> --- ##
you
eady ### Quick start #### 2. Fork a simple R package in Github - if you want to edit existing code for learning purpose - here, I forked [`genzplyr`](https://github.com/hadley/genzplyr), and named it [`kelateplyr`](https://github.com/wnarifin/kelateplyr) 😄 <!-- Showcase: wnarifin/kelateplyr -->  --- ##
you
eady ### Quick start #### 2. Fork a simple R package in Github - if you want to edit existing code for learning purpose - here, I forked [`genzplyr`](https://github.com/hadley/genzplyr), and named it [`kelateplyr`](https://github.com/wnarifin/kelateplyr) 😄 <!-- Showcase: wnarifin/kelateplyr --> .pull-left[ New code: ``` r library(kelateplyr) mtcars |> napis(mpg > 20) |> bare_nok(mpg, cyl, hp) |> kucar(kpg = mpg * 1.6) |> wak_strek(desc(mpg)) ``` ] .pull-right[ Same output: ``` mpg cyl hp kpg Toyota Corolla 33.9 4 65 54.24 Fiat 128 32.4 4 66 51.84 Honda Civic 30.4 4 52 48.64 Lotus Europa 30.4 4 113 48.64 Fiat X1-9 27.3 4 66 43.68 Porsche 914-2 26.0 4 91 41.60 ``` ``` ... ``` ] --- ##
you
eady ### Quick start #### 3. Download source code for a package from CRAN - you can have a look and edit a simple package (e.g. `SampleSizeDiagnostics` - only one function) - if you have favorite package, you can make some adjustments for your own use  <!-- <center> --> <!-- <img src="img/ssdiagnostics.png"> --> <!-- </center> --> --- class: center, middle # Thanks!