Skip to content

Instantly share code, notes, and snippets.

@jrosell
jrosell / pipeline.R
Last active August 30, 2025 19:26
Example of a machine learning pipeline in Python using dagster to skip the recomputation of assets if last modification dates are not updated from since previous computations. Also an example in R using the {targets} package.
library(targets)
library(tibble)
library(readr)
library(glmnet)
# Step 1: load raw data
load_raw_data <- function() {
Sys.sleep(10) # simulate slow load
tibble(
library(tidyverse)
library(tidymodels)
set.seed(123)
n <- 10000
Evento1 <- rbinom(n, size = 1, prob = 0.7) 
Evento2 <- rbinom(n, size = 1, prob = 0.6) 
p_target <- ifelse(Evento1 == 1, 0.7, 0.2)
p_target <- p_target + ifelse(Evento2 == 1, 0.3, -0.05)
p_target <- pmin(pmax(p_target, 0.01), 0.99) # limitar entre 0 y 1
@jrosell
jrosell / experiments_with_binary_operators.R
Last active August 1, 2025 17:13
Inspired by rust and haskell.
# getUser cfg =
# lookup "username" cfg >>= \uname ->
# lookup "age" cfg >>= \ageStr ->
# readMaybe (trim ageStr) >>= \age ->
# lookup "email" cfg >>= \emailRaw ->
# validateEmail (trim emailRaw) >>= \email ->
# Just (User uname age email)
# 1. Using atomic vectors -----
@jrosell
jrosell / flatmap_vectorize.R
Last active July 31, 2025 10:22
lapply and unlist are common base R operations chained together. That's the typical flatmap operation
# Refrence: https://www.biobits.be/biofunctor/2025/07/30/what-s-r-vector-victor/
# Laboratory experiments are often performed in 96-well plastic plates, with 8 rows (labeled A-H) and 12 columns (labeled 1-12). Each microwell is a separate micro-experiment (labeled A1-H12).
rows <- LETTERS[1:8]
columns <- 1:12 |> sprintf(fmt = "%02i")
# Recycled vectors is not what we want here.
paste0(rows, columns)
#> [1] "A01" "B02" "C03" "D04" "E05" "F06" "G07" "H08" "A09" "B10" "C11" "D12"
# 1. Preparations & helper functions -----
rlang::check_installed(c("tidyverse", "repurrrsive", "DBI", "duckdb", "ollamar", "ellmer", "glue", "testthat"))
library(tidyverse)
library(glue)
library(repurrrsive)
library(DBI)
library(duckdb)
@jrosell
jrosell / visualize-predictor-correlation-matrix.R
Last active July 10, 2025 17:12
Visualization of the predictor correlation matrix, significance and clustering.
load(url("https://github.com/topepo/FES/raw/refs/heads/master/Data_Sets/Ischemic_Stroke/stroke_data.RData"))
rlang::check_installed(c("tidyverse", "tidymodels", "corrplot"))
library(tidyverse)
library(tidymodels)
VC_preds <-
c("CALCVol", "CALCVolProp", "MATXVol", "MATXVolProp", "LRNCVol",
"LRNCVolProp", "MaxCALCArea", "MaxCALCAreaProp", "MaxDilationByArea",
@jrosell
jrosell / ai-evals.R
Created July 2, 2025 16:36
Do you know that you can evaluate IA models and compare their performance? Here an example using the {vitals} package from @posit_pbc by @simonpcouch Evals, evals, evals
rlang::check_installed(c("vitals", "ellmer", "dplyr", "ggplot2"))
library(vitals)
library(ellmer)
library(dplyr)
library(ggplot2)
eval_df <- tibble(
input = c("What's 2+2?", "What's 2+3?", "What's 2+4?"),
target = c("4", "5", "6")
@jrosell
jrosell / formula_typed_functions.R
Last active July 1, 2025 22:22
An experiment creating new functions with type validation usinf formulas
# An experiment creating new functions with type validation using formula notation in R.
# For example, for two double argugment inputs and integer output the formula would be: integer ~ double + double
library(rlang)
library(testthat)
type_check_fn <- function(type_name) {
switch(as.character(type_name),
"integer" = is.integer,
"double" = is.double,
@jrosell
jrosell / run_setup.qmd
Last active June 20, 2025 17:13
Reset the environment variables and execute the setup chunk from other R chunks in quarto. USE WITH CAUTION AT YOUR OWN RISK
rm(list = ls()); rstudioapi::getActiveDocumentContext()$path |>
parsermd::parse_rmd(parse_yaml = FALSE) |>
parsermd::rmd_select("setup") |>
parsermd::as_document() |>
purrr::keep(\(x) !stringr::str_detect(x, '^```|^#')) |>
parse(text = _) |>
eval()
@jrosell
jrosell / google-ads-2-google-sheets.js
Created June 16, 2025 12:49
Script in Google Ads account to export data to a Google Sheets
function main() {
var spreadsheetUrl = 'https://docs.google.com/spreadsheets/d/id/edit?usp=sharing';
var spreadsheet = SpreadsheetApp.openByUrl(spreadsheetUrl);
var pivotedSheet = getOrCreateSheet(spreadsheet, 'PivotedConversionActions'); // New pivoted data sheet
var firstDate = '2022-01-01';
var startDate = getDateNDaysAgo(200);
var endDate = getDateNDaysAgo(1);