Skip to content

Instantly share code, notes, and snippets.

@onlyphantom
Created June 16, 2018 15:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save onlyphantom/753278dd6d0adaa3f2c3d27e575920e0 to your computer and use it in GitHub Desktop.
Save onlyphantom/753278dd6d0adaa3f2c3d27e575920e0 to your computer and use it in GitHub Desktop.
Sample Code: Using SQL Server with Microsoft R
sqlConnString <- "Driver=SQL Server;Server=SETHMOTTDSVM;Database=RDB;Uid=ruser;Pwd=ruser"
# read in chunks of 100000
sqlRowsPerRead <- 100000
sqlTable <- "NYCTaxiBig"
ccColInfo <- list(
tpep_pickup_datetime = list(type = "character"),
tpep_dropoff_datetime = list(type = "character"),
passenger_count = list(type = "integer"),
trip_distance = list(type = "numeric"),
pickup_longitude = list(type = "numeric"),
pickup_latitude = list(type = "numeric"),
dropoff_longitude = list(type = "numeric"),
dropoff_latitude = list(type = "numeric"),
RateCodeID = list(type = "factor", levels = as.character(1:6), newLevels = c("standard", "JFK", "Newark", "Nassau or Westchester", "negotiated", "group ride")),
store_and_fwd_flag = list(type = "factor", levels = c("Y", "N")),
payment_type = list(type = "factor", levels = as.character(1:2), newLevels = c("card", "cash")),
fare_amount = list(type = "numeric"),
tip_amount = list(type = "numeric"),
total_amount = list(type = "numeric")
)
nyc_sql <- RxSqlServerData(connectionString = sqlConnString, table = sqlTable, rowsPerRead = sqlRowsPerRead, colInfo = ccColInfo)
system.time(linmod <- rxLinMod(tip_percent ~ pickup_nhood:dropoff_nhood + pickup_dow:pickup_hour,
data = nyc_sql, reportProgress = 0,
rowSelection = (u < .75)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment