Skip to content

Instantly share code, notes, and snippets.

@conormm
conormm / r-to-python-data-wrangling-basics.md
Last active June 26, 2024 07:56
R to Python: Data wrangling with dplyr and pandas

R to python data wrangling snippets

The dplyr package in R makes data wrangling significantly easier. The beauty of dplyr is that, by design, the options available are limited. Specifically, a set of key verbs form the core of the package. Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe. Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R. The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).

dplyr is organised around six key verbs:

anonymous
anonymous / Textplot.R
Created December 16, 2011 18:48
Upgrade of textplot in PerformanceAnalytics and gplot
#' Display text information in a graphics plot.
#'
#' This function displays text output in a graphics window. There is an option to display the text using the largest font that will fit in the plotting region.
#' For matrixes, data.frames and vectors a specialized textplot function is available that plots each of the cells individually in a way that is visually
#' appealing (maintains the table-like/grid-like structure of the data). If present, row and column labels will be displayed in a bold font.
#'
#' This function was modified from the PerformanceAnalytics version by Peter Carl and Brian G. Peterson (brian@@braverock.com).
#' Assistance was provided by John Colby in this post:
#' \url{http://stackoverflow.com/questions/8523944/left-justify-a-column-using-textplot-gplots-or-performanceanalytics}