John Stanton-Geddes johnstantongeddes

## r-to-python-data-wrangling-basics.md

      
              1 file
            
          
              101 forks
            
          
              38 comments
            
          
              402 stars
            
          
                conormm
                / r-to-python-data-wrangling-basics.md
            
            
              Last active
              April 24, 2024 18:22
            
              
                R to Python: Data wrangling with dplyr and pandas
              
          
    R to python data wrangling snippets

The dplyr package in R makes data wrangling significantly easier.
The beauty of dplyr is that, by design, the options available are limited.
Specifically, a set of key verbs form the core of the package.
Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe.
Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R.
The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas package).
dplyr is organised around six key verbs:

  
## orig.png

      
              4 files
            
          
              10 forks
            
          
              2 comments
            
          
              27 stars
            
          
                hrbrmstr
                / orig.png
            
            
              Last active
              July 16, 2023 06:43
            
              
                Supreme Annotations - moar splainin here: http://rud.is/b/2016/03/16/supreme-annotations/ - NOTE: this requires the github version of ggplot2
              
          
## sentiment_score_simple.R
# Code to fetch news streams from 5 live sources, process the streams and text
# and apply a simple sentiment scoring algorigthm.
#
# A writeup of the analysis can be found here:
# https://www.linkedin.com/pulse/article/20141109035942-34768479-r-sentiment-scoring-hsbc-w-harvard-general-inquirer

# Define the packages we want to load:
packs = c(
  "tm",                         # Text mining
  "tm.plugin.webmining",        # Web-source plugin for text mining
	# Code to fetch news streams from 5 live sources, process the streams and text
	# and apply a simple sentiment scoring algorigthm.
	#
	# A writeup of the analysis can be found here:
	# https://www.linkedin.com/pulse/article/20141109035942-34768479-r-sentiment-scoring-hsbc-w-harvard-general-inquirer

	# Define the packages we want to load:
	packs = c(
	"tm", # Text mining
	"tm.plugin.webmining", # Web-source plugin for text mining