Skip to content

Instantly share code, notes, and snippets.

@klmr
Forked from mschubert/unix.R
Last active August 29, 2015 13:56
Show Gist options
  • Save klmr/9070369 to your computer and use it in GitHub Desktop.
Save klmr/9070369 to your computer and use it in GitHub Desktop.

Bugfixes

  • cmdescape fixes the use of parameters which contain special characters of the terminal (or spaces)
  • Regression: the updated version does not work with custom specified arguments. If this is desired, there should be an optional list parameter for this purpose.

Style fixes & Improvements

  • Single expressions do not need (and should not have) parentheses. This goes universally (and in particular, since R is a functional programming language, for functions).
  • There’s no need to forward %sed% to sed (and same for %grep%), it can be declared as an alias. In fact, the original function is somewhat redundant since operators can be called as functions.
  • Use `…` instead of '…' to specify unusual names: the former is syntactically a name while the latter is a string! It only “happens to work” because R has special lookup rules for functions when encountering a string. The same won’t work for other (non-function) objects though.
  • setNames obviates the need for temporary variables to set the names of a vector/list.
  • Analogously for structure to set an object’s class.
  • Using <<< instead of echo calls one command less and is more efficient. NB: This might require specifying an interpreter. Arguably using bash -c … is safer (but won’t work on Windows – but then, none of the methods here are platform independent anyway).
  • I’m not clear about the meaning of setting the class in egrep … the result probably isn’t what I’d expect, at any rate.
  • Replace use of paste("a ", " b ", " c", sep='') with paste("a", "b", "c"). In cases where sep='' is actually desired, there’s paste0, which does exactly that.
cmdescape = function (str)
sprintf("'%s'", gsub("'", "'\"'\"'", str))
sed = function(x, pattern)
setNames(sapply(x, function(i)
structure(system(paste("sed", cmdescape(pattern), "<<<", cmdescape(i)), intern=TRUE),
class=class(i))),
names(x))
`%sed%` = sed
egrep = function(x, pattern) {
stopifnot(class(pattern) == 'character' || class(pattern) == 'numeric')
structure(lapply(x, function(i)
tryCatch(
system(paste("grep", cmdescape(pattern), "<<<", cmdescape(i)), intern=TRUE),
error = function(e) NULL,
warning = function(w) NULL)),
class = class(x))
}
`%grep%` = egrep
@mschubert
Copy link

Ok, so let's review the review. I'm not going to be nitpicking by pointing out things like "there is a space after your first function () but not on the second one", etc.

First off, I'm happy that you pointed out some things I genuinely did not know. Can I send you all my other code as well? :D

Bugfixes

I don't agree with your bugfixes, or at least not with the conclusion. The commands were supposed to mirror UNIX grep and sed, and if I can't use command line arguments anymore (either because they're not supported or they need a function call that can not be handled with an operator) they lose most of their value (esp. grep).

My proposed solution would be to use cmdescape, but to un-escape command line arguments (that should be fairly simple).

Others

Do not define functions and operators, operators can be called as functions

Does that mean %grep% in the above case masks R's grep? This would be a bug, because replacing core functionality with different one under the same name is bound to lead to problems.

Use setNames and structure instead of creating temporary variables

The drawback is that this also leads to a higher nesting of commands, which in turn gets harder to read and debug. With R I tend to agree, however, because it copies everything otherwise.

Use <<< instead of echo

Is this portable within unix, i.e. bash<4?

Setting the class in egrep

If I grep integers, I expect the result to be an numeric and not a character.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment