Skip to content

Instantly share code, notes, and snippets.

/1.R

Created May 18, 2014 02:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/c1c68121323af19c766c to your computer and use it in GitHub Desktop.
Save anonymous/c1c68121323af19c766c to your computer and use it in GitHub Desktop.
R
Social security numbers in the United States are represented by
# numbers conforming to the following format:
#
# a leading 0 followed by two digits
# followed by a dash
# followed by two digits
# followed by a dash
# finally followed by four digits
#
# For example 023-45-7890 would be a valid value,
# but 05-09-1995 and 059-2-27 would not be.
#
# Implement the body of the function 'extractSecuNum' below so that it
# returns a numeric vector whose elements are Social Security numbers
# extracted from a text, i.e., a vector of strings representing the text lines,
# passed to the function as its 'text' argument.
# (You can assume that each string in 'text' contains
# either zero or one Social Security numbers.)
extractSecuNum = function(text){
# Write your code here!
x = 1:length(text)
list_of_input = rep(0, length(text))
for (ind in x){
list_of_input[ind] = sub(' .*', '', sub('^[^0-9]*', '', text[ind]))
}
temp = c()
for (ind in x){
if(list_of_input[ind] != ''){
temp = c(temp, list_of_input[ind])
}
}
temp2 = c()
for (ind in 1:length(temp)){
temp3 = strsplit(temp[ind], '-')
temp2 = c(temp2, temp3)
}
final = c()
for(ind in 1:length(temp2)){
if (sub('0[0-9][0-9]', '', temp2[[ind]][1]) == ''){
if (sub('[0-9][0-9]', '', temp2[[ind]][2]) == ''){
if (sub('[0-9]{4}', '', temp2[[ind]][3]) == '')
{ final = c(final, paste(temp2[[ind]][1], temp2[[ind]][2], temp2[[ind]][3], sep='-')) }
}
}
}
return(final)
}
# Implement the body of the function 'rmMultipleBlanks'
# below so that it removes multiple blank characters (i.e.,
# spaces, tabs, and newlines) from before and after
# a vector of strings. For example:
#
# " hello, world " should be converted to "hello, world"
# "\n Hey you " should be converted to "Hey you"
# "\t\t\tHey you\n\n " should be converted to "Hey you"
#
# The function takes an argument called 'stringsWithBlanks', which
# is a character vector of strings (having possible junky prefixing
# and/or trailing blank characters).
#
# The function should return a character vector of the same
# length containing the strings with the
# prefixing and/or trailing blanks removed.
rmMultipleBlanks = function(stringsWithBlanks){
# Write your code here
temp = c()
for (ind in 1:length(stringsWithBlanks)){
temp = c(temp, (sub('[ ,\t, \n]*', '' , stringsWithBlanks[ind])))
}
temp2 = c()
for (ind in 1:length(temp)){
temp2 = c(temp2, (sub('[ ,\t, \n]*', '' , temp[ind])))
}
temp3 = c()
for(ind in 1:length(temp)){
tempk = c()
tempk = strsplit(temp[ind], '')
temp3 = c(temp3, tempk)
}
temp4= c()
for(ind in 1:length(temp3)){
tempk = rev(temp3[[ind]])
tempd = c('')
for(ind2 in 1:length(tempk)){
tempd = paste(tempd, tempk[ind2], sep='')
}
tempj = sub('[ ,\t, \n]*', '' , tempd)
temp4 = c(temp4, strsplit(tempj, ''))
}
print(temp4)
temp5= c()
for (ind4 in 1:length(temp4)){
tempd=c()
tempk = rev(temp4[[ind4]])
print(tempk)
for(ind2 in 1:length(tempk)){
tempd = paste(tempd, tempk[ind2], sep='')
}
temp5 =c(temp5, tempd)
}
#tempz = c('')
#for(ind3 in 1:length(tempq[[1]])){
# tempz = paste(tempz, tempq[[1]][ind3], sep='')
#}
#print(tempz)
#temp4 = c(temp4, tempz)
return(temp5)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment