miraculixx/README.md

## README.md

      
    Raw
  

              README.md
            
          
    in response to this question on stackoverflow
Installation:
$ chmod +x csvparse.py

Usage:
$ cat sample.csv | csvparse.py 

Add more filters:
Modify filter() accordingly. E.g. to filter on the first column,
change the existing condition:
if fields[0] == 'some other value':
     return True

To add more conditions extend the filters in any way
you like. Here is one example derived from a decision
table ($<n> refres to field n, zero-indexed):
Conditions:   R1     R2     R3      R_else  
        $0    foo    abc    <any>   <else>
        $1    !xyz   !xyz   xyz      
Actions:
    include   X      X
    exclude                 X        X

def filter(fields, line):
  # R_else
  should_include = False
  # R1
  if fields[0] == "foo" and fields[1] != "xyz":
     should_include = True
  # R2
  if fields[0] == "abc" and fields[1] != "xyz":
     should_include = True
  # R3
  if fields[1] == "xyz":
     should_include = False
  return should_include

Note you could also write the same as a simple conditional statement, however this becomes
unmaintainable quickly.
return (fields[0] in ['foo', 'abc']) and fields[1] != "xyz"


## csvparse.py
#!/usr/bin/env python

import sys

def filter(fields, line):
    """
    put your conditions here

    return True to include the line in the output

    fields are all fields in line
    """
    if fields[0] == 'foo':
       return True

def parsed(infile, sep=','):
    """ helper function to call filter() """
    for line in infile:
       fields = line.split(sep)
       if filter(fields, line):
           yield line

# only output lines that should be filtered
for output in parsed(sys.stdin):
   sys.stdout.write(output)

## sample.csv

          
            foo
            abc

            
              bla
              cde

            
              foo
              fgh
	#!/usr/bin/env python

	import sys

	def filter(fields, line):
	"""
	put your conditions here

	return True to include the line in the output

	fields are all fields in line
	"""
	if fields[0] == 'foo':
	return True

	def parsed(infile, sep=','):
	""" helper function to call filter() """
	for line in infile:
	fields = line.split(sep)
	if filter(fields, line):
	yield line

	# only output lines that should be filtered
	for output in parsed(sys.stdin):
	sys.stdout.write(output)