Contents

Introduction

That's a grep-like program that handles csv files (or even any delimited file format) by filtering its columns' content through a extremely simple syntax.

Usage

Usage: csvgrep [options] [expression] file

Options:
    --show-header            print enumerated column titles
                             (this option must be used alone and without expression)
    --with-header            print the column titles in the first output line
    --no-quote               print fields unquoted, even if needs to be quoted
                             (eg. comma separated fields into csv files)
    --close                  print a separator at the last position of each output line
    --print-all              send to output: (i) all input fields or 
                                             (ii) all not hidden fields (when some column is hidden)
    --separator=S            set S as input and output separator (',' is the default separator)
    --output-separator=S     set S as output separator (is overriden when --separator is used)
    --input-separator=S      set S as input separator (is overriden when --separator is used)
    --help                   print this message

Expression syntax

Expression is a comma separated list with key only or key-value pairs.

The key-value structure has the general syntax:

[[] [!] KEY[=VALUE] []]

where:

  • KEY is any regular expression that matches the column name (if it matches more than one column name, the first declared is used) or column's numeric id;
  • = the key-value pair clue;
  • VALUE is any regular expression that matches the column values (row values) (its absence serves to send the column, matched with KEY, to output);
  • [] hides the field from the output;
  • ! negates the expression;

The columns' ids are obtained with the command:

$ csvgrep --show-header names.csv
0,First Name
1,Middle Name
2,Last Name
3,Job Title
4,Salary

The numbers at the beginning of each line are the columns' ids.

Examples

Print all fields for each line where the column titled "Middle Name" matches the regular expression Doe$. If the "Middle Name" column doesn't exist an error is raised.

$ csvgrep --print-all "Middle Name=Doe$" names.csv

Print "First Name" and "Salary" columns and hides "Middle Name" column.

$ csvgrep "[Middle Name=Doe$], First Name, Salary" names.csv

Print all lines that don't match the regular expression.

$ csvgrep "[! Middle Name=Doe$], First Name, Salary" names.csv

Print the first and third column

$ csvgrep "0, 3" any.csv

Print the first and third column and filters by the fourth

$ csvgrep "0, 3, [4=True]" any.csv

Emulating unix commands

Unix Command csvgrep Description
awk 'BEGIN{OFS=","} {print $1,$2,$4,$6}' csvgrep "0,1,3,5" Print columns 1,2,4,6 delimited by commas.
grep "^Doe" csvgrep "0=^Doe" Lines starting with Doe is equivalent to first column starting with Doe
awk 'BEGIN{OFS=","} $1 ~ /^Doe/ {print $2,$5}' csvgrep "[0=^Doe],1,4" Print the second column when first column begins with Doe.

Tips

  • If the file parameter is not declared csvgrep reads the content from the standard input.
  • If more than one file is passed as parameter, csvgrep raises an error.
  • Any spaces surrounding the = symbol in the key-value expression will make part of the regular expression.
  • The header line is the first line where the number of columns is greater than 1, so if you have only one column I suggest to use awk.

Project

The project is hosted at Google:

csvtools

Download

csvgrep
Last modified December 6, 2008 9:29 pm / Skin by Kevin Hughes
MediaWiki