A command line tool to perform set operations with files
You can install sop using homebrew, go install or building it from source.
> brew install marcosy/tap/sop> go install github.com/marcosy/sopRemember to add GOPATH/bin to your PATH:
> export PATH="$GOPATH/bin:$PATH"You can also build sop from source, just run:
> git clone git@github.com:marcosy/sop.git
> make buildThe binary will be saved at ./bin/sop. For other targets, run make help.
sop considers files as sets of elements and performs set operations with those files.
sop [options] <operation> <filepath A> <filepath B>-
operation can be one of:
-
union: Print elements that exists in file A or file B -
intersection: Print elements that exists in file A and file B -
difference: Print elements that exists in file A and do not exist in file B
-
-
filepath A and filepath B are the filepaths to the files containing the elements to operate with. Elements are delimited by a separator string which by default is
"\n". -
options can be:
-s: String used as element separator (default"\n")
Given two files A (fileA.txt) and B (fileB.txt):
fileA.txt:
Fox
Duck
Dog
CatfileB.txt:
Dog
Cat
Cow
Goatsop performs set operations with the files.
The available operations are: union, intersection and difference.
The union of two sets A and B is the set of elements which are in A, in B, or in both A and B.
> sop union fileA.txt fileB.txt
Fox
Duck
Dog
Cat
Cow
GoatThe intersection of two sets A and B is the set containing all elements of A that also belong to B or equivalently, all elements of B that also belong to A.
> sop intersection fileA.txt fileB.txt
Dog
CatThe difference (a.k.a. relative complement) of A and B, is the set of all elements in A that are not in B.
> sop difference fileA.txt fileB.txt
Fox
DuckThe separator character used to delimitate elements is set by default to the new
line character (\n) but can also be configured using the flag -s:
> sop -s , union fileA.csv fileB.csv The result sets are not ordered by default, so consecutive executions may return
elements in different order. To obtain a consistent order pipe the output of sop
to sort:
> sop intersection fileA.txt fileB.txt | sort