<
directs the contents of a file into a program as stdin
.
|
pipes data from one program to another, forming a pipeline.
>
(or 1>
) will redirect a pipeline's stdout
to overwrite a given file location.
>>
will redirect stdout
to append to a given file location.
2>
will redirect a pipleline's stderr
.
&>
will redirect both stdout
and stderr
to the same file location.
grep
Searches for a given pattern line-by-line in an input file or stdin
stream and prints matching lines to stdout
. Useful as a filter operation in a pipeline.
sed
Performs a line-by-line regex-based find/replace on an input file or stdin
stream and prints output to stdout
. Useful as a mapping operation in a pipeline. -i
enables in-place find/replace on a file.
awk
Automatically parses line-by-line file or stdin
stream input with delimiters (eg a CSV file) into multiple input variables, which can be manipulated with and output to stdout
. Useful in pipelines as a mapping operation over tabular/record-based data.
With the help of specific command line flags, Perl one-liners can reproduce and extend the functionality of grep
, sed
, or awk
.
Perl "micro-scripts" can be particularily useful for easily combining find/replace operations using regular expressions with math operations. For example, to convert 30 fps video timestamps from a frame-based format to one based on milliseconds:
perl -pe 's/(\d\d);(\d\d);(\d\d);(\d\d)/"$1:$2:$3,".(sprintf "%03d", $4 * 1000 \/ 30)/ge'
00;00;03;25
is converted into 00:00:03,833
. -p
runs the operation repeatedly on each line piped to stdin
and sends the result to stdout
(like sed
or awk
would).
read
Reads a line from stdin
. Mainly useful in shell scripts for accepting interactive input or piping stream output into a while-loop for more sophisticated processing.
sort
Concatenates and sorts the input stream (or files) and returns the results to stdout
. Can take a delimeter and sort on specified columns of tabular data.
uniq
Eliminates adjacent duplicate lines from input stream (or file) and returns results to output stream (or file). Can also prepend the count of adjacent lines (useful for "histogramming" categorical data in combination with sort
).
cut
Removes subsets of lines passed in through a file or stream. Useful for filtering out unnessecary columns from tabular datasets.