Cookbook: Unix tools
A virtual machine with all these tools
Contributed by Antonio Ruiz. Download this VirtualBox image to have access to a wide range of command line tools for data science.
Regular expressions are not only found in Unix, but a lot of Unix tools use them. For example,
Menagerie of little tools
cat, head, tail
Concatenate (combine) two or more files:
You probably want to save the output in another file:
View just the top 10 lines of a file (first command), or top 5 lines (second command):
View just the last 10 lines of a file, or variations:
tail is also useful to watch a file as it changes, as you would a log file:
-o— only print matching part, not whole line with the match
-P— use Perl-style regular expressions (lots of extra features)
\` is often useful to execute a command and use its (space- or newline-separated output) to feed into the
Convert Excel to CSV:
Display column names:
wget is for downloading web pages or FTP directories.
wget complains about a certificate (SSL):
cron tool (which is always running) allows you to specify commands to run at certain times in the day/week/month. Contributed by Nathan Hilliard.
Start by running
crontab -e, then specify the time frame and command to run, one per line. See this quick reference for details about that file.