Do complex things with just a few keystrokes!
Bash is a Unix shell and command language, it survived and thrived for almost 50 years because it lets people do complex things with just a few keystrokes. Sometimes called "the universal glue of programming," it helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds that may be halfway around the world.
Basic
How to move around in the shell, and how to create, modify, and delete files and folders.
pwd
print working directoryls
: listing files or directoriescd
: change directorycp
: copymv
: move or renamerm
: remove$1
or$2
in bash script file, receive ARGV in bash scripts.$@
,$*
means get ARGV list,$#
means get ARGV length.
Find files
by name
- find ., list all file and folder below current
- find folder
- find . -type d, find all foldre, no file
- find . -type f, find all file , no foldre
- find . -type f -name "test.txt", name as text txt file
- find . -type f -name "text*", name as txt all file
- find . -type f -iname "text*", 不区分大小写
- find . -type f -name "*.py"
by time
- find . -type f -mmin -10,过去十分钟修改过的文件
- find . -type f -mmin +10
- find . -type f -mmin +1 -mmin -5
- find . -type f -mtime -20
amin,atime: access min and access day; cmin,ctime: change min and change day; mmin,mtime: modify;
by size
- find . -size +5m, k,g is work too
- ls -lah ./folders, info about sub folder and files,including size
- find . -empty
by permission
- find. -perm 777, read, write, and excute
- find folder -exec chown coreschafer:www-data {} +
- find folder, will return all folder, -exec will run the command in that results, {} palceholder, + end of the command.
- find folder -type f -exec chmod 664 {} +
- find folder -perm 664
- find . -type f -name "*.jpg"
- find . -type f -name "*.jpg" -maxdepth 1, searched 1 level down
- find . -type f -name "*.jpg" -maxdepth 1 -exec rm {} +, delete serched files
Grep
Grep single file
searched text
- grep "text_you_want_search" filename.txt
- grep -w "text_you_want_search" filename.txt, have to match all words
- grep -wi "text_you_want_search" filename.txt, igore the lowcase and uppearcse.
- grep -win "text_you_want_search" filename.txt, get info about the line number
- grep -win -B 4 "text_you_want_search" filename.txt, return the context about the searched words, 4 line, behind
- grep -win -A 4 "text_you_want_search" filename.txt, return the context about the searched words, 4 line, ahead
- grep -win -C 4 "text_you_want_search" filename.txt, return the context about the searched words, 4 line, two line before and two behind.
Grep multi file
- grep -win "text_" ./*, all file
- grep -win "text_" ./*.txt, txt file
- grep -winr "text" ./ , search all subdir
- grep -wirl "text" ./ , no need match info, just file list
- grep -wirc "text" ./ , show matched number in eatch file
Grep command history
- history | grep "git commit"
- history | grep "git commit" | grep "dotfile"
Grep rgx
- grep -P "--" file.txt, work well in linux, mac need to config, I configed
cURL
Requests
- curl url
- curl http://localhost:5000
- curl http:www.wittyfans.com/json_file
- curl -i http:www.wittyfans.com/json_file, details info about the get
- curl http:www.wittyfans.com/method
- curl -d "first=name&last=lastname" http:www.wittyfans.com/method, d for data, Post request
- curl -X PUT -d "first=name&last=lastname" http:www.wittyfans.com/method, d for data, Pust request
- curl -X DELETE http:www.wittyfans.com/method, delete request
Verify
Could not verify your access ?
curl -u username:password http://wittyfans.com, Auth
Download
- curl http://wittyfans.com/folder, return binary file , error
- curl -o filename.jpg http://wittyfans.com/folder , sucess
- curl -o file_name.json http:/.api.wittyfans.com , Saving large json file
rsync
Install
aviable in Mac, debian-based linux need to install
- apt-get install rsync
- yum install rsync
Use
- rsync folder1/* backup/ , sync fils to backup folder,will skping the subfolder's file, but affected subfolder
- rsync -r folder1/* backup/ , including subfolder's file
- rsync -r folder1 backup/, sync folder, not content in it
Check chage before run
- rsync -a --dry-run folder1/* backup/, check before the command run, now view showed
- rsync -av --dry-run folder1/* backup/, auto view
Source_folder has new file
- rsync -av --delete --dry-run original/ backup/, check, be careful !
Do it in local and host
- rsync -zaP -p local_folder username@ip:~/public/, z for compress, a for all, P for tarnsfer in internet
- rsync 0zaP username@ip:~/public/file ~/Downloads/, revers
Manipulating data
How to work with the data in those files
cat
: view a files contents, meaning concatenate
less
& more
: view contents piece by piece, more
is superseded by less
now, In less
:
:n
, Move to next file:p
, Go back to previous file:q
, quit
head
: look at the start of a text file, head -3
, only display the first three lines
tail
: look at the end of a text file, tail -n +7
, display content from line 7 to end
ls
: list everything below a directory, ls -R -F
, -R
recursive -F
prints a /
after the name of every directory and a *
after the name of every runnable program.
man
: manual, automatically invokes less
cut -f 2-5,8 -d values.csv
: select columns 2 through 5 and columns 8, using comma as the separator. -d
means delimiter, -f
meaning fields to specify columns
!command
: re-run the most recent use of that command matched
grep bicuspid seasonal/winter.csv
: prints lines from winter.csv that contain "bicuspid"
cat two_cities.txt | egrep 'Sydney Carton|Charles Darnay' | wc -l
: Count the number of lines in the book that contain either the character 'Sydney Carton' or 'Charles Darnay'.
1 | -c: print a count of matching lines rather than the lines themselves |
Combining tools
How to use this power to select the data you want, and introduce commands for sorting values and removing duplicates.
head -n 5 seasonal/summer.csv > top.csv
: get first 5 rows content of summer.csv, write to top.csv
cut -d , -f 2 seasonal/summer.csv | grep -v Tooth
, select all of the tooth names from column 2 of the comma delimited file.
wc
, word count, count a date from a file. grep 2017-07 seasonal/spring.csv | wc -l
head -n 3 seasonal/s*
, show all s* files first 3 rows.
*
, all?
, single word[...]
matches any one of the characters inside the square brackets,201[78].txt
matches2017.txt
or2018.txt
, but not2016.txt
{...}
matches any of the comma-separated patterns inside the curly brackets, so{*.txt, *.csv}
matches any file whose name ends with.txt
or.csv
, but not files whose names end with.pdf
.
sort
, -n
: sort numerically, -r
: reverse, -b
: ignore leading blanks, -f
: be case-insensitive
uniq
, remove adjacent duplicated lines .
wc -l seasonal/*.csv
, Print line numbers for each file in folder seasonal
wc -l seasonal/*.csv | grep -v 'total' | sort -n | head -n 1
, remove rows with word 'total' and select first row.
Batch processing
How to make your own pipelines do that. Along the way, you will see how the shell uses variables to store information.
set
: check environment variables
echo
: print
$User
, user name$OSTYPE
name of the kind of operating system you are using
training=seasonal/summer.csv
then echo $training
, define a variable and print it.
For loop:
1 | # Example 1 |
Do not using space in file name, it will causing issue in bash.
Creating new tools
How to go one step further and create new commands of your own.
Edit file using nano
:
nano filename.txt
, edit a file
Ctrl
+K
: delete a line.Ctrl
+U
: un-delete a line.Ctrl
+O
: save the file ('O' stands for 'output'). You will also need to press Enter to confirm the filename!Ctrl
+X
: exit the editor.
grep -h -v Tooth spring.csv summer.csv > temp.csv
, -h
stop it from printing filenames, -v
printing all rows exclude Tooth
history | tail -n 3
, Show most recent 3 commands
$@
, pass filenames to scripts. tail -q -n +2 $@ | wc -l
Downloading data
how to download data files from web servers via the command line
curl
, Client for URLs. man curl
, check curl installation.
curl -O url
save the file with it's original namecurl -o newname.txt url
new file name- Download all 100 data file,
curl -O https://s3.amazonaws.com/datafile[001-100].txt
Wget
, World Wide Web and get. better than curl when downloading multiple files recursively. which wget
, check wget
installation.
wget -c -b https://wittyfans.com/201812SpotifyData.zip
-c
, resume broken download-b
, go to backgroundwget --wait=1 -i url_list.txt
, # Create a mandatory 1 second pause between downloading all files in url_list.txt
1 | # Use curl, download and rename a single file from URL |
CSV Kit
Using
csvkit
to convert, preview, filter and manipulate files to prepare our data for further analyses.
pip install csvk
for install, Doc
in2csv
in2csv -h
, converting files to csv.in2csv SpotifyData.xlsx > SpotifyData.csv
in2csv SpotifyData.xlsx --sheet "Worksheet1_Popularity" > Spotify_Popularity.csv
, Only converting a sheet
csvlook
csvlook -h
, data preview on the command line.csvlook SpotifyData.csv
csvsort
csvsort -c 2 Spotify_Popularity.csv | csvlook
csvstat
csvstat Spotify_Popularity.csv
, summary statistics
csvcut
csvcut -n Spotify_MusicAttributes.csv
, Print a list of column headers in data filecsvcut -c 1,3,5 Spotify_MusicAttributes.csv
, Print the first column, by positioncsvcut -c "track_id","duration_ms","loudness" Spotify_MusicAttributes.csv
, Print the track id, song duration, and loudness, by name
csvgrep
csvgrep -c "danceability" -m 0.812 Spotify_MusicAttributes.csv
, filter row danceability by value 0.812, column name must with "".
csvstack
csvstack
, merge files.csvstack Spotify_Rank6.csv Spotify_Rank7.csv > Spotify.csv
, merge two files to onecsvstack -g "Rank6","Rank7" \ Spotify_Rank6.csv Spotify_Rank7.csv > Spotify_Al
, merge two files to one and add a source column.
chain commands
;
, links commands together and runs sequentially&&
, links commands together, but only runs the 2nd command if the 1st succeeds>
, using outputs from the 1st command|
, using outputs form the 1st as input to the 2nd
sql2csv
sql2csv -v
orsql2csv --verbose
, printing more tracebacks and logs
1 | # Pull the entire Spotify_Popularity table and print in log |
csvsql
Manipulating data using SQL syntax (Small to medium files only) :
1 | # Reformat the output using csvlook |
using bash variable:
1 | # Store SQL query as shell variable |
join two file:
1 | # Store SQL query as shell variable |
Pushing data back to database:
1 | # Store SQL for querying from SQLite database |
Bash Script
Stream editor
sed
: stream editor.
cat soccer_scores.csv | sed 's/Cherno/Cherno City/g' > soccer_scores_edited.csv
: replace word Cherno
to herno City
then save it to a new file, for more, check this.
Argument
$1
or $2
in bash script file, receive ARGV in bash scripts. $@
, $*
means get ARGV list, $#
means get ARGV length. cat hire_data/*.csv | grep "$1" > "$1".csv
: take in a city (an argument) as a variable, filter all the files by this city and output to a new CSV with the city name.
Quotes
Single,double,backticks.
- Single quotes ('sometext') = Shell interprets what is between literally
- Double quotes ("sometext") = Shell interpret literally except using $ and backticks
- Backticks (`sometext`) = Shell runs the command and captures STDOUT back into a variable
Numeric variables
In bash, Type >>> 1 + 5
will get error. instead, you need type expr 1 + 5
. expr
is utility program just like cat
and grep
. but expr
cannot natively handle decimal places. expr 1 + 2.5
will get not a decimal nuber error
.
Introduce bc
(basic calculator), a useful command-line program. using bc
without opening the calculator:
echo "5+7.5" | bc
, bc
has a scale
argument for how many decimal places: echo "scale=3; 10 /3 | bc"
, ;
is to separate lines in terminal.
Array
Normal array
Create array: capital_cities=("Sydney" "New York" "Paris")
Add element:
1 | # Create a normal array with the mentioned elements using the declare method |
Get all element and length of array:
1 | # The array has been created for you |
Associative arrays
Like dictionary in python.
Create:
1 | # Create empty associative array |
Create in one line:
1 | # An associative array has been created for you |
Example:
1 | # Create variables from the temperature data files |
Control Statements
IF statements
1 | if [ condition1 ] && [ condition2 ]; then |
Move files based on content:
1 | # Extract Accuracy from first ARGV element |
1 | # Normal flags |
For loops
1 | # Use a FOR loop on files in directory |
Case
1 | # Create a CASE statement matching the first ARGV element |
Function
1 | function function_name { |
Scope: All variables in Bash are global by default!Using
local val
to restrict variable scope.
To get data out from function:
- Assign to a global variable
echo
what we want back in last line and capture using shell-within-shell
1 | function convert { |
1 | # Create a function with a local base variable |
Python script on bash
Scheduling a job with crontab
1 | # Preview both Python script and requirements text file |
Cron
- crontab -l, list the crons
set your editor to nano, default vim
- export EDITOR=/user/bin/nano
- crontab -e, open editor
- press i to input
1 |
|