Monday, September 14, 2020

Bash scripting and shell programming linux part 4

Data manipulations and text transformations with SED:

  • SED=Stream Editor
  • A stream is data that travels from one process to another through pipe, one file to another as redirect or one device to another.
  • Sed performs text transformation on streams.
    • Examples:
      • substitute some text with other text
      • removes lines
      • Append text after given lines
      • Insert text before certain lines

 type -a sed

sed is /usr/bin/sed

sed is /bin/sed

Syntax:
sed 's/search-pattern/replacement-pattern/flags' file_name

Example:
echo 'I love my wife.' > love.txt
cat love.txt
I love my wife.
sed 's/my wife/sed/' love.txt
I love sed.

Scripting is case sensitive so we can use use "i" flag to make it  case insensitive with sed
sed 's/MY WIFE/sed/I' love.txt
or
sed 's/MY WIFE/sed/i' love.txt

to append a line to a existing a file:
echo 'this is line 2.' >> love.txt
cat love.txt
I love my wife.
this is line 2.

Change my wife with sed
cat love.txt
I love my wife.
this is line 2.
I love my wife with all of my heart.
sed 's/my wife/sed/' love.txt
I love sed.
this is line 2.
I love sed with all of my heart.


sed 's/my wife/sed/' love.txt
I love sed.
this is line 2.
I love sed with all of my heart.
I love sed and my wife loves me too

globally change all the words "my wife" with sed:
sed 's/my wife/sed/g' love.txt
I love sed.
this is line 2.
I love sed with all of my heart.
I love sed and sed loves me too

change the 2nd word "my wife" in a line
sed 's/my wife/sed/2' love.txt
I love my wife.
this is line 2.
I love my wife with all of my heart.
I love my wife and sed loves me too

Save in a file
sed 's/my wife/sed/2' love.txt >ex.txt
cat ex.txt
I love my wife.
this is line 2.
I love my wife with all of my heart.
I love my wife and sed loves me too

delete a line starting with "this"
sed '/this/d' love.txt
I love my wife.
I love my wife with all of my heart.
I love my wife and my wife loves me too

using escape characters:
echo '/home/test' | sed 's/\/home\/test/\/home\/var\/test/'
/home/var/test

echo '/home/test' | sed 's:\/home\/test:\/home\/var\/test:'
/home/var/test

To delete a line based on a word/char starting with the line.
echo '#User to run service as.' > config
echo 'User apache' >> config
echo '#Group to run service as ' >> config
echo 'Group apache' >>config
 cat config
#User to run service as.
User apache
#Group to run service as
Group apache
sed '/^#/d' config
User apache
Group apache

awk:

  • Awk is a scripting language used for manipulating data and generating reports.The awk command programming language requires no compiling, and allows the user to use variables, numeric functions, string functions, and logical operators.
  • Awk is a utility that enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a document and the action that is to be taken when a match is found within a line. Awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that matches with the specified patterns and then performs the associated actions.
  • Awk is abbreviated from the names of the developers – Aho, Weinberger, and Kernighan.
Syntax:

awk options 'selection _criteria {action }' input-file > output-file

Options:
-f program-file : Reads the AWK program source from the file 
                  program-file, instead of from the 
                  first command line argument.
-F fs : Use fs for the input field separator


To print every lines of a file:
awk '{print}' config
#User to run service as.
User apache
#Group to run service as
Group apache

Print the lines which matches with the given pattern.
awk '/User/ {print}' config
#User to run service as.
User apache

Splitting a Line Into Fields :
For each record i.e line, the awk command splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 4 words, it will be stored in $1, $2, $3 and $4 respectively. Also, $0 represents the whole line.

awk '{print $1, $3}' config
#User run
User
#Group run
Group

Built In Variables In Awk:
Awk’s built-in variables include the field variables—$1, $2, $3, and so on ($0 is the entire line) — that break a line of text into individual words or pieces called fields.

NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file.

NF: NF command keeps a count of the number of fields within the current input record.

FS: FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator.

RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline.

OFS: OFS command stores the output field separator, which separates the fields when Awk prints them. The default is a blank space. Whenever print has several parameters separated with commas, it will print the value of OFS in between each parameter.

ORS: ORS command stores the output record separator, which separates the output lines when Awk prints them. The default is a newline character. print automatically outputs the contents of ORS at the end of whatever it is given to print.


awk '{print NR,$1}' config
1 #User
2 User
3 #Group
4 Group

 awk '{print NR,$0}' config
1 #User to run service as.
2 User apache
3 #Group to run service as
4 Group apache

awk '{print NR,NF,$0}' config
1 5 #User to run service as.
2 2 User apache
3 5 #Group to run service as
4 2 Group apache

To print the first item along with the row number(NR) separated with ” – “
awk '{print NR "-" $1}' config
1-#User
2-User
3-#Group
4-Group

To print any non empty line if present
awk 'NF > 0' config
#User to run service as.
User apache
#Group to run service as
Group apache

 To find the length of the longest line present in the file:
 awk '{ if (length($0) > max) max = length($0) } END { print max }' config

25

To count the lines in a file:
awk 'END {print NR} ' config
4

Printing lines with more than 10 characters:
 awk 'length($0) > 10' config
#User to run service as.
User apache
#Group to run service as
Group apache

To print the squares of first numbers from 1 to n say 6:
awk 'BEGIN { for(i=1;i<=6;i++) print "square of", i, "is",i*i; }'
square of 1 is 1
square of 2 is 4
square of 3 is 9
square of 4 is 16
square of 5 is 25
square of 6 is 36




No comments:

Post a Comment

Featured Post

OIC - how can I use XSLT functions to remove leading zeros from numeric and alphanumeric fields?

To remove leading zeros from an numeric field in Oracle Integration Cloud (OIC) using XSLT, you can Use number() Function The number() funct...