Home Linux Shell script that accept a list of file names as arguments count and report the occurrence of each word

Shell script that accept a list of file names as arguments count and report the occurrence of each word

by Anup Maurya
31 minutes read

Shell script that accept a list of file names as arguments count and report the occurrence of each word.

What does $@ mean in a shell script?

$@ refers to all of a shell script’s command-line arguments. $1 , $2 , etc., refer to the first command-line argument, the second command-line argument, etc. Place variables in quotes if the values might have spaces in them.

Shell Script that accepts a list of file names as arguments and counts the occurrence of each word

#!/bin/bash

# Loop through all arguments (filenames)
for file in "$@"; do

  # Check if file exists
  if [ ! -f "$file" ]; then
    echo "File $file does not exist."
    continue
  fi

  # Count the occurrence of each word
  echo "Word count for $file:"
  cat "$file" | tr '[:upper:]' '[:lower:]' | tr -cs '[:alpha:]' '\n' | sort | uniq -c | sort -nr

done

Let’s break down how this script works:

  1. We start by looping through all arguments (filenames) passed to the script using the $@ variable.
  2. For each file, we check if it exists using the -f option of the test command. If the file does not exist, we print an error message and move on to the next file.
  3. If the file exists, we count the occurrence of each word using a combination of shell commands:
  • cat reads the contents of the file and outputs them to stdout.
  • tr is used to convert all uppercase letters to lowercase, and to replace all non-alphabetic characters with newline characters. This separates each word into its own line.
  • sort sorts the words in alphabetical order.
  • uniq -c counts the occurrence of each unique word and outputs the result.
  • Finally, sort -nr sorts the result in descending order by word count.

The output will display the occurrence of each word in the specified file(s).

Note: This script assumes that each word is separated by whitespace. If your files contain words separated by other characters (e.g., punctuation marks), you may need to modify the tr command accordingly.

Suppose, we have two text file one is test.txt and another is testone.txt

Test.txt file

DevOps is a set of practices, tools, and a cultural philosophy that automate and integrate the processes between software development and IT teams.

Testone.txt file

DevOps is a collaboration between Development and IT Operations to make software production and Deployment in an automated & repeatable way.

Run the script, one above two files.

~/Assignment$ bash main.sh text.txt
File text.txt does not exist.
~/Assignment$ bash main.sh test.txt testone.txt
Word count for test.txt:
      3 and
      2 a
      1 tools
      1 the
      1 that
      1 teams
      1 software
      1 set
      1 processes
      1 practices
      1 philosophy
      1 of
      1 it
      1 is
      1 integrate
      1 devops
      1 development
      1 cultural
      1 between
      1 automate
Word count for test1.txt:
      2 and
      1 way
      1 to
      1 software
      1 repeatable
      1 production
      1 operations
      1 make
      1 it
      1 is
      1 in
      1 devops
      1 development
      1 deployment
      1 collaboration
      1 between
      1 automated
      1 an
      1 a

Thank you for reading, If you have reached so far, please like the article, It will encourage me to write more such articles. Do share your valuable suggestions, I appreciate your honest feedback!

related posts

Leave a Comment