Bash Subshells

Bash Subshells

Using subshell

ยท

4 min read

Subshells in Unix refer to running a shell command within the parenthesis (). This spawns a subshell - a separate copy of the shell. Commands run within the subshell are isolated and do not affect the current shell environment. Some examples of using subshells are:

(cd /some/dir; ls)
# After the subshell exits, we are back to the original directory.

(export VAR=value; echo $VAR) # VAR is unset after the subshell exits

( uptime; date ) # Runs two commands in a subshell, outputting both results

Subshells are useful for:

  • Testing commands - Running commands in a subshell isolates them, so it's a safe way to test things without affecting the current shell.

  • Running multiple commands - Grouping commands within () runs them all in the same subshell.

  • Creating a new temporary environment - Variables set and files created within the subshell are discarded when it exits.

The subshell essentially provides a "sandbox" to run commands and make changes in isolation from the parent shell. Hope this helps! Let me know if you have any other questions.


Yes, a subshell runs synchronously. This means:

  • The parent shell waits for the subshell to finish executing before continuing.

  • Commands within the subshell also run synchronously - one after the other.

This is in contrast to running commands in the background using "&", which runs them asynchronously - the parent shell does not wait, and commands can run concurrently.

Since subshells run synchronously:

  • It provides a "sandboxed" environment - the subshell exits and all changes are reverted once the commands finish executing.

  • It's easy to reason about the order of execution - commands run one after the other in a predictable way.

So in summary, yes a subshell runs the commands within it synchronously, and the parent shell waits for the subshell to finish before continuing. The synchronous execution model makes subshells useful for testing and temporarily modifying the environment.


Parallel Programming

Here is a short introduction to parallel programming in Bash shell scripts:

Bash allows you to run commands and scripts in parallel using:

  • Subshells - Using ( ) you can spawn subshells and run commands in the background & within the subshell. This forks new processes.

  • GNU Parallel - A tool to run jobs in parallel. You pass it a list of jobs and it forks the required number of processes.

This allows you to:

  • Speed up repetitive tasks by running them concurrently.

  • Utilize all CPU cores to run jobs in parallel.

You have to be careful of:

  • Race conditions - Where output from parallel processes interfere.

  • Shared memory - Processes modifying the same variables.

Some uses of parallel programming in Bash are:

  • Running multiple scripts at once

  • Processing a list of files concurrently

  • Running a for loop in parallel over a range

Examples:

# Run 4 script.sh in parallel 
(script.sh &); (script.sh &); (script.sh &); (script.sh &)

# Use GNU Parallel to run 10 script.sh jobs  
parallel -j10 ::: ./script.sh 

# Process files in parallel
parallel -j4 'process.sh {}' ::: *.txt

So in summary, using subshells and tools like GNU Parallel, you can achieve simple parallel programming in Bash to speed up repetitive tasks by running them concurrently on multiple CPU cores.


Scrip Example

Here is how you can run a bash script in parallel on 4 threads using subshells:

#!/bin/bash

script.sh # Your bash script

( script.sh & ) # Run in subshell 1 
( script.sh & ) # Run in subshell 2
( script.sh & ) # Run in subshell 3
( script.sh & ) # Run in subshell 4

wait # Wait for all subshells to finish

What we are doing here is:

  • We have our bash script script.sh

  • We then spawn 4 subshells using ( )

  • Within each subshell, we run the script in the background using &

  • This will fork 4 processes, with each running the script concurrently

  • We use wait at the end to wait for all subshells/processes to finish

The key points are:

  • We run the script in the background & inside each subshell

  • This forks a new process for that subshell, running the script in parallel

  • But since the subshells themselves run synchronously, we can easily keep track of them

  • We use wait to ensure all 4 processes finish before the main script continues

So in summary, by running the script in the background & within subshells, we are able to spawn multiple processes in parallel, while still having the benefits of running the subshells synchronously.


Disclaim: I have used Rix to generate this article. On my course, I will elaborate with diagrams and examples. You must join the course to learn how to actually solve a problem in parallel and how to plan for parallel execution. Learn and prosper. ๐Ÿ––

ย