Why while read Sometimes Eats Your Variables
by Anton Van Assche - 12 min read
If you have ever written a shell script that loops over lines of input, you may have stumbled on a surprising behavior. Consider the following code:
count=0
echo -e "a\nb\nc" | while read line; do
((count++))
done
echo "Count: $count"
You might expect it to print Count: 3. Instead, it
prints:
Count: 0
The loop clearly executed three times, so why did the variable never change? The answer lies in how Bash implements pipelines and subshells.
Pipelines and Subshells
A pipeline in Bash, like producer | consumer, is not a
single process. Bash must set up a pipe (a unidirectional data
channel in the kernel) and then run each command in the pipeline in
a separate process. That way, data flows between processes without
blocking the parent shell.
You can picture it like this:
Parent shell (bash)
|
|-- fork() -> producer (echo)
| writes "a\nb\nc" to pipe
|
`-- fork() -> consumer (while read loop)
reads from pipe
increments $count (in child only)
Variables in Bash are scoped to the process. The count
you incremented lives in the subshell created for the consumer. When
the subshell exits, its memory is destroyed. The parent shell's
count was never touched, which is why the final
echo shows 0.
We can visualize what happens in memory:
Parent Shell Memory
-------------------
count = 0
ENV_VAR = "original"
Pipeline forks -> Subshell (while loop)
-------------------
count = 0 <- incremented independently
ENV_VAR = "original"
After subshell exits:
-------------------
Parent shell memory unchanged:
count = 0
ENV_VAR = "original"
This combination shows both the process-level separation (PID view) and the variable-level effect (memory view). It makes it much easier to understand why your loop doesn't change the parent shell's variables.
Why Bash Does This
You might wonder why Bash doesn't just run the last command of a pipeline in the current shell. The reason is consistency. The POSIX specification allows the shell to decide whether each part of a pipeline runs in a subshell or not, but historically most shells fork each stage to avoid tricky edge cases where builtins would otherwise block the pipeline.
"Each command in a multi-command pipeline, where pipes are created, is executed in a subshell, which is a separate process." The Bash Manual
The side effect is that anything you do in that last stage - setting variables, changing directories, or modifying shell options - disappears when the subshell exits.
Here is a simplified ASCII view of what happens:
Without pipeline:
bash (pid 1000)
|
`-- runs while-loop directly
updates $count in pid 1000
With pipeline:
bash (pid 1000)
|
|-- fork() -> pid 1001 (echo)
|
`-- fork() -> pid 1002 (while loop)
updates $count in pid 1002
exits, state lost
Proving the Subshell
We can actually visualize this behavior by printing the process ID of each stage:
echo "Parent PID: $BASHPID"
echo -e "a\nb\nc" | while read line; do
echo "Loop PID: $BASHPID"
done
Here we use the BASHPID variable, which holds the PID
of the current Bash process. The output will look something like
this:
Parent PID: 1219125
Loop PID: 1225380
Loop PID: 1225380
Loop PID: 1225380
Notice that the Loop PID is different from the
Parent PID, confirming that the loop runs in a separate
subshell process. Each iteration of the loop shows the same PID
because it is the same subshell instance handling all reads from the
pipe.
In the example above we used $BASHPID instead of the
older $$, this is due to the fact that
$$ always shows the PID of the original shell (the
parent shell), not the current subshell. When we would used
$$ the result would simply be four times
1219125 which can be misleading.
Another way to see the subshell effect is to modify an environment, e.g. changing a variable or changing directories, and observing that the change does not persist after the loop.
count=0
echo -e "a\nb\nc" | while read line; do
((count++))
done
echo "Count: $count" # prints 0, variable change lost
echo "Initial directory: $(pwd)"
echo -e "/tmp\n/home" | while read dir; do
cd "$dir"
echo "Inside loop: $(pwd)"
done
echo "After loop: $(pwd)" # parent shell directory unchanged
When we execute this, we see that the count remains
0, and the directory after the loop is the same as
before, confirming that changes inside the loop do not affect the
parent shell.
Resulting in output like:
Count: 0
Initial directory: /home/anton
Inside loop: /tmp
Inside loop: /home
After loop: /home/anton
Both examples clearly demonstrate that the loop runs in a subshell, and each subshell has its own separate memory and environment.
Using {...} vs. (...) Groups
Bash supports two types of command grouping: curly braces and parentheses. Each has different implications for variable scope and subshell behavior:
-
Curly braces
{ …; }will cause the commands inside the grouping to run in the current shell. Any variable changes or environment modifications will persist after the group completes. -
Parentheses
( … )will run the commands inside a subshell. Variable changes and environment modifications will not persist after the subshell exits.
An important caveat when using curly braces inside a pipeline, is that it will not prevent the whole pipeline from running in a subshell. The entire pipeline will still run in a subshell, the curly braces only take affect on how the commands are handled inside that subshell.
For example:
count=0
echo -e "a\nb\nc" | { while read line; do ((count++)); done; }
echo "Count: $count" # still prints 0, because the whole pipeline is in a subshell
Using parentheses inside a pipeline just nests a subshell within a subshell:
count=0
echo -e "a\nb\nc" | ( while read line; do ((count++)); done )
echo "Count: $count" # still prints 0, because the whole pipeline is in a subshell
This makes it clear that grouping alone does not overcome the subshell behavior of pipelines.
Workarounds
There are a few ways to avoid this behavior, depending on your Bash version, coding style, and requirements.
Redirect input into the loop
Instead of piping into the loop, feed it input via process substitution. This way the loop runs in the parent shell:
count=0
while read line; do
((count++))
done < <(echo -e "a\nb\nc")
echo "Count: $count"
# -> Count: 3
Here, < <(...) creates a temporary file
descriptor that Bash connects directly to the loop, so no subshell
is needed for the while.
Rethink the design
Sometimes, you don't need to maintain state in a loop at all. For
example, if you only need to count lines, a utility like
wc can do it directly:
count=$(echo -e "a\nb\nc" | wc -l)
echo "Count: $count"
# -> Count: 3
Another modern approach (since Bash 4.0) is to use
mapfile (or readarray) to read all lines
into an array, which runs in the current shell and avoids subshell
issues:
mapfile -t lines < <(echo -e "a\nb\nc")
count=${#lines[@]}
echo "Count: $count"
# -> Count: 3
Both wc -l and mapfile avoid the need for
a subshell entirely. They are often faster and express the intent
more clearly.
Use Bash's lastpipe option
Since Bash 4.2, you can enable lastpipe, which tells
Bash to run the last command in a pipeline in the current shell
rather than a subshell. This only works in non-interactive shells,
not in your interactive terminal.
shopt -s lastpipe
count=0
echo -e "a\nb\nc" | while read line; do
((count++))
done
echo "Count: $count"
# -> Count: 3
With lastpipe enabled, Bash executes the loop directly
in the parent process. This is closer to what many people expect,
but it is not enabled by default because it can subtly break
portability.
While it looks convenient, be cautious about using it, it may introduce hard to debug bugs due to a variable change where you least expect it. Most times a better redesign or writing style will lead to more robust and maintainable code. As some parts of your script might rely on the traditional behavior of pipelines creating subshells.
The Bottom Line
Pipelines in Bash fork processes, and a while read loop
fed by a pipeline runs in a subshell. Any variables modified inside
that loop are lost once the subshell exits. To preserve state, you
can redirect input instead of piping, enable
lastpipe in modern Bash, or restructure your code to
avoid the issue entirely.
Once you understand that every pipeline stage is its own process, the mystery of the disappearing variable becomes much less magical - and much easier to avoid in your scripts.
References & Further Reading
man bash- Bash Manual-
man 2 fork-fork()System Call Manual -
man 2 pipe-pipe()System Call Manual -
man 2 wait-wait()System Call Manual help shopt- Bash shell options (see lastpipe)