Parallel processes: appending outputs to an array in a zsh script

Multi tool use


Parallel processes: appending outputs to an array in a zsh script
I have a for loop in which a function task
is called. Each call to the function returns a string that is appended to an array. I would like to parallelise this for loop. I tried using &
but it does not seem to work.
task
&
Here is the code not parallelised.
task (){ sleep 1;echo "hello $1"; }
arr=()
for i in {1..3}; do
arr+=("$(task $i)")
done
for i in "${arr[@]}"; do
echo "$i x";
done
The output is:
hello 1 x
hello 2 x
hello 3 x
Great! But now, when I try to parallelise it with
[...]
for i in {1..3}; do
arr+=("$(task $i)")&
done
wait
[...]
the output is empty.
This question is specifically for zsh
, for its bash
counterpart please see here.
zsh
bash
2 Answers
2
I could be wrong, but I'm pretty sure you don't want to do quite that.
I'll leave the original solution below, but try it with a coproc.
#! /usr/bin/zsh
coproc cat &
task(){
sleep $1
print -p "Sloppy simulation process # $1: $(date)"
}
arr=()
for i in {1..3}; do
task $i &
done
for i in {1..3}; do
read -p val
arr+=("$val")
done
for i in "${arr[@]}"; do
[[ -n "$i" ]] && echo "$i"
done
Ideally, the writes to the coproc will take long enough that the reads will start first and block.
I think.
My output:
Sloppy simulation process # 1: Thu Jul 26 15:19:02 CDT 2018
Sloppy simulation process # 2: Thu Jul 26 15:19:03 CDT 2018
Sloppy simulation process # 3: Thu Jul 26 15:19:04 CDT 2018
original file storing version
If task
is a long-running step, it might be worth paralleling the work and adding the overhead of storing it somewhere persistent and then loading the array. Is this quick hack out helpful?
task
task(){ # task() handles persistence itself
sleep $1
echo "Sloppy simulation process # $1: $(date)" >| /tmp/task/$1
}
mkdir -p /tmp/task/
cd /tmp/task
for i in {1..3}
do task $i & # run them in background
done
wait # wait for the last one
arr=()
for f in *
do arr[$f]="$(<$f)" # read each into arr
done
for i in $( seq ${#arr[@]} )
do [[ -n "${arr[$i]}" ]] && echo "${arr[$i]}" # show them
done
rm -fr /tmp/task/
task
This solution works like a charm but only in bash (because
export -f
does not work in zsh).– BiBi
5 hours ago
export -f
I figured, but assignment in background means in a subshell, yes? Won't work. You have to have some form of persistence to move the result back up to the parent. Maybe a coprocess or named pipe as a queue?
– Paul Hodges
5 hours ago
@BiBi, ...one could make it work without
export -f
in bash by using declare -f
to print a declaration of your function and substitute that into the code run in the subprocess. Maybe that'll work in zsh?– Charles Duffy
4 hours ago
export -f
declare -f
@PaulHodges, ...btw, again, I don't know zsh, but at least in bash the
for i in $(seq ...)
is a significant amount of performance overhead (forking off a subprocess and running an external command within it), compared to for i in "${!arr[@]}"; do
which is purely shell-builtin.– Charles Duffy
4 hours ago
for i in $(seq ...)
for i in "${!arr[@]}"; do
zsh comes with zargs. It would be handy in order to do some xargs-ish parallel operations on shell functions.
task () {
sleep $1
echo "hello $1"
}
arr=()
autoload -Uz zargs
arr=("${(@f)"$(zargs -P 3 -n 1 -- {3..1} -- task)"}")
print -l ${(qqq)arr}
#>> "hello 1"
#>> "hello 2"
#>> "hello 3"
Here is zargs --help
output:
zargs --help
Usage: zargs [options --] [input-args] [-- command [initial-args]]
If command and initial-args are omitted, "print -r --" is used.
Options:
--eof[=eof-str], -e[eof-str]
Change the end-of-input-args string from "--" to eof-str. If
given as --eof=, an empty argument is the end; as --eof or -e,
with no (or an empty) eof-str, all arguments are input-args.
--exit, -x
Exit if the size (see --max-chars) is exceeded.
--help
Print this summary and exit.
--interactive, -p
Prompt before executing each command line.
--max-args=max-args, -n max-args
Use at most max-args arguments per command line.
--max-chars=max-chars, -s max-chars
Use at most max-chars characters per command line.
--max-lines[=max-lines], -l[max-lines]
Use at most max-lines of the input-args per command line.
This option is misnamed for xargs compatibility.
--max-procs=max-procs, -P max-procs
Run up to max-procs command lines in the background at once.
--no-run-if-empty, -r
Do nothing if there are no input arguments before the eof-str.
--null, -0
Split each input-arg at null bytes, for xargs compatibility.
--replace[=replace-str], -i[replace-str]
Substitute replace-str in the initial-args by each initial-arg.
Implies --exit --max-lines=1.
--verbose, -t
Print each command line to stderr before executing it.
--version
Print the version number of zargs and exit.
It uses --max-procs=max-procs, -P max-procs
and --max-args=max-args, -n max-args
on above example.
--max-procs=max-procs, -P max-procs
--max-args=max-args, -n max-args
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Your solution will work, but I'm rather looking for a solution that does not require writing outputs to file. You can see
task
as a function taking 10 seconds to run and you'd rather not adding this overhead. Thanks for the solution anyway!– BiBi
5 hours ago