Parallel Processing with PowerShell
Introduction

Before we jump on to the Parallel Processing concept in powershell, Let’s understand What is Powershell and Parallel(Asynchronous) processing? PowerShell is a Windows command-line shell designed especially for system administrators. While it may look similar to the Command Prompt app in Windows, It is a scalable way for corporate IT managers to automate business-critical tasks on every Windows PC across a wide area network.

What is Synchronous?

next task only will start, once a task which is started before it gets completed.

What is Asynchronous?

When you run something asynchronously it means it is non-blocking, you execute it without waiting for it to complete and carry on with other tasks.

What is Parallelism?

Parallelism means to run multiple tasks at the same time, in parallel. Parallelism works well when you can separate tasks into independent pieces of work.

Parallel Processing Sync
Description

When you think about PowerShell script, you’re probably thinking of synchronous execution. This means that PowerShell executes code, one line at a time. It starts executing one command, waits for it to complete, and then starts the next one.

Synchronous execution is fine for scripts In which execution of the next line depends on the execution of a line before it.When you are working with the script that can take many minutes or an hour, You can instead choose to execute code asynchronously with any of the below PS concepts.

Jobs (PowerShell 3.0)

Job is a piece of code that is executed in the background, creates n multiple background processes to execute n no. of jobs , each job creates a separate powershell instance to complete execution. (You can find out no. of instances created while running jobs , inside task manager)

Parallel Processing Sync

Different cmdlets to work with PS Jobs.

Start-Job: Create and execute job.1..5 | % {Start-Job { “Hello” } }

Parallel Processing Start Job

Get-Job: Get all jobs that are started with Start-Job cmd.
Wait-Job: Wait for all jobs to complete.

Get-Job | Wait-Job
Parallel Processing Wait Job

Receive-Job: To print output of job to console.

Get-Job | Receive-Job
Parallel Processing Receive Job

Remove-Job: To delete all jobs that were created with Start-Job command

.*Jobs created must be removed with this command.

Get-Job | Remove-Job
ThreadJob (PowerShell 6.0)

This is a thread based job. This is a lighter weight solution compared to Jobs. Unlike traditional PS Jobs which spawn a whole new host process for each running job, PS ThreadJobs run in multiple threads on the same process which vastly increases performance by lowering overhead.

There are a few drawbacks to using a ThreadJob over a background job. If a background job hangs, only that process hangs. All other jobs keep chugging away. If you have a job that hangs with ThreadJob the entire queue is affected.

Measure-Command {1..5 | % {Start-Job {Start-Sleep 1}} | Wait-Job} | Select-Object TotalSeconds
Measure-Command {1..5 | % {Start-ThreadJob {Start-Sleep 1}} | Wait-Job} | Select-Object TotalSeconds

TotalSeconds
------------
   5.7665849
   1.5735008

Syntax is quite similar to PSJobs , Job string is replaced with ThreadJob. One parameter is there to set no of jobs you want to start concurrently. (i.e. throttle limit , default value is 5)

Parallel foreach (PowerShell 7.0)

Each iteration of ForEach-Object that is passed in via the Parallel scriptblock input, will run in it’s own thread.This is faster than both the above methods. You can run all scripts in parallel for each piped input object.

If your script is crunching a lot of data over a significant period of time and if the machine you are running on has multiple cores that can host the script block threads. In this case the -ThrottleLimit parameter should be set approximately to the number of available cores. If you are running on a VM with a single core, then it makes little sense to run high compute script blocks in parallel since the system must serialize them anyway to run on the single core.

Scripts that do a lot of file operations, or perform operations on external machines can benefit by running in parallel. Since the running script cannot use all of the machine cores, it makes sense to set the -ThrottleLimit parameter to something greater than the number of cores. If one script execution waits many minutes to complete, you may want to allow tens or hundreds of scripts to run in parallel..

1..5 | ForEach-Object -Parallel { "Hello $_"; sleep 1; } -ThrottleLimit 5 
Hello 1 
Hello 3 
Hello 2 
Hello 4 
Hello 5

Parallel processing is an ideal solution when you want to run the jobs that are independent of each other.

Performance test

#% -> ForEach-Object
Measure-Command {1..5 | % {Start-Sleep 1} } | Select-Object TotalSeconds
#Job
Measure-Command {1..5 | % {Start-Job {Start-Sleep 1}} | Wait-Job} | Select-Object TotalSeconds
#Thread Job
Measure-Command {1..5 | % {Start-ThreadJob -ThrottleLimit 5 {Start-Sleep 1}} | Wait-Job} | Select-Object TotalSeconds
#ForEach-Object Parallel
Measure-Command {1..5 | ForEach-Object -Parallel {Start-Sleep 1} -ThrottleLimit 5} | Select-Object TotalSeconds
Parallel Processing Total Seconds

% represents forech

  • Regular foreach command took almost 5 sec to run sequentially, each iteration took one second to complete the execution.
  • With PS Job it took 7 secs. (2s Overhead of starting jobs assigning runspace etc.)
  • With PS ThreadJob it took 1 sec, all executed asynchronously and executed within 1 sec. (background job created and we will need to remove it manually)
  • Parallel execution of foreach also completed within a second as runs based on throttle limit which should be set as per the CPU cores.

Scripts attached (Executed with VS code)

  1. Folder Copy (with Thread Job vs Regular way)
  2. API call with parallel foreach

Ms. Roshni Bokade