Parallelizing Monte Carlo Simulation
🏆 This feature requires compute resources |
---|
MecSimCalc supports multiprocessing when higher compute resources are chosen. For more information, visit the code optimization page.
Overview
The Monte Carlo Simulation can be parallelized by implementing multiprocessing. This can significantly enhance the speed of the Monte Carlo Simulation.
Note: This only applies to greater than 2 vCPUs and requires subscription to higher compute resources on MecSimCalc.
Here is a comparison in the time the program takes to run between using multiprocessing and running the program serially in a simple Monte Carlo Simulation program with O(n) time complexity:
Multiprocessing will exponentially reduce the time taken to run the program as the number of CPU cores increase, whereas the execution time remains constant without multiprocessing.
Here is another comparison between sequential and parallel execution time in a computationally heavy Monte Carlo Simulation program used to determine strain demand in pipes subject to ground movement. More details can be found here.
Similar to the previous example, parallelization consistently reduces program execution time as CPU cores increase regardless of the number of simulations.
Implementing Multiprocessing
This tutorial requires a working Monte Carlo simulation program implemented on MecSimCalc. For more information on how to port your Python program to MecSimCalc, visit Getting Started.
There are multiple ways to implement multiprocessing and the methods shown may not work for all programs. Use these instructions as a rough guide.
-
Modularize the Program
Arrange the program into functions so there is one function that runs the simulation. This will make it easier to implement multiprocessing. -
Import the Multiprocessing Module
import multiprocessing
-
Create a Shared Object
An object with shared memory between the processes will be required to store any function return data. The shared object can be any data type and must be created in main. This can be done using the manager object from the multiprocessing module.The example below creates a list with shared data that can be accessed between all processes.
def main(inputs):
### Other Code
manager = mp.Manager() # This is case-sensitive
data = manager.list()
### Other Code
- Count the Number of CPU Cores
This step can be skipped for those who want a specific number of processes. Otherwise, we can use this to count the optimal number of processes.
Note: the number of CPU cores is dependent on the selection of vCPUs in MecSimCalc
def main(inputs):
### Other Code
num_cores = multiprocessing.cpu_count()
### Other Code
- Divide Tasks Among Each Process
In the following example, the simulation is repeated
num_simulations
number of times.
def split_list(lst, n):
k, m = divmod(len(lst), n)
return (lst[i*k+min(i, m):(i+1)*k+min(i+1, m)] for i in range(n))
def main(inputs):
### Other Code
arr = range(num_simulations)
simulations_per_process = list(split_list(arr, num_cores))
num_processes = min(num_cores, num_simulations)
### Other Code
- Create the Processes
Create processes by using a for loop to iterate over the number of processes you have determined in step 4 and 5. In this example,simulation()
is the function to run the simulation and is used to create each process by settingtarget=simulation
.simulation_args
is a placeholder for any other arguments thesimulation()
function requires.
def main(inputs):
### Other Code
processes = []
for i in range(num_processes):
# Create the process
p = mp.Process(target=simulation, args=(simulations_per_process, simulation_args))
processes.append(p) # Add the process to the list of processes
p.start() # Starts the process
for p in processes:
p.join()
# This waits for other processes to finish executing before continuing
### Other Code
Sample Implementation of Multiprocessing
The following is a Monte Carlo Simulation of a simple dice game adapted from here.
- Each player starts with $1000
- When the two dices rolls the same number, the player gets 4 times their bet amount
- If the dices do not roll the same number, the player loses their bet
The following Monte Carlo Simulation divides each simulation into max_num_rolls
number of dice rolls, and repeats the simulations num_simulations
times.
Before Implementing Multiprocessing
import matplotlib.pyplot as plt
import random
import time
import mecsimcalc as msc
def roll_dice():
die1 = random.randint(1,6)
die2 = random.randint(1,6)
if die1==die2:
same_num = True
else:
same_num = False
return same_num
def simulate(num_simulations, max_num_rolls, bet):
win_probability = []
end_balance = []
results = []
for _ in range(num_simulations):
balance = [1000]
num_rolls = [0]
num_wins = 0
while num_rolls[-1] < max_num_rolls:
if roll_dice(): #If both dices roll the same number
balance.append(balance[-1] + 4 * bet)
num_wins += 1
else:
balance.append(balance[-1] - bet)
num_rolls.append(num_rolls[-1] + 1)
win_probability.append(num_wins/num_rolls[-1])
end_balance.append(balance[-1])
results.append([num_rolls, balance])
# Tracking variables
overall_win_probability = sum(win_probability)/len(win_probability)
overall_end_balance = sum(end_balance)/len(end_balance)
return results, overall_win_probability, overall_end_balance
def main(inputs):
start = time.time()
# Inputs
num_simulations = inputs['num_simulations']
max_num_rolls = inputs['max_num_rolls']
bet = inputs['bet']
# Simulation
results, overall_win_probability, overall_end_balance = simulate(num_simulations, max_num_rolls, bet)
# Plotting
plt.figure()
plt.title("Monte Carlo Dice Game [" + str(num_simulations) + " simulations]")
plt.xlabel("Roll Number")
plt.ylabel("Balance [$]")
plt.xlim(0, max_num_rolls)
for result in results:
num_rolls, balance = result
plt.plot(num_rolls, balance)
end = time.time()
elapsed_time = end - start
img, download = msc.print_plot(plt, download=True)
return {
"plot": img,
"download": download,
"time_taken": elapsed_time,
"win_prob": overall_win_probability,
"end_bal": overall_end_balance
}
After Implementing Multiprocessing
import matplotlib.pyplot as plt
import random
import time
import multiprocessing as mp
import mecsimcalc as msc
# Rolls the dices and checks if they are the same number
def roll_dice():
die1 = random.randint(1,6)
die2 = random.randint(1,6)
if die1==die2:
same_num = True
else:
same_num = False
return same_num
def simulate(num_simulations, max_num_rolls, bet, shared_list):
for _ in range(num_simulations):
balance = [1000]
num_rolls = [0]
num_wins = 0
while num_rolls[-1] < max_num_rolls:
if roll_dice(): #If both dices roll the same number
balance.append(balance[-1] + 4*bet)
num_wins += 1
# Result if the dice are different numbers
else:
balance.append(balance[-1] - bet)
num_rolls.append(num_rolls[-1] + 1)
# Tracking variables
win_probability = num_wins/num_rolls[-1]
end_balance = balance[-1]
shared_list.append([num_rolls, balance, win_probability, end_balance])
#Return by appending to shared list
return # Function return must be null
def split_list(lst, n):
k, m = divmod(len(lst), n)
return (lst[i*k+min(i, m):(i+1)*k+min(i+1, m)] for i in range(n))
def main(inputs):
start = time.time()
# Inputs
num_simulations = inputs['num_simulations']
max_num_rolls = inputs['max_num_rolls']
bet = inputs['bet']
# Plotting
plt.figure()
plt.title("Monte Carlo Dice Game [" + str(num_simulations) + " simulations]")
plt.xlabel("Roll Number")
plt.ylabel("Balance [$]")
plt.xlim(0, max_num_rolls)
### Multiprocessing
# Create shared list between processes
manager = mp.Manager()
results = manager.list()
# Count the number of CPUs
num_cores = mp.cpu_count()
# Divide the tasks (number of simulations) for each process
arr = range(num_simulations)
simulations_per_process = list(split_list(arr, num_cores))
num_processes = min(num_cores, num_simulations)
# Create the processes
processes = []
for i in range(num_processes):
p = mp.Process(target=simulate, args=(len(arr_partitions[i]), max_num_rolls, bet, results))
processes.append(p)
p.start()
for p in processes:
p.join()
###
overall_win_probability = 0
overall_end_balance = 0
for result in results:
num_rolls, balance, win_probability, end_balance = result
overall_win_probability += win_probability
overall_end_balance += end_balance
plt.plot(num_rolls, balance)
overall_win_probability /= len(results)
overall_end_balance /= len(results)
end = time.time()
# Averaging win probability and end balance
elapsed_time = end - start
img, download = msc.print_plot(plt, download=True)
return {
"plot": img,
"download": download,
"time_taken": elapsed_time,
"win_prob": overall_win_probability,
"end_bal": overall_end_balance,
"cpu_count": num_processes
}