(15)THREADING VS MULTIPROCESSING:
Process: A process is an instance of a program. They are independent from each other
key facts:
• Another new process starts independently from the first process.
• Takes advantage of multiple cores and CPUs.
• To separate memory space.
• Memory is not shared between processes
• Starting processes is slower than starting a thread
• Large memory footprint
Threads: A thread is an entity within a process that can be scheduled for execution . A Process can spawn multiple threads. The main difference is that all threads within a process share the same memory.
key facts:
• Multiple threads can be spawned within one process
• Memory is shared between all threads
• Starting a thread is faster than starting a process
• Lightweight - low memory footprint
• Neither interruptible or killable
for example:
from threading import Thread
def square_numbers():
for i in range(1000):
result = i * i
if __name__ == "__main__":
threads = []
num_threads = 10
# create threads and asign a function for each thread
for i in range(num_threads):
thread = Thread(target=square_numbers)
threads.append(thread)
# start all threads
for thread in threads:
thread.start()
# wait for all threads to finish
# block the main thread until these threads are finished
for thread in threads:
thread.join()
What is a GIL?
A GIL or Global interrupter lock is a lock that only allows one thread to hold the control of the python interrupter to prevent certain errors occurring when multiple threads try to control the python interrupter.
What is the use of threading?
It is useful for I/O bound tasks when your program is talking to slow devices such as network connections. The threading program uses the waiting time on these slow devices for other tasks.
Multiprocessing
Using the multiprocessing module..
for example:
from multiprocessing import Process
import os
def square_numbers():
for i in range(1000):
result = i * i
if __name__ == "__main__":
processes = []
num_processes = os.cpu_count()
# create processes and asign a function for each process
for i in range(num_processes):
process = Process(target=square_numbers)
processes.append(process)
# start all processes
for process in processes:
process.start()
# wait for all processes to finish
# block the main thread until these processes are finished
for process in processes:
process.join()
What is the use of multiprocessing?
It is used in CPU-bound tasks that takes a lot of time and have to do a lot of CPU operations.
Do you need it?
It is needed because memory management of cpython is not thread safe. For memory management python uses reference counting. Python have a reference count variable that keeps track of the number of references that point to the object. When this count reaches zero, the memory occupied by the object is released.
Avoiding GIL's
In python community the GIL is very controversial. To avoid it you can use multiprocessing instead of threading. Another way is to use jython or ironpython instead of cpython and wait there is more you can also move parts of application out into extension modules such as in c/c++.
How to use the threading module?
1.How to create and run threads?
To create a thread type 'threading.Thread()'. This will take two arguments.
◼️ Target: For this thread to be invoked when the thread starts a callable object is needed.
◼️ Args: The function argument for the target function. This must be a tuple.
Start a the thread with thread.start().
2.Lock
A lock is a synchronization mechanism for limiting a access to a resource where there are many threads of execution.
A lock has two states locked and unlocked
lock.acquire(): This will lock the state and block.
lock.release(): This will unlock the state again.
important: You should always release the lock again after it was acquired!
for example:
# import Lock
from threading import Thread, Lock
import time
database_value = 0
def increase(lock):
global database_value
# lock the state
lock.acquire()
local_copy = database_value
local_copy += 1
time.sleep(0.1)
database_value = local_copy
lock.release()
if __name__ == "__main__":
lock = Lock()
print('Start value: ', database_value)
t1 = Thread(target=increase, args=(lock,)) # notice the comma after lock since args must be a tuple
t2 = Thread(target=increase, args=(lock,))
t1.start()
t2.start()
t1.join()
t2.join()
print('End value:', database_value)
print('end main')
output =
Start value: 0
End value: 2
end main
To release the lock automatically. You should Use the lock as the context manager which acquires and then releases the lock automatically.
for example :
def increase(lock):
global database_value
with lock:
local_copy = database_value
local_copy += 1
time.sleep(0.1)
database_value = local_copy
3.Queue
A queue is a linear data structure that means that if first task is in the first task is out first. A good example of it is that a queue of customers that are waiting in line in which fist customer is served first.
Main methods are:
◼️ q.get( ): Used to remove and return the first item. By default until the item is available.
◼️ q.put(item) : Used to put the elements at the end of the queue. But by default it blocks until a free slot is available.
◼️ q.task_done( ) : Tells you that a formerly enqueued task is complete. You should call it for each get( )
after you are done with your task.
◼️ q.join( ): Block until all items in the queue have been gotten and processed.
◼️ q.empty( ) : To check whether the queue is empty. It returns true if the queue is empty.
for example:
from queue import Queue
q = Queue()
q.put(1)
q.put(2)
q.put(3)
first = q.get()
print(first)
output = 1
Use the queue in multithreading.
The following is an example which uses the queue to exchange numbers from 0.19. Every thread calls on the worker method. In the infinity loop the thread is waiting until items are available due to the blocking q.get() call. The items are processed when the q.task_done( ) tells the queue that processing is complete. 10 daemon threads are created in the main page. Meaning that they die when the main thread dies, and thus the worker method and infinite loop is no longer invoked.
from threading import Thread, Lock, current_thread
from queue import Queue
def worker(q, lock):
while True:
value = q.get() # unblocks when the item is available
# doing something...
with lock:
# prevent printing at the same time with this lock
print(f"in {current_thread().name} got {value}")
# ...
# For each get(), a subsequent call to task_done() tells the queue
# that the processing on this item is complete.
# If all tasks are done, q.join() can unblock
q.task_done()
if __name__ == '__main__':
q = Queue()
num_threads = 10
lock = Lock()
for i in range(num_threads):
t = Thread(name=f"Thread{i+1}", target=worker, args=(q, lock))
t.daemon = True # dies when the main thread dies
t.start()
for x in range(20):
q.put(x)
q.join() # Blocks until all items in the queue have been gotten and processed.
print('main done')
output =
in Thread1 got 0
in Thread2 got 1
in Thread2 got 11
in Thread6 got 3
in Thread8 got 4
in Thread7 got 5
in Thread9 got 6
in Thread9 got 16
in Thread4 got 8
in Thread3 got 9
in Thread1 got 10
in Thread5 got 2
in Thread2 got 12
in Thread6 got 13
in Thread8 got 14
in Thread7 got 15
in Thread10 got 7
in Thread9 got 17
in Thread4 got 18
in Thread3 got 19
Comments
Post a Comment
If you have any doubts please let me know