(15)THREADING VS MULTIPROCESSING:

Process: A process is an instance of a program. They are independent from each other

key facts:

• Another new process starts independently from the first process.

• Takes advantage of multiple cores and CPUs.

• To separate memory space.

• Memory is not shared between processes

• Starting processes is slower than starting a thread

• Large memory footprint

Threads: A thread is an entity within a process that can be scheduled for execution . A Process can spawn multiple threads. The main difference is that all threads within a process share the same memory.

key facts:

• Multiple threads can be spawned within one process

• Memory is shared between all threads

• Starting a thread is faster than starting a process

• Lightweight - low memory footprint

• Neither interruptible or killable

for example:

from threading import Thread

def square_numbers():
    for i in range(1000):
        result = i * i

        
if __name__ == "__main__":        
    threads = []
    num_threads = 10

    # create threads and asign a function for each thread
    for i in range(num_threads):
        thread = Thread(target=square_numbers)
        threads.append(thread)

    # start all threads
    for thread in threads:
        thread.start()

    # wait for all threads to finish
    # block the main thread until these threads are finished
    for thread in threads:
        thread.join()

What is a GIL?

A GIL or Global interrupter lock is a lock that only allows one thread to hold the control of the python interrupter to prevent certain errors occurring when multiple threads try to control the python interrupter.

What is the use of threading?

It is useful for I/O bound tasks when your program is talking to slow devices such as network connections. The threading program uses the waiting time on these slow devices for other tasks.

Multiprocessing

Using the multiprocessing module..

for example:

from multiprocessing import Process
import os


def square_numbers():
    for i in range(1000):
        result = i * i


if __name__ == "__main__":
    processes = []
    num_processes = os.cpu_count()

    # create processes and asign a function for each process
    for i in range(num_processes):
        process = Process(target=square_numbers)
        processes.append(process)

    # start all processes
    for process in processes:
        process.start()

    # wait for all processes to finish
    # block the main thread until these processes are finished
    for process in processes:
        process.join()

What is the use of multiprocessing?

It is used in CPU-bound tasks that takes a lot of time and have to do a lot of CPU operations.

Do you need it?

It is needed because memory management of cpython is not thread safe. For memory management python uses reference counting. Python have a reference count variable that keeps track of the number of references that point to the object. When this count reaches zero, the memory occupied by the object is released.

Avoiding GIL's

In python community the GIL is very controversial. To avoid it you can use multiprocessing instead of threading. Another way is to use jython or ironpython instead of cpython and wait there is more you can also move parts of application out into extension modules such as in c/c++.

How to use the threading module?

1.How to create and run threads?

To create a thread type 'threading.Thread()'. This will take two arguments.

◼️ Target: For this thread to be invoked when the thread starts a callable object is needed.

◼️ Args: The function argument for the target function. This must be a tuple.

Start a the thread with thread.start().

2.Lock

A lock is a synchronization mechanism for limiting a access to a resource where there are many threads of execution.

A lock has two states locked and unlocked

lock.acquire(): This will lock the state and block.

lock.release(): This will unlock the state again.

important: You should always release the lock again after it was acquired!

for example:

# import Lock
from threading import Thread, Lock
import time

database_value = 0

def increase(lock):
    global database_value 
    
    # lock the state
    lock.acquire()
    local_copy = database_value
    local_copy += 1
    time.sleep(0.1)
    database_value = local_copy
    lock.release()

if __name__ == "__main__":
    lock = Lock()
    print('Start value: ', database_value)

    t1 = Thread(target=increase, args=(lock,)) # notice the comma after lock since args must be a tuple
    t2 = Thread(target=increase, args=(lock,))

    t1.start()
    t2.start()

    t1.join()
    t2.join()

    print('End value:', database_value)
    print('end main')

output =

Start value: 0

End value: 2

end main

To release the lock automatically. You should Use the lock as the context manager which acquires and then releases the lock automatically.

for example :

def increase(lock):
    global database_value 
    
    with lock: 
        local_copy = database_value
        local_copy += 1
        time.sleep(0.1)
        database_value = local_copy

3.Queue

A queue is a linear data structure that means that if first task is in the first task is out first. A good example of it is that a queue of customers that are waiting in line in which fist customer is served first.

Main methods are:

◼️ q.get( ): Used to remove and return the first item. By default until the item is available.

◼️ q.put(item) : Used to put the elements at the end of the queue. But by default it blocks until a free slot is available.

◼️ q.task_done( ) : Tells you that a formerly enqueued task is complete. You should call it for each get( )

after you are done with your task.

◼️ q.join( ): Block until all items in the queue have been gotten and processed.

◼️ q.empty( ) : To check whether the queue is empty. It returns true if the queue is empty.

for example:

from queue import Queue

q = Queue()

q.put(1)
q.put(2)
q.put(3)

first  = q.get()
print(first)

output = 1

Use the queue in multithreading.

The following is an example which uses the queue to exchange numbers from 0.19. Every thread calls on the worker method. In the infinity loop the thread is waiting until items are available due to the blocking q.get() call. The items are processed when the q.task_done( ) tells the queue that processing is complete. 10 daemon threads are created in the main page. Meaning that they die when the main thread dies, and thus the worker method and infinite loop is no longer invoked.

from threading import Thread, Lock, current_thread
from queue import Queue

def worker(q, lock):
    while True:
        value = q.get()  # unblocks when the item is available

        # doing something...
        with lock:
            # prevent printing at the same time with this lock
            print(f"in {current_thread().name} got {value}")
        # ...

        # For each get(), a subsequent call to task_done() tells the queue
        # that the processing on this item is complete.
        # If all tasks are done, q.join() can unblock
        q.task_done()


if __name__ == '__main__':
    q = Queue()
    num_threads = 10
    lock = Lock()

    for i in range(num_threads):
        t = Thread(name=f"Thread{i+1}", target=worker, args=(q, lock))
        t.daemon = True  # dies when the main thread dies
        t.start()
    
    for x in range(20):
        q.put(x)

    q.join()  # Blocks until all items in the queue have been gotten and processed.

    print('main done')

output =

in Thread1 got 0

in Thread2 got 1

in Thread2 got 11

in Thread6 got 3

in Thread8 got 4

in Thread7 got 5

in Thread9 got 6

in Thread9 got 16

in Thread4 got 8

in Thread3 got 9

in Thread1 got 10

in Thread5 got 2

in Thread2 got 12

in Thread6 got 13

in Thread8 got 14

in Thread7 got 15

in Thread10 got 7

in Thread9 got 17

in Thread4 got 18

in Thread3 got 19

Learn Python, Python tutorials with notes and Q&A

Search This Blog

Featured Post

Keyword to create root window in python

Threading vs Multiprocessing in python.

(15)THREADING VS MULTIPROCESSING:

Process: A process is an instance of a program. They are independent from each other

Threads: A thread is an entity within a process that can be scheduled for execution . A Process can spawn multiple threads. The main difference is that all threads within a process share the same memory.

What is a GIL?

What is the use of threading?

Multiprocessing

What is the use of multiprocessing?

Do you need it?

Avoiding GIL's

How to use the threading module?

1.How to create and run threads?

2.Lock

3.Queue

Use the queue in multithreading.

Comments

Post a Comment