Asynchronous blocking, manager module, thread
Mr. Xiao Yang 2021-06-04 10:36:20

One 、 Asynchronous blocking

1、 Not waiting for results in order of execution

2、 It's that all tasks are executed asynchronously

3、 But I don't know whose result I want first , I'll take the result of whoever ends first

Obviously asynchronous , Everyone is executing each other ( Asynchronous process ), I'll take the result of whoever ends first , And the process I'm waiting for is a blocking process , The whole is an asynchronous block .

Use the producer consumer model as an example :

import requests
from multiprocessing import Process, Queue
​
url = [
'https://www.baidu.com',
'https://www.taobao.com',
'https://www.jd.com',
'https://i.cnblogs.com'
]
​
​
def producer(name, url, q): # The producer is responsible for crawling the web , Get the status code and put it in the queue 
ret = requests.get(url)
q.put((name, ret.status_code))
​
​
def consumer(q): # consumer 
while True:
conn = q.get()
if conn:
print(conn)
else:break
​
​
if __name__ == '__main__':
q = Queue()
p_list = []
for i in url:
p = Process(target=producer, args=(i, i, q))
p.start()
p_list.append(p)
​
Process(target=consumer, args=(q,)).start()
# Output 
('https://www.jd.com', 200)
('https://www.baidu.com', 200)
('https://www.taobao.com', 200)
('https://i.cnblogs.com', 200)

Two 、Manager Data sharing between modules and processes ( understand )

Data can be shared between processes , adopt Manager This class can be implemented , However, the cost of data sharing is very high , That is, the shared data needs to be locked , We have to operate many unnecessary operations .

In general, we don't use this method , The reason for using processes is because of data isolation between processes , If you have to share a lot of data , It proves that this scenario is not suitable to be solved by process

from multiprocessing import Process, Manager, Lock
​
​
def change_dic(dic, lock):
with lock: # Using data sharing requires locking to ensure data security , Like the ticket grabbing example 
dic['count'] -= 1
​
​
if __name__ == '__main__':
m = Manager()
lock = Lock()
dic = m.dict({'count': 100}) # It can be a dictionary , list 
p_l = []
for i in range(100):
p = Process(target=change_dic, args=(dic, lock))
p.start()
p_l.append(p)
for p in p_l: p.join()
print(dic)
# Output 
{'count': 0}

3、 ... and 、 Threads

process : Data isolation , The smallest unit of resource allocation , You can use multiple cores , Scheduled by the operating system , Data is not secure , Switching on and off takes a lot of time

Processes are just used to bring resources together ( Process is just a resource unit , Or resource collection ), And threads are CPU The executive unit of .

Threads : Data sharing , The smallest unit of operating system scheduling , You can use multiple cores , Scheduled by the operating system , Data is not secure , The time cost of switching on and off is small

Can be scheduled by the operating system ( to CPU perform ) Minimum unit of , Multiple threads in the same process can be CPU perform ,

 

The difference between thread and process :

1、 The thread shares the address space of the process that created it ; Processes have their own address space .

2、 Threads can directly access the data of their processes ; The process has a copy of the parent process data .

3、 Threads can communicate directly with other threads in the process ; Processes must use interprocess communication to communicate with peer processes .

4、 New threads are easy to create ; The new process needs to copy the parent process .

5、 Threads can use considerable control over threads in the same process ; A process can only control child processes .

6、 Changes to the main thread may affect the process's behavior to other threads ; Changes to the parent process do not affect the child process .

 

Thread features :

1、 Light entity : The entities in the thread basically do not own system resources .

2、 The basic unit of independent scheduling and dispatching : Because threads are light , So the thread switching is very fast and the cost is small ( In the same process ).

3、 Share process resources : Each thread in the same process , All resources owned by the process can be shared .

4、 Can be executed concurrently : Between multiple threads in a process , Concurrent execution , Even allows all threads in a process to execute concurrently

 

Global interpreter lock GIL (global interpreter lock)

stay CPython In multithreading , Garbage collection mechanism (gc) Equivalent to a thread , It has been used. Reference count + Generational recycling , To count references in variables as 0 To recycle the variables of , But in multithreading, when CPU Operate variables of multiple threads at the same time ,gc You have to count the variables of each thread , In this case, it will still have the same effect as the ticket grabbing example , For this reason, a Global interpreter lock , It allows variables in each thread to be CPU In operation , At the same time, and only one thread can be CPU operation .

The global interpreter lock is mainly used to complete gc Recycling mechanism of , The change record of reference count for different threads is more accurate

however Global interpreter lock causes only one thread in the same process to be locked CPU perform

The saving is io Operating time , instead of CPU Calculated time , because CPU It's very fast , In most cases , There's no way we can put all of the... In one process io Avoid operation .

Four 、threading modular

thread Module provides basic thread and lock support ,threading Provides a higher level 、 More powerful thread management capabilities .

thread Module does not support daemons , When the mainline process exits , All sub threads, whether they're still working or not , Will be forced out . and threading The module supports daemons .

Thread creation Threading. Thread class

multiprocessing The module is completely imitated threading Module interface , They are at the use level , There are great similarities .

current_thread() Get the current thread object ,current_thread(). ident adopt ident You can get threads id

Threads are not available from the outside terminale

All child threads can only be closed after executing the code themselves

enumerate list , Stores all living Thread objects , Including the main thread

active_count Numbers , It stores all the threads alive

from threading import Thread, current_thread, active_count, enumerate
import time
​
​
def func(i):
time.sleep(1)
print(f' This is a thread func{i}, Threads id={current_thread().ident}')
​
​
t_list = []
for i in range(5):
t = Thread(target=func, args=(i,))
t.start()
t_list.append(t)
​
print(enumerate())
print(f' Number of live threads {active_count()}')
​
for th in t_list:
th.join()
​
print(' All threads are finished !')
​
# Output 
[<_MainThread(MainThread, started 8600)>, <Thread(Thread-1, started 18832)>,...]
Number of live threads 6
This is a thread func3, Threads id=3148
This is a thread func1, Threads id=2296
This is a thread func0, Threads id=18832
This is a thread func2, Threads id=18644
This is a thread func4, Threads id=12072
All threads are finished !

Using object-oriented to create threads

from threading import Thread, current_thread
​
​
class MyThread(Thread):
def __init__(self,i):
self.i = i
super().__init__()
​
def run(self):
print(f' This is a thread {self.i}, Thread number ={current_thread().ident}')
​
​
t = MyThread(1)
t.start() # Open thread , Execute in a thread run Method .
# Output 
This is a thread 1, Thread number =10160

Data sharing between threads

The successful modification indicates that the data of the process is shared with the thread .

from threading import Thread
​
count = 100def func():
global count
count -= 1
​
t_list = []
for i in range(100):
t = Thread(target=func)
t.start()
t_list.append(t)
​
for th in t_list:
th.join()
​
print(f' All threads are finished ,count={count}')
​
# Output 
All threads are finished ,count=0

 

Please bring the original link to reprint ,thank
Similar articles

2021-08-09

2021-08-09