Processes and threads (Part 1)

lc013 2021-09-15 08:32:13

 

Concept

Concurrent programming Is to make the program execute multiple tasks at the same time , How to implement concurrent programming , This is about process and Threads These two concepts .

For the operating system , A task ( Or the program ) It's a process (Process), For example, opening a browser is to start a browser process , Opening wechat starts a wechat process , Open two notepads , Just start two Notepad processes .

The characteristics of the process are :

  • operating system In progress Allocate storage space ,  Each process has its own address space 、 The data stack And other auxiliary data used to track process execution ;

  • The process can be  fork  perhaps  spawn  Way to create a new process to perform other tasks

  • Each process has its own independent memory space , So the process needs to pass Interprocess communication mechanism (IPC,Inter-Process Communication) To achieve data sharing , Specific ways include The Conduit 、 The signal 、 Socket 、 Shared memory area etc.

A process can also do many things at the same time , For example Word Typing at the same time 、 Pinyin check 、 Printing and so on , That is, a task is divided into multiple sub tasks and carried out at the same time , these Subtasks within a process are called threads (Thread).

Because each process needs to accomplish at least one thing , That is, a process has at least one thread . When you want to implement concurrent programming , That is, when performing multiple tasks at the same time , There are three solutions :

  • Multi process , There is only one thread per process , But multiple processes perform multiple tasks together ;

  • Multithreading , Just start one process , But opening multiple threads in one process ;

  • Multi process + Multithreading , That is, start multiple processes , Each process starts multiple threads , But this method is very complex , It is seldom used in practice

Be careful : real Multitasking can be performed in parallel only in multi-core CPU In order to achieve , Single core CPU In the system , True concurrency is impossible , Because at some point you can get CPU Only one thread , Shared by multiple threads CPU Execution time of .

Python It supports multiple processes and threads at the same time , The following describes multiprocessing and multithreading respectively .

Multi process

stay  Unix/Linux  In the system , Provides a  fork()  system call , It's a special function , A normal function call is a call to , Go back to , but  fork  Function call once , Go back twice , Because the function is called by the parent process , Then copy out a part of the process , Finally, it returns... In both parent and child processes , So it will return twice .

The subprocess always returns  0 , The parent process will return the ID, Because the parent process can copy multiple child processes , So you need to record the of each child process ID, The child process can call  getpid()  Gets the ID.

Python in  os  Modules encapsulate common system calls , This includes  fork , The code example is as follows :

import os
print('Process (%s) start...' % os.getpid())
# Only works on Unix/Linux/Mac:
pid = os.fork()
if pid == 0:
print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid()))
else:
print('I (%s) just created a child process (%s).' % (os.getpid(), pid))

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.

Running results :

Process (876) start...
I (876) just created a child process (877).
I am child process (877) and my parent is 876.

  • 1.
  • 2.
  • 3.

because windows The system does not exist  fork , Therefore, the above function cannot call , but Python It's cross platform , Therefore, there are other modules that can realize the function of multi process , such as  multiprocessing modular .

multiprocess

multiprocessing  Module provides  Process  Class to represent a process object , Next, an example of downloading a file is used to illustrate the difference between using multiple processes and not using multiple processes .

The first is the example of not using multiple processes :

def download_task(filename):
''' Simulate downloading files '''
print(' Start the download %s...' % filename)
time_to_download = randint(5, 10)
sleep(time_to_download)
print('%s Download complete ! It took %d second ' % (filename, time_to_download))
def download_without_multiprocess():
''' Do not use multiple processes '''
start = time()
download_task('Python.pdf')
download_task('nazha.mkv')
end = time()
print(' The total cost is %.2f second .' % (end - start))
if __name__ == '__main__':
download_without_multiprocess()

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.

The operation results are as follows , Here we use  randint  Function to randomly output the time consumption of the currently downloaded file , From the results , The running time of the program is equal to the sum of the task time of two downloaded files .

 Start the download Python.pdf...
Python.pdf Download complete ! It took 9 second
Start the download nazha.mkv...
nazha.mkv Download complete ! It took 9 second
The total cost is 18.00 second .

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

If you are using multiple processes , Examples are as follows :

def download_task(filename):
''' Simulate downloading files '''
print(' Start the download %s...' % filename)
time_to_download = randint(5, 10)
sleep(time_to_download)
print('%s Download complete ! It took %d second ' % (filename, time_to_download))
def download_multiprocess():
''' Adopt multi process '''
start = time()
p1 = Process(target=download_task, args=('Python.pdf',))
p1.start()
p2 = Process(target=download_task, args=('nazha.mkv',))
p2.start()
p1.join()
p2.join()
end = time()
print(' The total cost is %.2f second .' % (end - start))
if __name__ == '__main__':
download_multiprocess()

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.

In this multi process example , We go through  Process  Class creates a process object , adopt  target  The parameter is passed into a function to represent the task that the process needs to perform ,args  It's a tuple , Represents the parameters passed to the function , Then use  start  To start the process , and  join  Method to wait for the end of process execution .

The running results are as follows , Time consumption is not the sum of the execution time of two tasks , The speed is also greatly improved .

 Start the download Python.pdf...
Start the download nazha.mkv...
Python.pdf Download complete ! It took 5 second
nazha.mkv Download complete ! It took 9 second
The total cost is 9.36 second .

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
Pool

The above example starts two processes , But if you need to start a large number of sub processes , The above code is not written properly , The sub processes should be created in batch by means of process pool , Still use the example of downloading files , But execute the following code as follows :

import os
from multiprocessing import Process, Pool
from random import randint
from time import time, sleep
def download_multiprocess_pool():
''' Adopt multi process , And use pool Manage process pools '''
start = time()
filenames = ['Python.pdf', 'nazha.mkv', 'something.mp4', 'lena.png', 'lol.avi']
# The process of pool
p = Pool(5)
for i in range(5):
p.apply_async(download_task, args=(filenames[i], ))
print('Waiting for all subprocesses done...')
# Close process pool
p.close()
# Wait for all processes to complete their tasks
p.join()
end = time()
print(' The total cost is %.2f second .' % (end - start))
if __name__ == '__main__':
download_multiprocess_pool()

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.

In the code  Pool  Object first creates 5 A process , then  apply_async  The method is to start the process and execute the task in parallel , call  join()  Method must be called before  close() ,close() It mainly closes the process pool , So you can't add new process objects after executing this method . then  join()  Is to wait for all processes to complete their tasks .

The running results are as follows :

Waiting for all subprocesses done...
Start the download Python.pdf...
Start the download nazha.mkv...
Start the download something.mp4...
Start the download lena.png...
Start the download lol.avi...
nazha.mkv Download complete ! It took 5 second
lena.png Download complete ! It took 6 second
something.mp4 Download complete ! It took 7 second
Python.pdf Download complete ! It took 8 second
lol.avi Download complete ! It took 9 second
The total cost is 9.80 second .

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
Subprocesses

In most cases , A child process is an external process , Not itself . After the child process is created , We also need to control the input and output of the subprocess .

subprocess  Module allows us to start subprocesses and manage the input and output of subprocesses .

Here is a demonstration of how to use Python Demo command  nslookup www.python.org, The code is as follows :

import subprocess
print('$ nslookup www.python.org')
r = subprocess.call(['nslookup', 'www.python.org'])
print('Exit code:', r)

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

Running results :

$ nslookup www.python.org
Server: 192.168.19.4
Address: 192.168.19.4#53
Non-authoritative answer:
www.python.org canonical name = python.map.fastly.net.
Name: python.map.fastly.net
Address: 199.27.79.223
Exit code: 0

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.

If the child process needs to enter , Can pass  communicate()  Input , The code is as follows :

import subprocess
print('$ nslookup')
p = subprocess.Popen(['nslookup'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = p.communicate(b'set q=mx\npython.org\nexit\n')
print(output.decode('utf-8'))
print('Exit code:', p.returncode)

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.

This code is to execute the command  nslookup  when , Input :

set q=mx
python.org
exit

  • 1.
  • 2.
  • 3.

Running results :

$ nslookup
Server: 192.168.19.4
Address: 192.168.19.4#53
Non-authoritative answer:
python.org mail exchanger = 50 mail.python.org.
Authoritative answers can be found from:
mail.python.org internet address = 82.94.164.166
mail.python.org has AAAA address 2001:888:2000:d::a6
Exit code: 0

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
Interprocess communication

Processes need to communicate ,multiprocess  The module also provides  QueuePipes  And so on .

Here we use  Queue  For example , Create two child processes in the parent process , A to  Queue  Write data , The other is from  Queue  Reading data . The code is as follows :

import os
from multiprocessing import Process, Queue
import random
from time import time, sleep
# Write code executed by data process :
def write(q):
print('Process to write: %s' % os.getpid())
for value in ['A', 'B', 'C']:
print('Put %s to queue...' % value)
q.put(value)
sleep(random.random())
# Read code executed by data process :
def read(q):
print('Process to read: %s' % os.getpid())
while True:
value = q.get(True)
print('Get %s from queue.' % value)
def ipc_queue():
'''
use Queue Realize interprocess communication
:return:
'''
# Parent process creation Queue, And passed to all subprocesses :
q = Queue()
pw = Process(target=write, args=(q,))
pr = Process(target=read, args=(q,))
# Start subprocess pw, write in :
pw.start()
# Start subprocess pr, Read :
pr.start()
# wait for pw end :
pw.join()
# pr There's a dead cycle in the process , Can't wait for it to end , Forced termination only :
pr.terminate()
if __name__ == '__main__':
ipc_queue()

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.

The running results are as follows :

Process to write: 24992
Put A to queue...
Process to read: 22836
Get A from queue.
Put B to queue...
Get B from queue.
Put C to queue...
Get C from queue.

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

 

 

 Processes and threads ( On )_ Subprocesses

Please bring the original link to reprint ,thank
Similar articles

2021-09-15