[Python skill] how to speed up loop operation and numpy array operation

lc013 2021-09-15 09:00:39

[Python skill ] How to speed up cycle operation and Numpy Array operation speed _ Sorting algorithm

Preface

Python Although the amount of code written is much less than that of C++,Java, But not as fast as they , So there are all kinds of improvements Python Methods and techniques of speed , This time I want to introduce Numba The library accelerates time-consuming loop operations and Numpy operation .

stay   24 Speed up your Python The acceleration method of the cycle is introduced in , One way is to use  Numba  Speed up , Just recently I saw an article introducing the use of  Numba  Speed up Python , This paper mainly introduces two examples , It's also  Numba  Two functions of , They are accelerated cycle , And right  Numpy  The calculation acceleration of .

original text :https://towardsdatascience.com/heres-how-you-can-get-some-free-speed-on-your-python-code-with-numba-89fdc8249ef3


Compared with other languages ,Python It's really slow in running speed .

A common solution , Is used as C++ Rewrite the code , And then use Python encapsulate , In this way, we can achieve C++ The operating speed can be maintained in the main applications Python The convenient .

The only difficulty with this approach is to rewrite it as C++ Part of the code takes a lot of time , Especially if you're right about C++ Unfamiliar situations .

Numba  It can improve the speed without rewriting part of the code into other programming languages .

Numba brief introduction

Numba  It is one that can transform Python A compilation library for converting code into optimized machine code . Through this transformation , The running speed of the numerical algorithm can be improved to close to  C  The speed of language code .

use  Numba  You don't need to add very complex code , Just before the function you want to optimize Add a line of code , Leave the rest to  Numba  that will do .

Numba  Can pass  pip  install :

$ pip install numba

  • 1.

Numba  For those with many numerical operations ,Numpy  Operation or a large number of cyclic operations , Can greatly improve the running speed .

Speed up Python loop

Numba  The most basic application of is acceleration Python Loop operation in .

First , If you want to use loop operation , You first consider whether you can adopt  Numpy  Function substitution in , There are some cases , There may be no alternative function . At this time, you can consider using  Numba  了 .

The first example is illustrated by inserting a sorting algorithm . We'll implement a function , Enter an unordered list , Then return to the sorted list .

Our husband is a man who contains 100,000 A list of random integers , And then execute 50 Second insertion sorting algorithm , Then calculate the average speed .

The code is as follows :

import time
import random
num_loops = 50
len_of_list = 100000
def insertion_sort(arr):
for i in range(len(arr)):
cursor = arr[i]
pos = i
while pos > 0 and arr[pos-1] > cursor:
# From back to front , Sort from small to large
arr[pos] = arr[pos-1]
pos = pos-1
# Find the location of the current element
arr[pos] = cursor
return arr
start = time.time()
list_of_numbers = list()
for i in range(len_of_list):
num = random.randint(0, len_of_list)
list_of_numbers.append(num)
for i in range(num_loops):
result = insertion_sort(list_of_numbers)
end = time.time()
run_time = end-start
print('Average time={}'.format(run_time/num_loops))

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.

Output results :

Average time=22.84399790763855

  • 1.

From the code, we can know that the time complexity of the insertion sorting algorithm is  [Python skill ] How to speed up cycle operation and Numpy Array operation speed _ Array _02, Because there are two loops ,for  The loop has  while  loop , This is the worst case . Then enter the quantity as 10 All integers , Plus repetition 50 Time , It's a very time-consuming operation .

The original author used a computer configuration i7-8700k, So the average time it takes is  3.0104s. But my computer configuration here is much worse ,i5-4210M 's Laptop , And has used close to 4 year , So the result of my running is , The average time taken is  22.84s.

that , How to adopt  Numba  Accelerated cycle operation , The code is as follows :

import time
import random
from numba import jit
num_loops = 50
len_of_list = 100000
@jit(nopython=True)
def insertion_sort(arr):
for i in range(len(arr)):
cursor = arr[i]
pos = i
while pos > 0 and arr[pos-1] > cursor:
# From back to front , Sort from small to large
arr[pos] = arr[pos-1]
pos = pos-1
# Find the location of the current element
arr[pos] = cursor
return arr
start = time.time()
list_of_numbers = list()
for i in range(len_of_list):
num = random.randint(0, len_of_list)
list_of_numbers.append(num)
for i in range(num_loops):
result = insertion_sort(list_of_numbers)
end = time.time()
run_time = end-start
print('Average time={}'.format(run_time/num_loops))

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.

Output results :

Average time=0.09438572406768798

  • 1.

You can see , In fact, only two lines of code are added , The first line is import  jit  Decorator

from numba import jit

  • 1.

Then add a line of code in front of the function , Use a decorator

@jit(nopython=True)
def insertion_sort(arr):

  • 1.
  • 2.

Use  jit  Decorator indicates that we want to convert this function to machine code , And then the parameters  nopython  Specify that we want  Numba  Use pure machine code , Or add some if necessary  Python  Code , This parameter must be set to  True  To get better performance , Unless something goes wrong .

The average time taken by the original author is  0,1424s , My computer is upgraded to just  0.094s , The speed has been greatly improved .

Speed up Numpy operation

Numba  Another common place in , Is to accelerate  Numpy  Arithmetic .

This time will initialize 3 It's a very big one  Numpy  Array , Equivalent to the size of a picture , Then use  numpy.square()  Function squares their sum .

The code is as follows :

import time
import numpy as np
num_loops = 50
img1 = np.ones((1000, 1000), np.int64) * 5
img2 = np.ones((1000, 1000), np.int64) * 10
img3 = np.ones((1000, 1000), np.int64) * 15
def add_arrays(img1, img2, img3):
return np.square(img1+img2+img3)
start1 = time.time()
for i in range(num_loops):
result = add_arrays(img1, img2, img3)
end1 = time.time()
run_time1 = end1 - start1
print('Average time for normal numpy operation={}'.format(run_time1/num_loops))

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.

Output results :

Average time for normal numpy operation=0.040156774520874024

  • 1.

When we are right  Numpy  Array for basic array calculation , Such as addition 、 Multiplication and square ,Numpy  Will be automatically vectorized internally , That's why it can be better than native  Python  The reason why code has better performance .

The speed of the above code running on the original author's computer is  0.002288s , And my computer needs  0.04s  about .

But even if it is  Numpy  Code won't be as fast as optimized machine code , So... Can still be used here  Numba  Accelerate , The code is as follows :

# numba Speed up
from numba import vectorize, int64
@vectorize([int64(int64,int64,int64)], target='parallel')
def add_arrays_numba(img1, img2, img3):
return np.square(img1+img2+img3)
start2 = time.time()
for i in range(num_loops):
result = add_arrays_numba(img1, img2, img3)
end2 = time.time()
run_time2 = end2 - start2
print('Average time using numba accelerating={}'.format(run_time2/num_loops))

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.

Output results :

Average time using numba accelerating=0.007735490798950195

  • 1.

What we use here is  vectorize  Decorator , It has two number parameters , The first parameter specifies the operation to be performed  numpy  Data type of array , This must be added , because  numba  You need to convert the code to the best version of machine code , In order to increase the speed ;

The second parameter is  target , It has the following three optional values , Indicates how to run the function :

  • cpu: Run on a single threaded CPU On

  • parallel: Run on multi-core 、 Multithreading CPU

  • cuda: Running on the GPU On

parallel  In most cases, the options are faster than  cpu , and  cuda  It is generally used when there are very large arrays .

The running time of the above code on the original author's computer is  0.001196s , Promoted 2 About times , And my computer is  0.0077s, Promoted 5 About times the speed .

Summary

numba  It can better play its role in increasing speed under the following circumstances :

  • Python  The code runs slower than  C Where to code , Typically, it's a loop operation

  • Repeat the same operation in the same place , For example, perform the same operation on many elements , namely  numpy Array operation

And in other cases ,Numba  It won't bring such a significant speed increase , Of course , In general, try to use  numba  Improving speed is also a good attempt .

Last , Exercise code :

https://github.com/ccc013/Python_Notes/blob/master/Python_tips/numba_example.ipynb

Welcome to my WeChat official account. -- The growth of algorithmic apes , Or scan the QR code below , Let's talk , Learning and progress !

[Python skill ] How to speed up cycle operation and Numpy Array operation speed _4s_03

 

Please bring the original link to reprint ,thank
Similar articles

2021-09-15

2021-09-15

2021-09-15