# Quark app end Intelligence: practice and application of document key point detection

InfoQ 2021-10-14 05:08:14
{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c7/c7c73a83994db10394f5998ded9b8b81.gif","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" author ： Shunda ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Recently, the quark end intelligent team is doing real-time document detection on the end , That is, enter a RGB Images , Get the coordinates of the key points of the four corners of the document . Whole pipelines It belongs to the key point detection algorithm , Therefore, I have recently read and experimented with papers in related fields .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The key point detection algorithm is divided into different modules , It can be divided into the following parts , Each part has relevant methods to optimize ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" The image processing ： Including data optical enhancement , Transformation ,resize,crop Wait for the operation , Expand the diversity of pictures ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" code ： It means in training , How to convert the coordinates to the required label, Used to monitor the output of the model ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":" A network model ： It refers to the network structure , There can be backbone/FPN/detection head And so on ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":" decode ： It refers to how to convert the result of model reasoning into the required coordinate form , Such as coordinates in Cartesian coordinate system .","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/cb/cb0766449eb4a2edda588c0dd7d950a7.webp","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Related Works","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" There are mainly two technical schemes in key point detection ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Similar to face detection , The results of the model output tensor adopt fc layer , Get the one-dimensional vector directly , It is usually the coordinate value of the normalized key point ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" Similar to human posture estimation , The results of the model output tensor adopt argmax Methods such as , obtain heatmap Corresponding large coordinates in , Finally, restore the coordinates to the original coordinates .","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In recent years , be based on heatmap Most of the schemes for key point detection , The main reason is based on heatmap The effect is better than that of regression using full connection layer . therefore , Our scheme is also based on heatmap Of , The following are some related papers in recent years .","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"DSNT","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Nibali A , He Z , Morgan S , et al. Numerical Coordinate Regression with Convolutional Neural Networks[J]. 2018.","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Ideas ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" at present , In the output of the model heatmap Conversion to numerical coordinates , There are two ways ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Through to heatmap To take argmax, Get the corresponding maximum point , To convert it into numerical coordinates . This method has good spatial generalization , But because in training argmax It is not derivable , Usually use heatmap To approximate the coded Gaussian heat map , This will lead to the inconsistency between the loss function and the final evaluation index . secondly , In the reasoning stage , Only the coordinate points to the maximum response are used to calculate the numerical coordinates , And in the training phase , All coordinate points contribute to the loss . Third , adopt heatmap Convert to numerical coordinates , There will be a lower limit of theoretical error ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" By means of heatmap Followed by fc layer , Convert to numerical coordinates . This method allows the gradient to be transmitted back from numerical coordinates to input in , But the results depend heavily on data distribution （ For example, in the training set , An object always appears in coordinates ; And in the test set , This object appears on the right , This will lead to mispredictions ）. secondly , adopt fc transformation , lost heatmap Spatial information of .","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" For the above two schemes , The author compatible the advantages of the two schemes （ End to end optimization and maintaining spatial generalization ）, A differentiable method is proposed to obtain the numerical coordinates .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fa/fa46a80ee0951f2524fc59190995e5f3.webp","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Specific steps ","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/08/08d7fbb65285a55386d73ef284f327ef.webp","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Model output 1_K_H*W individual heatmaps, among K Indicates the number of keys ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" The of each channel heatmap normalization , Let the values be nonnegative and the sum be 1, To get norm_heatmap . The purpose of this is , Use normalized heatmap It ensures that the predicted coordinates are located in heatmap Within the space of . meanwhile , norm_heatmap It can also be understood as a two-dimensional discrete probability density function ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":" Generate X and Y matrix ,","attrs":{}},{"type":"katexinline","attrs":{"mathString":"X_{(i,j)} = \\frac{2j-(w+1)}{w}"}},{"type":"text","text":",","attrs":{}},{"type":"katexinline","attrs":{"mathString":"Y_{(i,j)} = \\frac{2i-(h+1)}{h}"}},{"type":"text","text":", respectively x Index and of axes y The index of the axis . It can be understood as shrinking the upper left corner of the picture to (-1,-1) And the lower right corner zoom to (1,1) ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":" take X and Y The matrix is associated with norm_heatmap Point multiplication , So as to obtain the final numerical coordinates . And the reason for that is , norm_heatmap Represents the probability density function , X The matrix represents the index , The two points represent the prediction x The average of . The coordinates of the final prediction are represented by the mean value , The advantage of this is ,a) Differentiable ;b) The lower limit of theoretical error is small .","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3e/3e19c2b247c889b3f53afcae213ed456.webp","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Loss function loss","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"dsnt The loss function of the module is determined by Euclidean loss and JS Regular constraints consist of . The former is used for regression coordinates , The latter is used to constrain the generated thermodynamic diagram to be closer to Gaussian distribution .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"L_{euc}(u,p) = ||p-u||_2 "}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"L_D(Z,p) =JS(p(c)||N(p,I)))"}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" advantage ","attrs":{}}]},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" The whole model is trained end-to-end , The loss function can correspond to the test index ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" The lower limit of theoretical error is very small ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":" introduce X Matrix and Y matrix , It can be understood as introducing a priori , Reduce the learning difficulty of the model ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":" The effect of low resolution is still good .","attrs":{}}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" shortcoming ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the experiment , It is found that when the key is at the edge of the picture , The prediction results are not good .","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"DARK","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Zhang F , Zhu X , Dai H , et al. Distribution-Aware Coordinate Representation for Human Pose Estimation[J]. 2019.","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Ideas ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The author found that heatmaps Decoding results , It has a great influence on the generation of final numerical coordinates . Therefore, the shortcomings of the standard coordinate decoding method are studied , A decoding method and coding method with known distribution are proposed , To improve the final effect of the model .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The standard coordinate decoding process is , Get the model heatmaps after , adopt argmax Find the maximum response point m And the second largest response point s , To calculate the final response point p :","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"p=m+0.25\\frac{s-m}{\\left | s-m \\right |_2} "}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This formula means that the maximum response point is offset from the second largest response point 0.25 Pixel , The purpose of this is to compensate for quantization errors . Then map the response points back to the original graph ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"\\hat{p} = \\lambda p "}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This also shows that , heatmap The maximum response point in the is not exactly corresponding to the key points in the original figure , Just the approximate location .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Based on the pain points above , Based on the premise that the distribution is known （ Gaussian distribution ）, A new decoding method is proposed , Solve how to start from heatmap Get the exact position in , Minimize quantization error . meanwhile , A matching coding method is proposed ","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Specific steps ","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":" decode ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Suppose the output is heatmap In line with the Gaussian distribution , that heatmap You can use the following function to express ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a2/a2790d5b01f2da235c9dc0ba76001b14.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" among \$$\\mu\$$ Indicates that the key is mapped to heatmap The location of . We need to ask for \$$\\mu\$$ The location of , So the function g Convert to maximum likelihood function ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3f/3f5fec5d18c2b6bff704eff16a535ac9.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Yes \$$P(\\mu )\$$ Do Taylor expansion ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a5/a55e72ecd0230da5af14eee572f4ce5d.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" among ,m Represents the location of the maximum response in the thermal diagram . and \$$\\mu\$$ In the thermodynamic diagram corresponds to the poles , There are the following properties ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a4/a40600e9a8680062cdbb198dffd254e6.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ca/ca2f617a603f7cb0ed5373b39dcb210f.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Combined with the above formula , You can get ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" therefore , In order to get heatmap in \$$\\mu\$$ The location of , Can pass heatmap The first and second derivatives of . The purpose of this step is to explain the moving distance through mathematical methods .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Previously mentioned the of hypothetical output heatmap In line with the Gaussian distribution , The actual situation is inconsistent , It may actually be multimodal , So you need to heatmap Modulation , Let it meet this premise as much as possible . The specific method is to use Gaussian kernel function to smooth heatmap , At the same time, in order to ensure consistent amplitude , To normalize .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"{h}'=K\\circledast h"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"{h}'=\\frac{{h}'-min({h}')}{max({h}')-min({h}')}*max(h) "}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/96/965864711108398db094b73b1fec2d1f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" in summary , Step is ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Yes heatmap Gaussian kernel is used to modulate , And zoom ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" Find the first and second derivatives , To get ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"\\mu"}},{"type":"text","text":";","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":" take ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"\\mu"}},{"type":"text","text":" Map back to the original .","attrs":{}}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":" code ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Coding refers to mapping keys to heatmap On , And generate Gaussian distribution .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The previous work is to sample the coordinates , Then quantify the points （floor,ceil,round）, Finally, the quantized coordinates are used to generate Gaussian function .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Because quantization is non differentiable , There is quantization error , therefore , The author proposes not to quantify , Use float To generate a Gaussian function , In this way, unbiased heatmap .","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"UDP","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Huang J , Zhu Z , Guo F , et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation[J]. 2019.","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Ideas ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The author starts with data processing and coordinate representation , To improve performance . The author found , The current data processing method is biased , especially flip when , Will not align with the original data ; Secondly, there are statistical errors in coordinate representation . These two problems together lead to the deviation of the results . Therefore, a data processing method is proposed unbiased data processing, Solve the error caused by image conversion and coordinate conversion .","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Specific steps ","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2b/2b7cef6e17fc32945f87b496dad6507b.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Unbiased Coordinate System Transformation","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the test , Usually use the flipped ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"{k}'_{o,flip}"}},{"type":"text","text":" With the original ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"{k}'_o"}},{"type":"text","text":" Stack , To get the final prediction . however ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"{k}'_o"}},{"type":"text","text":" And ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"{\\hat{k}}'_o"}},{"type":"text","text":" Don't agree with each other , There is a deviation . You can see the flipped heatmap Not with the original heatmap alignment , There will be errors , It's about resolution .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7e/7e0fe711c7f32f0a95c0ee461faa16e4.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Therefore, the author suggests using unit length Instead of picture length ：","attrs":{}},{"type":"katexinline","attrs":{"mathString":"w=w^p-1"}},{"type":"text","text":". So after turning over heatmap Just align .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f0/f04e8629d6b894787d9b7c104647c2e2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Unbiased Keypoint Format Transformation","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Unbiased key point transformation should be reversible , namely ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"k=Decoding(Enoding(k))"}},{"type":"text","text":". therefore , The author proposes two ways ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Combined classification and regression format","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" It draws lessons from anchor The way , Suppose the key points to be predicted ","attrs":{}},{"type":"katexinline","attrs":{"mathString":"k=(m,n)"}},{"type":"text","text":", Then convert it into the following . among C Indicates the location range of the key ,X and Y Indicates what needs to be predicted offset. The final decoding is in the heat map C Get... From argmax, Then on X And Y Get the corresponding position on the heat map offset, Finally, the numerical coordinates are obtained by adding .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/5f/5f86fc29cfa3d0d97111658b33aa64af.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Classification format","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" And DARK In the same way , That is, Taylor expansion is used to approximate the real position .","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"AID","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Huang J , Zhu Z , Huang G , et al. AID: Pushing the Performance Boundary of Human Pose Estimation with Information Dropping Augmentation[J]. 2020.","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Contribution point ","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d2/d24e7e9f450d577aa8992f1b1c7eceb9.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" For key point detection , Appearance information is as important as constraint information . The previous work is usually over fitting the appearance information , And ignore the constraint information . therefore , This paper hopes to pass information drop, It can be understood as a mask , To emphasize constraint information . Constraint information is helpful when the key is occluded , Predict its exact location .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the past work, we have not used information drop The reason is that , After using this data enhancement method, the index decreased . The author passed the experiment , Find out information drop It is helpful to improve the accuracy of the model , But you need to modify the training strategy of response ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Double the number of workouts ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" Use first without mask To train , After getting a better model , And then mask Means to join continuing training .","attrs":{}}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"RSN","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Cai Y , Wang Z , Luo Z , et al. Learning Delicate Local Representations for Multi-Person Pose Estimation[J]. 2020.","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Contribution point ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This article is about 2019 year coco The scheme of key point detection champion . The main idea of this paper is , Maximum aggregation of features with the same spatial size , So as to obtain rich local information , Local information helps to produce more accurate location . Therefore, it is proposed that RSN The Internet , As shown in Figure 1 . From the picture , That is to integrate the characteristics of different receptive fields .RSN The output of contains low-level Accurate spatial information and high-level Semantic information , Spatial information helps locate , Semantic information helps to classify . However, the influence weight of these two types of information on the final prediction is inconsistent , Need to be used PRM Modules to balance ,RPM Module is essentially a channel attention and spatial attention module .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ed/edc6600c9ab23f0b21b88f429fadbd60.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/1b/1bc43b844cc7d2cf329f0c6d38f31391.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Lite-HRNet","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Yu C , Xiao B , Gao C , et al. Lite-HRNet: A Lightweight High-Resolution Network[J]. 2021.","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":" Contribution point ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This paper presents an efficient high-resolution network , yes HRNet Lightweight version of , By way of ShuffleNet Medium shuffle block Introduced to the HRNet in . Simultaneous discovery shuffleNet Used a lot of pointwise convolution（1_1 Convolution ）, It's a computing bottleneck , So introduce contional channel weight To replace shuffle block Medium 1_1 Convolution . The overall structure of the network is shown in the figure below . Consistently preserve high-resolution features in the model , And continue to integrate high-level features .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/65/6528cbcbc96c966730eb6b3e4089e674.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Mentioned earlier contional channel weight As shown below . On the left is ShuffleNet Medium shuffle block, The picture on the right is contional channel weight. You can see , A new module is used to replace 1*1 Convolution of , Realize cross stage Information exchange and local information exchange . Its specific practices include Cross-resolution weight computation and Spatial weight computation. The essence of these two modules is the attention mechanism .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/68/68a90f16279f18d76407b74f0717eff6.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":" Experimental optimization results ","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" Model structure ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" This model draws lessons from CenterNet/RetiaFace/DBFace Related work in . This time I used dsnt The plan . The main reason is ： Need to run on the end , Real time is the primary consideration .dsnt The advantage of low resolution is obvious .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MobileNet v3 Use small edition ,FPN Use in Nearest Upsample + conv + bn + Relu For up sampling . Used in training keypoints , mask and center Branch ; And when forecasting , Only used keypoints Branch .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/13/13f716e4eed202558db5fd819b6583bd.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" Optimization strategy ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In this experiment , The following optimization strategies are used ：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Use mask And center Branches to assist learning . among mask A mask that represents a document ,center Represents the center point of the document ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" Use deep Supervise. Use 4 Down sampling characteristic diagram and 8 Double down sampling feature map for training , Use the same loss Function to monitor these two layers ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"dsnt Poor effect on edge points in , therefore , Do... On the picture padding, Make the point no longer at the edge of the picture ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":" Data enhancement strategy , In addition to conventional optical disturbance enhancement , The pictures are also random crop、random erase and random flip Wait for the operation ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":" Conduct loss The function tries to work , For key point branches loss, Tried euclidean loss , l1 loss , l2 loss and smoothl1 loss , Final smoothl1 loss Best results .","attrs":{}}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" The evaluation index ","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MSE","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" It is used to evaluate the mean square error of the verification set in training .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"mse = \\frac{\\sum |d_i - \\hat{d_i}|_2^2}{N} "}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"oks-mAP","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"oks Used to evaluate the similarity between predicted and real key points ,mAP The evaluation method is similar coco[0.5:0.05:0.95] Evaluation methods , Here take [0.99:0.001:0.999]. among ,oks Make a certain transformation ,\$$d_{p,i}\$$ Represents the Euclidean distance of a point ,\$$S_p\$$ Represents the area of the quadrilateral .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"oks_{p,i} =e^{-\\frac{d_{p,i}^2}{2S_p}}"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Time consuming ","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Time consuming means in red rice 8 On , use MNN The average time for the reasoning framework to run the model .","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":" experimental result ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" So let's build one baseline,baseline The model of is moblieNet v3 + fpn + ssh module + keypoints Branch + dsnt , among , Did not use the above optimization strategy , Use 4 The down sampling characteristic graph is used as the output .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/95/95be6c70113738d8743747c01128dc80.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" stay v2 Replace different versions of loss function .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/04/0470d3773187edb54a47a3892b0f3a65.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Besides , I've tried other invalid tricks：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" Auxiliary tasks help to improve the indicators of the model , So... Was added edge To assist learning . Experiment down , Adding this branch will damage the index of the model . Probably because edge It's using gt Key points to generate , Maybe some edge Not the real edge of the corresponding document ;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" At this stage, it is time to forecast the document 4 Corner , So it's increasing 4 A point to predict , Namely 4 The center point of the edge , So the model predicts 8 A key point . Experimental results show that , Indicators have also fallen .","attrs":{}}]}]}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Demo demonstration ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=Mzg4MDY0ODk0Ng==&mid=2247484038&idx=1&sn=b3409892067b7b1a2fe0a6c2f0c3af0a&chksm=cf70b51bf8073c0dd6aaf41bb82e52c6eb552e4a3faf6312eff0c0c6ce5f9f5d7a5264c6db04&token=498222590&lang=zh_CN#rd","title":"","type":null},"content":[{"type":"text","text":"Demo Please click at the end of this article to view the video .","attrs":{}}]}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":" summary ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" in summary , In the field of document key point detection on the end , Try it now , Is based on heatmap+dsnt The scheme is better ,oks-mAP There is room for improvement . however , Compared to using fc The way the layer regresses coordinates , be based on heatmap One of the shortcomings of our scheme is ： The constraint information cannot be based on the document , To predict the coordinates of key points outside the picture . The shortcomings of this scheme , It will lead to the loss of document content , Straighten out the situation with poor results . therefore , This deficiency needs to be made up in the future .","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0b/0b0ab19875525bd50aba09e2a5fa1892.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=Mzg4MDY0ODk0Ng==&mid=2247483731&idx=1&sn=9466e57a0f56124e25719740fa932016&chksm=cf70b6cef8073fd8db82357a28c4d276c11abf63d416baec51130dbc84337bd5800de6cd70ee&token=1213689564&lang=zh_CN#rd","title":"","type":null},"content":[{"type":"text","text":" How to efficiently develop end-to-end intelligent algorithms ？MNN The workbench Python Debugging details ","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":" Pay attention to our , Once a week 3 Mobile technology practice & Dry goods give you thinking ！","attrs":{}}]}]}