The call stack / The chain of functions is as follows :

Situation 1 : When downloader initializes

__init__
buildOpener# structure opener
newProxy4Opener# Equipment agent
getNewProxy# Access to the agent
maintainProxyPool# Maintain agent pool
replenishProxies# Supplementary agent
getProxiesFromLib# adopt web The service gets the specified number of new agents

Case 2 : When the download

safeDownload# Try it automatically when it fails 
download# download
chgProxy# Detect if the agent needs to be replaced
# Satisfy the random condition , Try to equip new agents
newProxy4Opener
getNewProxy
maintainProxyPool
replenishProxies
getProxiesFromLib
# The current agent is invalid , Try to equip new agents
dropAndChangeProxy
newProxy4Opener
getNewProxy
maintainProxyPool
replenishProxies
getProxiesFromLib

One urllib2 Built html Download agent component of the implementation of the program more related articles

  1. python Multiprocess breakpoint continuation fragment downloader

    python Multiprocess breakpoint continuation fragment downloader label :python Downloader Multi process Because crawlers use downloaders , But use it directly urllib Download is slow , So after a long search, I finally found a downloader that made me happy . He's able to download it in pieces , Great lift ...

  2. python actual combat --csdn Blog column downloader

    I'm going to make good use of my spare time to study Python Of web frame --web.py, In depth analysis of its implementation principle , experience web.py The beauty of delicacy . But on the basis of studying the source code, you should at least be able to use web.py. focused , No good Idea, So I plan to develop a ...

  3. Node.js Seed downloader

    Node.js Seed downloader Celebrate 2018 National Day , Made a Node.js Seed downloader for . Crawl the page , According to the link on the page , Crack another website , Download the seed file . The project is relatively simple , Crawling pages doesn't use any crawler framework . Project source ...

  4. Scrapy Study ( Ten ) The downloader middleware (Downloader Middleware)

    Downloader middleware is between Scrapy Of request/response Hook frame for handling , Is for global modification Scrapy request and response A light weight of . The underlying system . Activate Downloader Midd ...

  5. Java Multithreaded downloader (1)

    It is based on Java Multithreaded downloader , The functions available are : 1. Use multithreading to download files , And show the download speed at each time . 2. Manage multiple downloads , Including thread scheduling , Memory management, etc . One : Single file download management 1. A single file ...

  6. < be based on Qt And POSIX Threads > Multithread downloader easy to build

    The original blog , Please contact the blogger for reprint ! This project has been entrusted to me Git Remote library :https://github.com/yue9944882/Snow Project objectives   Major Functionality development environment :  Ce ...

  7. use urllib2 The idea of implementing a downloader

    The structure of the downloader use urllib2 The function and flexibility of downloader can be realized from the following aspects : handler redirect, cookie, proxy action timeout Construct request headers: ua, c ...

  8. use python Implementation of a multi-threaded web downloader

    Today, let's share a multithreaded web downloader implemented yesterday . This is an implementation with real needs , My purpose is to take it through HTTP Submit the game data to the server in this way . Putting it up is also for us to help find fault , Look for bug, Make it work better . k ...

  9. python There are five modules in crawler : The crawler starts the entry module ,URL The manager stores the crawler's URL And reptiles URL list ,html Downloader ,html Parser ,html Output device At the same time, we can master urllib2 Use 、bs4(BeautifulSoup) Page parser 、re Regular expressions 、urlparse、python Basic knowledge review (set Set operations ) And so on .

    This time python A hundred steps Encyclopedia of reptiles , There is a detailed analysis of the steps of the crawler , There are detailed comments on each step of the code , It can be grasped through this case python The characteristics of reptiles : 1. Crawler dispatch entry (crawler_main.py) # coding: ...

Random recommendation

  1. The solution of algorithm problem math Class questions

    Bulb Switcher Bulb switch Ideas : Except for the square , All the other lights were turned on and off an even number of times , So in the end it's all 0. The problem is equivalent to finding 1~n The number of square numbers in . public class Solution { ...

  2. 《 Bird brother Linux Private dishes 》 Reading notes 5

    1. Ctrl+alt+FX(X=1~6) You can switch to 6 A different text interface terminal (Terminal) Press again Ctrl+alt+F7 You can go back to X Window, Press Ctrl+alt+Backspace This is the end of all ...

  3. cocos2dx ResolutionPolicy

    FrameSize  Parameters , While the game is running , We can go through  CCEGLView::sharedOpenGLView()->getFrameSize(); If you run it on your phone , So different resolutions will get different values ...

  4. Python One of the reptiles

    1. Crawler selection :scrapy and requests+beautifuisoup scrapy It's the frame , and requests and beautifulsoup It's the library .scrapy Frame can be added as requests and bea ...

  5. java The difference and connection between distributed and cluster ( turn )

    This paper mainly introduces java The difference and connection between distributed and cluster , It has a good reference value , Let's take a look with Xiaobian One . Let's start with the difference : In a word : Distribution works in parallel , Clusters work in tandem . 1. Distributed means that different businesses are distributed in different places ...

  6. JS Write the secondary navigation bar ( Using the bubbling principle )

    Today, I'd like to share a kind of use JS Write the navigation bar , Although we don't use JS To make the navigation bar , In order to practice we use JS To do one. JS The navigation bar This method has a lot less code than other methods , But you need to have some understanding of the bubbling of events , If you need to understand bubbling, you can refer to ...

  7. Programmer interview 50 topic (1)— Find the smallest k Elements [ Algorithm ]

    subject : Input n It's an integer , Output the smallest of them k individual . For example, the input 1,2,3,4,5,6,7 and 8 this 8 A digital , The smallest 4 Numbers are 1,2,3 and 4. analysis : The simplest way to solve this problem is to put the input n An integer sort , That's the top one k Number ...

  8. secureCRT7.3.4 Crack and install

    1-9 by SecureCRT 7.3.4 Installation diagram :10-13 yes  SecureCRT 7.3.4 Break the diagram , Anxious friends can directly pull down . Here is Baidu Encyclopedia  SecureCRT Introduction to : SecureCR ...

  9. C++——OOP Object oriented understanding

    from Rob Pike Of Google+ On a tweet, I saw an article called <Understanding Object Oriented Programming> The article , Let me give a brief account of this article , And then talk about ...

  10. 2017.11.15 hashmap How it works

    Reference from :http://blog.csdn.net/jeffleo/article/details/54946424 One hashMap Basic concepts of 1.HashMap The definition of public class ...