The article comes from ：wwww.iamshuaidi.com, One focuses on school recruitment , interview , Face to face programming website
subject ： I have a 40 Million integers , Give a new integer , I need to determine whether the new integer is 40 Among the billion integers , What would you do ？
【 Ask the great God 】
Xiao Shi goes back to school , I told Mr. Lu of the computer college about the interview .
Xiao Shi hurriedly pulled teacher Lu and asked , Why do I say points 8 Load data once , The interviewer will say too slowly ？
Miss Lu ： ha-ha , Loading data from disk is disk io operation , It's very slow , You load so much data every time , still more 8 Time , I guess you can find a number of minutes or even hours .
Minor history ： Then if it's you , What would you do ？
Miss Lu ： In fact, the interviewer has made it obvious , He said he would give you a batch of machines , It's a hint that you can use distributed algorithms . You spread the data around 8 On the machine , And then come up with a new data ,8 We'll find... Together , Finally, just summarize the results .
Minor history ： How much faster can it be ？
Miss Lu ： This should reach the second level . Minor history , You can analyze it yourself .
Minor history ： I think …… Oh , In doing so , Because each machine can read data into memory at one time , There is no need to load data back and forth during comparison , So you can save the cost of loading data ！ This is a good way .
【 Better plan 】
Miss Lu ： In fact, this is not the best way , I also have a millisecond method , Do you want to know ？
Minor history ： Of course I do , Teach me .
Minor history ： Oh , Yeah , So I apply for 40 Just a hundred million , The new number is converted to one bit , Then judge whether this bit is 0 still 1 That's it .
Miss Lu ： Xiao Shi , Consider the problem clearly , If it is 40 100 million , So this 40 What are the hundred million bits 0, Which are 1 Well ？ Here comes a new number , How to judge whether you are 40 Out of 100 million ？
Minor history ： I think , Yeah ,40 100 million ,40 Million number , So every bit is 1, this ...
Miss Lu ： Actually, you can think about ,32 position int The scope of the , All in all 2 Of 32 Power , Probably 42 More than 100 million . So you can apply for 2 Of 32 The second place .
Minor history ： It means that I cover the whole integer range , Oh , Yeah . thus , You can do it ,1 On behalf of the first ,2 On behalf of the second ,2 Of 32 The power represents the last bit .40 Of the billion , The existing number is in the corresponding position 1, The other bits are 0.
Miss Lu ： you 're right , Here comes a new number ？
Minor history ： The new number will find the corresponding bit , For example, here comes a 1234, Just find the first 1234 position , If it is 1 There is a , yes 0 It doesn't exist .
Miss Lu ： you 're right , In that case , How much memory do you need ？
Minor history ： Let me see ,2 Of 32 The second place , amount to 2 Of 29 Sub bytes , wow , only 500MB, It really saves a lot of memory .
Minor history ： Such a powerful algorithm , How do you think of ？
Miss Lu ： In fact, this is a very famous big data algorithm , It's called bitmap method , English name bitmap. seeing the name of a thing one thinks of its function , Is to use bits to represent the State , To save space . I have a class tomorrow , Let's talk about bitmap method , You can listen to .
【 Mr. Lu's class 】
the second day , Mr. Lu began his class , He threw out the interview question Xiao Shi encountered from the beginning .
Miss Lu ： The classmates , The question is BAT An interview question from the company , Do you have any ideas ？
Voice has just fallen. , Egg brother stood up and answered . Brother Dan is the most proud student of teacher Lu , Known for active thinking .
Egg brother ： I think it can be . First ,32 position int The range is 42 Billion ,40 Some of the billion integers must be continuous , We can sort the data externally first , Then an initial number and a length are used to form a data structure , To represent a continuous number , for instance .
If the data is 1 2 3 4 6 7…… Such , Then you can use it (1,4) and (6,2) To express , thus , Successive numbers become 2 The number means . After a new number , I used dichotomy to find .
thus , The worst case is 2 More than 100 million breakpoints , That is to say 2 More than 100 million structures , Each structure 8 Bytes , Probably 16 Gigabytes ,1.6GB, You can put... In memory .
Miss Lu ： Um. , very nice , Not only the scheme is given , It can also actively analyze space and feasibility .
Xiao Shi was deeply impressed after listening , There is definitely more than one solution to the problem , As long as you use your head , Even if you haven't learned bitmap Algorithm , There are other ways to solve the problem .
【 after class 】
After class , Xiao Shi found teacher LV again .
Miss Lu ： But your understanding is still very strong , Many things are easy to understand , Not everyone can do this .
Hello everyone , I'm handsome , It is also being more interview , Face the , Algorithm Wait for hard core articles , Click on My head is , You'll find it's too late to meet , If you think the article is just , Don't be stingy with your praise , Hee hee
More interview articles ：