1  The basic working principle of hard disk

1.1  Structure diagram of hard disk components

1.2  Main parameter term explanation

head : In the process of exchanging data with the hard disk in , Read operations are much faster than write operations , Hard disk manufacturers develop a read / Write separate heads .

speed (Rotationl Speed): Is the rotation speed of the motor spindle in the hard disk , That is the maximum number of revolutions a hard disk can complete in one minute . The faster the drive turns , The faster the hard disk can find files , The transmission speed of the relative hard disk is also improved . At present, the common speed of hard disk in the market is 5400rpm、7200rpm、10000rpm、15000rpm. Theoretically , The faster the speed, the better . Because higher speed can shorten the average seek time and the actual read-write time of the hard disk . But the faster the speed, the more heat , It's not good for cooling . The current mainstream hard disk speed is generally 7200rpm above . as for SCSI The spindle speed of the hard disk can be up to 7200-10000RPM, And the highest speed SCSI Hard disk speeds up to 15000RPM.

Single disc capacity :  It is one of the most important parameters of hard disk , To a certain extent, it determines the level of hard disk . A hard disk is a combination of multiple storage disks , The capacity of a single disk is the maximum amount of data that a storage disk can store .

Number of discs :  Disk is the medium of data storage in hard disk , A hard disk is made up of multiple disks stacked together , They are separated from each other by washers . The more disks a hard disk has , The thicker , The higher the fever .

Random seek time ( Company : millisecond ):  Different speeds , Performance differences are directly reflected in random reads / Write seek time . The lower the value of random seek performance, the better , It's also the most direct performance of daily hard disk applications in terms of speed .

Average seek time (Average seek time): It refers to the time taken by the hard disk to move the read / write head on the disk to the specified track to find the corresponding target data , It describes the ability of a hard disk to read data , The unit is millisecond . When the capacity of a single disc increases , The seek action and moving distance of the magnetic head are reduced , So the average seek time is reduced , Speed up the hard drive .

Data caching : High speed memory inside a hard disk , In the computer, it is like a buffer to store some data temporarily for reading and re reading . In the early days, the cache of hard disk was 512KB-2MB, At present the mainstream SATA The data cache of the hard disk is 32MB.

Tao to Tao time (single track seek): The time that the head moves from one track to another , The unit is millisecond (ms).

The whole visit time (max full seek): The total time it takes for the head to start moving until it finally finds the data block it needs , The unit is millisecond (ms).

Continuous time between failures (MTBF): It refers to the longest time from the beginning of hard disk operation to failure . General hard disk MTBF At least in the 30000 or 40000 Hours .

1.3  Hard disk types and advantages and disadvantages

According to the time sequence of hard disk development, they are :

1.3.1 IDE Hard disk

IDE(Integrated Drive Electronics) It refers to the hard disk drive that integrates the controller with the disk , It's the transmission interface of the hard disk ,  Another name is ATA(Advanced Technology Attachment), It's the same thing , Using parallel multiplexing technology (PATA).

In general use 16-bit data bus ,  Every time the bus processes, it transmits 2 Bytes , It's usually 100Mbytes/sec bandwidth , The data bus must be locked in 50MHz. ordinary IDE The speed of the hard disk is 5400/7200RPM. The transmission rate stops at 133MB/s about . Due to the limitation of parallel technology, it is gradually eliminated .

1.3.2 SATA Hard disk

SATA(Serial ATA) The serial port hard disk is also called serial port hard disk . SATA It's named after its serial data transmission . In the process of data transmission , Data line and signal line are used independently , And the transmitted clock frequency remains independent , So it's the same as before PATA comparison ,SATA The transmission rate can reach parallel 30 times .

In the early SATA-1 Can achieve 150MB/s, Late SATA-2 The standard can reach 300MB/s, And the third generation SATA-3 The standard transmission rate of the protocol can reach 600M/s, Speed at 7200RPM.

SATA Hard disk supports hot swap , But when the hard disk is damaged , It doesn't show the specific bad disk , Hot swap technology makes little sense , When a single thread or a small number of threads work , The performance is already very good , But in the case of multitasking or mass data transmission , Sharp performance degradation , The reason is that the mechanical chassis is relatively low .

1.3.3 SCSI Hard disk

SCSI English full name :Small Computer System Interface, It's a memory unit interface mode specially designed for small computer system ,SCSI A computer can send commands to a SCSI equipment , The disk can move the drive arm to position the head , Passing data through disk media and caches , The whole process is performed in the background . In this way, multiple commands can be sent and operated at the same time , Suitable for heavy load I/O application . The overall performance on disk array is also much higher than that based on ATA The array of hard disks .

Main stream SCSI Hard drives are all made of Ultra 320 SCSI Interface , Can provide 320MB/s The transmission speed of the interface , The average seek time is 4-5ms,CPU Occupancy rate low 、 The parallel processing ability is strong , It can process and transmit data asynchronously , ordinary SCSI The speed of the hard disk is 10000/15000RPM, But it's expensive .

1.3.3 SAS Hard disk

SAS(Serial Attached SCSI) Serial connection SCSI, It's a new generation of SCSI technology . And now popular Serial ATA(SATA) The hard disk is the same , Are using serial technology to get higher transmission speed , And by shortening the link line to improve the internal space and so on .SAS  It's also true SCSI A transformative development of Technology SAS  It's also true SCSI A transformative development of Technology .

Transmission rate support 600MB/s, Every SAS Port supply 3Gb bandwidth , Transmission capacity and 4Gb The optical fibers are almost the same , Enough and FC Hard disk is as good as , Not only can it connect SCSI Hard disk , Also compatible SATA Hard disk , The average search time is 3-4ms, But the price is too high , Compared to the same capacity Ultra 320 SCSI Hard disk ,SAS Hard drives cost more than twice as much , But it's also expensive , If the group RAID, And I need to buy SAS card .

1.3.4 FC Hard disk

FC(Fibre Channel) It is generally considered as the interconnection architecture between systems or between systems and subsystems , It's point-to-point ( Or exchange ) The configuration of the system adopts the optical cable connection between the systems .( The hard disk itself does not have FC Interface ,  The cabinet with the hard disk is equipped with FC Interface ,  It is connected with optical fiber switch through optical fiber ).

The transmission rate can reach 200MB/s-400MB/s, The average search time is 3ms about , High performance transmission 、 Excellent stability , But it's extremely expensive , In addition to very high-end enterprise applications , There is basically no market ,SAS The rise of the new era has also given FC There's a lot of pressure .

1.3.5 SSD Hard disk

Solid state disk (Solid State Disk or Solid State Drive), Also called electronic hard disk or solid state electronic disk , It consists of control unit and solid state storage unit (DRAM or FLASH chip ) Hard disk made up of .

Excellent earthquake resistance , The operating temperature range of the chip is very wide (-40~85℃). Cost is very high .

SSD There are two kinds of :

(1) Flash based SSD (IDE FLASH DISK、Serial ATA Flash Disk): such SSD Can be moved , And data protection is not power controlled , Can adapt to a variety of environments , But the service life is not high , It is suitable for individual users .

(2) be based on DRAM Solid state drive : It imitates the design of a traditional hard disk 、 It can be set and managed by the file system of most operating systems , And provide FC Interface and PCI Interface , The application mode can be divided into SSD Hard disk and SSD There are two kinds of hard disk arrays .

2 RAID

Here's something about hard drives , It's all from the Internet . in application , The biggest impact on the performance of our program is the network and disk IO 了 , Because now CPU The speed has been very fast , Memory IO The speed has reached a very fast level ( Almost there should be 5G Per second ), But our data is stored on disk , The operation of the program needs to read data continuously 、 Store the data , Because the performance of the disk is one of the biggest factors ( Let's not talk about the Internet ).

The drawback of modern disks is :I/O Poor performance , Very poor stability . We're only talking about qualitative things here , As for performance , Ken is willing to spend money , I'll buy the expensive one . When it comes to stability , If a hard disk fails or is damaged , Then this hard disk can no longer be used , If it's in a place where the data storage requirements are particularly high , It's unthinkable . Because of that , A new technology was born --RAID.

2.1 RAID Concept

Redundant array of independent disks (RAID, Redundant Array of Independent Disks), Redundant array of cheap disks (RAID, Redundant Array of Inexpensive Disks), Hard disk array for short . The basic idea is to combine several relatively cheap hard disks , Become a hard disk array group , Make the performance reach or even surpass one expensive 、 A huge hard disk . Depending on the version you choose ,RAID Compared with a single hard disk, it has one or more of the following advantages : Enhance data integration , Enhanced fault tolerance , Increase capacity or capacity . in addition , Disk arrays for computers , It looks like a separate hard disk or logical storage unit . It is divided into RAID-0,RAID-1,RAID-1E,RAID-5,RAID-6,RAID-7,RAID-10,RAID-50,RAID-60.

Evaluate a kind of RAID There are three main indicators in the form of the index , Namely : Speed 、 Disk usage 、 Redundancy .

2.2 RAID0

Merging multiple disks into one large disk , No redundancy , parallel I/O, The fastest . If a disk ( Physics ) damage , All data will be lost .

2.3 RAID1

More than two groups of N Two disks interact with each other Mirror image , In some multithreaded operating systems, it has good read speed , In addition, there is a slight decrease in write speed . Unless the primary disk with the same data is damaged at the same time as the mirror , Otherwise, as long as one disk is normal, it can maintain operation , Highest reliability . yes RAID The highest unit cost of .

2.4 RAID2

RAID 0 An improved version of , With Hamming code (Hamming Code) After encoding, the data is partitioned into independent bits , And write the data to the hard disk respectively . Because the error correction code is added to the data (ECC,Error Correction Code), So the overall capacity of the data will be larger than the original data , The error can be corrected in case of data error , To make sure the output is correct . The data transfer rate is quite high .RAID2 It takes at least three disk drives to work . Multiple disks are required to store inspection and recovery information , bring RAID2 Technology implementation is more complex . So it's rarely used in a business environment .

2.5 RAID3

Parallel transmission with parity check code . use Bit-interleaving( Data interleaved storage ) technology , We can only check mistakes, not correct them . Mainly used for graphics ( Including animation ) And so on . Provides a good transfer rate for large amounts of continuous data , But for applications that often need to perform a large number of write operations , The parity disk will become the bottleneck of write operation . The use of a separate check disk to protect data, although not mirror high security , But hard disk utilization has been greatly improved . To achieve this, the user must have more than three drives , write in / The readout rate is very high . Because there are fewer check bits , So the computing time is relatively small .

2.6 RAID4

Independent disk structure with parity check code . And RAID3 similar , Access to data is done in blocks , That is, by disk , One disk at a time . In a failed recovery , It's more difficult than RAID3 Much more , The design of the controller is much more difficult , And the efficiency of accessing data is not very good . Host access RAID Cards should all be made with Block Unit 、 When reading ,RAID3 You have to access all the disks to get the data ,RAID4 Just access a disk . Considering the disk seek time For a long , When reading large amounts of data ,RAID4 It's easier to do concurrency , So the performance should be better . When writing ,RAID3 It can calculate the check value directly , Then write the data and check value to disk respectively On ,RAID4 You need to read the old data and old check value , Using old data 、 Old check value 、 New data calculate new check value , Then write new data and new check values .

2.7 RAID5

The independent disk structure of distributed parity . It uses Disk Striping( Hard disk partition ) technology , It's a storage capability 、 A storage solution that combines data security and storage cost . Compare the data with the corresponding Parity check Information is stored in the composition RAID5 On each disk of , And parity information and corresponding data are stored on different disks . When RAID5 After a disk data corruption of , The remaining data and the corresponding parity information can be used to recover the damaged data . RAID 5 It can be understood as RAID 0 and RAID 1 A compromise of . High reading efficiency , Write efficiency is average , Block based group access is efficient . But the parallelism of data transmission is not well solved , And the design of the controller is quite difficult . Every write operation , There will be four actual reads / Write operations , Two of them read old data and parity information , Write new data and even information twice . But when it's down , The operating efficiency has been greatly reduced .

2.8 RAID6

Independent disk structure with two kinds of distributed parity check codes . And RAID 5 comparison ,RAID 6 A second independent parity block is added . Two independent parity systems use different algorithms , The reliability of the data is very high , Even if two disks fail at the same time, data usage will not be affected . It is mainly used in the situation that data must not be wrong . The design of controller becomes very complicated , Writing speed is not good , It takes a lot of time to calculate the parity value and verify the correctness of the data , Creates unnecessary load .RAID 6 You have to have more than four disks to work .RAID 6 In the function of hardware disk array card , And the most common disk array level .

2.9 RAID10/01

RAID 1+0 First map and then partition the data , Divide all the hard disks into two groups , As if it were RAID 0 The lowest combination of , Then think of the two groups as RAID 1 operation .

RAID 0+1 It's with RAID 1+0 The procedure is the opposite , First partition and then map the data to two sets of hard disks . It divides all the hard drives into two groups , become RAID 1 The lowest combination of , The two sets of hard disks are regarded as RAID 0 operation .

On the performance ,RAID 0+1 Than RAID 1+0 It's faster to read and write .

Reliability , When RAID 1+0 One of the hard disks is damaged , The other three will continue to work .RAID 0+1  As long as one hard disk is damaged , Same group RAID 0 The other hard disk will also stop working , There are only two hard drives left to run , Low reliability .

2.10 RAID50

RAID 5 And RAID 0 The combination of , First work RAID 5, Do it again RAID 0, That is to say, for multiple groups RAID 5 They make up of each other Stripe visit . because RAID 50 In order to RAID 5 Based on , and RAID 5 Need at least 3 A hard disk , So it's going to be in groups RAID 5 constitute RAID 50, Need at least 6 A hard disk . With RAID 50 The smallest 6 Take the configuration of a hard disk as an example , The first 6 Each hard disk is divided into 2 Group , Each group 3 It's made up of RAID 5, So you get two sets of RAID 5, And then put the two groups together RAID 5 constitute RAID 0.

RAID 50 Any group or groups at the bottom RAID 5 It appears that 1 When a hard disk is damaged , It's still working , But if either group RAID 5 It appears that 2 More than one hard disk is damaged , Whole set RAID 50 It will fail. .

RAID 50 Because in the upper layer of the multi group RAID 5 constitute Stripe, Performance is better than just RAID 5 high , And capacity utilization is related to RAID 5 identical .

2.11 RAID60

RAID 6 And RAID 0 The combination of : First work RAID 6, Do it again RAID 0. let me put it another way , It's for more than two groups RAID 6 do Stripe visit .RAID 6 At least 4 A hard disk , therefore RAID 60 The minimum requirement for a company is 8 A hard disk .

Because the bottom layer is made of RAID 6 form , therefore RAID 60 Any group can be allowed to RAID 6 Most of them are damaged 2 A hard disk , And the system still works ; But as long as any group at the bottom RAID 6 Damaged in the middle of 3 A hard disk , Whole set RAID 60 It will fail. , Of course, the odds are pretty low .

Compared with the simple RAID 6,RAID 60 By combining multiple groups of RAID 6 constitute Stripe visit , So the performance is high . But the threshold is high , And low capacity utilization is a big problem .

2.12  Disk array comparison

RAID Grade

Number of hard disks required

Minimum number of fault tolerant hard disks

Available capacity

performance

Security

Purpose

Application industry

0

≧2

0

n

The highest

A hard disk is abnormal , All hard disks are abnormal

Pursue maximum capacity 、 Speed

3D Real time rendering of industry 、 Video clip cache uses

1

≧2

Half of the total

Half of the total capacity

A little bit higher

The highest

For maximum security

personal 、 Enterprise backup

10

≧4

Half of the total

Half of the total capacity

high

Highest security

comprehensive RAID 0/1 advantage , The theoretical speed is faster

Large databases 、 The server

5

≧3

1

n-1

high

high

Pursue maximum capacity 、 Minimum budget

personal 、 Enterprise backup

6

≧4

2

n-2

Than RAID 5 Slightly slower

It's safer RAID 5 high

Same as RAID 5, But it's safer

personal 、 Enterprise backup

About hard disk and several RAID More articles about

  1. RAID5 and RAID10, What kind RAID Better for you ( On )

    [IT168 Manuscript ] Storage is currently IT A hot spot of industrial development , and RAID Technology is building high performance . The basic technology of mass storage , It is also the basic technology of building network storage . Experts say , The performance advantage of disk array benefits from the parallelism of disk operation , Improve the parallelism of equipment operation ...

  2. read :RAID5 and RAID10, What kind RAID Better for you

    read :RAID5 and RAID10, What kind RAID Better for you -------------------------------------------2013/10/06 Storage is currently IT A hot spot of industrial development , and RA ...

  3. Reprint :RAID5 and RAID10, What kind RAID Better for you

    from  http://storage.it168.com/h/2007-06-28/200706281204046_3.shtml Storage is currently IT A hot spot of industrial development , and RAID Technology is building high performance . Mass storage ...

  4. several RAID Technical comparison

    http://book.51cto.com/art/201310/412862.htm RAID( Redundant array of inexpensive disks ) Technology is mainly to improve disk access latency , Enhance disk availability and fault tolerance . Current server level computers ...

  5. several RAID Introduce ( summary )

    Concept RAID yes Redundent Array of Inexpensive Disks Abbreviation , Referred to as “ disk array ”. later RAID Letters in I It was changed to Independent,RAID became “ Independent redundant disk array ...

  6. Seven kinds RAID technology

    The basic way to use a lot of hard disks together is : Connect all hard drives , First write data to the first hard disk , After full , Write data to the second hard disk , This is just a simple way to connect multiple hard disks . On this basis, we have developed RAID technology : It is composed of independent disks with redundancy ...

  7. From the idea of hard disk design to RAID The way to improve

    Monitor the past and present of hard disk about desktop hard disk . Enterprise class near line hard disk (NL-SAS/SATA) It's different from monitoring a hard disk , We have talked about it in detail in the previous article , Let's look at it from another angle . " The monitoring hard disk is customized by Seagate and Western Digital for video monitoring , Typical ...

  8. several RAID Level comparison

    Grade Summary redundancy Number of disks Read fast Write fast RAID 0 cheap . Fast . dangerous No N Yes Yes RAID 1 high velocity . Simple . Security Yes 2( Usually ) Yes No RAID 5 Security ( Speed ) Cost tradeoffs Y ...

  9. Quickly understand several common RAID Disk array level

    I find a lot of people around me learning and understanding RAID The principle of disk array is , Looking for a lot of professional information to see , But because there are fewer opportunities to do it , So after reading it, I still don't understand , When it comes to the actual design , I still can't make up my mind . therefore , I combine myself in the past ...

Random recommendation

  1. The fifth chapter GPIO Interface

    5.1 GPIO Hardware introduction You can not output high and low levels through them or read in the corresponding state through them S3C2410 Yes 117 individual I/O port , It is divided into A~H common 8 Group :GPA.GPB....GPH S3C2440 Yes 130 individual I/O port , ...

  2. 30、 Calculate accurately CoreText The height method

    http://ios-iphone.diandian.com/post/2012-03-29/18389515 - (int)getAttributedStringHeightWithString:( ...

  3. Linux Check out all the CPU Core usage methods

    The two methods : 1. Method 1 : sar -P ALL Check history CPU Usage rate : sar -f /var/log/sa/sa01 2. Method 2 : mpstat -P ALL

  4. JFinal Of Shiro Rights management plug-in -- Maya cattle / JFinalShiro

    http://git.oschina.net/myaniu/jfinalshiroplugin JFinalShiroPlugin JFinal Of Shiro plug-in unit , Realize the authority management . Upgrade instructions 1) Support JF ...

  5. php Can't get session A solution to value

    1. To confirm the <?php session_start(); ?> This sentence is not in <HTML> Before the sign .  If not , Please put <HTML> Before the sign . 2. If after the above operation ...

  6. 0517JS Comprehensive practice 、 Hang up the event exercise

    <!DOCTYPE html><html>    <head>        <meta charset="UTF-8">      ...

  7. [ turn ] React The enterprise front end technology of the style

    Dear friends , Good afternoon everyone ! First of all, I wish you a happy National Day ! I'm glad to be on the eve of national day , I can share with you React The enterprise front end technology of the style . When it comes to the front end , Maybe our first feeling before was , Front end , It's just making page cuts , At most ...

  8. Local Debug Asp.net MVC Unable to load css And js

    Run one from the Internet download One of the MVC project , You can't display styles while running ,js Also reported wrong . There's no problem checking the path , Later, one of the configurations was removed from the configuration <staticContent> <!-- ...

  9. Linux The process has been occupying a single core CPU analysis

    pidstat 1 Information

  10. linux How to install multiple jdk?

    1 Download from the official website JDK edition jdk-8u181-linux-x64.tar.gz 2 utilize ssh The tool uploads the installation package to Linux System To :/usr/local 3 Linux User installed programs are usually placed in /usr/ ...