Diferencia entre revisiones de «BTRFS»

De Jose Castillo Aliaga
Ir a la navegación Ir a la búsqueda
Sin resumen de edición
Sin resumen de edición
Línea 2: Línea 2:


En un servidor configurat com a RAID 10 amb 4 discs durs, trobem un misstage a la pantalla que diu:
En un servidor configurat com a RAID 10 amb 4 discs durs, trobem un misstage a la pantalla que diu:
BTRFS error (device sda1): bdev /dev/sdf1 errs: wr 51956436, rd 28739183, flush 124601, corrupt 0, gen 0
Això ens diu que hi ha algun problema en el sistema d'arxius. Si fem algunes comprovacions:


<pre class="code">
<pre class="code">
Línea 15: Línea 19:
error details: read=5340485 super=3
error details: read=5340485 super=3
corrected errors: 0, uncorrectable errors: 5340485, unverified errors: 0
corrected errors: 0, uncorrectable errors: 5340485, unverified errors: 0
</pre>
Efectivament, es troben molts errors que no pot solucionar.
Si fem un smartctl al disc que sembla fallar:
<pre class="code">
sudo smartctl /dev/sdf -a
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-70-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:    Seagate Barracuda 7200.14 (AF)
Device Model:    ST2000DM001-1ER164
Serial Number:    W4Z19BCP
LU WWN Device Id: 5 000c50 07d44b288
Firmware Version: CC25
User Capacity:    2.000.398.934.016 bytes [2,00 TB]
Sector Sizes:    512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:  ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Nov 18 15:53:14 2019 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:      (  0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (  89) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (  1) minutes.
Extended self-test routine
recommended polling time: ( 219) minutes.
Conveyance self-test routine
recommended polling time: (  2) minutes.
SCT capabilities:       (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000f  117  099  006    Pre-fail  Always      -      162249504
  3 Spin_Up_Time            0x0003  100  096  000    Pre-fail  Always      -      0
  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      175
  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000f  079  060  030    Pre-fail  Always      -      89979761
  9 Power_On_Hours          0x0032  081  081  000    Old_age  Always      -      16693
10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0
12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      85
183 Runtime_Bad_Block      0x0032  100  100  000    Old_age  Always      -      0
184 End-to-End_Error        0x0032  100  100  099    Old_age  Always      -      0
187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0
188 Command_Timeout        0x0032  100  089  000    Old_age  Always      -      1 2 12
189 High_Fly_Writes        0x003a  099  099  000    Old_age  Always      -      1
190 Airflow_Temperature_Cel 0x0022  064  048  045    Old_age  Always      -      36 (Min/Max 30/38)
191 G-Sense_Error_Rate      0x0032  100  100  000    Old_age  Always      -      0
192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      44
193 Load_Cycle_Count        0x0032  067  067  000    Old_age  Always      -      67287
194 Temperature_Celsius    0x0022  036  052  000    Old_age  Always      -      36 (0 9 0 0 0)
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x003e  200  069  000    Old_age  Always      -      262
240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      16485h+48m+15.949s
241 Total_LBAs_Written      0x0000  100  253  000    Old_age  Offline      -      19632474908
242 Total_LBAs_Read        0x0000  100  253  000    Old_age  Offline      -      1261379460714
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
</pre>
</pre>

Revisión del 17:58 18 nov 2019

Cas pràctic

En un servidor configurat com a RAID 10 amb 4 discs durs, trobem un misstage a la pantalla que diu:

BTRFS error (device sda1): bdev /dev/sdf1 errs: wr 51956436, rd 28739183, flush 124601, corrupt 0, gen 0

Això ens diu que hi ha algun problema en el sistema d'arxius. Si fem algunes comprovacions:

 sudo btrfs scrub start /media/btrfs/
 scrub started on /media/btrfs/, fsid bb600f14-9fbb-4f27-af33-95c6ac1975fe (pid=16546)
 sudo btrfs scrub status /media/btrfs/
   scrub status for bb600f14-9fbb-4f27-af33-95c6ac1975fe
	scrub started at Mon Nov 18 13:54:43 2019, running for 00:15:05
	total bytes scrubbed: 84.04GiB with 5340488 errors
	error details: read=5340485 super=3
	corrected errors: 0, uncorrectable errors: 5340485, unverified errors: 0

Efectivament, es troben molts errors que no pot solucionar.

Si fem un smartctl al disc que sembla fallar:

sudo smartctl /dev/sdf -a
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-70-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1ER164
Serial Number:    W4Z19BCP
LU WWN Device Id: 5 000c50 07d44b288
Firmware Version: CC25
User Capacity:    2.000.398.934.016 bytes [2,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Nov 18 15:53:14 2019 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   89) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 ( 219) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x1085)	SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       162249504
  3 Spin_Up_Time            0x0003   100   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       175
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   079   060   030    Pre-fail  Always       -       89979761
  9 Power_On_Hours          0x0032   081   081   000    Old_age   Always       -       16693
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       85
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   089   000    Old_age   Always       -       1 2 12
189 High_Fly_Writes         0x003a   099   099   000    Old_age   Always       -       1
190 Airflow_Temperature_Cel 0x0022   064   048   045    Old_age   Always       -       36 (Min/Max 30/38)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       44
193 Load_Cycle_Count        0x0032   067   067   000    Old_age   Always       -       67287
194 Temperature_Celsius     0x0022   036   052   000    Old_age   Always       -       36 (0 9 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   069   000    Old_age   Always       -       262
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       16485h+48m+15.949s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       19632474908
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1261379460714

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.