Unable to label media

sub2o5 · Mar 19, 2022

Hello,

i freshly connected a new LTO8-drive to my PBS. The drive gets detected. Eject, Erase, Format etc works fine.
But i'm unable to label the media.
I only get the following error:
"media read error - read failed - scsi command failed: transport error"

The tape itself is a brand new sony LTO8.

I already checked the firmware of the drive - it's the newest - and updated PBS via apt.
Still no change.

Any ideas?
Kind regards
Stephan

dcsapak · Mar 22, 2022

can you post your pbs versions?

Code:

proxmox-backup-manager versions --verbose

sub2o5 · Mar 22, 2022

Code:

proxmox-backup             2.1-1        running kernel: 5.13.19-6-pve
proxmox-backup-server      2.1.5-1      running version: 2.1.5
pve-kernel-helper          7.1-13
pve-kernel-5.13            7.1-9
pve-kernel-5.13.19-6-pve   5.13.19-14
pve-kernel-5.13.19-1-pve   5.13.19-3
ifupdown2                  3.1.0-1+pmx3
libjs-extjs                7.0.0-1
proxmox-backup-docs        2.1.5-1
proxmox-backup-client      2.1.5-1
proxmox-mini-journalreader 1.3-1
proxmox-widget-toolkit     3.4-7
pve-xtermjs                4.16.0-1
smartmontools              7.2-1
zfsutils-linux             2.1.2-pve1

BTW: I did a fresh update before my initial post.

dcsapak · Mar 22, 2022

is there anything specially configured for that drive? encryption?
the tape is not a WORM tape, is it?

i'll try to check which call could fail here and write back later

sub2o5 · Mar 22, 2022

Thank you very much in advance!

The drive is a plain standard LTO8, freshly bought. Vendordiagnostics don't show any problems.
The media is nothing special,

proxmox-tape status shows some written data:

Code:

┌────────────────────────────────────────────
│ Name           │ Value                    │
╞════════════════╪══════════════════════════╡
│ blocksize      │ 0                        │
├────────────────────────────────────────────
│ density        │ LTO8                     │
├────────────────┼──────────────────────────┤
│ compression    │ 1                        │
├────────────────┼──────────────────────────┤
│ buffer-mode    │ 1                        │
├────────────────┼──────────────────────────┤
│ alert-flags    │ (empty)                  │
├────────────────┼──────────────────────────┤
│ file-number    │ 0                        │
├────────────────┼──────────────────────────┤
│ block-number   │ 0                        │
├────────────────┼──────────────────────────┤
│ manufactured   │ Sat Oct 30 02:00:00 2021 │
├────────────────┼──────────────────────────┤
│ bytes-written  │ 11.76 GiB                │
├────────────────┼──────────────────────────┤
│ bytes-read     │ 11.765 GiB               │
├────────────────┼──────────────────────────┤
│ medium-passes  │ 41                       │
├────────────────┼──────────────────────────┤
│ medium-wearout │ 0.26%                    │
├────────────────┼──────────────────────────┤
│ volume-mounts  │ 5                        │
└────────────────┴──────────────────────────┘

Code:

➜  ~ pmt status
using device /dev/tape/by-id/scsi-10WT124396-sg
{
  "vendor": "IBM",
  "product": "ULTRIUM-HH8",
  "revision": "N9M1",
  "blocksize": 0,
  "compression": true,
  "buffer-mode": 1,
  "density": "LTO8",
  "alert-flags": "(empty)",
  "file-number": 0,
  "block-number": 0,
  "manufactured": 1635552000,
  "bytes-read": 12632195072,
  "bytes-written": 12626952192,
  "volume-mounts": 5,
  "medium-passes": 41,
  "medium-wearout": 0.0025625
}

Code:

➜  ~ pmt cartridge-memory
using device /dev/tape/by-id/scsi-10WT124396-sg
0|Remaining Capacity In Partition|11444091
1|Maximum Capacity In Partition|11444091
2|Tapealert Flags|(empty)
3|Load Count|5
4|MAM Space Remaining|3060
5|Assigning Organization|LTO-CVE
6|Formatted Density Code|5e
7|Initialization Count|6
9|Volume Change Reference|00000007
522|Device Vendor/Serial Number at Last Load|IBM     10WT124396
523|Device Vendor/Serial Number at Load-1|IBM     10WT124396
524|Device Vendor/Serial Number at Load-2|IBM     10WT124396
525|Device Vendor/Serial Number at Load-3|IBM     10WT124396
544|Total MBytes Written in Medium Life|12042
545|Total MBytes Read In Medium Life|12047
546|Total MBytes Written in Current Load|0
547|Total MBytes Read in Current/Last Load|14
548|Logical Position of First Encrypted Block|ffffffffffffffff
549|Logical Position of First Unencrypted Block After the First Encrypted Block|ffffffffffffffff
1024|Medium Manufacturer|SONY
1025|Medium Serial Number|S211030055
1026|Medium Length|960
1027|Medium Width|127
1028|Assigning Organization|LTO-CVE
1029|Medium Density Code|5e
1030|Medium Manufacture Date|20211030
1031|MAM Capacity|16352
1032|Medium Type|00
1033|Medium Type Information|0000
4096|Unique Cartridge Identify (UCI)|331369fa4a4a4c3641313139534f4e59202020200023b25c00800000
4097|Alternate Unique Cartridge Identify (Alt-UCI)|331369fa4a4a4c3641313139533231313033303035350080

sub2o5 · Mar 22, 2022

Code:

➜  ~ proxmox-tape label --label-text full --drive LTO8

TASK ERROR: media read error - read failed - scsi command failed: transport error

sub2o5 · Mar 22, 2022

Same thing on the LTO7-drive which.

"2022-03-22T11:23:51+01:00: TASK ERROR: read failed - scsi command failed: transport error"

Formating went fine!

dcsapak · Mar 22, 2022

mhmm... from the code i cannot really imagine what fails here, especially since you seem to use a fairly standard drive...

can you post the output of this command:

Code:

proxmox-tape scan --drive LTO8

?
maybe also for the LTO7 drive?

sub2o5 · Mar 22, 2022

Same error:

Code:

➜  ~ proxmox-tape scan --drive LTO8
rewinding tape
Error: read failed - scsi command failed: transport error
➜  ~ proxmox-tape scan --drive LTO7
rewinding tape
Error: read failed - scsi command failed: transport error
➜  ~

Are there some other low-level logs i can provide?

dcsapak · Mar 22, 2022

there is actually not really much that could go wrong here...
can you execute the following command and post the output?

Code:

sg_raw -r 1k /path/to/tape/device 12 00 00 00 60 00

the path to the tape device can be seen by 'proxmox-backup drive list' should look like '/dev/tape/by-id/scsi-ID-sg'

edit: that is not actually the thing that goes wrong... so you don't have to execute it...

the only thing i can see going wrong is the actual read from the drive with a transport error...
is there anything in 'dmesg' or the syslog while trying to label?

is the drive encrypted or did you set anything special?

sub2o5 · Mar 22, 2022

Code:

➜  ~ sg_raw -r 1k /dev/tape/by-id/scsi-10WT124396-sg 12 00 00 00 60 00
SCSI Status: Good

Received 96 bytes of data:
 00     01 80 06 12 41 01 10 02  49 42 4d 20 20 20 20 20    ....A...IBM
 10     55 4c 54 52 49 55 4d 2d  48 48 38 20 20 20 20 20    ULTRIUM-HH8
 20     4e 39 4d 31 00 00 6a 00  00 19 00 00 00 00 00 00    N9M1..j.........
 30     30 31 50 4c 35 34 39 20  00 00 00 a2 0c 28 04 60    01PL549 .....(.`
 40     05 20 0a 28 05 02 00 00  00 00 00 00 00 00 00 00    . .(............
 50     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00    ................

Code:

➜  ~ sg_raw -r 1k /dev/tape/by-id/scsi-90WT103605-sg 12 00 00 00 60 00
SCSI Status: Good

Received 96 bytes of data:
 00     01 80 06 12 41 01 10 02  51 55 41 4e 54 55 4d 20    ....A...QUANTUM
 10     55 4c 54 52 49 55 4d 2d  48 48 37 20 20 20 20 20    ULTRIUM-HH7
 20     4b 41 48 31 00 00 6a 00  00 12 00 00 00 00 00 00    KAH1..j.........
 30     33 38 4c 37 36 32 38 20  00 00 00 a2 0c 28 04 60    38L7628 .....(.`
 40     05 20 0a 28 05 02 00 00  00 00 00 00 00 00 00 00    . .(............
 50     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00    ................

Next things i could try is changing the HBA and reinstalling pbs ...

dcsapak · Mar 23, 2022

nothing out of the ordinary, but did you see my edit?

dcsapak said:
edit: that is not actually the thing that goes wrong... so you don't have to execute it...

the only thing i can see going wrong is the actual read from the drive with a transport error...
is there anything in 'dmesg' or the syslog while trying to label?

is the drive encrypted or did you set anything special?

sub2o5 · Mar 23, 2022

Sorry, didn't notice that.

The lto8 is factory new, the lto7 worked for veritas and linux (dd) for several years. no encryption or any extraordinary stuff.

dmesg stays silent during the error.

Is there a way to dump the communication while doing a task? Maybe that way we can get any further.

The odd thing: bytecounters and passes increase during every labeling.

dcsapak · Mar 23, 2022

mhmm.. strange

sub2o5 said:
Is there a way to dump the communication while doing a task? Maybe that way we can get any further.

no not really without changing the source code...

sub2o5 said:
The odd thing: bytecounters and passes increase during every labeling.

read counter + passes is not really surprising, since we rewind and try to read. but it seems that read somehow fails in an unexpected way, and
there seems to not be much error information to be contained...

i'll think about how to debug that in a practical way..

sub2o5 · Mar 29, 2022

After changing the controller from an Adaptec ASR-5805 to a dumb LSI one, at least the labeling works.
Right now i've started a backup job, drive is idling:

Code:

2022-03-29T09:09:11+02:00: update media online status
2022-03-29T09:09:11+02:00: media set uuid: 548b060a-9dc0-4750-9f41-e1a115063440
2022-03-29T09:09:11+02:00: found 51 groups
2022-03-29T09:09:12+02:00: backup snapshot ct/149/2022-02-06T17:33:18Z
2022-03-29T09:09:12+02:00: allocated new writable media 'full'
2022-03-29T09:09:12+02:00: Checking for media 'full' in drive 'LTO8'
2022-03-29T09:09:12+02:00: found media label full (2602327e-7bf2-4982-b115-9aa8784de541)
2022-03-29T09:09:12+02:00: writing new media set label (overwrite '00000000-0000-0000-0000-000000000000/0')
2022-03-29T09:09:14+02:00: moving to end of media

Don't know what it's doing in the background, maybe i just need to be a bit more patient.

dcsapak · Mar 29, 2022

ok , great you found something. i was about to suggest to 'raw read' from the tapedevice with cat, since the logs indicate that the read fails...
it seems that you used a raid controller, which will probably not work since it probably does not let the host speak 'raw' scsi with the drives, thus the read errors..

sub2o5 · Mar 29, 2022

The controller itself never has been a problem. It passes trough tape-devices.
Neither via mt/tar/dd-stuff directly on linux, nor via iscsi or esos talking to a windows-server-vm with veritas backup - everything was fine. So at first, it was not logical to search there. But because i got one on hand, i tried it and was pleasantly surprised. Maybe you can keep this for the record as these adaptec raid-controllers are cheap and can be found everywhere. I have 7 or so in my sas-controller-box and 3 more at customer sites.

Backup is running now btw:

Code:

2022-03-29T09:23:09+02:00: backup snapshot ct/149/2022-03-16T20:40:26Z
2022-03-29T09:24:56+02:00: wrote 1134 chunks (4295.49 MB at 40.01 MB/s)
2022-03-29T09:25:26+02:00: wrote 1187 chunks (4296.80 MB at 142.01 MB/s)
2022-03-29T09:26:00+02:00: wrote 1121 chunks (4304.67 MB at 129.23 MB/s)
2022-03-29T09:26:29+02:00: wrote 1147 chunks (4297.06 MB at 147.20 MB/s)
2022-03-29T09:26:34+02:00: wrote 239 chunks (798.75 MB at 159.21 MB/s)
2022-03-29T09:26:34+02:00: end backup backup:ct/149/2022-03-16T20:40:26Z
2022-03-29T09:26:34+02:00: percentage done: 1.50% (0/51 groups, 39/51 snapshots in group #1)
2022-03-29T09:26:34+02:00: backup snapshot ct/149/2022-03-17T20:28:22Z
2022-03-29T09:27:37+02:00: wrote 1093 chunks (4297.59 MB at 68.38 MB/s)
2022-03-29T09:27:51+02:00: wrote 546 chunks (1927.28 MB at 135.37 MB/s)
2022-03-29T09:27:51+02:00: end backup backup:ct/149/2022-03-17T20:28:22Z
2022-03-29T09:27:51+02:00: percentage done: 1.54% (0/51 groups, 40/51 snapshots in group #1)
2022-03-29T09:27:51+02:00: backup snapshot ct/149/2022-03-18T20:34:32Z

dcsapak · Mar 29, 2022

sub2o5 said:
The controller itself never has been a problem. It passes trough tape-devices.
Neither via mt/tar/dd-stuff directly on linux, nor via iscsi or esos talking to a windows-server-vm with veritas backup - everything was fine. So at first, it was not logical to search there. But because i got one on hand, i tried it and was pleasantly surprised. Maybe you can keep this for the record as these adaptec raid-controllers are cheap and can be found everywhere. I have 7 or so in my sas-controller-box and 3 more at customer sites.

interesting, since we (AFAIK) don't do anything differently than other tape reading tools. yes, we'll keep that in mind, sadly it's hard to investigate such things without the hardware on hand...

sub2o5 · Mar 29, 2022

I see you're from .AT ... can you please contact me via PM if you're interested in such a controller? (gerne deutschsprachig!)
Unfortunately, i wasn't able to contact you directly.

hmx · Mar 31, 2022

Hello,

just if it could help: same problem here tested with 2 LTO-drives LTO5 tandberg and LTO6 IBM
I wasn't able to read or label a new tape, with the same error:

proxmox-backup-proxy[2878]: GET /api2/json/tape/drive/LTO-01/read-label?: 400 Bad Request: [client [::ffff:127.0.0.1]:48742] read failed - do_scsi_pt failed with err EINVAL: Invalid argument

moving the drive from MegaRAID SAS 2008 [Falcon]
to a flashed IT SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (other brand Fuji > Dell)
solved the problem

scsi host1: Avago SAS based MegaRAID driver MegaRAID SAS 2008 [Falcon]
scsi host0: Fusion MPT SAS Host

now I can test this

Unable to label media

Active Member

Attachments

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Active Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Proxmox Staff Member

Active Member

Member

We value your privacy