Jorge Visentini

New Member
Jan 11, 2018
10
0
1
38
Hi everyone!

Sorry for the question...

I configured iSCSI and delivered a LUN through 2 network cables (two Storage controllers).

Apparently everything is working, as shown in the output of the command "multipath -ll"

3600d023100033f6f4b98cce0168a4056 dm-19 IFT,DS 1000 Series
size=50G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| `- 9:0:0:0 sdj 8:144 active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
`- 8:0:0:0 sdi 8:128 active ready running


However in the log messages, the following messages are appearing directly, which I do not know if it is an error or not.

Jun 11 10:43:08 pve02 kernel: sd 9:0:0:0: alua: supports implicit TPGS
Jun 11 10:43:08 pve02 kernel: sd 9:0:0:0: alua: device naa.600d023100033f6f4b98cce0168a4056 port group 1 rel port 1
Jun 11 10:43:08 pve02 kernel: sd 9:0:0:0: alua: port group 01 state A preferred supports tolusNA
Jun 11 10:43:19 pve02 kernel: sd 8:0:0:0: alua: supports implicit TPGS
Jun 11 10:43:19 pve02 kernel: sd 8:0:0:0: alua: device naa.600d023100033f6f4b98cce0168a4056 port group 2 rel port 2
Jun 11 10:43:19 pve02 kernel: sd 8:0:0:0: alua: port group 02 state N non-preferred supports tolusNA
Jun 11 10:43:38 pve02 kernel: sd 9:0:0:0: alua: supports implicit TPGS
Jun 11 10:43:38 pve02 kernel: sd 9:0:0:0: alua: device naa.600d023100033f6f4b98cce0168a4056 port group 1 rel port 1
Jun 11 10:43:38 pve02 kernel: sd 9:0:0:0: alua: port group 01 state A preferred supports tolusNA


Does anyone know if this is a standard Proxmox message or the iSCSI service itself?

Thank you.
 
As a complement, below is the output of the "iscsiadm -m session -P 2" command:

Target: iqn.2002-10.com.infortrend:raid.uid212847.012 (non-flash)
Current Portal: 172.16.23.1:3260,1
Persistent Portal: 172.16.23.1:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1993-08.org.debian:01:6786e13cdc6a
Iface IPaddress: 172.16.23.2
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
*********
Timeouts:
*********
Recovery Timeout: 5
Target Reset Timeout: 30
LUN Reset Timeout: 30
Abort Timeout: 15
*****
CHAP:
*****
username: <empty>
password: ********
username_in: <empty>
password_in: ********
************************
Negotiated iSCSI params:
************************
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: Yes
MaxOutstandingR2T: 1
Target: iqn.2002-10.com.infortrend:raid.uid212847.001 (non-flash)
Current Portal: 172.16.22.1:3260,1
Persistent Portal: 172.16.22.1:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1993-08.org.debian:01:6786e13cdc6a
Iface IPaddress: 172.16.22.2
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 2
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
*********
Timeouts:
*********
Recovery Timeout: 5
Target Reset Timeout: 30
LUN Reset Timeout: 30
Abort Timeout: 15
*****
CHAP:
*****
username: <empty>
password: ********
username_in: <empty>
password_in: ********
************************
Negotiated iSCSI params:
************************
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: Yes
MaxOutstandingR2T: 1
 
Hello Jorge,

I'm installing a new cluster with Proxmox 5.2 and a Dell Compellent SCv2020 and I have the same messages on syslog.
I'm looking for a solution but I have not found anything yet, if I find it I will write it here and if you find it, I would appreciate it if you also put it here.

Thanks in avance!
 
Hi there!

I was wondering whether you managed to get some luck about this issue?

I am in the same situation. I run a 4-node proxmox cluster with Dell Compellent SCv4020 / iSCSI / multipathd.
All was fine with proxmox version 3.x and 4.x. Last week I upgraded from 4.4 to latest 5.2-9 and I immediately started to get those syslog messages about alua :

...
Sep 25 14:09:59 proxmox1 kernel: [356177.963996] sd 6:0:0:9: alua: supports implicit TPGS
Sep 25 14:09:59 proxmox1 kernel: [356177.964001] sd 6:0:0:9: alua: device naa.6000d310012b2e00000000000000000f port group f01d rel port 1d
Sep 25 14:09:59 proxmox1 kernel: [356177.966679] sd 6:0:0:22: alua: port group f01d state A non-preferred supports toluSNA
...

I noticed that multipathd was upgraded. Output of multipath -ll now looks like
...
dsm-cml (36000d310012b2e000000000000000013) dm-23 COMPELNT,Compellent Vol
size=64G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 5:0:0:13 sdax 67:16 active ready running
`- 6:0:0:13 sday 67:32 active ready running
...

whereas before it used to be size=64G features='1 queue_if_no_path' hwhandler='0' wp=rw

Browsing multipath.conf man I saw some params changed default values, especially retain_attached_hw_handler and detect_prio which switched from default no to default yes.
Adding
retain_attached_hw_handler no
detect_prio no
to defaults section and reloading multipath brings me back to
dsm-cml (36000d310012b2e000000000000000013) dm-23 COMPELNT,Compellent Vol
size=64G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 5:0:0:13 sdax 67:16 active ready running
`- 6:0:0:13 sday 67:32 active ready running

so I thought I got rid of alua stuff. But syslog messages are still there, flooding my logs every 30 seconds. And with 30+ VM / LVM, that is really huge.

I searched the web. I really did. I didn't find anything relevant, apart from your post https://forum.proxmox.com/threads/iscsi-message.44467/
and this one : https://forum.proxmox.com/threads/iscsi-login-negotiation-failed.41187/

They are relevants, but unsolved :/

From what I found and what you're saying, I understand that these messages are not errors but info.
But they are flooding logs. Should we only ignore them ? Filter them in logs ? Sounds bad, not a clean solution.

In addition, I run munin to monitors cluster nodes. It runs several plugins.
Since I upgraded, the 'diskstats' plugin behaves weirdly.
When I run 'munin-run diskstats' locally on a node, it answers immediately, but when my munin server runs it remotely it seems to take long as answer takes 90-100 seconds, while it used to be around 4 seconds...
And as soon as I disable this plugin, munin answer time switches down back to 4 seconds.
That seems to indicate something is wrong with disks / multipath.

Did you guys find any solution, explanation or workaround? We should not be the only ones with this issue I guess?
Should you need any further info please let me know!

Many thanks in advance for your help,
kind regards,
Laurent
 
Take that as a no then... haha Multipath seems to be working great, but still it is a constant onslaught of the same message:

[469950.289741] sd 7:0:0:1: alua: supports implicit TPGS
[469950.289772] sd 7:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[469950.324157] sd 7:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[469950.424786] sd 8:0:0:1: alua: supports implicit TPGS
[469950.424795] sd 8:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[469950.468246] sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[469961.134564] sd 9:0:0:1: alua: supports implicit TPGS
[469961.134595] sd 9:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 1
[469961.176155] sd 9:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[469961.276708] sd 10:0:0:1: alua: supports implicit TPGS
[469961.276717] sd 10:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 1
[469961.320116] sd 10:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[469981.000640] sd 9:0:0:1: alua: supports implicit TPGS
[469981.000651] sd 9:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 1
[469981.040129] sd 9:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[469981.152827] sd 10:0:0:1: alua: supports implicit TPGS
[469981.152835] sd 10:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 1
[469981.196157] sd 10:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470001.064394] sd 7:0:0:1: alua: supports implicit TPGS
[470001.064404] sd 7:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470001.108146] sd 7:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470001.208433] sd 8:0:0:1: alua: supports implicit TPGS
[470001.208442] sd 8:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470001.236156] sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470020.204452] sd 7:0:0:1: alua: supports implicit TPGS
[470020.204462] sd 7:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470020.244065] sd 7:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470020.344712] sd 8:0:0:1: alua: supports implicit TPGS
[470020.344720] sd 8:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470020.388103] sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470031.117963] sd 7:0:0:1: alua: supports implicit TPGS
[470031.117989] sd 7:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470031.156205] sd 7:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470031.240764] sd 8:0:0:1: alua: supports implicit TPGS
[470031.240773] sd 8:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470031.284240] sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470050.960111] sd 7:0:0:1: alua: supports implicit TPGS
[470050.960130] sd 7:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470050.996100] sd 7:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
[470051.100725] sd 8:0:0:1: alua: supports implicit TPGS
[470051.100735] sd 8:0:0:1: alua: device naa.60014052555fa1ad6ffed4783d866fd9 port group 0 rel port 2
[470051.140185] sd 8:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA
 
Same issue here :
Code:
[ 5294.324694] sd 10:0:0:0: alua: supports implicit and explicit TPGS
[ 5294.325607] sd 10:0:0:0: alua: device naa.6e843b696e00140d1af8d4937db522d7 port group 0 rel port 1
[ 5294.336094] sd 11:0:0:0: alua: supports implicit and explicit TPGS
[ 5294.337140] sd 11:0:0:0: alua: device naa.6e843b696e00140d1af8d4937db522d7 port group 0 rel port 1
[ 5294.337177] sd 10:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
[ 5294.339297] sd 10:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA
appears every 20 secs or so...
This is quite bothering as it floods the syslog with useless status messages, but what concerns me the most is that this is generating unnecessary write operations on the SSDs on which the OS resides.
After further researches over the web I came across this post on the Suse support kb, implying this notification is an informational message only and doesn't report any issue. There is also a related post on RedHat support, but it is not publicly accessible...
Has anyone managed to disable these inopportune notifications, or possibly filtered them so that they do not spoil the syslog anymore ?
(I found interesting posts here and here regarding second option but haven't evaluated nor tested them yet).
 
I gave up looking for a solution to this, I saw that it was not a mistake and as I had no more time to devote to it I gave it up.
Today in my company we no longer use those Compellent Storage arrays, we have moved on to Ceph.
Sorry I can't help more.
 
Thanks Jesus for your feedback.
In case this can be useful to others...
I have pursued investigations and managed to prevent my logs from beeing further flooded by this "alua: xxx" notification, however I couldn't manage to stop it from spoiling the kernel ring buffer (dmesg)...
I basically applied the solution provided in the last link in my previous post :
Creating a dedicated ".conf" file in "/etc/rsyslog.d" :
cat /etc/rsyslog.d/ignore-multipath-poll-notif.conf
Code:
if ($msg contains "device naa.6e843b696e00140d1af8d4937db522d7 port group 0 rel port 1" or $msg contains "port group 00 state A non-preferred supports TOlUSNA" or $msg contains "supports implicit and explicit TPGS") then stop
(one would have to adapt with the content of the message appearing, in particular the specific wwid).
followed by a restart of the rsyslog daemon:
Code:
systemctl restart rsyslog
This method is certainly not the most elegant, but is still better than altering the log level as suggested in the same post, I believe.
I have tried fiddling with the configuration of the multipathd daemon, nothing provided the desired result :
- Increasing the "max_polling_Interval" to 3600 secs (1h) -> fail
- reducing "verbosity" from 2 to 1, or even 0 -> fail

I still don't understand the purpose of repeating the same notification over and over again... especially when everything is working as expected...
That is my temporary solution until the issue gets solved by the maintainer(s) in charge of this module.
 
  • Like
Reactions: Cylindric
thanks @oduL your rsyslog filter helped out, and dmesg filtering works at least for me this way
Bash:
dmesg -Tx --level=err,warn,notice,crit,alert,emerg --follow
 
Code:
if (re_match($msg, "alua: port group [0-9]{2} state [AN] (non-|)preferred supports tolUsNA") or re_match($msg, "alua: device naa.[0-9a-z]{32} port group [0-9] rel port [0-9a-z]") or re_match($msg, "alua: supports implicit and explicit TPGS")) then stop
 
Another option for dmesg filtering (edit to your needs):
Code:
alias dmesg='dmesg | grep -vE "alua: (supports implicit|device .* port group)"'
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!