Virtual Interfaces Issues

rgonzaleza

(@rgonzaleza)

Active Member

Joined: 5 months ago

Posts: 9

Topic starter 25/02/2025 7:33 pm

Hi everyone.

We were working smoothly with our cluster, but we have found that memory suddenly goes up and that we needed to make a failover, but as soon as we did that some interfaces stopped working, I mean we only have 4 virutal interfaces, when working on node1 everything was ok, with the failover to node2, 2 interfaces stopped working even with an ok status, but we couldnt use the farms that were on those interfaces, Do you know what could be happening?

Thanks in advance

Quote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

26/02/2025 9:37 am

Good day @rgonzaleza,

When Community Cluster switches the other node takes control activating the virtual IP configured in variable $cluster_ip, so to clarify, can you confirm if this IP is reachable (you can run ping) in the other node (new MASTER) at the moment the cluster switches?

If not, ARP broadcast packages are not allowed in your network (it is quite common to drop packets in Cloud services). If you run on-premise please allow that kind of packets.

if after a cluster switch the $cluster_ip is reachable but the other virtual IPs are not reachable then do the following:

Open the file /usr/local/skudonet/config/zlb-start and add the following line (anyway you can do this from the new MASTER for test purposes as this command announces to the network the MAC change):

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_1

I assume Virtual_IP_1 runs over eth0, if not, please change eth0 based on your config.

Repeat the line per Virtual IP.

When arping is executed ARP packages are sent to the network to announce the MAC change of Virtual_IP_1, and your switches have to update the ARP table. If not please check your network.

Any update will be welcomed.

Regards!

ReplyQuote

rgonzaleza

(@rgonzaleza)

Active Member

Joined: 5 months ago

Posts: 9

Topic starter 03/03/2025 11:01 pm

@emiliocm Thanks for your help, we did what you told us and these are our findings:

When we made the Fail over to the second node, we are able to ping the Cluster IP and the first node, also all the virtual IPs are running over eth0, and the switches are updating actively the ARP table.

Antothe troubleshooting we made is:

We have another LB, and there, we were able to configure the virtual IPs that stopped working in this new cluster.

ReplyQuote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

04/03/2025 12:40 pm

HI @rgonzaleza,. I cannot understand where your problem is, so can you please extend with more detail the description of the problem? Because I read your first comment and I understood cluster was not switching properly as the virtual IP was not working in second node, so I gave you indications to troubleshoot the switching and you reported it works.

I am a little bit lost, sorry.

Any additional information that would help me figure out what is going on would be appreciated.

Regards!

ReplyQuote

rgonzaleza

(@rgonzaleza)

Active Member

Joined: 5 months ago

Posts: 9

Topic starter 04/03/2025 4:37 pm

@emiliocm Hi Emilio, sorry If didnt make myself clear.

This is the issue:

We have a cluster with 2 nodes, master (10.1.1.140), backup (10.1.1.141) and cluster (10.1.1.142), everything seems working fine, the replication and failover is working, we created 4 virtual interfaces:

10.1.1.143

10.1.1.144

10.1.1.145

10.1.1.146

but here's the problem:

when we make the failover to the node 2 some of the virtual interfaces stop working, hence the farms also stop working and we need to make the failover back to the node 1, so we are not able to bring up that virtual intrfaces on node 2, but the weird thing is that, if we create those virtual interfaces in another LB we got, they start working.

ReplyQuote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

04/03/2025 5:44 pm

Hi @rgonzaleza, when you say: "we are not able to bring up that virtual interface on node 2", what kind of test did you do to figure out that interfaces on node 2 are not bringing UP?

Additionally, some information is missing in the description.

-When the failover is done to node2, are the virtual interfaces configured and Up in the node2? you can check this with commands like:

ifconfig

ip route list

ip route list table table_eth0

-When the failover is done to node2 are the farms UP and running in this node2?

if you run HTTP profiles then check if "pound" binaries are running:

ps -ef | grep pound

if you run l4xnat profiles then check if nft is loaded with the rules

nft list ruleset

This information is mandatory to let me understand if the issue is a bug in the switching or an issue in your network configuration.

Regards!

ReplyQuote

rgonzaleza

(@rgonzaleza)

Active Member

Joined: 5 months ago

Posts: 9

Topic starter 07/03/2025 8:51 pm

@emiliocm Hi Emilio, Finally we were able to replicate the issue, we created a new cluster and I'm sending you a document with the images, hoping that will be clear for you.

Skudonet-Error.docx

ReplyQuote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

10/03/2025 11:03 am

Hi @rgonzaleza, thanks for the attached document; it is clear now.

Did you apply the workaround I explained in my previous comment 26/02/2025 8:37 am ?

Add the following lines to /usr/local/skudonet/config/zlb-start (both nodes of the cluster)

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_1

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_2

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_3

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_4

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_5

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_6

Later repeat the tests and share the results with us.

ReplyQuote

rgonzaleza

(@rgonzaleza)

Active Member

Joined: 5 months ago

Posts: 9

Topic starter 11/03/2025 12:28 am

@emiliocm hi Emilio.

I added the command in the file you told me and is updated on the other node automatically, I supposed is because of the HA configuration, but the issue is the same, the node 2 shows interfaces are up, but we lost communication to the virutal interfaces. I add a new document with the evidence, also run the command you sent last week

Skudonet-Error-2.docx

ReplyQuote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

11/03/2025 9:52 am

Hi @rgonaleza, I can confirm that the issue is related to your network. SKUDONET Cluster is working properly; see attached PDF with my tests.

When the cluster switches, the new master sends a gratuitous ARP packet to the network to indicate networking devices that the MAC for the IP has changed. For some reason, this kind of packetets are discarded by your switches.

I would recommend you to play with the arping command in the new MASTER once the cluser switches doing some changes to the command

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_1

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_2

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_3

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_4

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_5

/usr/bin/arping -A -c 2 -I eth0 Virtual_IP_6

by (send arping package in unsolicited mode -U, instead of answered mode, -A)

/usr/bin/arping -U -c 2 -I eth0 Virtual_IP_1

/usr/bin/arping -U -c 2 -I eth0 Virtual_IP_2

/usr/bin/arping -U -c 2 -I eth0 Virtual_IP_3

/usr/bin/arping -U -c 2 -I eth0 Virtual_IP_4

/usr/bin/arping -U -c 2 -I eth0 Virtual_IP_5

/usr/bin/arping -U -c 2 -I eth0 Virtual_IP_6

You can do some troubleshooting in your switches and identify why these packets are not managed. But as I said the issue seems to be related to your network as the cluster works as expected and do the desired.

Thanks!

SKUDONET-Cluster-CE-demo-test.pdf

ReplyQuote

rgonzaleza

(@rgonzaleza)

Active Member

Joined: 5 months ago

Posts: 9

Topic starter 12/03/2025 12:05 am

@emiliocm Thanks Emilio, I ran the command with -U, and that update the MAC address, but now, what do we have to do? because I added the commands in the file you said before (zlb-start) and wehn I did the failover it didnt work.

ReplyQuote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

12/03/2025 9:47 am

Good day @rgonzaleza, this file zlb-start is called at the end of the Master startup process, so it is the place where to configure additional actions once the node takes the MASTER role, yo can add some sleep before executing the arping -U and increase the number of packets instead of 2.

sleep 5

/usr/bin/arping -U -c 10 -I eth0 Virtual_IP_1

/usr/bin/arping -U -c 10 -I eth0 Virtual_IP_2

/usr/bin/arping -U -c 10 -I eth0 Virtual_IP_3

/usr/bin/arping -U -c 10 -I eth0 Virtual_IP_4

/usr/bin/arping -U -c 10 -I eth0 Virtual_IP_5

/usr/bin/arping -U -c 10 -I eth0 Virtual_IP_6

ReplyQuote

emiliocm

(@emiliocm)

SKUDONET Engineer Moderator

Joined: 7 months ago

Posts: 52

20/06/2025 1:05 pm

The issue was managed privately, and it seemed to be related to the way the switches updated the ARP table. The -U flag in the arping command solved the problem.

ReplyQuote

Forum

[Solved] Virtual Interfaces Issues

Download Community Edition

Forum

[Solved] Virtual Interfaces Issues

Download Skudonet ADC Load Balancer Community Edition

Source Code

Installable ISO

Download Community Edition

Download Skudonet ADC Load Balancer
Community Edition