Register Now

Login

Lost Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

How To Troubleshoot IPSEC VPN When Some Multiple SA’s Are Inactive

Issue:

Here in this article we will discuss how to troubleshoot the issue when one of the IPSEC SA’s remains inactive. We can refer this article when some SA’s are established and some SA’s are down.

Security Associations Inactive
Security Association

Troubleshooting:

In the above image, we can notice security association between “192.168.80.0/24” and “192.168.64.0/20” is down and security association between “192.168.1.0/24” and “192.168.64.0/20” is active.

First step is to check when security association is changed in the charon logs. Login to CLI of sophos XG firewall then select “Device Management” and then select “Advanced Shell” and enter this command to keep charon service in debug mode: “service charon:debug -ds nosync”. Once we keep charon service in debug mode, it will give more detailed logs.

Now we need to filter the charon logs with this command: “cat charon.log | grep conn_name

**************************************************************************************************************************

2020-10-23 20:31:21 10[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 2
2020-10-23 20:34:21 32[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 2
2020-10-23 20:37:21 27[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 2
2020-10-23 20:40:21 28[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 2
2020-10-23 20:43:21 20[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 1
2020-10-23 20:46:21 30[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 1
2020-10-23 20:49:21 14[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 1
2020-10-23 20:52:21 27[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 1
2020-10-23 20:55:21 30[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 1
2020-10-23 20:58:21 16[APP] [COP-UPDOWN][STATUS] (db_status_update) conn_name: azuregw count: 1

**************************************************************************************************************************

In the above logs count 2 will resemble 2 security associations, and then after “2020-10-23 20:43:21” count became 1. So the two security associations changed to one security association, means one security association has been disconnected.

We need to narrow down the logs just before that specific time stamp in charon logs.

XG IP is 285.54.258.149 (Respond Only)

Azure IP: 51.136.198.59 (Initiator)

I am marking the important logs which will make us to understand what is happening at that specific time.

*********************************************************************************************************************

2020-10-23 20:40:07 15[CHD] <aztest1-1|143> SPI 0xe37df272, src 285.54.258.149 dst 51.136.198.59
2020-10-23 20:40:07 15[CHD] <aztest1-1|143> CHILD_SA azuregw-1{132} state change: INSTALLING => INSTALLED
2020-10-23 20:40:07 15[IKE] <aztest1-1|143> inbound CHILD_SA azuregw-1{132} established with SPIs c08f3a27_i e37df272_o and TS 192.168.1.0/24 === 192.168.64.0/20
2020-10-23 20:40:07 15[CHD] <aztest1-1|143> CHILD_SA azuregw-1{131} state change: INSTALLED => REKEYING
2020-10-23 20:40:07 15[CHD] <aztest1-1|143> CHILD_SA azuregw-1{131} state change: REKEYING => REKEYED
2020-10-23 20:40:07 15[ENC] <aztest1-1|143> generating CREATE_CHILD_SA response 6269 [ SA No TSi TSr ]
2020-10-23 20:40:07 15[NET] <aztest1-1|143> sending packet: from 285.54.258.149[500] to 51.136.198.59[500] (208 bytes)
2020-10-23 20:40:07 04[NET] sending packet: from 285.54.258.149[500] to 51.136.198.59[500]
2020-10-23 20:40:07 03[NET] received packet: from 51.136.198.59[500] to 285.54.258.149[500] on Port2
2020-10-23 20:40:07 03[NET] waiting for data on sockets
2020-10-23 20:40:07 11[NET] <aztest1-1|143> received packet: from 51.136.198.59[500] to 285.54.258.149[500] (80 bytes)
2020-10-23 20:40:07 11[ENC] <aztest1-1|143> parsed INFORMATIONAL request 6270 [ D ]
2020-10-23 20:40:07 11[IKE] <aztest1-1|143> received DELETE for ESP CHILD_SA with SPI 6d8c6dba
2020-10-23 20:40:07 11[IKE] <aztest1-1|143> closing CHILD_SA azuregw-1{131} with SPIs c300308c_i (532547358 bytes) 6d8c6dba_o (35133355 bytes) and TS 192.168.1.0/24 === 192.168.64.0/20
2020-10-23 20:40:07 11[IKE] <aztest1-1|143> sending DELETE for ESP CHILD_SA with SPI c300308c
2020-10-23 20:40:07 11[CHD] <aztest1-1|143> CHILD_SA azuregw-1{131} state change: REKEYED => DELETING
2020-10-23 20:40:07 11[IKE] <aztest1-1|143> CHILD_SA closed
2020-10-23 20:40:07 11[CHD] <aztest1-1|143> CHILD_SA azuregw-1{131} state change: DELETING => DELETED

***********************************************************************************************************************

From the above logs we can notice below steps:

  1. Packet received from Azure gateway to XG gateway.

2. XG received DELETE ESP CHILD SA from Azure gateway.

3. CHILD SA started deleting.

4. Finally CHILD SA has been deleted.

Conclusion:

The initiator sends a rekey request, however SA did not established properly. In this particular scenario issue lies on azure side as delete request for specific SA has been initiated from the azure side.

By referring this article you can solve similar scenarios where some security associations are inactive.

Hope this article helps you.