Discussion:
hostapd mixing EAP ids
Ahmet Basagalar
2005-10-18 21:44:24 UTC
Permalink
Hi,

We have recently upgraded to hostapd ver. 3.9 from ver 2.6. But there seems
to be a bug regarding Radius authentication. Randomly some of our users can
not authenticate although they are valid. When I check radius logs I see the
following message:

Either EAP-request timed out OR EAP-response to an unknown EAP-request

I did a packet capture to see the communication between radius server and
hostapd and seems like hostapd is mixing up EAP ids and this confuses RADIUS
server and causes a reject.

Below is a sample packet capture showing the problem.

Everything is normal up to Frame 65 where RADIUS sends MD5 challenge, but
hostapd again responds with a identity instead of challenge response, so
RADIUS sends another challenge for this new identity, then probably hostapd
responds to this challenge with the old EAP id I believe.

I will try to look into the code, but if somebody resolved this before, that
will be helpful. This happens randomly as I mentioned before.

Thanks,
Ahmet

Packet Capture
-------------------------------------------
Frame 63 (220 bytes on wire, 220 bytes captured)
Linux cooked capture
Internet Protocol, Src Addr: 62.12.37.95 (62.12.37.95), Dst Addr:
61.107.119.129 (61.107.119.129)
User Datagram Protocol, Src Port: 32771 (32771), Dst Port: radius (1812)
Radius Protocol
Code: Access Request (1)
Packet identifier: 0x6b (107)
Length: 176
Authenticator: 0xAFF10C9DD82E5820D891B90112FE7CFD
Attribute value pairs
t:User Name(1) l:9, Value:"metzler"
t:NAS IP Address(4) l:6, Value:62.12.37.95
t:NAS identifier(32) l:15, Value:"some_nas"
t:NAS Port(5) l:6, Value:40
t:Called Station Id(30) l:33, Value:"00-02-6F-07-C3-45:some_nas"
t:Calling Station Id(31) l:19, Value:"00-02-6F-07-7E-2E"
t:Framed MTU(12) l:6, Value:1400
t:NAS Port Type(61) l:6, Value:Wireless IEEE 802.11(19)
t:Connect Info(77) l:24, Value:"CONNECT 11Mbps 802.11b"
t:EAP Message(79) l:14
Extensible Authentication Protocol
Code: Response (2)
Id: 138
Length: 12
Type: Identity [RFC3748] (1)
Identity (7 bytes): metzler
t:Message Authenticator(80) l:18,
Value:779B20E889DC807CE9A785564DA2C5F3

Frame 65 (323 bytes on wire, 323 bytes captured)
Linux cooked capture
Internet Protocol, Src Addr: 61.107.119.129 (61.107.119.129), Dst Addr:
62.12.37.95 (62.12.37.95)
User Datagram Protocol, Src Port: radius (1812), Dst Port: 32771 (32771)
Radius Protocol
Code: Access challenge (11)
Packet identifier: 0x6b (107)
Length: 279
Authenticator: 0x8EFC0EFB6EE6EBD7853C406FEAF2FDB6
Attribute value pairs
t:Vendor Specific(26) l:193, Vendor:Cisco(9)
t:Acct Interim Interval(85) l:6, Value:1800
t:EAP Message(79) l:24
Extensible Authentication Protocol
Code: Request (1)
Id: 139
Length: 22
Type: MD5-Challenge [RFC3748] (4)
Value-Size: 16
Value: 00B5754047F66F1FEB8046FE9F1CB303
t:Message Authenticator(80) l:18,
Value:C4834492B327FD1B3F33422135F9395D
t:State(24) l:18, Value:190F2E2DF8FA36E59535A0B853F4ED5E

Frame 66 (238 bytes on wire, 238 bytes captured)
Linux cooked capture
Internet Protocol, Src Addr: 62.12.37.95 (62.12.37.95), Dst Addr:
61.107.119.129 (61.107.119.129)
User Datagram Protocol, Src Port: 32771 (32771), Dst Port: radius (1812)
Radius Protocol
Code: Access Request (1)
Packet identifier: 0x6d (109)
Length: 194
Authenticator: 0x21FEB636DEAD236CD292F840409AFB7D
Attribute value pairs
t:User Name(1) l:9, Value:"metzler"
t:NAS IP Address(4) l:6, Value:62.12.37.95
t:NAS identifier(32) l:15, Value:"some_nas"
t:NAS Port(5) l:6, Value:40
t:Called Station Id(30) l:33, Value:"00-02-6F-07-C3-45:some_nas"
t:Calling Station Id(31) l:19, Value:"00-02-6F-07-7E-2E"
t:Framed MTU(12) l:6, Value:1400
t:NAS Port Type(61) l:6, Value:Wireless IEEE 802.11(19)
t:Connect Info(77) l:24, Value:"CONNECT 11Mbps 802.11b"
t:EAP Message(79) l:14
Extensible Authentication Protocol
Code: Response (2)
Id: 139
Length: 12
Type: Identity [RFC3748] (1)
Identity (7 bytes): metzler
t:State(24) l:18, Value:190F2E2DF8FA36E59535A0B853F4ED5E
t:Message Authenticator(80) l:18,
Value:6E1AB597A228D65BBA27D0B0535EA45B

Frame 69 (323 bytes on wire, 323 bytes captured)
Linux cooked capture
Internet Protocol, Src Addr: 61.107.119.129 (61.107.119.129), Dst Addr:
62.12.37.95 (62.12.37.95)
User Datagram Protocol, Src Port: radius (1812), Dst Port: 32771 (32771)
Radius Protocol
Code: Access challenge (11)
Packet identifier: 0x6d (109)
Length: 279
Authenticator: 0xB65E5719D35053C0189D18DF67A60183
Attribute value pairs
t:Vendor Specific(26) l:193, Vendor:Cisco(9)
t:Acct Interim Interval(85) l:6, Value:1800
t:EAP Message(79) l:24
Extensible Authentication Protocol
Code: Request (1)
Id: 140
Length: 22
Type: MD5-Challenge [RFC3748] (4)
Value-Size: 16
Value: ACBC9E2DF2140A42E986511C899D3D12
t:Message Authenticator(80) l:18,
Value:C58FD6905CF891777D29E251FE93E80B
t:State(24) l:18, Value:8F403AADDCF9C9CA88B32A87589BEE54

Frame 70 (248 bytes on wire, 248 bytes captured)
Linux cooked capture
Internet Protocol, Src Addr: 62.12.37.95 (62.12.37.95), Dst Addr:
61.107.119.129 (61.107.119.129)
User Datagram Protocol, Src Port: 32771 (32771), Dst Port: radius (1812)
Radius Protocol
Code: Access Request (1)
Packet identifier: 0x6f (111)
Length: 204
Authenticator: 0xE889CB8BDEAAAC447E9471EC44DCBB80
Attribute value pairs
t:User Name(1) l:9, Value:"metzler"
t:NAS IP Address(4) l:6, Value:62.12.37.95
t:NAS identifier(32) l:15, Value:"some_nas"
t:NAS Port(5) l:6, Value:40
t:Called Station Id(30) l:33, Value:"00-02-6F-07-C3-45:some_nas"
t:Calling Station Id(31) l:19, Value:"00-02-6F-07-7E-2E"
t:Framed MTU(12) l:6, Value:1400
t:NAS Port Type(61) l:6, Value:Wireless IEEE 802.11(19)
t:Connect Info(77) l:24, Value:"CONNECT 11Mbps 802.11b"
t:EAP Message(79) l:24
Extensible Authentication Protocol
Code: Response (2)
Id: 139
Length: 22
Type: MD5-Challenge [RFC3748] (4)
Value-Size: 16
Value: 059DD3270C74D9097277BE0EE5145F5F
t:State(24) l:18, Value:8F403AADDCF9C9CA88B32A87589BEE54
t:Message Authenticator(80) l:18,
Value:1636B50009207281FE441CAF659990F3

Frame 73 (257 bytes on wire, 257 bytes captured)
Linux cooked capture
Internet Protocol, Src Addr: 61.107.119.129 (61.107.119.129), Dst Addr:
62.12.37.95 (62.12.37.95)
User Datagram Protocol, Src Port: radius (1812), Dst Port: 32771 (32771)
Radius Protocol
Code: Access Reject (3)
Packet identifier: 0x6f (111)
Length: 213
Authenticator: 0x09D75A74F1ECF6294F1429EC1F80BF96
Attribute value pairs
t:Vendor Specific(26) l:193, Vendor:Cisco(9)
Jouni Malinen
2005-10-19 00:53:36 UTC
Permalink
Post by Ahmet Basagalar
We have recently upgraded to hostapd ver. 3.9 from ver 2.6. But there seems
to be a bug regarding Radius authentication. Randomly some of our users can
not authenticate although they are valid. When I check radius logs I see the
Everything is normal up to Frame 65 where RADIUS sends MD5 challenge, but
hostapd again responds with a identity instead of challenge response, so
RADIUS sends another challenge for this new identity, then probably hostapd
responds to this challenge with the old EAP id I believe.
I will try to look into the code, but if somebody resolved this before, that
will be helpful. This happens randomly as I mentioned before.
How often does this happen? In other words, is this easy to reproduce?
If you can easily get a debug log from hostapd showing one of these
cases, it could provide useful information for finding out the details
of what happened with the client authentication. In many cases, there
will be two EAP-Request/Identity frames (one triggered by association
another by EAPOL-Start from client). It would be good to find out why
one of the EAP-Response/Identity packets does not get filtered out,
though.
--
Jouni Malinen PGP id EFC895FA
Ahmet Basagalar
2005-10-19 14:46:53 UTC
Permalink
This really happens very rarely, and I was not able to reproduce it so far.
Usually the first authentication goes in ok, but later during a reauth the
thing locks up. Interestingly in some occasions, after a couple of hours,
the problem goes away and the user authenticates again. Of course if you
restart hostapd, the problem goes away immediately.

Below are a couple of hostapd state dumps.

I will also try to get the debug logs and send it to the list.

Thanks,
Ahmet

Hostapd dumps
-----------------------------------
STA=00:02:6f:07:cf:50
AID=9 flags=0x3 [AUTH][ASSOC]
capability=0x1 listen_interval=0
supported_rates=82 84 8b 96 [1M][2M][5.5M][11M]
timeout_next=NULLFUNC POLL
IEEE 802.1X:
identity=bquintana
cached_packets=[RX RADIUS][EAP SUPPLICANT]
EAPOL state machine:
aWhile=8 quietWhile=38 reAuthWhen=3600
authAbort=FALSE authFail=TRUE authPortStatus=Unauthorized
authStart=FALSE
authTimeout=FALSE authSuccess=FALSE eapFail=TRUE eapolEap=TRUE
eapSuccess=FALSE eapTimeout=FALSE initialize=FALSE keyAvailable=FALSE
keyDone=FALSE keyRun=FALSE keyTxEnabled=FALSE portControl=Auto
portEnabled=TRUE portValid=TRUE reAuthenticate=FALSE
Authenticator PAE:
state=HELD
eapolLogoff=FALSE eapolStart=FALSE eapRestart=FALSE
portMode=Auto reAuthCount=2
quietPeriod=60 reAuthMax=2
authEntersConnecting=386
authEapLogoffsWhileConnecting=0
authEntersAuthenticating=257
authAuthSuccessesWhileAuthenticating=0
authAuthTimeoutsWhileAuthenticating=0
authAuthFailWhileAuthenticating=258
authAuthEapStartsWhileAuthenticating=0
authAuthEapLogoffWhileAuthenticating=0
authAuthReauthsWhileAuthenticated=0
authAuthEapStartsWhileAuthenticated=0
authAuthEapLogoffWhileAuthenticated=0
Backend Authentication:
state=IDLE
eapNoReq=FALSE eapReq=FALSE eapResp=TRUE
serverTimeout=30
backendResponses=516
backendAccessChallenges=258
backendOtherRequestsToSupplicant=516
backendAuthSuccesses=0
backendAuthFails=258
Reauthentication Timer:
state=INITIALIZE
reAuthPeriod=3600 reAuthEnabled=TRUE
Authenticator Key Transmit:
state=NO_KEY_TRANSMIT
Key Receive:
state=NO_KEY_RECEIVE
rxKey=FALSE
Controlled Directions:
state=FORCE_BOTH
adminControlledDirections=Both operControlledDirections=Both
operEdge=FALSE

STA=00:02:6f:07:a0:66
AID=12 flags=0x3 [AUTH][ASSOC]
capability=0x1 listen_interval=0
supported_rates=82 84 8b 96 [1M][2M][5.5M][11M]
timeout_next=NULLFUNC POLL
IEEE 802.1X:
identity=echen
cached_packets=[RX RADIUS][EAP SUPPLICANT]
EAPOL state machine:
aWhile=0 quietWhile=9 reAuthWhen=3600
authAbort=FALSE authFail=TRUE authPortStatus=Unauthorized
authStart=FALSE
authTimeout=FALSE authSuccess=FALSE eapFail=TRUE eapolEap=TRUE
eapSuccess=FALSE eapTimeout=FALSE initialize=FALSE keyAvailable=FALSE
keyDone=FALSE keyRun=FALSE keyTxEnabled=FALSE portControl=Auto
portEnabled=TRUE portValid=TRUE reAuthenticate=FALSE
Authenticator PAE:
state=HELD
eapolLogoff=FALSE eapolStart=FALSE eapRestart=FALSE
portMode=Auto reAuthCount=1
quietPeriod=60 reAuthMax=2
authEntersConnecting=174
authEapLogoffsWhileConnecting=0
authEntersAuthenticating=120
authAuthSuccessesWhileAuthenticating=14
authAuthTimeoutsWhileAuthenticating=0
authAuthFailWhileAuthenticating=107
authAuthEapStartsWhileAuthenticating=0
authAuthEapLogoffWhileAuthenticating=0
authAuthReauthsWhileAuthenticated=14
authAuthEapStartsWhileAuthenticated=0
authAuthEapLogoffWhileAuthenticated=0
Backend Authentication:
state=IDLE
eapNoReq=FALSE eapReq=FALSE eapResp=TRUE
serverTimeout=30
backendResponses=241
backendAccessChallenges=120
backendOtherRequestsToSupplicant=241
backendAuthSuccesses=14
backendAuthFails=107
Reauthentication Timer:
state=INITIALIZE
reAuthPeriod=3600 reAuthEnabled=TRUE
Authenticator Key Transmit:
state=NO_KEY_TRANSMIT
Key Receive:
state=NO_KEY_RECEIVE
rxKey=FALSE
Controlled Directions:
state=FORCE_BOTH
adminControlledDirections=Both operControlledDirections=Both
operEdge=FALSE
Ahmet Basagalar
2005-10-20 22:35:53 UTC
Permalink
I think I have found the problem. In ieee802_1x.c, function
ieee802_1x_new_auth_session , eapolEap flag must set to False, otherwise in
some cases when there is a left over eap response from supplicant, it is
send as the response to Radius server (BE_AUTH goes to RESPONSE state
without getting the response from supplicant)

So I just added:

sm->eapolEap = FALSE;

at the end of this function. It seemed to fix the problem. Let me know your
comments...

Thanks,
Ahmet
Jouni Malinen
2005-10-21 05:09:37 UTC
Permalink
Post by Ahmet Basagalar
I think I have found the problem. In ieee802_1x.c, function
ieee802_1x_new_auth_session , eapolEap flag must set to False, otherwise in
some cases when there is a left over eap response from supplicant, it is
send as the response to Radius server (BE_AUTH goes to RESPONSE state
without getting the response from supplicant)
Thanks! This is very helpful information.
Post by Ahmet Basagalar
sm->eapolEap = FALSE;
at the end of this function. It seemed to fix the problem. Let me know your
comments...
I was trying to reproduce this kind of problems in my test setup and was
easily able to get into this state with 0.4.x branch since it has a
change that ends up allowing received EAP packets to be processed before
all possible EAPOL state machine steps have been completed. However,
this issue did not happen with 0.3.9 that you are using since it runs
EAPOL steps to completion before receiving the next frame.

Setting eapolEap to FALSE in ieee802_1x_new_auth_session() sounds
reasonable, but it is also looking like it could be just hiding the real
problem. I would like to understand what is triggering this behavior
with the 0.3.x-style of EAPOL state machine stepping that does not
allow, e.g., EAPOL-Start and EAP-Identity/Response to be processed while
going through possible steps from the previous packet.

It is possible that something else is not initialized in case of
EAPOL-Start processing. If the exact trigger mechanism for this issue
can be figured out (e.g., from debug log, if this can be easily
reproduced). If this seems to take too much time, I'm considering of
just applying this changes since it looks like something that should not
cause problems (apart from maybe hiding the real bug in a way that makes
it even more difficult to find in the future, should there be a need to
really fix it).
--
Jouni Malinen PGP id EFC895FA
Ahmet Basagalar
2005-10-21 21:40:06 UTC
Permalink
I had a very huge debug log, but it is hard to find the exact place where it
gets triggered. Following is an interesting entry in the debug log

IEEE 802.1X: station 00:02:6f:07:c4:dc - new auth session, clearing State
IEEE 802.1X: Generated EAP Request-Identity for 00:02:6f:07:c4:dc
(identifier 33, timeout 30)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state CONNECTING
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state DISCONNECTED
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state RESTART
IEEE 802.1X: station 00:02:6f:07:c4:dc - new auth session, clearing State
IEEE 802.1X: Generated EAP Request-Identity for 00:02:6f:07:c4:dc
(identifier 34, timeout 30)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state CONNECTING
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state AUTHENTICATING
IEEE 802.1X: 00:02:6f:07:c4:dc BE_AUTH entering state REQUEST
IEEE 802.1X: Sending EAP Packet to 00:02:6f:07:c4:dc (identifier 34)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc BE_AUTH entering state RESPONSE
Encapsulating EAP message into a RADIUS packet
Sending RADIUS message to authentication server
RADIUS message: code=1 (Access-Request) identifier=164 length=176

As you can see the auth session was restarted twice. I do not know if this
gives any clue.

My only concern for adding that modification was the fact that it may
trigger something else somewhere like a memory leak, but from what you said
it seems like it should not. To recreate it, you may try setting reauth
period to a very low value like 10 secs and associate as much clients as
possible. That was how I was able to reproduce it.

Thanks,
Ahmet
Jouni Malinen
2005-10-24 00:29:21 UTC
Permalink
Post by Ahmet Basagalar
I had a very huge debug log, but it is hard to find the exact place where it
gets triggered. Following is an interesting entry in the debug log
Would it be possible for you to send bit more larger extract from the
debug log? Ideally, I would like to see what happens from the
association all the way to the point when the EAP id gets mixed up.
Post by Ahmet Basagalar
IEEE 802.1X: station 00:02:6f:07:c4:dc - new auth session, clearing State
IEEE 802.1X: Generated EAP Request-Identity for 00:02:6f:07:c4:dc
(identifier 33, timeout 30)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state CONNECTING
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state DISCONNECTED
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state RESTART
IEEE 802.1X: station 00:02:6f:07:c4:dc - new auth session, clearing State
IEEE 802.1X: Generated EAP Request-Identity for 00:02:6f:07:c4:dc
(identifier 34, timeout 30)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state CONNECTING
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc AUTH_PAE entering state AUTHENTICATING
IEEE 802.1X: 00:02:6f:07:c4:dc BE_AUTH entering state REQUEST
IEEE 802.1X: Sending EAP Packet to 00:02:6f:07:c4:dc (identifier 34)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc BE_AUTH entering state RESPONSE
Encapsulating EAP message into a RADIUS packet
Sending RADIUS message to authentication server
RADIUS message: code=1 (Access-Request) identifier=164 length=176
As you can see the auth session was restarted twice. I do not know if this
gives any clue.
Unfortunately, the debug log did not include lines showing what happened
before the first start of a new authentication session. In many cases,
this is because of an EAPOL-Start frame from the supplicant. However,
here AUTH_PAE state machine is moving from CONNECTING to DISCONNECTED
state which should only happen if reAuthCount > reAuthMax, i.e.,
re-authentication has been tried too many times without success.
Post by Ahmet Basagalar
My only concern for adding that modification was the fact that it may
trigger something else somewhere like a memory leak, but from what you said
it seems like it should not. To recreate it, you may try setting reauth
period to a very low value like 10 secs and associate as much clients as
possible. That was how I was able to reproduce it.
I don't expect it to lead to a memory leak, but I'm not sure whether
this change would resolve all cases that could trigger this kind of
behavior.

hostapd 0.4.x versions have a race condition that can trigger similar
problem if the connections between authentication server, authenticator,
and supplicant are very fast (e.g., in my tests that use emulated
network connection by just passing frames locally within one host).
However, after resolving this, I have not been able to reproduce this
issue. In addition to the latest development version, I tested with
0.3.x branch using 100 stations and five second EAPOL reauthentication
interval. Just having lots of stations and reauthentications does not
seem to be enough in my test setup; there needs to be different latency
between events etc. to trigger this.
--
Jouni Malinen PGP id EFC895FA
Jouni Malinen
2005-10-24 01:20:52 UTC
Permalink
Post by Ahmet Basagalar
IEEE 802.1X: 00:02:6f:07:c4:dc BE_AUTH entering state REQUEST
IEEE 802.1X: Sending EAP Packet to 00:02:6f:07:c4:dc (identifier 34)
IEEE 802.1X: 00:02:6f:07:c4:dc REAUTH_TIMER entering state INITIALIZE
IEEE 802.1X: 00:02:6f:07:c4:dc BE_AUTH entering state RESPONSE
Encapsulating EAP message into a RADIUS packet
I was able to reproduce similar case by forcing the supplicant to send
out all EAP packets twice. This transition from BE_AUTH REQUEST to
RESPONSE without receiving anything from the supplicant seems to be the
main problem. It means that there was a pending EAP-Response from the
station (likely EAP-Response/Identity) and the authenticator is just
sending it out when receiving a new request from the authentication
server.

I did not find any clear place in the state machine that would clear
eapolEap in cases where an extra EAP packet is received unexpectedly.
After going through the state machines for a while, it looks like the
best way to resolve this would be to set eapolEap to FALSE in
BE_AUTH::REQUEST. This is quite close to what you did in the function
generation EAP-Request/Identity, but actually done for all EAP packets
since this is by no means limited to just EAP-Request/Identity.

It would still be interesting to find out where this happened in real
life since my test case is done be forcing the supplicant to do odd
things.. If there is a clear scenario for this happening during normal
operations, it would be worthwhile to update the IEEE 802.1X standard to
take this into account.
--
Jouni Malinen PGP id EFC895FA
Loading...