0x230 - Time-Based Email Enumeration
This week I reviewed several AI-generated pentest reports.
The client has setup an AI agent to hack their app, configured it with credentials and let it do its job.
While this produced ~10-12 reports, the work didn't end here.
They needed someone to look through all these reports and:
Reproduce the findings to confirm if it's real (some were actual hallucinations with made-up HTTP responses)
Verify and decide how critical the findings actually are (a few of the "critical" ones were more like "medium")
However, one vulnerability in particular caught my attention.
A vulnerability that I (and probably many pentesters) would have overlooked.
A time-based email enumeration.
Common Email Enumeration
Common places for email enumerations include:
Login
Register
Password Reset
You try one of these functions with a known (already registered) email.
You check the HTTP response.
Then you try it again with an invalid (non-existing) email.
You look for differences between the two responses: Email already exists vs. Email not found.
Based on this difference you can infer if a user already has an account on the app or not.
However, in this case all responses were ambiguous/generic i.e: If the email is registered, an email will be sent.
Time-Based Email Enumeration
What surprised me in this case was that the AI was still able to enumerate emails using the differences in the server's response time.
The technique is nothing new.
You can read more about it here.
But many pentesters would've probably overlook it and move on concluding that the app is not vulnerable.
However the AI caught it.
How it worked:
The AI tested the
Loginform with an existing account and invalid passwordGot
Incorrect email or passwordThen checked again the
Loginform with a non-existing accountGot again
Incorrect email or passwordTested both scenarios ~5 times
Compared the HTTP response time
Noticed that for existing account with invalid password -> 500 milliseconds
Noticed that for non-existing account -> 200 milliseconds
Concluded that this can be used for enumeration
The difference was minimal. Easy to overlook for a human.
But when I tried to reproduce the finding -> it was indeed consistent across multiple checks.
Everytime an existing account was checked -> the server took longer to reply.