This post was contributed by Frans Rosen, Bug Bounty Hunter and Knowledge Advisor at Detectify
TLDR: Sometimes you just need to spend a couple of months to exploit a XSS with a hygiene product.
For a couple of months this specific bug was on my "check later" list. I later reported it to the company running a private bug bounty. I had been messing with it back and forth and was never been able to do something that actually made sense – and as soon as I had some progress – a new obstacle came crashing in my face. After a few months returning to the same endpoint, I was finally able to create a PoC to show that a security issue was present.
It's a freaking XSS, but hey, the story is what counts, right..? :)
From the start I noticed that the company had a specific OEmbed endpoint that looked like this:
I noticed it was returning structured data as JSON so I googled for any sort of OEmbed APIs out there. I saw this:
Interesting. Let's read the Docs and see if this specific API endpoint behaves like Embed.ly. In the Docs there was an indication that you were able to change format into XML instead:
So I tried that out:
Well, we got an XML back, so the format-parameter is working. Yeah, an error, but this must mean that the company has actually built their own API using Embed.ly in the back. If I would be able to find some way to utilize Embed.ly, that the company didn't know of, I could actually have something interesting here.
Back again to the documentation. The first thing I noticed was that only a specific amount of URLs were allowed to be "oembed:ed" by Embed.ly. What Embed.ly actually does is basically making a bunch of URLs enabled for OEmbed, even if the actual URL being used haven't provided it.
Embed.ly has a whitelist. This list contains regular expressions for what domains are validated as proper providers. One of these lists (there are some different versions out there) is located here: api.embed.ly/1/services
If the regular expressions in this JSON matches the URL being provided, the response will contain meta data about the content of the URL. The interesting part here is that Embed.ly will not actually resolve the URL that matched, but instead the URL which was connected to the regular expression inside Embed.ly. This is good. The problem though is when a certain regex is a bit too open. Take a look at Amazon's:
Woah. This basically means any domain that contains the word
amazon. with a path that contains
/dp/ which was great news for me...
We now get an XML, and behold, we actually escaped from the anchor tag:
It returned this when viewing the response using
And when trying to watch the rendered page, I see this:
You see, all spaces are being converted to
%20. And remember – we're in XML-context – we need attributes to create a namespace to enable HTML in it. The XML specification is much more strict in terms of valid nodes/attributes and namespaces, it needs a real freaking space before an attribute starts – no exceptions – and that sucks.
Here I went into mental space mode, since by trying the following URL:
I got this back:
No luck whatsoever, and I just could. not. get. any. space. in. there...
I saved the URL to my
check-later.txt and thought I should probably try to figure something out another day...
2 months later on a Friday night I open my
check-later.txt and find the URL which got me thinking again...
Remember this regular expression list with the services?
.* in a regular expression actually also includes spaces. And since Embed.ly was not resolving the URL using the pattern you provided, the domain doesn't necessarily need to exist. What if we instead inserted our spaces there? Would that make a difference compared to when we tried putting the spaces in the URL path?
I crafted a URL to try it, putting a space inside the domain:
And holy shizzle dizzle, what do I see:
I CAN actually get spaces in there! Wow, this is actually getting somewhere. Breakthrough!
Now, let's craft a URL that utilizes the ability to make spaces in the domain part, enable the namespace using a proper
xmlns-attribute on the anchor tag, then put the actual XSS-payload in the end of the URL since no more spaces are needed.
Suddenly, CloudFlare gets involved...
I say to myself:
I'll deal with that later, let's move on.
Couldn't embed content.
Damn. Okay, what if I escape the special characters in the domain, hell, even double escape them? And then also close the already open href-attribute, and let's start a new attribute after our namespace, to make the XML look nice and clean.
Payload inside the
url parameter then turns into:
I SEE A BRAUN SHAVER! WINNING!
We now get real HTML using this payload, some XML-errors in the top, but who cares! We can now try our best to put a payload in there. Getting closer...
Now, hello Sir CloudFlare. I've been expecting you.
After some time it strikes me, remember those
%09 being removed? What if we could bypass the CloudFlare WAF by using any of these characters, which are then removed by the API before being sent to Embed.ly. Almost like those spaces never existed in the first place. Could that work?
I craft the following payload a Friday night at 02:55 AM, 4 months after I found this endpoint in the first place:
BOOM! IT WORKS! CloudFlare's WAF was easily bypassed just by inserting a new-line inside both script-tags:
Friday night 03:03 AM I send my report and go to bed happy. At last.
The company got back to me, paid me a good reward for my 8000 character long report for a little XSS. They also told me that CloudFlare was notified about and had fixed the WAF-bypass and that the endpoint now only allow URLs with allowed domains.
I hope you enjoyed this journey as much as I did. Sometimes, even a XSS needs to bide its time.