The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content. It's not just Cloudflare, Akamai has the same problem.
If your site discusses databases then turning on the default SQL injection attack prevention rules will break your site. And there is another ruleset for file inclusion where things like /etc/hosts and /etc/passwd get blocked.
I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
Fine tuning the rules is time consuming. You often have to just completely turn off the ruleset because when you try to keep the ruleset on and allow the use-case there are a ton of changes you need to get implemented (if its even possible). Page won't load because /etc/hosts was in a query param? Okay, now that you've fixed that, all the XHR included resources won't load because /etc/hosts is included in the referrer. Now that that's fixed things still won't work because some random JS analytics lib put the URL visited in a cookie, etc, etc... There is a temptation to just turn the rules off.
mjr003 days ago
> I disagree with other posts here, it is partially a balance between security and usability.
And economics. Many people here are blaming incompetent security teams and app developers, but a lot of seemingly dumb security policies are due to insurers. If an insurer says "we're going to jack up premiums by 20% unless you force employees to change their password once every 90 days", you can argue till you're blue in the face that it's bad practice, NIST changed its policy to recommend not regularly rotating passwords over a decade ago, etc., and be totally correct... but they're still going to jack up premiums if you don't do it. So you dejectedly sigh, implement a password expiration policy, and listen to grumbling employees who call you incompetent.
It's been a while since I've been through a process like this, but given how infamous log4shell became, it wouldn't surprise me if insurers are now also making it mandatory that common "hacking strings" like /etc/hosts, /etc/passwd, jndi:, and friends must be rejected by servers.
swiftcoder3 days ago
Not just economics, audit processes also really encourage adopting large rulesets wholesale.
We're SOC2 + HIPAA compliant, which either means convincing the auditor that our in-house security rules cover 100% of the cases they care about... or we buy an off-the-shelf WAF that has already completed the compliance process, and call it a day. The CTO is going to pick the second option every time.
NaN years ago
undefined
NaN years ago
undefined
simonw3 days ago
I wish IT teams would say "sorry about the password requirement, it's required by our insurance policy". I'd feel a lot less angry about stupid password expiration rules if they told me that.
NaN years ago
undefined
NaN years ago
undefined
betaby3 days ago
> but a lot of seemingly dumb security policies are due to insurers.
I keep hearing that often on HN, however I've personally never seen seen such demands from insurers.
I would greatly appreciate if one share such insurance policy.
Insurance policies are not trade secrets and OK to be public. I can google plenty of commercial cars insurance policies for example.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
lucianbr3 days ago
There should be some limits and some consequences to the insurer as well. I don't think the insurer is god and should be able to request anything no matter if it makes sense or not and have people and companies comply.
If anything, I think this attitude is part of the problem. Management, IT security, insurers, governing bodies, they all just impose rules with (sometimes, too often) zero regard for consequences to anyone else. If no pushback mechanism exists against insurer requirements, something is broken.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
wvh2 days ago
Having worked with PCI-DSS, some rules seem to only exist to appease insurance. When criticising decisions, you are told that passing audits to be able to claim insurance is the whole game, even when you can demonstrate how you can bypass certain rules in reality. High-level security has more to do with politics (my definition) than purely technical ability. I wouldn't go as far as to call it security theatre, there's too much good stuff there that many don't think about without having a handy list, but the game is certainly a lot bigger than just technical skills and hacker vs anti-hacker.
I still have a nervous tick from having a screen lock timeout "smaller than or equal to 30 seconds".
II2II3 days ago
> If an insurer says "we're going to jack up premiums by 20% unless you force employees to change their password once every 90 days", you can argue till you're blue in the face that it's bad practice, NIST changed its policy to recommend not regularly rotating passwords over a decade ago, etc., and be totally correct... but they're still going to jack up premiums if you don't do it.
I would argue that password policies are very context dependent. As much as I detest changing my password every 90 days, I've worked in places where the culture encouraged password sharing. That sharing creates a whole slew of problems. On top of that, removing the requirement to change passwords every 90 days would encourage very few people to select secure passwords, mostly because they prefer convenience and do not understand the risks.
If you are dealing with an externally facing service where people are willing to choose secure passwords and unwilling to share them, I would agree that regularly changing passwords creates more problems than it solves.
NaN years ago
undefined
620gelato3 days ago
> jack up premiums by 20% unless you force employees to change their password once every 90 days"
Always made me judge my company's security teams as to why they enable this stupidity. Thankfully they got rid of this gradually, nearly 2 years ago now (90 days to 365 days to never). New passwords were just one key l/r/u/d on the keyboard.
Now I'm thinking maybe this is why the app for a govt savings scheme in my country won't allow password autofill at all. Imagine expecting a new password every 90 days and not allowing auto fill - that just makes passwords worse.
smeg_it3 days ago
I'm no expert, but I did take a CISSP course a while ago. One thing I actually remember ;P, is that it recommended long passwords in in lieu of the number, special character, upper, lower ... I don't remember the exact wording of course and maybe it did recommend some of that, but it talked about having a sentence rather than all that mess in 6-8 characters, but many sites still want the short mess that I never will actually remember
NaN years ago
undefined
NaN years ago
undefined
afiori3 days ago
I believe that these kind of decisions are mostly downstream of security audits/consultants with varying level of up to date slideshows.
I believe that this is overall a reasonable approach for companies that are bigger than "the CEO knows everyone and trusted executives are also senior IT/Devs/tech experts" and smaller than "we can spin an internal security audit using in-house resources"
josephcsible3 days ago
Why wouldn't the IT people just tell the grumbling employees that exact explanation?
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
paxys3 days ago
"You never know..." is the worst form of security, and makes systems less secure overall. Passwords must be changed every month, just to be safe. They must be 20 alphanumeric characters (with 5 symbols of course), just to be safe. We must pass every 3-letter compliance standard with hundreds of pages of checklists for each. The server must have WAF enabled, because one of the checklists says so.
Ask the CIO what actual threat all this is preventing, and you'll get blank stares.
As an engineer what incentive is there to put effort into knowing where each form input goes and how to sanitize it in a way that makes sense? You are getting paid to check the box and move on, and every new hire quickly realizes that. Organizations like these aren't focused on improving security, they are focused on covering their ass after the breach happens.
NaN years ago
undefined
NaN years ago
undefined
ryandrake3 days ago
This looks like a variation of the Scunthorpe problem[1], where a filter is applied too naively, aggressively, and in this case, to the wrong content altogether. Applying the filter to "other stuff" sent to and among the servers might make sense, but there doesn't seem to be any security benefit to filtering actual text payload that's only going to be displayed as blog content. This seems like a pretty cut and dried bug to me.
I don't get why you'd have SQL injection filtering of input fields at the CDN level. Or any validation of input fields aside from length or maybe some simple type validation (number, date, etc). Your backend should be able to handle arbitrary byte content in input fields. Your backend shouldn't be vulnerable to SQL injection if not for a CDN layer that's doing pre-filtering.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
stingraycharles3 days ago
Yup. Were a database company that needs to be compliant with SOC2, and I’ve had extremely long and tiring arguments with our auditor why we couldn’t adhere to some of these standard WAF rulesets because it broke our site (we allow people to spin up a demo env and trigger queries).
We changed auditors after that.
NaN years ago
undefined
kiitos3 days ago
> I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
I might be out of the loop here, but it seems to me that any WAF that's triggered when the string "/etc/hosts" is literally anywhere in the content of a requested resource, is pretty obviously broken.
NaN years ago
undefined
RKFADU_UOFCCLEL3 days ago
There's no "trade-off" here. Blocking IPs that send "1337 h4x0r buzzword /etc/passwd" in it is completely naive and obtrusive, which is the modus operandi of the CDN being discussed here. There are plenty of other ways of hosting a website.
julik3 days ago
This is what surprises me in this story. I could not, at first glance, assume that either Substack people or Cloudflare people were incompetent.
Oh: I resisted tooth and nail about turning on a WAF at one of my gigs (there was no strict requirement for it, just cargo cult). Turns out - I was right.
coldpie3 days ago
> There is a temptation to just turn the rules off
Definitely, though I have seen other solutions, like inserting non-printable characters in the problematic strings (e.g. "/etc/ho<b></b>sts" or whatever, you get the idea). And honestly that seems like a reasonable, if somewhat annoying, workaround to me that still retains the protections.
NaN years ago
undefined
lsofzz2 days ago
> The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content. It's not just Cloudflare, Akamai has the same problem.
I agree. There is a business opportunity here. Right in the middle of your sentences.
Hint: Context-Aware WAF.
Many platforms have emerged in the last decade - some called it smart WAF, some called it nextgen WAF.. All vaporware garbage that consumes tons and tons of system resource and still manages to do a shit job of _actually_ WAF'ing web requests.
To be truly context-aware, you need to compute a priori about the situation - the user, the page, the interactions etc.
oakwhiz3 days ago
I've had the issue where filling out form fields for some company website triggers a WAF and then nobody in the company is able to connect me to the responsible party who can fix the WAF rules. So I'm just stuck.
matt-p2 days ago
In my experience the pain of false positives required to outweigh the "WAF is best practice" is just very very heigh. Most big businesses would rather lose/frustrate a small percentage of customers, to be "safe".
chrisjj2 days ago
> The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content
They shouldn't be doing that job at all. The content of user data is none of their business.
_blk3 days ago
100!
[good] security just doesn't work as a mixing pattern...
I'm not saying it's necessarily bad to use those additional protections but they come with severe limitations so the total value (as in cost/benefit) is hard to gauge.
gfiorav3 days ago
I agree. From a product perspective, I would also support the decision. Should we make the rules more complex by default, potentially overlooking SQL injection vulnerabilities? Or should we blanket prohibit anything that even remotely resembles SQL, allowing those edge cases to figure it out?
I favor the latter approach. That group of Cloudflare users will understand the complexity of their use case accepting SQL in payloads and will be well-positioned to modify the default rules. They will know exactly where they want to allow SQL usage.
From Cloudflare’s perspective, it is virtually impossible to reliably cover every conceivable valid use of SQL, and it is likely 99% of websites won’t host SQL content.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
netsharc3 days ago
Reminds me of an anecdote about an e-commerce platform: someone coded a leaky webshop, so their workaround was to watch if the string "OutOfMemoryException" shows up in the logs, and then restart the app.
Another developer in the team decided they wanted to log what customers searched for, so if someone typed in "OutOfMemoryException" in the search bar...
PhilipRoman3 days ago
Careless analysis of free-form text logs is an underrated way to exploit systems. It's scary how much software blindly logs data without out of band escaping or sanitizing.
NaN years ago
undefined
skipants3 days ago
I've actually gone through this a few times with our WAF. A user got IP-banned because the WAF thought a note with the string "system(..." was PHP injection.
Y_Y3 days ago
Does it block `/etc//hosts` or `/etc/./hosts`? This is a ridiculous kind of whack-a-mole that's doomed to failure. The people who wrote these should realize that hackers are smarter and more determined than they are and you should only rely on proven security, like not executing untrusted input.
jrockway3 days ago
Yeah, and this seems like a common Fortune 500 mandatory checkbox. Gotta have a Web Application Firewall! Doesn't matter what the rules are, as long as there are a few. Once I was told I needed one to prevent SQL injection attacks... against an application that didn't use an SQL database.
If you push back you'll always get a lecture on "defense in depth", and then they really look at you like you're crazy when you suggest that it's more effective to get up, tap your desk once, and spin around in a circle three times every Thursday morning. I don't know... I do this every Thursday and I've never been hacked. Defense in depth, right? It can't hurt...
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
nickdothutton3 days ago
See "enumerating badness" as a losing strategy. I knew this was a bad idea about 5 minutes after starting my first job in 1995.
tom13373 days ago
Well I've just created an account on substack to test this but turns out they've already fixed the issue (or turned off their WAF completely)
augusto-moura3 days ago
How would that be hard? Getting the absolute path of a string is in almost all languages stdlibs[1]. You can just grep for any string containing slashes and try resolve them and voilá
Resolving wildcards is trickier but definitely possible if you have a list of forbidden files
Edit: changed link because C's realpath has a slightly different behavior
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
eli3 days ago
Is a security solution worthless if it can't stop a dedicated attacker? A lot of WAF rules are blocking probes from off-the-shelf vulnerability scanners.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
mystifyingpoi3 days ago
No one expects any WAF to be a 100% solution that catches all exfiltration attempts ever, and it should not be treated this way. But having it is generally better than not having it.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
simonw3 days ago
"How could Substack improve this situation for technical writers?"
How about this: don't run a dumb as rocks Web Application Firewall on an endpoint where people are editing articles that could be about any topic, including discussing the kind of strings that might trigger a dumb as rocks WAF.
This is like when forums about web development implement XSS filters that prevent their members from talking about XSS!
Learn to escape content properly instead.
awoimbee2 days ago
I'm in the position where I have to run a WAF to pass security certifications.
The only open source WAFs are modsecurity and it's beta successor, coraza.
These things are dumb, they just use OWASP's coreruleset which is a big pile of unreadable garbage.
serial_dev3 days ago
Surprisingly simple solution
ZeroTalent3 days ago
hire a cybersec person. I don't think they one.
blenderob3 days ago
> This case highlights an interesting tension in web security: the balance between protection and usability.
But it doesn't. This case highlights a bug, a stupid bug. This case highlights that people who should know better, don't!
The tension between security and usability is real but this is not it. Tension between security and usability is usually a tradeoff. When you implement good security that inconveniences the user. From simple things like 2FA to locking out the user after 3 failed attempts. Rate limiting to prevent DoS. It's a tradeoff. You increase security to degrade user experience. Or you decrease security to increase user experience.
This is neither. This is both bad security and bad user experience. What's the tension?
myflash133 days ago
I would say it’s a useful security practice in general to apply WAF as a blanket rule to all endpoints and then remove it selectively when issues like this occur. It’s much, much, harder to evaluate every single public facing endpoint especially when hosting third party software like Wordpress with plugins.
crabbone2 days ago
Precisely.
This also reminded me, I think in the PHP 3 era, PHP used to "sanitize" the contents of URL requests to blanket combat SQL injections, or perhaps, it was a configuration setting that would be frequently turned on in shared hosting services. This, of course, would've been very soon discovered by the authors of the PHP site and various techniques were employed to circumvent this restriction, overall giving probably even worse outcomes than if the "sanitation" wasn't there to begin with.
NaN years ago
undefined
SonOfLilit3 days ago
After having been bitten once (was teaching a competitive programming team, half the class got a blank page when submitting solutions, after an hour of debugging I narrowed it down to a few C++ types and keywords that cause 403 if they appear in the code, all of which happen to have meaning in Javascript), and again (working for a bank, we had an API that you're supposed to submit a python file to, and most python files would result in 403 but short ones wouldn't... a few hours of debugging and I narrowed it down to a keyword that sometimes appears in the code) and then again a few months later (same thing, new cloud environment, few hours burned on debugging[1]), I had the solution to his problem in mind _immediately_ when I saw the words "network error".
[1] the second time it happened, a colleague added "if we got 403, print "HAHAHA YOU'VE BEEN WAFFED" to our deployment script, and for that I am forever thankful because I saw that error more times than I expected
simonw3 days ago
Do you remember if that was Cloudflare or some other likely WAF?
NaN years ago
undefined
netsharc3 days ago
+++ATH
pimanrules3 days ago
We faced a similar issue in our application. Our internal Red Team was publishing data with XSS and other injection attack attempts. The attacks themselves didn't work, but the presence of these entries caused our internal admin page to stop loading because our corporate firewall was blocking the network requests with those payloads in them. So an unsuccessful XSS attack became an effective DoS attack instead.
darkwater3 days ago
This is funny and sad at the same time.
mrgoldenbrown3 days ago
Everything old is new again :) We used to call this the Scunthorpe problem.
I remember back in the old days on the Eve Online forums when the word cockpit would always turn up as "c***pit". I was quite amused by that.
NaN years ago
undefined
greghendershott3 days ago
See also: Recent scrubbing US government web sites for words like "diversity", "equity", and "inclusion".
Writing about biology, finance, or geology? Shrug.
Dumb filtering is bad enough when used by smart people with good intent.
NaN years ago
undefined
odirf2 days ago
It is time to add the Substack case to this Wikipedia article.
reverendsteveii3 days ago
"I wonder why it's called Scunthorpe....?"
sits quietly for a second
"Oh nnnnnnnooooooooooooooo lol!"
petercooper3 days ago
I ran into a similar issue with OpenRouter last night. OpenRouter is a “switchboard” style service that provides a single endpoint from which you can use many different LLMs. It’s great, but last night I started to try using it to see what models are good at processing raw HTML in various ways.
It turns out OpenRouter’s API is protected by Cloudflare and something about specific raw chunks of HTML and JavaScript in the POST request body cause it to block many, though not all, requests. Going direct to OpenAI or Anthropic with the same prompts is fine. I wouldn’t mind but these are billable requests to commercial models and not OpenRouter’s free models (which I expect to be heavily protected from abuse).
esafak3 days ago
Did you report it?
NaN years ago
undefined
jmmv3 days ago
I encountered this a while ago and it was incredibly frustrating. The "Network error" prevented me from updating a post I had written for months because I couldn't figure out why my edits (which extended the length and which I assumed was the problem) couldn't get through.
Trying to contact support was difficult too due to AI chatbots, but when I finally did reach a human, their "tech support" obviously didn't bother to look at this in any reasonable timeframe.
It wasn't until some random person on Twitter suggested the possibility of some magic string tripping over some stupid security logic that I found the problem and could finally edit my post.
robertlagrant3 days ago
> This case highlights an interesting tension in web security: the balance between protection and usability.
This isn't a tension. This rule should not be applied at the WAF level. It doesn't know that this field is safe from $whatever injection attacks. But the substack backend does. Remove the rule from the WAF (and add it to the backend, where it belongs) and you are just as secure and much more usable. No tension.
myflash133 days ago
I would say it’s a decent security practice to apply WAF as a blanket rule to all endpoints and then remove it selectively when issues like this occur. It’s much, much, harder to evaluate every single public facing endpoint especially when hosting third party software like Wordpress with plugins.
NaN years ago
undefined
worewood3 days ago
There is a tension, but it's between paying enough to developers to actually produce decent code or pay a 3rd-party to firewall the application.
Content filtering should be highly context dependent. If the WAF is detached from what it's supposed to filter, this happens. If the WAF doesn't have the ability to discern between command and content contexts, then the filtering shouldn't be done via WAF.
This is like spam filtering. I'm an anti-spam advocate, so the idea that most people can't discuss spam because even the discussion will set off filters is quite old to me.
People who apologize for email content filtering usually say that spam would be out of control if they didn't have that in place, in spite of no personal experience on their end testing different kinds of filtering.
My email servers filter based on the sending server's configuration: does the EHLO / HELO string resolve in DNS? Does it resolve back to the connecting IP? Does the reverse DNS name resolve to the same IP? Does the delivery have proper SPF / DKIM? Et cetera.
My delivery-based filtering works worlds better than content-based filtering, plus I don't have to constantly update it. Each kind has advantages, but I'd rather occasional spam with no false positives than the chance I'm blocking email because someone used the wrong words.
With web sites and WAF, I think the same applies, and I can understand when people have a small site and don't know or don't have the resources to fix things at the actual content level, but the people running a site like Substack really should know better.
Anamon1 day ago
Yes to smart filtering at the right layer. The whole reverse DNS checks et al. are so effective. I recently moved my personal mailbox from a host who didn't do these kinds of checks to one that does. My received spam volume instantly went from about 20 a day (across all my aliases) to less than 1 a week.
myflash133 days ago
SPF and DKIM are now more commonly implemented correctly by spammers than by major email providers.
Few years ago I had an application that allowed me to set any password, but then gave mysterious errors when I tried to use that password to login. Took me a bit to figure out what was going on, but their WAF blocked my "hacking attempt" of using a ' in the password.
The same application also stored my full password in localStorage and a cookie (without httponly or secure). Because reasons. Sigh.
I'm going to do a hot take and say that WAFs are bollocks mainly used by garbage software. I'm not saying a good developer can't make a mistake and write a path traversal, but if you're really worried about that then there are better ways to prevent that than this approach which obviously is going to negatively impact users in weird and mysterious ways. It's like the naïve /(fuck|shit|...)/g-type "bad word filter". It shows a fundamental lack of care and/or competency.
Aside: is anyone still storing passwords in /etc/passwd? Storing the password in a different root-only file (/etc/shadow, /etc/master.passwd, etc.) has been a thing on every major system since the 90s AFAIK?
tlb3 days ago
It's more that /etc/hosts and /etc/passwd are good for testing because they always exist with predictable contents on almost every system. If you inject "cat /etc/passwd" to various URLs you can grep for "root:" to see if it worked.
So it's really blocking doorknob-twisting scripts.
NaN years ago
undefined
reverendsteveii3 days ago
my bank requires non-alphanumeric characters in their passwords but will reject a password if it has alphanumeric characters it associates with command injection attacks.
as far as WAFs being garbage, they absolutely are, but this is a great time for a POSIWID analysis. A WAF says its purpose is to secure web apps. It doesn't do that, but people keep buying them. Now we're faced with a crossroads: we either have to assume that everyone is stupid or that the actual purpose of a WAF is something other than its stated purpose. I personally only assume stupidity as a last resort. I find it lazy and cynical, and it's often used to dismiss things as hopeless when they're not actually hopeless. To just say "Oh well, people are dumb" is a thought-terminating cliche that ignores potential opportunities. So we do the other thing and actually take some time to think about who decides to put a WAF in-place and what value it adds for them. Once you do that, you see myriad benefits because a WAF is a cheap, quick solution that allows non-technical people to say they're doing something. You're the manager of a finance OU that has a development group in it whose responsibility is some small web app. Your boss just read an article about cyber security and wants to know what this group two levels below you is doing about cyber security. Would you rather come back with "We're gonna need a year, $1 million and every other dev priority to be pushed back in order to develop a custom solution" or "We can have one fired up tomorrow for $300/mo, it's developed and supported by Microsoft and it's basically industry standard." The negative impact of these things is obvious to us because this is what we do, but we're not always the decision-makers for stuff like that. Often the decision-makers are actually that naive and/or they're motivated less by the ostensible goal of better web app security and more by the goal of better job security.
As far as etc/passwd you're right that passwords don't live there anymore but user IDs often do and those can indicate which services are running as daemons on a given system. This is vital because if you can figure out what services are running you can start version fingerprinting them and then cross-referencing those versions with the CVE database.
Osiris3 days ago
I understand applying path filters in URLS and search strings, but I find it odd that they would apply the same rules to request body content, especially content encoded as valid JSON, and especially for a BLOG platform where the content would be anything.
dvorack1013 days ago
Indeed a severe case of paranoia?
1. Create a new post.
2. Include an Image, set filter to All File types and select "/etc/hosts".
3. You get served with an weird error message box displacing a weird error message.
4. After this the Substack posts editor is broken. Heck, every time i access the Dashboard, it waits forever to build the page.
where the referenced files contain the usual list of *nix suspects including the offending filename (lfi-os-files.data, "local file inclusion" attacks)
The advantage (whack-a-mole notwithstanding) of a WAF is it orders of magnitude easier to tweak WAF rules than upgrade say, Weblogic, or other teetering piles of middleware.
NaN years ago
undefined
Habgdnv2 days ago
I have a lifetime Pastebin account that I hadn't used for some years. Last year I enrolled in a "linux administration" class and tried to use that pastebin (famous for sharing code) to share some code/configurations with other students. When I tried to paste my homework I kept getting a Cloudflare error page. I don't even remember what I was pasting, but it was normal linux stuff. I contacted pastebin support - of course I got ghosted.
I am sharing this in relation to the WAF comments and how much the companies implementing WAF care about your case.
As a card carrying Substack hater, I’m not suprised.
> "How could Substack improve this situation for technical writers?"
They don’t care about (technical) writers. All they care about is building a TikTok clone to “drive discoverability” and make the attention-metrics go up. Chris Best is memeing about it on his own platform. Very gross.
paulpauper3 days ago
Reminds me of Slashdot and breaking the page by widening it with certain characters
donatj3 days ago
We briefly had a WAF forced upon us and it caused so many problems like this we were able to turn it off, for now. I'm sure it'll be back.
jkrems3 days ago
Could this be trivially solved client-side by the editor if it just encoded the slashes, assuming it's HTML or markdown that's stored? Replacing `/etc/hosts` with `/etc/hosts` for storage seems like an okay workaround. Potentially even doing so for anything that's added to the WAF rules automatically by syncing the rules to the editor code.
vintermann3 days ago
That reminds me of issues I once had with Microsoft's boneheaded WAF. We had base64 encoded data in a cookie, and whenever certain particular characters were produced next to each other in the data - I think the most common was "--" - the WAF would tilt and stop the "attempted SQL injection attack". So every so often someone would get an illegal login cookie and just get locked out of the system until they deleted it or it expired. Took a while to find out what went wrong, and even longer to figure out how to remove the more boneheaded rules from the WAF.
aidog3 days ago
It's something I ran into quite a few times in my career. It's a weird call to get if the client can't save their cms site, due to typing something harmless. I think worst was when there was a dropdown that I defined which had a value in the mod rules that was not allowed.
sfoley3 days ago
I cannot reproduce this.
halffullbrain3 days ago
At least, in this case, the WAF in question had the decency to return 403.
I've worked with a WAF installation (totally different product), where the "WAF fail" tell was HTTP status 200 (!) and "location: /" (and some garbage cookies), possibly to get browsers to redirect using said cookies. This was part of the CSRF protection.
Other problems were with "command injection"-patterns (like in the article, expect with specific Windows commands, too - they clash with everyday words which the users submit), and obviously SQL injections which cover some relevant words, too.
The bottom line is that WAFs in their "hardened/insurance friendly" standard configs are set up to protect the company from amateurs exposing buggy, unsupported software or architectures. WAF's are useful for that, but you still gave all the other issues with buggy, unsupported software.
As others have written, WAFs can be useful to protect against emerging threats, like we saw with the log4j exploit which CloudFlare rolled out protection for quite fast.
Unless you want compliance more than customers, you MUST at least have a process to add exceptions to "all the rules"-circus they put in front of the buggy apps.
Whack-a-mole security filtering is bad, but whack-a-mole relaxation rule creation against an unknown filter is really tiring.
Too3 days ago
Almost equally fun are the ones that simply drop the connection and leave you waiting for a timeout.
0xDEAFBEAD3 days ago
Weird idea: What if user content was stored and transmitted encrypted by default? Then an attacker would have to either (a) identify a plaintext which encrypts to an attack ciphertext (annoying, and also you could keep your WAF rules operational for the ciphertext, with minimal inconvenience to users) or (b) attack the system when plaintext is being handled (could still dramatically reduce attack surface).
nicoledevillers3 days ago
it was a cf managed waf rule for a vulnerability that doesn't apply to us. we've disabled it.
SonOfLilit3 days ago
This comment deserves to be much higher, assuming this user speaks for Substack (no previous submissions or comments, but the comment implies it).
NaN years ago
undefined
silverwind3 days ago
Why not review rules before applying them?
badgersnake3 days ago
Seems like a case of somebody installing something they couldn’t be bothered to understand to tick a box marked security.
The outcome is the usual one, stuff breaks and there is no additional security.
thayne3 days ago
As soon as I saw the headline, I knew this was due to a WAF.
I worked on a project where we had to use a WAF for compliance reasons. It was a game of wack-a-mole to fix all the places where standard rules broke the application or blocked legitimate requests.
One notable, and related example is any request with the string "../" was blocked, because it might be a path traversal attack. Of course, it is more common that someone just put a relative path in their document.
teddyh3 days ago
> For now, I'll continue using workarounds like "/etc/h*sts" (with quotes) or alternative spellings when discussing system paths in my Substack posts.
Ahh, the modern trend of ”unalived”¹ etc. comes to every corner of society eventually.
This reminds me of that time I was discussing with friends about something we did in our computer science class that day and I realised writing toString in the Whatsapp client for macOS would crash the application. At the time I didn’t have the skills to understand why so I recorded the bug on my phone to share with friends :)
sudb3 days ago
I had a problem recently trying to send LLM-generated text between two web servers under my control, from AWS to Render - I was getting 403s for command injection from Render's Cloudflare protection which is opaque and unconfigurable to users.
The hacky workaround which has been stably working for a while now was to encode the offending request body and decode it on the destination server.
godelski3 days ago
I don't get it. Why aren't those files just protected so they have no read or write permissions? Isn't this like the standard way to do things? Put the blog in a private user space with minimal permissions.
Why would random text be parsed? I read the article but this doesn't make sense to me. They suggested directory transversal but your text shouldn't have anything to do with that and transversal is solved by permission settings
tryauuum3 days ago
this is the usual approach with web application firewalls, block all the 100500 known attacks. Doesn't matter if they are not applicable to your website. Some of them are obviously OS-depended (having .exe in the URLs) but it doesn't matter, it's blocked just in case
I do understand this appoach. From the defence point of view it makes sense, if you have to create a solution to protect millions of websites it doesn't make sense to tailor it to specifics of a single one
NaN years ago
undefined
NaN years ago
undefined
chrisjj2 days ago
> Substack's filter is well-intentioned - protecting their platform from potential attacks.
There is sadly no evidence in this article that the supposed filter does protect the platform from potential attacks.
driverdan3 days ago
This is a common problem with WAFs and, more specifically, Cloudflare's default rulesets. If your platform has content that is remotely technical you'll end up triggering some rules. You end up needing a test suite to confirm your real content doesn't trigger the rules and if it does you need to disable them.
swyx3 days ago
substack also does wonderful things like preserve weird bullet points, lack code block displays, and make it impossible to customize the landing page of your site beyond the 2 formats they give you.
generally think that Substack has done a good thing for its core audience of longform newsletter writer creators who want to be Ben Thompson. however its experience for technical people, for podcasters, for people who want to start multi-channel media brands, and for people who write for reach over revenue (but with optional revenue) has been really poor. (all 4 of these are us with Latent.Space). I've aired all these complaints with them and theyve done nothing, which is their prerogative.
i'd love for "new Substack" to emerge. or "Substack for developers".
He gave a talk on it at WordCamp Asia at the start of last year, although I haven’t heard of any progress recently on it.
NaN years ago
undefined
gitroom2 days ago
The amount of headaches I've had from WAFs blocking legit stuff is unreal. I just wish the folks turning those rules on had to use them for a week themselves.
paxys3 days ago
This isn't a "security vs usability" trade-off as the author implies. This has nothing to do with security at all.
/etc/hosts
See, HN didn't complain. Does this mean I have hacked into the site? No, Substack (or Cloudflare, wherever the problem is) is run by people who have no idea how text input works.
gav3 days ago
It's more so that Cloudflare has a WAF product that checks a box for security and makes people who's job it is to care about boxes being checked happy.
For example, I worked with a client that had a test suite of about 7000 or so strings that should return a 500 error, including /etc/hosts and other ones such as:
We "failed" and were not in compliance as you could make a request containing one of those strings--ignoring that neither Apache, SQL, or Windows were in use.
We ended up deploying a WAF to block all these requests, even though it didn't improve security in any meaningful way.
NaN years ago
undefined
NaN years ago
undefined
NaN years ago
undefined
orlp3 days ago
This is like banning quotes from your website to 'solve' SQL injection...
NaN years ago
undefined
mystifyingpoi3 days ago
> is run by people who have no idea how text input works
That's a very uncharitable view. It's far more likely that they are simply using some WAF with sane defaults and never caught this. They'll fix it and move on.
NaN years ago
undefined
macspoofing3 days ago
My thought exactly - this isn't an example of balance between "security vs usability" - this is just wrong behaviour.
eli3 days ago
It's a text string that is frequently associated with attacks and vulnerabilities. In general you want your WAF to block those things. This is indeed the point of a WAF. Except you also don't want it to get in the way of normal functionality (too much). That is what the security vs usability trade off is.
This particular rule is obviously off. I suspect it wasn't intended to apply to the POST payload of user content. Perhaps just URL parameters.
On a big enough website, users are doing weird stuff all the time and it can be tricky to write rules that stop the traffic you don't want while allowing every oddball legitimate request.
NaN years ago
undefined
bastawhiz2 days ago
I once helped maintain some PHP software that was effectively a CMS. You'd drop a little PHP snippet into any page (e.g., that you make with Dreamweaver) and it would automatically integrate it with the CMS functionality.
We had unending trouble with mod_security. The worst issue I can remember was that any POST request whose body contained the word "delete" was automatically rejected. That was the full rule. To this day I still can't imagine what the developers were thinking.
toogan2 days ago
The title would be improved with "Writing the string ...". I first read it as "Writing the file" which was pretty weird.
TRiG_Ireland1 day ago
It's in quotation marks, which I'd say makes it clear enough for most people.
lofaszvanitt3 days ago
Using a WAF is the strongest indicator that someone doesn't know what's happening and where or something underneath is smelly and leaking profusely.
mike-cardwell3 days ago
Just rot13 any request data using javascript before posting, and rot13 it again on the server side. Problem solved. (jk)
ChrisArchitect3 days ago
Just tried to post a tweet with this article title and link and got a similar error (on desktop twitter.com). Lovely.
skybrian3 days ago
Did anyone try reporting this to Substack?
iefbr143 days ago
So "/etc/h*sts" is not stopped by the filters? Nice to know for the hackers :)
righthand3 days ago
Similar:
Writing `find` as the first word in your search will prevent Firefox from accepting the “return” key is pressed.
Pretty annoying.
apetresc3 days ago
I can't reproduce this; is it still the case, or some ancient thing?
NaN years ago
undefined
jandrese3 days ago
Are you sure you don't have a custom search rule configured in Firefox? I just tried this on my local instance and there was no problem.
righthand2 days ago
EDIT: Apparently this is caused by the “findplus” extension. Removed!
nottorp3 days ago
So everyone should start looking for vulnerabilities in the substack site?
If that's their idea of security...
stefs3 days ago
this feels like blocking terms like "null" or "select" just because you failed to properly parameterize your SQL queries.
HenryBemis3 days ago
Aaaahh they are trying to prevent a Little Bobby Tables story..
t1234s3 days ago
writing "bcc: someone@email.com" sometimes triggers WAF rules
julik3 days ago
Ok so: there is a blogging/content publishing engine, which is somewhat of a darling of the startup scene. There is a cloud hosting company with a variety of products, which is an even dearer darling of the startup scene. Something is posted on the blobbing/content publishing engine that clearly reveals that
* The product provided for blogging/content publishing did a shitty job of configuring WAF rules for its use cases (the utility of a "magic WAF that will just solve all your problems" being out of the picture for now)
* The WAF product provided by the cloud platform clearly has shitty, overreaching rules doing arbitrary filtering on arbitrary strings. That filtering absolutely can (and will) break unrelated content if the application behind the WAF is developed with a modicum of security-mindedness. You don't `fopen()` a string input (no, I will not be surprised - yes, sometimes you do `fopen()` a string input - when you are using software that is badly written).
So I am wondering:
1. Was this sent to Substack as a bug - they charge money for their platform, and the inability to store $arbitrary_string on a page you pay for, as a user, is actually a malfunction and disfunction"? It might not be the case "it got once enshittified by a CIO who mandated a WAF of some description to tick a box", it might be the case "we grabbed a WAF from our cloud vendor and haven't reviewed the rules because we had no time". I don't think it would be very difficult for me, as an owner/manager at the blogging platform, to realise that enabling a rule filtering "anything that resembles a Unix system file path or a SQL query" is absolutely stupid for a blogging platform - and go and turn it the hell off at the first user complaint.
2. Similarly - does the cloud vendor know that their WAF refuses requests with such strings in them, and do they have a checkbox for "Kill requests which have any character an Average Joe does not type more frequently than once a week"? There should be a setting for that, and - thinking about the cloud vendor in question - I can't imagine the skill level there would be so low as to not have a config option to turn it off.
So - yes, that's a case of "we enabled a WAF for some compliance/external reasons/big customer who wants a 'my vendor uses a WAF' on their checklist", but also the case of "we enabled a WAF but it's either buggy or we haven't bothered to configure it properly".
To me it feels like this would be 2 emails first ("look, your thing <X> that I pay you money for clearly and blatantly does <shitty thing>, either let me turn it off or turn it off yourself or review it please") - and a blog post about it second.
curtisszmania3 days ago
[dead]
selfselfgo3 days ago
[dead]
chaitrack3 days ago
[dead]
untill3 days ago
[flagged]
eli3 days ago
You figured all that out just because the headers indicate the site passed through Cloudflare at one point? That's quite a leap!
If Cloudflare had a default rule that made it impossible to write that string on any site with their WAF, wouldn't this be a lot more widespread? Much more likely someone entered a bad rule into Cloudflare, or Cloudflare isn't involved in that rule at all.
netsharc3 days ago
Huh, a bit like "adult-content" filters that would censor Scunthorpe or Wikipedia articles about genitals, maybe Cloudflare saw a market to sell protection for donkeys who can't protect their webapps from getting request-injected.
rainforest3 days ago
I think Cloudflare WAF is a good product compared to other WAFs - by definition a WAF is intended to layer on validation that properly built applications should be doing, so it's sort of expected that it would reject valid potentially harmful content.
I think you can fairly criticise WAF products and the people who advocate for them (and created the need for them) but I don't think the CF team responsible can really be singled out.
kccqzy3 days ago
Unfortunately this is probably a case where the market demands stupidity. The quality engineers don't have a say over market forces.
arnaudsm3 days ago
These WAF features are older than LLMs & vibe coding.
tester7563 days ago
Who knows how many attacks such a "stupid" thing blocks every month?
Sharo20253 days ago
[flagged]
0xbadcafebee3 days ago
Worth noting that people here are assuming that the author's assumption is correct, that his writing /etc/hosts is causing the 403, and that this is either a consequence of security filtering, or that this combination of characters at all that's causing the failure. The only evidence he has, is he gets back a 403 forbidden to an API request when he writes certain content. There's a thousand different things that could be triggering that 403.
It's not likely to be a WAF or content scanner, because the HTTP request is using PUT (which browser forms don't use) and it's uploading the content as a JSON content-type in a JSON document. The WAF would have to specifically look for PUTs, open up the JSON document, parse it, find the sub-string in a valid string, and reject it. OR it would have to filter raw characters regardless of the HTTP operation.
Neither of those seem likely. WAFs are designed to filter on specific kinds of requests, content, and methods. A valid string in a valid JSON document uploaded by JavaScript using a JSON content-type is not an attack vector. And this problem is definitely not path traversal protection, because that is only triggered when the string is in the URL, not some random part of the content body.
apetresc3 days ago
It sure looks like the author did his due diligence; he has a chart of all the different phrases in the payload which triggered the 403 and they all corresponded to paths to common UNIX system configuration files.
Nobody could prove that's exactly what's happening without seeing Cloudflare's internal WAF rules, but can you think of any other reasonable explanation? The endpoint is rejecting a PUT who's payload contains exactly /etc/hosts, /etc/passwd, or /etc/ssh/sshd_config, but NOT /etc/password, /etc/ssh, or /etc/h0sts. What else could it be?
NaN years ago
undefined
ryandrake3 days ago
If you change a single string in the HTTP payload and it works, what other explanation makes sense besides a text scanner somewhere along the path to deploying the content?
You're being downvoted because WAFs work exactly like this, and it's intentional and their vendors think this is a good thing. A WAF vendor would say that a WAF parsing JSON makes it weaker.
The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content. It's not just Cloudflare, Akamai has the same problem.
If your site discusses databases then turning on the default SQL injection attack prevention rules will break your site. And there is another ruleset for file inclusion where things like /etc/hosts and /etc/passwd get blocked.
I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
Fine tuning the rules is time consuming. You often have to just completely turn off the ruleset because when you try to keep the ruleset on and allow the use-case there are a ton of changes you need to get implemented (if its even possible). Page won't load because /etc/hosts was in a query param? Okay, now that you've fixed that, all the XHR included resources won't load because /etc/hosts is included in the referrer. Now that that's fixed things still won't work because some random JS analytics lib put the URL visited in a cookie, etc, etc... There is a temptation to just turn the rules off.
> I disagree with other posts here, it is partially a balance between security and usability.
And economics. Many people here are blaming incompetent security teams and app developers, but a lot of seemingly dumb security policies are due to insurers. If an insurer says "we're going to jack up premiums by 20% unless you force employees to change their password once every 90 days", you can argue till you're blue in the face that it's bad practice, NIST changed its policy to recommend not regularly rotating passwords over a decade ago, etc., and be totally correct... but they're still going to jack up premiums if you don't do it. So you dejectedly sigh, implement a password expiration policy, and listen to grumbling employees who call you incompetent.
It's been a while since I've been through a process like this, but given how infamous log4shell became, it wouldn't surprise me if insurers are now also making it mandatory that common "hacking strings" like /etc/hosts, /etc/passwd, jndi:, and friends must be rejected by servers.
Not just economics, audit processes also really encourage adopting large rulesets wholesale.
We're SOC2 + HIPAA compliant, which either means convincing the auditor that our in-house security rules cover 100% of the cases they care about... or we buy an off-the-shelf WAF that has already completed the compliance process, and call it a day. The CTO is going to pick the second option every time.
undefined
undefined
I wish IT teams would say "sorry about the password requirement, it's required by our insurance policy". I'd feel a lot less angry about stupid password expiration rules if they told me that.
undefined
undefined
> but a lot of seemingly dumb security policies are due to insurers.
I keep hearing that often on HN, however I've personally never seen seen such demands from insurers. I would greatly appreciate if one share such insurance policy. Insurance policies are not trade secrets and OK to be public. I can google plenty of commercial cars insurance policies for example.
undefined
undefined
undefined
undefined
There should be some limits and some consequences to the insurer as well. I don't think the insurer is god and should be able to request anything no matter if it makes sense or not and have people and companies comply.
If anything, I think this attitude is part of the problem. Management, IT security, insurers, governing bodies, they all just impose rules with (sometimes, too often) zero regard for consequences to anyone else. If no pushback mechanism exists against insurer requirements, something is broken.
undefined
undefined
undefined
Having worked with PCI-DSS, some rules seem to only exist to appease insurance. When criticising decisions, you are told that passing audits to be able to claim insurance is the whole game, even when you can demonstrate how you can bypass certain rules in reality. High-level security has more to do with politics (my definition) than purely technical ability. I wouldn't go as far as to call it security theatre, there's too much good stuff there that many don't think about without having a handy list, but the game is certainly a lot bigger than just technical skills and hacker vs anti-hacker.
I still have a nervous tick from having a screen lock timeout "smaller than or equal to 30 seconds".
> If an insurer says "we're going to jack up premiums by 20% unless you force employees to change their password once every 90 days", you can argue till you're blue in the face that it's bad practice, NIST changed its policy to recommend not regularly rotating passwords over a decade ago, etc., and be totally correct... but they're still going to jack up premiums if you don't do it.
I would argue that password policies are very context dependent. As much as I detest changing my password every 90 days, I've worked in places where the culture encouraged password sharing. That sharing creates a whole slew of problems. On top of that, removing the requirement to change passwords every 90 days would encourage very few people to select secure passwords, mostly because they prefer convenience and do not understand the risks.
If you are dealing with an externally facing service where people are willing to choose secure passwords and unwilling to share them, I would agree that regularly changing passwords creates more problems than it solves.
undefined
> jack up premiums by 20% unless you force employees to change their password once every 90 days"
Always made me judge my company's security teams as to why they enable this stupidity. Thankfully they got rid of this gradually, nearly 2 years ago now (90 days to 365 days to never). New passwords were just one key l/r/u/d on the keyboard.
Now I'm thinking maybe this is why the app for a govt savings scheme in my country won't allow password autofill at all. Imagine expecting a new password every 90 days and not allowing auto fill - that just makes passwords worse.
I'm no expert, but I did take a CISSP course a while ago. One thing I actually remember ;P, is that it recommended long passwords in in lieu of the number, special character, upper, lower ... I don't remember the exact wording of course and maybe it did recommend some of that, but it talked about having a sentence rather than all that mess in 6-8 characters, but many sites still want the short mess that I never will actually remember
undefined
undefined
I believe that these kind of decisions are mostly downstream of security audits/consultants with varying level of up to date slideshows.
I believe that this is overall a reasonable approach for companies that are bigger than "the CEO knows everyone and trusted executives are also senior IT/Devs/tech experts" and smaller than "we can spin an internal security audit using in-house resources"
Why wouldn't the IT people just tell the grumbling employees that exact explanation?
undefined
undefined
undefined
undefined
undefined
undefined
"You never know..." is the worst form of security, and makes systems less secure overall. Passwords must be changed every month, just to be safe. They must be 20 alphanumeric characters (with 5 symbols of course), just to be safe. We must pass every 3-letter compliance standard with hundreds of pages of checklists for each. The server must have WAF enabled, because one of the checklists says so.
Ask the CIO what actual threat all this is preventing, and you'll get blank stares.
As an engineer what incentive is there to put effort into knowing where each form input goes and how to sanitize it in a way that makes sense? You are getting paid to check the box and move on, and every new hire quickly realizes that. Organizations like these aren't focused on improving security, they are focused on covering their ass after the breach happens.
undefined
undefined
This looks like a variation of the Scunthorpe problem[1], where a filter is applied too naively, aggressively, and in this case, to the wrong content altogether. Applying the filter to "other stuff" sent to and among the servers might make sense, but there doesn't seem to be any security benefit to filtering actual text payload that's only going to be displayed as blog content. This seems like a pretty cut and dried bug to me.
1: https://en.wikipedia.org/wiki/Scunthorpe_problem
undefined
undefined
undefined
I don't get why you'd have SQL injection filtering of input fields at the CDN level. Or any validation of input fields aside from length or maybe some simple type validation (number, date, etc). Your backend should be able to handle arbitrary byte content in input fields. Your backend shouldn't be vulnerable to SQL injection if not for a CDN layer that's doing pre-filtering.
undefined
undefined
undefined
undefined
Yup. Were a database company that needs to be compliant with SOC2, and I’ve had extremely long and tiring arguments with our auditor why we couldn’t adhere to some of these standard WAF rulesets because it broke our site (we allow people to spin up a demo env and trigger queries).
We changed auditors after that.
undefined
> I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
I might be out of the loop here, but it seems to me that any WAF that's triggered when the string "/etc/hosts" is literally anywhere in the content of a requested resource, is pretty obviously broken.
undefined
There's no "trade-off" here. Blocking IPs that send "1337 h4x0r buzzword /etc/passwd" in it is completely naive and obtrusive, which is the modus operandi of the CDN being discussed here. There are plenty of other ways of hosting a website.
This is what surprises me in this story. I could not, at first glance, assume that either Substack people or Cloudflare people were incompetent.
Oh: I resisted tooth and nail about turning on a WAF at one of my gigs (there was no strict requirement for it, just cargo cult). Turns out - I was right.
> There is a temptation to just turn the rules off
Definitely, though I have seen other solutions, like inserting non-printable characters in the problematic strings (e.g. "/etc/ho<b></b>sts" or whatever, you get the idea). And honestly that seems like a reasonable, if somewhat annoying, workaround to me that still retains the protections.
undefined
> The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content. It's not just Cloudflare, Akamai has the same problem.
I agree. There is a business opportunity here. Right in the middle of your sentences.
Hint: Context-Aware WAF.
Many platforms have emerged in the last decade - some called it smart WAF, some called it nextgen WAF.. All vaporware garbage that consumes tons and tons of system resource and still manages to do a shit job of _actually_ WAF'ing web requests.
To be truly context-aware, you need to compute a priori about the situation - the user, the page, the interactions etc.
I've had the issue where filling out form fields for some company website triggers a WAF and then nobody in the company is able to connect me to the responsible party who can fix the WAF rules. So I'm just stuck.
In my experience the pain of false positives required to outweigh the "WAF is best practice" is just very very heigh. Most big businesses would rather lose/frustrate a small percentage of customers, to be "safe".
> The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content
They shouldn't be doing that job at all. The content of user data is none of their business.
100! [good] security just doesn't work as a mixing pattern... I'm not saying it's necessarily bad to use those additional protections but they come with severe limitations so the total value (as in cost/benefit) is hard to gauge.
I agree. From a product perspective, I would also support the decision. Should we make the rules more complex by default, potentially overlooking SQL injection vulnerabilities? Or should we blanket prohibit anything that even remotely resembles SQL, allowing those edge cases to figure it out?
I favor the latter approach. That group of Cloudflare users will understand the complexity of their use case accepting SQL in payloads and will be well-positioned to modify the default rules. They will know exactly where they want to allow SQL usage.
From Cloudflare’s perspective, it is virtually impossible to reliably cover every conceivable valid use of SQL, and it is likely 99% of websites won’t host SQL content.
undefined
undefined
undefined
Reminds me of an anecdote about an e-commerce platform: someone coded a leaky webshop, so their workaround was to watch if the string "OutOfMemoryException" shows up in the logs, and then restart the app.
Another developer in the team decided they wanted to log what customers searched for, so if someone typed in "OutOfMemoryException" in the search bar...
Careless analysis of free-form text logs is an underrated way to exploit systems. It's scary how much software blindly logs data without out of band escaping or sanitizing.
undefined
I've actually gone through this a few times with our WAF. A user got IP-banned because the WAF thought a note with the string "system(..." was PHP injection.
Does it block `/etc//hosts` or `/etc/./hosts`? This is a ridiculous kind of whack-a-mole that's doomed to failure. The people who wrote these should realize that hackers are smarter and more determined than they are and you should only rely on proven security, like not executing untrusted input.
Yeah, and this seems like a common Fortune 500 mandatory checkbox. Gotta have a Web Application Firewall! Doesn't matter what the rules are, as long as there are a few. Once I was told I needed one to prevent SQL injection attacks... against an application that didn't use an SQL database.
If you push back you'll always get a lecture on "defense in depth", and then they really look at you like you're crazy when you suggest that it's more effective to get up, tap your desk once, and spin around in a circle three times every Thursday morning. I don't know... I do this every Thursday and I've never been hacked. Defense in depth, right? It can't hurt...
undefined
undefined
undefined
See "enumerating badness" as a losing strategy. I knew this was a bad idea about 5 minutes after starting my first job in 1995.
Well I've just created an account on substack to test this but turns out they've already fixed the issue (or turned off their WAF completely)
How would that be hard? Getting the absolute path of a string is in almost all languages stdlibs[1]. You can just grep for any string containing slashes and try resolve them and voilá
Resolving wildcards is trickier but definitely possible if you have a list of forbidden files
[1]: https://nodejs.org/api/path.html#pathresolvepaths
Edit: changed link because C's realpath has a slightly different behavior
undefined
undefined
undefined
Is a security solution worthless if it can't stop a dedicated attacker? A lot of WAF rules are blocking probes from off-the-shelf vulnerability scanners.
undefined
undefined
undefined
undefined
No one expects any WAF to be a 100% solution that catches all exfiltration attempts ever, and it should not be treated this way. But having it is generally better than not having it.
undefined
undefined
undefined
undefined
undefined
undefined
undefined
"How could Substack improve this situation for technical writers?"
How about this: don't run a dumb as rocks Web Application Firewall on an endpoint where people are editing articles that could be about any topic, including discussing the kind of strings that might trigger a dumb as rocks WAF.
This is like when forums about web development implement XSS filters that prevent their members from talking about XSS!
Learn to escape content properly instead.
I'm in the position where I have to run a WAF to pass security certifications. The only open source WAFs are modsecurity and it's beta successor, coraza. These things are dumb, they just use OWASP's coreruleset which is a big pile of unreadable garbage.
Surprisingly simple solution
hire a cybersec person. I don't think they one.
> This case highlights an interesting tension in web security: the balance between protection and usability.
But it doesn't. This case highlights a bug, a stupid bug. This case highlights that people who should know better, don't!
The tension between security and usability is real but this is not it. Tension between security and usability is usually a tradeoff. When you implement good security that inconveniences the user. From simple things like 2FA to locking out the user after 3 failed attempts. Rate limiting to prevent DoS. It's a tradeoff. You increase security to degrade user experience. Or you decrease security to increase user experience.
This is neither. This is both bad security and bad user experience. What's the tension?
I would say it’s a useful security practice in general to apply WAF as a blanket rule to all endpoints and then remove it selectively when issues like this occur. It’s much, much, harder to evaluate every single public facing endpoint especially when hosting third party software like Wordpress with plugins.
Precisely.
This also reminded me, I think in the PHP 3 era, PHP used to "sanitize" the contents of URL requests to blanket combat SQL injections, or perhaps, it was a configuration setting that would be frequently turned on in shared hosting services. This, of course, would've been very soon discovered by the authors of the PHP site and various techniques were employed to circumvent this restriction, overall giving probably even worse outcomes than if the "sanitation" wasn't there to begin with.
undefined
After having been bitten once (was teaching a competitive programming team, half the class got a blank page when submitting solutions, after an hour of debugging I narrowed it down to a few C++ types and keywords that cause 403 if they appear in the code, all of which happen to have meaning in Javascript), and again (working for a bank, we had an API that you're supposed to submit a python file to, and most python files would result in 403 but short ones wouldn't... a few hours of debugging and I narrowed it down to a keyword that sometimes appears in the code) and then again a few months later (same thing, new cloud environment, few hours burned on debugging[1]), I had the solution to his problem in mind _immediately_ when I saw the words "network error".
[1] the second time it happened, a colleague added "if we got 403, print "HAHAHA YOU'VE BEEN WAFFED" to our deployment script, and for that I am forever thankful because I saw that error more times than I expected
Do you remember if that was Cloudflare or some other likely WAF?
undefined
+++ATH
We faced a similar issue in our application. Our internal Red Team was publishing data with XSS and other injection attack attempts. The attacks themselves didn't work, but the presence of these entries caused our internal admin page to stop loading because our corporate firewall was blocking the network requests with those payloads in them. So an unsuccessful XSS attack became an effective DoS attack instead.
This is funny and sad at the same time.
Everything old is new again :) We used to call this the Scunthorpe problem.
https://en.m.wikipedia.org/wiki/Scunthorpe_problem
I remember back in the old days on the Eve Online forums when the word cockpit would always turn up as "c***pit". I was quite amused by that.
undefined
See also: Recent scrubbing US government web sites for words like "diversity", "equity", and "inclusion".
Writing about biology, finance, or geology? Shrug.
Dumb filtering is bad enough when used by smart people with good intent.
undefined
It is time to add the Substack case to this Wikipedia article.
"I wonder why it's called Scunthorpe....?"
sits quietly for a second
"Oh nnnnnnnooooooooooooooo lol!"
I ran into a similar issue with OpenRouter last night. OpenRouter is a “switchboard” style service that provides a single endpoint from which you can use many different LLMs. It’s great, but last night I started to try using it to see what models are good at processing raw HTML in various ways.
It turns out OpenRouter’s API is protected by Cloudflare and something about specific raw chunks of HTML and JavaScript in the POST request body cause it to block many, though not all, requests. Going direct to OpenAI or Anthropic with the same prompts is fine. I wouldn’t mind but these are billable requests to commercial models and not OpenRouter’s free models (which I expect to be heavily protected from abuse).
Did you report it?
undefined
I encountered this a while ago and it was incredibly frustrating. The "Network error" prevented me from updating a post I had written for months because I couldn't figure out why my edits (which extended the length and which I assumed was the problem) couldn't get through.
Trying to contact support was difficult too due to AI chatbots, but when I finally did reach a human, their "tech support" obviously didn't bother to look at this in any reasonable timeframe.
It wasn't until some random person on Twitter suggested the possibility of some magic string tripping over some stupid security logic that I found the problem and could finally edit my post.
> This case highlights an interesting tension in web security: the balance between protection and usability.
This isn't a tension. This rule should not be applied at the WAF level. It doesn't know that this field is safe from $whatever injection attacks. But the substack backend does. Remove the rule from the WAF (and add it to the backend, where it belongs) and you are just as secure and much more usable. No tension.
I would say it’s a decent security practice to apply WAF as a blanket rule to all endpoints and then remove it selectively when issues like this occur. It’s much, much, harder to evaluate every single public facing endpoint especially when hosting third party software like Wordpress with plugins.
undefined
There is a tension, but it's between paying enough to developers to actually produce decent code or pay a 3rd-party to firewall the application.
undefined
WAFs were created by people who read https://thedailywtf.com/articles/Injection_Rejection and didn't realize that TDWTF isn't a collection of best practices.
Content filtering should be highly context dependent. If the WAF is detached from what it's supposed to filter, this happens. If the WAF doesn't have the ability to discern between command and content contexts, then the filtering shouldn't be done via WAF.
This is like spam filtering. I'm an anti-spam advocate, so the idea that most people can't discuss spam because even the discussion will set off filters is quite old to me.
People who apologize for email content filtering usually say that spam would be out of control if they didn't have that in place, in spite of no personal experience on their end testing different kinds of filtering.
My email servers filter based on the sending server's configuration: does the EHLO / HELO string resolve in DNS? Does it resolve back to the connecting IP? Does the reverse DNS name resolve to the same IP? Does the delivery have proper SPF / DKIM? Et cetera.
My delivery-based filtering works worlds better than content-based filtering, plus I don't have to constantly update it. Each kind has advantages, but I'd rather occasional spam with no false positives than the chance I'm blocking email because someone used the wrong words.
With web sites and WAF, I think the same applies, and I can understand when people have a small site and don't know or don't have the resources to fix things at the actual content level, but the people running a site like Substack really should know better.
Yes to smart filtering at the right layer. The whole reverse DNS checks et al. are so effective. I recently moved my personal mailbox from a host who didn't do these kinds of checks to one that does. My received spam volume instantly went from about 20 a day (across all my aliases) to less than 1 a week.
SPF and DKIM are now more commonly implemented correctly by spammers than by major email providers.
https://news.ycombinator.com/item?id=43468995
undefined
Few years ago I had an application that allowed me to set any password, but then gave mysterious errors when I tried to use that password to login. Took me a bit to figure out what was going on, but their WAF blocked my "hacking attempt" of using a ' in the password.
The same application also stored my full password in localStorage and a cookie (without httponly or secure). Because reasons. Sigh.
I'm going to do a hot take and say that WAFs are bollocks mainly used by garbage software. I'm not saying a good developer can't make a mistake and write a path traversal, but if you're really worried about that then there are better ways to prevent that than this approach which obviously is going to negatively impact users in weird and mysterious ways. It's like the naïve /(fuck|shit|...)/g-type "bad word filter". It shows a fundamental lack of care and/or competency.
Aside: is anyone still storing passwords in /etc/passwd? Storing the password in a different root-only file (/etc/shadow, /etc/master.passwd, etc.) has been a thing on every major system since the 90s AFAIK?
It's more that /etc/hosts and /etc/passwd are good for testing because they always exist with predictable contents on almost every system. If you inject "cat /etc/passwd" to various URLs you can grep for "root:" to see if it worked.
So it's really blocking doorknob-twisting scripts.
undefined
my bank requires non-alphanumeric characters in their passwords but will reject a password if it has alphanumeric characters it associates with command injection attacks.
as far as WAFs being garbage, they absolutely are, but this is a great time for a POSIWID analysis. A WAF says its purpose is to secure web apps. It doesn't do that, but people keep buying them. Now we're faced with a crossroads: we either have to assume that everyone is stupid or that the actual purpose of a WAF is something other than its stated purpose. I personally only assume stupidity as a last resort. I find it lazy and cynical, and it's often used to dismiss things as hopeless when they're not actually hopeless. To just say "Oh well, people are dumb" is a thought-terminating cliche that ignores potential opportunities. So we do the other thing and actually take some time to think about who decides to put a WAF in-place and what value it adds for them. Once you do that, you see myriad benefits because a WAF is a cheap, quick solution that allows non-technical people to say they're doing something. You're the manager of a finance OU that has a development group in it whose responsibility is some small web app. Your boss just read an article about cyber security and wants to know what this group two levels below you is doing about cyber security. Would you rather come back with "We're gonna need a year, $1 million and every other dev priority to be pushed back in order to develop a custom solution" or "We can have one fired up tomorrow for $300/mo, it's developed and supported by Microsoft and it's basically industry standard." The negative impact of these things is obvious to us because this is what we do, but we're not always the decision-makers for stuff like that. Often the decision-makers are actually that naive and/or they're motivated less by the ostensible goal of better web app security and more by the goal of better job security.
As far as etc/passwd you're right that passwords don't live there anymore but user IDs often do and those can indicate which services are running as daemons on a given system. This is vital because if you can figure out what services are running you can start version fingerprinting them and then cross-referencing those versions with the CVE database.
I understand applying path filters in URLS and search strings, but I find it odd that they would apply the same rules to request body content, especially content encoded as valid JSON, and especially for a BLOG platform where the content would be anything.
Indeed a severe case of paranoia?
1. Create a new post. 2. Include an Image, set filter to All File types and select "/etc/hosts". 3. You get served with an weird error message box displacing a weird error message. 4. After this the Substack posts editor is broken. Heck, every time i access the Dashboard, it waits forever to build the page.
Did find this text while browsing the source for an error (see original ascii art: https://pastebin.com/iBDsuer7):
SUBSTACK WANTS YOU
TO BUILD A BETTER BUSINESS MODEL FOR WRITING
"Who signed off on your WAF rules" would be a great reverse interview question then.
This looks like it was caused by this update https://developers.cloudflare.com/waf/change-log/2025-04-22/ rule 100741.
It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd
AFAICT it's also (though I'm very rusty) in ModSecurity, if XML content processing is enabled then rules like these will trip:
where the referenced files contain the usual list of *nix suspects including the offending filename (lfi-os-files.data, "local file inclusion" attacks)The advantage (whack-a-mole notwithstanding) of a WAF is it orders of magnitude easier to tweak WAF rules than upgrade say, Weblogic, or other teetering piles of middleware.
undefined
I have a lifetime Pastebin account that I hadn't used for some years. Last year I enrolled in a "linux administration" class and tried to use that pastebin (famous for sharing code) to share some code/configurations with other students. When I tried to paste my homework I kept getting a Cloudflare error page. I don't even remember what I was pasting, but it was normal linux stuff. I contacted pastebin support - of course I got ghosted.
I am sharing this in relation to the WAF comments and how much the companies implementing WAF care about your case.
The problem with WAF is discussed in https://users.ece.cmu.edu/~adrian/731-sp04/readings/Ptacek-N....
One of the authors of the paper has said "WAFs are just speed bump to a determined attacker."
> "WAFs are just speed bump to a determined attacker."
We wish. Speed bumps don't totally immobilise a pseudo-random selection of innocent vehicles.
Locks are a speedbump for a lockpick.
Doors are a speedbump for a car.
Well yeah, sure, doesn't mean I'm going to have an open doorframe or a door without a lock.
undefined
https://en.wikipedia.org/wiki/Bush_hid_the_facts
As a card carrying Substack hater, I’m not suprised.
> "How could Substack improve this situation for technical writers?"
They don’t care about (technical) writers. All they care about is building a TikTok clone to “drive discoverability” and make the attention-metrics go up. Chris Best is memeing about it on his own platform. Very gross.
Reminds me of Slashdot and breaking the page by widening it with certain characters
We briefly had a WAF forced upon us and it caused so many problems like this we were able to turn it off, for now. I'm sure it'll be back.
Could this be trivially solved client-side by the editor if it just encoded the slashes, assuming it's HTML or markdown that's stored? Replacing `/etc/hosts` with `/etc/hosts` for storage seems like an okay workaround. Potentially even doing so for anything that's added to the WAF rules automatically by syncing the rules to the editor code.
That reminds me of issues I once had with Microsoft's boneheaded WAF. We had base64 encoded data in a cookie, and whenever certain particular characters were produced next to each other in the data - I think the most common was "--" - the WAF would tilt and stop the "attempted SQL injection attack". So every so often someone would get an illegal login cookie and just get locked out of the system until they deleted it or it expired. Took a while to find out what went wrong, and even longer to figure out how to remove the more boneheaded rules from the WAF.
It's something I ran into quite a few times in my career. It's a weird call to get if the client can't save their cms site, due to typing something harmless. I think worst was when there was a dropdown that I defined which had a value in the mod rules that was not allowed.
I cannot reproduce this.
At least, in this case, the WAF in question had the decency to return 403.
I've worked with a WAF installation (totally different product), where the "WAF fail" tell was HTTP status 200 (!) and "location: /" (and some garbage cookies), possibly to get browsers to redirect using said cookies. This was part of the CSRF protection. Other problems were with "command injection"-patterns (like in the article, expect with specific Windows commands, too - they clash with everyday words which the users submit), and obviously SQL injections which cover some relevant words, too.
The bottom line is that WAFs in their "hardened/insurance friendly" standard configs are set up to protect the company from amateurs exposing buggy, unsupported software or architectures. WAF's are useful for that, but you still gave all the other issues with buggy, unsupported software.
As others have written, WAFs can be useful to protect against emerging threats, like we saw with the log4j exploit which CloudFlare rolled out protection for quite fast.
Unless you want compliance more than customers, you MUST at least have a process to add exceptions to "all the rules"-circus they put in front of the buggy apps.
Whack-a-mole security filtering is bad, but whack-a-mole relaxation rule creation against an unknown filter is really tiring.
Almost equally fun are the ones that simply drop the connection and leave you waiting for a timeout.
Weird idea: What if user content was stored and transmitted encrypted by default? Then an attacker would have to either (a) identify a plaintext which encrypts to an attack ciphertext (annoying, and also you could keep your WAF rules operational for the ciphertext, with minimal inconvenience to users) or (b) attack the system when plaintext is being handled (could still dramatically reduce attack surface).
it was a cf managed waf rule for a vulnerability that doesn't apply to us. we've disabled it.
This comment deserves to be much higher, assuming this user speaks for Substack (no previous submissions or comments, but the comment implies it).
undefined
Why not review rules before applying them?
Seems like a case of somebody installing something they couldn’t be bothered to understand to tick a box marked security.
The outcome is the usual one, stuff breaks and there is no additional security.
As soon as I saw the headline, I knew this was due to a WAF.
I worked on a project where we had to use a WAF for compliance reasons. It was a game of wack-a-mole to fix all the places where standard rules broke the application or blocked legitimate requests.
One notable, and related example is any request with the string "../" was blocked, because it might be a path traversal attack. Of course, it is more common that someone just put a relative path in their document.
> For now, I'll continue using workarounds like "/etc/h*sts" (with quotes) or alternative spellings when discussing system paths in my Substack posts.
Ahh, the modern trend of ”unalived”¹ etc. comes to every corner of society eventually.
1. <https://knowyourmeme.com/memes/unalive>
It's /con/con all over again
This reminds me of that time I was discussing with friends about something we did in our computer science class that day and I realised writing toString in the Whatsapp client for macOS would crash the application. At the time I didn’t have the skills to understand why so I recorded the bug on my phone to share with friends :)
I had a problem recently trying to send LLM-generated text between two web servers under my control, from AWS to Render - I was getting 403s for command injection from Render's Cloudflare protection which is opaque and unconfigurable to users.
The hacky workaround which has been stably working for a while now was to encode the offending request body and decode it on the destination server.
I don't get it. Why aren't those files just protected so they have no read or write permissions? Isn't this like the standard way to do things? Put the blog in a private user space with minimal permissions.
Why would random text be parsed? I read the article but this doesn't make sense to me. They suggested directory transversal but your text shouldn't have anything to do with that and transversal is solved by permission settings
this is the usual approach with web application firewalls, block all the 100500 known attacks. Doesn't matter if they are not applicable to your website. Some of them are obviously OS-depended (having .exe in the URLs) but it doesn't matter, it's blocked just in case
I do understand this appoach. From the defence point of view it makes sense, if you have to create a solution to protect millions of websites it doesn't make sense to tailor it to specifics of a single one
undefined
undefined
> Substack's filter is well-intentioned - protecting their platform from potential attacks.
There is sadly no evidence in this article that the supposed filter does protect the platform from potential attacks.
This is a common problem with WAFs and, more specifically, Cloudflare's default rulesets. If your platform has content that is remotely technical you'll end up triggering some rules. You end up needing a test suite to confirm your real content doesn't trigger the rules and if it does you need to disable them.
substack also does wonderful things like preserve weird bullet points, lack code block displays, and make it impossible to customize the landing page of your site beyond the 2 formats they give you.
generally think that Substack has done a good thing for its core audience of longform newsletter writer creators who want to be Ben Thompson. however its experience for technical people, for podcasters, for people who want to start multi-channel media brands, and for people who write for reach over revenue (but with optional revenue) has been really poor. (all 4 of these are us with Latent.Space). I've aired all these complaints with them and theyve done nothing, which is their prerogative.
i'd love for "new Substack" to emerge. or "Substack for developers".
Ben Thompson is working on Passport, which seems to be a self-hosted (WordPress-based) Substack: https://stratechery.com/2021/passport/
He gave a talk on it at WordCamp Asia at the start of last year, although I haven’t heard of any progress recently on it.
undefined
The amount of headaches I've had from WAFs blocking legit stuff is unreal. I just wish the folks turning those rules on had to use them for a week themselves.
This isn't a "security vs usability" trade-off as the author implies. This has nothing to do with security at all.
/etc/hosts
See, HN didn't complain. Does this mean I have hacked into the site? No, Substack (or Cloudflare, wherever the problem is) is run by people who have no idea how text input works.
It's more so that Cloudflare has a WAF product that checks a box for security and makes people who's job it is to care about boxes being checked happy.
For example, I worked with a client that had a test suite of about 7000 or so strings that should return a 500 error, including /etc/hosts and other ones such as:
We "failed" and were not in compliance as you could make a request containing one of those strings--ignoring that neither Apache, SQL, or Windows were in use.We ended up deploying a WAF to block all these requests, even though it didn't improve security in any meaningful way.
undefined
undefined
undefined
This is like banning quotes from your website to 'solve' SQL injection...
undefined
> is run by people who have no idea how text input works
That's a very uncharitable view. It's far more likely that they are simply using some WAF with sane defaults and never caught this. They'll fix it and move on.
undefined
My thought exactly - this isn't an example of balance between "security vs usability" - this is just wrong behaviour.
It's a text string that is frequently associated with attacks and vulnerabilities. In general you want your WAF to block those things. This is indeed the point of a WAF. Except you also don't want it to get in the way of normal functionality (too much). That is what the security vs usability trade off is.
This particular rule is obviously off. I suspect it wasn't intended to apply to the POST payload of user content. Perhaps just URL parameters.
On a big enough website, users are doing weird stuff all the time and it can be tricky to write rules that stop the traffic you don't want while allowing every oddball legitimate request.
undefined
I once helped maintain some PHP software that was effectively a CMS. You'd drop a little PHP snippet into any page (e.g., that you make with Dreamweaver) and it would automatically integrate it with the CMS functionality.
We had unending trouble with mod_security. The worst issue I can remember was that any POST request whose body contained the word "delete" was automatically rejected. That was the full rule. To this day I still can't imagine what the developers were thinking.
The title would be improved with "Writing the string ...". I first read it as "Writing the file" which was pretty weird.
It's in quotation marks, which I'd say makes it clear enough for most people.
Using a WAF is the strongest indicator that someone doesn't know what's happening and where or something underneath is smelly and leaking profusely.
Just rot13 any request data using javascript before posting, and rot13 it again on the server side. Problem solved. (jk)
Just tried to post a tweet with this article title and link and got a similar error (on desktop twitter.com). Lovely.
Did anyone try reporting this to Substack?
So "/etc/h*sts" is not stopped by the filters? Nice to know for the hackers :)
Similar:
Writing `find` as the first word in your search will prevent Firefox from accepting the “return” key is pressed.
Pretty annoying.
I can't reproduce this; is it still the case, or some ancient thing?
undefined
Are you sure you don't have a custom search rule configured in Firefox? I just tried this on my local instance and there was no problem.
EDIT: Apparently this is caused by the “findplus” extension. Removed!
So everyone should start looking for vulnerabilities in the substack site?
If that's their idea of security...
this feels like blocking terms like "null" or "select" just because you failed to properly parameterize your SQL queries.
Aaaahh they are trying to prevent a Little Bobby Tables story..
writing "bcc: someone@email.com" sometimes triggers WAF rules
Ok so: there is a blogging/content publishing engine, which is somewhat of a darling of the startup scene. There is a cloud hosting company with a variety of products, which is an even dearer darling of the startup scene. Something is posted on the blobbing/content publishing engine that clearly reveals that
* The product provided for blogging/content publishing did a shitty job of configuring WAF rules for its use cases (the utility of a "magic WAF that will just solve all your problems" being out of the picture for now) * The WAF product provided by the cloud platform clearly has shitty, overreaching rules doing arbitrary filtering on arbitrary strings. That filtering absolutely can (and will) break unrelated content if the application behind the WAF is developed with a modicum of security-mindedness. You don't `fopen()` a string input (no, I will not be surprised - yes, sometimes you do `fopen()` a string input - when you are using software that is badly written).
So I am wondering:
1. Was this sent to Substack as a bug - they charge money for their platform, and the inability to store $arbitrary_string on a page you pay for, as a user, is actually a malfunction and disfunction"? It might not be the case "it got once enshittified by a CIO who mandated a WAF of some description to tick a box", it might be the case "we grabbed a WAF from our cloud vendor and haven't reviewed the rules because we had no time". I don't think it would be very difficult for me, as an owner/manager at the blogging platform, to realise that enabling a rule filtering "anything that resembles a Unix system file path or a SQL query" is absolutely stupid for a blogging platform - and go and turn it the hell off at the first user complaint.
2. Similarly - does the cloud vendor know that their WAF refuses requests with such strings in them, and do they have a checkbox for "Kill requests which have any character an Average Joe does not type more frequently than once a week"? There should be a setting for that, and - thinking about the cloud vendor in question - I can't imagine the skill level there would be so low as to not have a config option to turn it off.
So - yes, that's a case of "we enabled a WAF for some compliance/external reasons/big customer who wants a 'my vendor uses a WAF' on their checklist", but also the case of "we enabled a WAF but it's either buggy or we haven't bothered to configure it properly".
To me it feels like this would be 2 emails first ("look, your thing <X> that I pay you money for clearly and blatantly does <shitty thing>, either let me turn it off or turn it off yourself or review it please") - and a blog post about it second.
[dead]
[dead]
[dead]
[flagged]
You figured all that out just because the headers indicate the site passed through Cloudflare at one point? That's quite a leap!
If Cloudflare had a default rule that made it impossible to write that string on any site with their WAF, wouldn't this be a lot more widespread? Much more likely someone entered a bad rule into Cloudflare, or Cloudflare isn't involved in that rule at all.
Huh, a bit like "adult-content" filters that would censor Scunthorpe or Wikipedia articles about genitals, maybe Cloudflare saw a market to sell protection for donkeys who can't protect their webapps from getting request-injected.
I think Cloudflare WAF is a good product compared to other WAFs - by definition a WAF is intended to layer on validation that properly built applications should be doing, so it's sort of expected that it would reject valid potentially harmful content.
I think you can fairly criticise WAF products and the people who advocate for them (and created the need for them) but I don't think the CF team responsible can really be singled out.
Unfortunately this is probably a case where the market demands stupidity. The quality engineers don't have a say over market forces.
These WAF features are older than LLMs & vibe coding.
Who knows how many attacks such a "stupid" thing blocks every month?
[flagged]
Worth noting that people here are assuming that the author's assumption is correct, that his writing /etc/hosts is causing the 403, and that this is either a consequence of security filtering, or that this combination of characters at all that's causing the failure. The only evidence he has, is he gets back a 403 forbidden to an API request when he writes certain content. There's a thousand different things that could be triggering that 403.
It's not likely to be a WAF or content scanner, because the HTTP request is using PUT (which browser forms don't use) and it's uploading the content as a JSON content-type in a JSON document. The WAF would have to specifically look for PUTs, open up the JSON document, parse it, find the sub-string in a valid string, and reject it. OR it would have to filter raw characters regardless of the HTTP operation.
Neither of those seem likely. WAFs are designed to filter on specific kinds of requests, content, and methods. A valid string in a valid JSON document uploaded by JavaScript using a JSON content-type is not an attack vector. And this problem is definitely not path traversal protection, because that is only triggered when the string is in the URL, not some random part of the content body.
It sure looks like the author did his due diligence; he has a chart of all the different phrases in the payload which triggered the 403 and they all corresponded to paths to common UNIX system configuration files.
Nobody could prove that's exactly what's happening without seeing Cloudflare's internal WAF rules, but can you think of any other reasonable explanation? The endpoint is rejecting a PUT who's payload contains exactly /etc/hosts, /etc/passwd, or /etc/ssh/sshd_config, but NOT /etc/password, /etc/ssh, or /etc/h0sts. What else could it be?
undefined
If you change a single string in the HTTP payload and it works, what other explanation makes sense besides a text scanner somewhere along the path to deploying the content?
See https://developers.cloudflare.com/waf/change-log/2025-04-22/ rule 100741.
It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd
You're being downvoted because WAFs work exactly like this, and it's intentional and their vendors think this is a good thing. A WAF vendor would say that a WAF parsing JSON makes it weaker.
undefined