Skip to content

Instantly share code, notes, and snippets.

@premist
Last active April 20, 2018 05:51
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save premist/e7bc1f3892c9643b08ba to your computer and use it in GitHub Desktop.
Save premist/e7bc1f3892c9643b08ba to your computer and use it in GitHub Desktop.
CloudFlare has outage on one of their edge (ICN), but not noticing their customers

Minku Lee :

I'm geographically located in Seoul, Korea and traffic to api.fmf.io always get routed via LAX servers. My server is located in Tokyo, Japan so it adds up a lot of delay on the request. Tried with different ISPs and all same result. My friends are also experiencing this with their free plan enabled website, but I'm also experiencing this in pro-enabled paid plans.

This lasted for a quite long time (over several month) and my friend got a response from CloudFlare rep. that it's a temporarily routing problem and will be resolved soon. However seems like this is not the case.. If this is not getting resolved, I'm considering and testing migration of all my company and personal paid plans to Fastly.


Minku Lee :

Attaching cdn-cgi results

CloudFlare :

Hi Minku,

Thank you for contacting CloudFlare support. Sorry to hear you're experiencing some network connectivity issues here. Your ticket has been assigned, and will be reviewed by a member of our network team.

To better troubleshoot this, please share the ticket # that your friend previously opened on this matter, as well as a traceroute + mtr in text format into this ticket.

Best Regards, [rep name]


Minku Lee :

Hi [rep name],

My friend’s ref # is [rep #].

Also attaching traceroute results for two domains.

(Moved traceroute result to traceroute-cloudflare file)

(Moved traceroute result to traceroute-apifmfio file)

I’m using Korea Telecom (KT) DNS 168.126.63.1 / 168.126.63.2, but nslookup/traceroute result is similar when I use SK BroadBand or LG Uplus.


Minku Lee :

Also attaching MTR results.

(Moved MTR result to mtr-apifmfio file)

(Moved MTR result to mtr-cloudflare file)


Minku Lee :

Found this on GitHub, seems like other people are experiencing this as well. cdnjs/cdnjs#3395


CloudFlare :

Hi,

Currently we're awaiting an increase in capacity from a provider in Seoul. Currently traffic is temporarily re-routed.

It seems as though your ISP prefers to transit you to LAX rather than somewhere else in Asia, due to where they have connectivity.

Once the routes are re-introduced to Seoul after the capacity upgrade, traffic should again flow to there.


Minku Lee :

According to CloudFlare status page, however, Seoul is listed as 'Operational'. If status page shows false results, what's the point of having status page?

I'm sad that I can't trust CloudFlare, which is such an awesome service and I recommended to friends a lot. Also, I have 2 paid domains (one on [my personal email], one on [my work email]) and it's too bad that CloudFlare didn't have any notices even to paid customers.


CloudFlare :

Hi,

"According to CloudFlare status page, however, Seoul is listed as 'Operational'. If status page shows false results, what's the point of having status page?"

Because not all traffic is affected in that manner & only some customers may be in the situation. And, as Marty mentioned, it may also be an issue with how ISPs prefer to connect due to things like peering arrangements that they have (we don't have direct control over this).

We will update you as soon as the circumstance changes on our end relative to the upgrades, and we hope that changes the behavior. Until then, however, there's nothing we can do to force it to route the way you want it to.

Regards, [rep name]


Minku Lee :

Hi [rep name],

"Because not all traffic is affected in that manner & only some customers may be in the situation." Normally, according to Statuspage.io and other companies', they classifies component's status into four different steps :

  • Operational
  • Degraded Performance
  • Partial Outage
  • Major Outage

Especially, Statuspage.io describes 'Partial Outage' as this : This component is partially down or is experiencing an outage only affecting a small percentage of constituents. (example: 1 of 25 file servers offline)

I believe many customers of CloudFlare will recognize current component's status with a similar thinking, and like me, will not understand if some group of customers are experiencing a problem while status page indicates 'Operational' status.

Also, the CF websites I looked into (including cloudflare.com and cdnjs.com) was all routed into far away region (LAX, SJC, HKG). There were no exceptions.

"And, as Marty mentioned, it may also be an issue with how ISPs prefer to connect due to things like peering arrangements that they have"

I tried major 3 ISPs in Korea, KT(Korea Telecom/Kornet), LG Uplus and SK BroadBand (Powercomm). KT usually routes CF traffic into LAX, LG Uplus and SK BroadBand routes traffic into SJC/HKG. Of course we can blame ISPs, but customers will blame us when CloudFlare says there is no problem on the network. Some people seems to contacted ISP, but their technical team mentioned that CF's Anycast DNS seems having a problem.

Anyway, if CloudFlare is not honest about some of their components' status (and no willing to change it), it means I can't rely on CloudFlare. If CloudFlare made a honest notice on status page and claims that ICN edge has a bit of problem, the situation would be much, much better.

This network problem is spreading into Korean web communities, and I'm sure people's opinion on CF will get even worse in Korea before it gets better.


CloudFlare

[ticket marked as solved]

====== traceroute www.cloudflare.com ======
traceroute: Warning: www.cloudflare.com has multiple addresses; using 198.41.214.163
traceroute to www.cloudflare.com.cdn.cloudflare.net (198.41.214.163), 64 hops max, 52 byte packets
1 10.0.1.1 (10.0.1.1) 1.292 ms 1.035 ms 0.986 ms
2 175.196.71.126 (175.196.71.126) 6.792 ms 6.811 ms 7.620 ms
3 59.10.107.17 (59.10.107.17) 5.221 ms 6.016 ms 5.269 ms
4 61.78.42.171 (61.78.42.171) 28.842 ms 4.599 ms 5.143 ms
5 112.189.28.249 (112.189.28.249) 5.327 ms
112.189.29.181 (112.189.29.181) 4.466 ms
121.138.3.137 (121.138.3.137) 4.612 ms
6 112.174.58.1 (112.174.58.1) 6.424 ms
112.174.18.9 (112.174.18.9) 4.646 ms
112.174.58.13 (112.174.58.13) 7.247 ms
7 112.174.25.134 (112.174.25.134) 4.638 ms 4.284 ms 3.898 ms
8 112.174.93.245 (112.174.93.245) 3.903 ms 4.303 ms
112.174.93.249 (112.174.93.249) 3.829 ms
9 112.174.84.98 (112.174.84.98) 4.477 ms 4.593 ms
112.174.84.154 (112.174.84.154) 9.161 ms
10 112.174.88.170 (112.174.88.170) 146.260 ms
112.174.88.186 (112.174.88.186) 150.670 ms 156.245 ms
11 xe-0-1-0.edge01.lax01.as13335.net (206.223.123.156) 149.752 ms 148.935 ms 148.400 ms
12 198.41.214.163 (198.41.214.163) 157.402 ms 148.344 ms 157.148 ms
====== end of traceroute www.cloudflare.com ======
====== traceroute api.fmf.io ======
traceroute: Warning: api.fmf.io has multiple addresses; using 162.159.240.95
traceroute to fmf.io (162.159.240.95), 64 hops max, 52 byte packets
1 10.0.1.1 (10.0.1.1) 1.188 ms 0.963 ms 0.956 ms
2 175.196.71.126 (175.196.71.126) 8.418 ms 7.308 ms 6.435 ms
3 59.10.107.17 (59.10.107.17) 8.003 ms 8.494 ms 7.365 ms
4 61.78.42.171 (61.78.42.171) 6.441 ms 5.782 ms 6.497 ms
5 121.138.3.141 (121.138.3.141) 3.840 ms
121.138.4.17 (121.138.4.17) 5.948 ms
121.138.3.141 (121.138.3.141) 4.243 ms
6 112.174.18.1 (112.174.18.1) 8.854 ms
112.174.18.21 (112.174.18.21) 7.244 ms
112.174.18.1 (112.174.18.1) 7.462 ms
7 112.174.65.94 (112.174.65.94) 4.350 ms
112.174.25.134 (112.174.25.134) 4.504 ms
112.174.65.94 (112.174.65.94) 4.429 ms
8 112.174.93.245 (112.174.93.245) 6.981 ms
112.174.93.249 (112.174.93.249) 5.478 ms 8.288 ms
9 112.174.84.206 (112.174.84.206) 5.378 ms
112.174.84.78 (112.174.84.78) 4.779 ms
112.174.84.98 (112.174.84.98) 4.643 ms
10 112.174.87.78 (112.174.87.78) 147.340 ms
112.174.88.186 (112.174.88.186) 149.094 ms
112.174.87.66 (112.174.87.66) 148.543 ms
11 xe-0-1-0.edge01.lax01.as13335.net (206.223.123.156) 147.263 ms 148.371 ms 160.081 ms
12 162.159.240.95 (162.159.240.95) 149.592 ms 147.874 ms 150.815 ms
====== end of traceroute api.fmf.io ======
============== mtr api.fmf.io ==============
My traceroute [v0.86]
Vivid-Redux.local (0.0.0.0) Wed Jan 28 10:12:01 2015
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. 10.0.1.1 0.0% 3 1.1 1.1 1.0 1.2 0.0
2. 175.196.71.126 0.0% 3 6.2 5.6 5.0 6.2 0.0
3. 59.10.107.17 0.0% 3 4.8 4.0 3.5 4.8 0.0
4. 61.78.42.171 0.0% 3 1.5 1.9 1.5 2.0 0.0
5. 112.189.28.181 0.0% 2 38.7 20.2 1.7 38.7 26.2
6. 112.174.18.5 0.0% 2 3.9 3.4 2.8 3.9 0.0
7. 112.174.25.134 0.0% 2 2.1 2.3 2.1 2.5 0.0
8. 112.174.93.249 0.0% 2 2.6 2.8 2.6 3.0 0.0
9. 112.174.84.102 0.0% 2 2.0 2.3 2.0 2.5 0.0
10. 112.174.87.66 0.0% 2 148.1 147.0 145.9 148.1 1.4
11. xe-0-1-0.edge01.lax01.as13335.net 0.0% 2 145.8 145.8 145.8 145.9 0.0
12. 162.159.240.95 0.0% 2 146.5 146.7 146.5 147.0 0.0
=========== end of mtr api.fmf.io ===========
=========== mtr www.cloudflare.com ===========
My traceroute [v0.86]
Vivid-Redux.local (0.0.0.0) Wed Jan 28 10:12:24 2015
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. 10.0.1.1 0.0% 2 1.0 2.4 1.0 3.8 1.7
2. 175.196.71.126 0.0% 2 5.9 11.6 5.9 17.3 8.1
3. 59.10.107.17 0.0% 2 3.9 4.4 3.9 4.8 0.0
4. 61.78.42.171 0.0% 2 1.9 2.3 1.9 2.7 0.0
5. 112.189.28.249 0.0% 2 2.6 8.9 2.6 15.3 8.9
6. 112.174.18.13 0.0% 2 5.9 5.8 5.6 5.9 0.0
7. 112.174.25.134 0.0% 2 2.7 2.9 2.7 3.1 0.0
8. 112.174.93.245 0.0% 2 2.8 2.7 2.6 2.8 0.0
9. 112.174.84.46 0.0% 2 2.5 7.6 2.5 12.8 7.3
10. 112.174.88.186 0.0% 2 147.2 147.2 147.2 147.2 0.0
11. xe-0-1-0.edge01.lax01.as13335.net 0.0% 1 144.8 144.8 144.8 144.8 0.0
12. 198.41.214.163 0.0% 1 146.3 146.3 146.3 146.3 0.0
=========== end of mtr www.cloudflare.com ===========
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment