Created
February 17, 2013 14:50
-
-
Save SyntaxBlitz/931fb1ec5624ebddb1ec to your computer and use it in GitHub Desktop.
SRT file for https://gist.github.com/pimterry/4971500 which is a transcript of http://www.youtube.com/watch?v=RzjUw47ZIg0 Tim Perry owns all the rights to this derivative work.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 | |
00:00:00,076 --> 00:00:02,593 | |
This talk is on XML attacks, | |
2 | |
00:00:02,593 --> 00:00:04,441 | |
which are very easy to become vulnerable to, | |
3 | |
00:00:04,441 --> 00:00:05,677 | |
because XML is insane | |
4 | |
00:00:05,677 --> 00:00:07,674 | |
and extremely dangerous, especially if you're | |
5 | |
00:00:07,674 --> 00:00:09,613 | |
running web services or similar. | |
6 | |
00:00:09,613 --> 00:00:11,921 | |
First up, Billion Laughs. | |
7 | |
00:00:11,921 --> 00:00:16,588 | |
Essentially, you can do text substitutions in XML, | |
8 | |
00:00:16,588 --> 00:00:18,521 | |
because obviously it can rewrite itself as you parse it. | |
9 | |
00:00:18,521 --> 00:00:21,424 | |
And you do them like this. | |
10 | |
00:00:21,424 --> 00:00:23,026 | |
So, you define a whole load of rules, | |
11 | |
00:00:23,026 --> 00:00:26,196 | |
and then at the bottom, &lol9 gets replaced by | |
12 | |
00:00:26,196 --> 00:00:28,175 | |
10 &lol8s, which each then get replaced by | |
13 | |
00:00:28,175 --> 00:00:30,896 | |
10 &lol7's, and eventually gives you one billion | |
14 | |
00:00:30,896 --> 00:00:35,772 | |
lols. Byte for each character, 3 bytes for a lol, gives you | |
15 | |
00:00:35,772 --> 00:00:39,776 | |
3GB of string. Parsing that will take a long time and will | |
16 | |
00:00:39,776 --> 00:00:44,144 | |
probably break things when you write it anywhere. | |
17 | |
00:00:44,144 --> 00:00:47,350 | |
On top of that, you can also substitute | |
18 | |
00:00:47,350 --> 00:00:52,591 | |
things for other resources, such as files. | |
19 | |
00:00:52,591 --> 00:00:56,178 | |
If you do this you'll read from the random bit on the disk, | |
20 | |
00:00:56,178 --> 00:00:58,728 | |
and it will keep giving you data, forever. | |
21 | |
00:00:58,728 --> 00:01:00,469 | |
You will never parse this. | |
22 | |
00:01:00,469 --> 00:01:03,838 | |
In the meantime it'll also fill all of your memory. | |
23 | |
00:01:03,838 --> 00:01:06,870 | |
It's not very good. | |
24 | |
00:01:06,870 --> 00:01:09,778 | |
In addition, imagine you've got some web service where | |
25 | |
00:01:09,778 --> 00:01:11,757 | |
you take some XML and maybe you look at it | |
26 | |
00:01:11,757 --> 00:01:14,711 | |
for a bit, and you give it back, in an error page or similar. | |
27 | |
00:01:14,711 --> 00:01:16,746 | |
If you do this, you give them your passwords, or | |
28 | |
00:01:16,746 --> 00:01:19,763 | |
your private keys, or anything else you can possibly get at. | |
29 | |
00:01:19,763 --> 00:01:23,453 | |
And also, even more insane, you can read from the Internet. | |
30 | |
00:01:23,453 --> 00:01:28,441 | |
So, this could now read anything, so you can do things like | |
31 | |
00:01:28,441 --> 00:01:30,960 | |
the previous attack, and you can give them the contents | |
32 | |
00:01:30,960 --> 00:01:33,302 | |
of your private intranet site, or they can provide you a | |
33 | |
00:01:33,302 --> 00:01:35,570 | |
website and they can just block you, you can just wait for | |
34 | |
00:01:35,570 --> 00:01:38,271 | |
awhile, or they can give you whatever they want to give | |
35 | |
00:01:38,271 --> 00:01:40,810 | |
you, and you will just sit and take it. | |
36 | |
00:01:40,810 --> 00:01:43,473 | |
These look relatively easy to block, but also | |
37 | |
00:01:43,473 --> 00:01:46,843 | |
there's XInclude, which does the same thing, but again. | |
38 | |
00:01:46,843 --> 00:01:50,046 | |
And then, you can mix them together. | |
39 | |
00:01:50,046 --> 00:01:52,682 | |
This will connect to that website one billion times. | |
40 | |
00:01:52,682 --> 00:01:57,621 | |
If you send this to a webserver, it will attack somebody else | |
41 | |
00:01:57,621 --> 00:01:58,592 | |
a lot. | |
42 | |
00:01:58,592 --> 00:02:00,762 | |
If you send this multiple times, all your multithreads on your | |
43 | |
00:02:00,762 --> 00:02:03,793 | |
little multithreaded webserver will all go and attack | |
44 | |
00:02:03,793 --> 00:02:06,060 | |
them, and it'll load balance it and everything. | |
45 | |
00:02:06,060 --> 00:02:07,731 | |
This isn't theoretical! | |
46 | |
00:02:07,731 --> 00:02:10,200 | |
Last week, had this, this works. | |
47 | |
00:02:10,200 --> 00:02:14,904 | |
If you point it to itself, it takes the entire server down. | |
48 | |
00:02:14,904 --> 00:02:17,407 | |
It's absolutely mad. | |
49 | |
00:02:17,407 --> 00:02:20,710 | |
As well as that, you can do XML injection, | |
50 | |
00:02:20,710 --> 00:02:22,812 | |
which is like SQL injection, but "enterprisey". | |
51 | |
00:02:22,812 --> 00:02:27,183 | |
If you expect to send this off your order-processing | |
52 | |
00:02:27,183 --> 00:02:29,383 | |
website somewhere, that's then going to build | |
53 | |
00:02:29,383 --> 00:02:31,516 | |
them and sell on this product and so on. | |
54 | |
00:02:31,516 --> 00:02:33,340 | |
And you're going to take some user input and you're going | |
55 | |
00:02:33,340 --> 00:02:35,097 | |
to put it into here, and you're going to sell them what | |
56 | |
00:02:35,097 --> 00:02:37,994 | |
they want. You take some user input, but oh no, they've | |
57 | |
00:02:37,994 --> 00:02:41,430 | |
given you XML, and they've written a new price in there and | |
58 | |
00:02:41,430 --> 00:02:43,092 | |
now everything's free. | |
59 | |
00:02:43,092 --> 00:02:48,261 | |
Lots of XML parsers are quite naïve and foolish, and | |
60 | |
00:02:48,261 --> 00:02:51,614 | |
will give you the last price, if you ask for a single element, | |
61 | |
00:02:51,614 --> 00:02:54,043 | |
and it'll be zero. But you've got a clever XML parser, | |
62 | |
00:02:54,043 --> 00:02:55,478 | |
so it's fine. | |
63 | |
00:02:55,478 --> 00:02:59,346 | |
But they've commented it now! There's only one | |
64 | |
00:02:59,346 --> 00:03:01,758 | |
interpretation of this, and this is that everything is free. | |
65 | |
00:03:01,758 --> 00:03:04,619 | |
If you take XML that goes in like this and you don't | |
66 | |
00:03:04,619 --> 00:03:10,132 | |
sanitize it, everything goes horribly horribly wrong. | |
67 | |
00:03:10,132 --> 00:03:12,013 | |
So what do you do? | |
68 | |
00:03:12,013 --> 00:03:14,348 | |
Essentially, this. | |
69 | |
00:03:14,348 --> 00:03:17,809 | |
If you're building XML yourself, you sanitize your inputs. | |
70 | |
00:03:17,809 --> 00:03:20,570 | |
You make sure you're not putting in special characters | |
71 | |
00:03:20,570 --> 00:03:22,644 | |
and then there's a different set of special characters | |
72 | |
00:03:22,644 --> 00:03:25,308 | |
if you're putting in attributes and so on and so on. | |
73 | |
00:03:25,308 --> 00:03:27,544 | |
Or sensibly, you build XML with some kind of framework, | |
74 | |
00:03:27,544 --> 00:03:30,487 | |
which will do all of this for you, but you should test it, | |
75 | |
00:03:30,487 --> 00:03:31,680 | |
as well. | |
76 | |
00:03:31,680 --> 00:03:34,852 | |
Secondly, you disable all the bits of XML which are | |
77 | |
00:03:34,852 --> 00:03:37,887 | |
absolutely mad, which is lots of them, and are enabled | |
78 | |
00:03:37,887 --> 00:03:41,587 | |
by default in lots of things. Again, needs testing. | |
79 | |
00:03:41,587 --> 00:03:45,064 | |
Then, when you get exceptions in your XML, you don't | |
80 | |
00:03:45,080 --> 00:03:48,372 | |
show them to the user! If you're trying to do XML injection | |
81 | |
00:03:48,388 --> 00:03:51,018 | |
and the thing tells you you're missing this element, | |
82 | |
00:03:51,018 --> 00:03:52,186 | |
you add the element, | |
83 | |
00:03:52,186 --> 00:03:54,269 | |
and it's easy and it just works. | |
84 | |
00:03:54,269 --> 00:03:57,763 | |
You don't want to let people do that, so that covers that. | |
85 | |
00:03:57,763 --> 00:03:59,765 | |
But then if you do need to use any of the special features, | |
86 | |
00:03:59,765 --> 00:04:01,567 | |
you lock it down. | |
87 | |
00:04:01,567 --> 00:04:04,603 | |
So you watch it and you don't let it parse for half an hour, | |
88 | |
00:04:04,603 --> 00:04:07,112 | |
or you don't let it use 3GB of memory, and if you really | |
89 | |
00:04:07,112 --> 00:04:10,008 | |
need to pull in resources from somewhere else, | |
90 | |
00:04:10,008 --> 00:04:13,545 | |
you only let in certain ones, not all of the Internet. | |
91 | |
00:04:13,545 --> 00:04:17,745 | |
In Java, this is relatively easy, you just set a load of things | |
92 | |
00:04:17,745 --> 00:04:20,616 | |
to false. I'll send this 'round for reference, | |
93 | |
00:04:20,616 --> 00:04:22,421 | |
but it's not terribly complicated. | |
94 | |
00:04:22,421 --> 00:04:24,421 | |
In .NET it's even easier and you can actually set | |
95 | |
00:04:24,421 --> 00:04:26,592 | |
the XmlResolver bit, which is the bit that looks up external | |
96 | |
00:04:26,592 --> 00:04:29,062 | |
resources, to null, so it can't get any, and it's | |
97 | |
00:04:29,062 --> 00:04:31,096 | |
just better. | |
98 | |
00:04:31,096 --> 00:04:33,765 | |
And that's basically it. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment