Skip to content

Embed URL

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
SRT file for https://gist.github.com/pimterry/4971500 which is a transcript of http://www.youtube.com/watch?v=RzjUw47ZIg0 Tim Perry owns all the rights to this derivative work.
1
00:00:00,076 --> 00:00:02,593
This talk is on XML attacks,
2
00:00:02,593 --> 00:00:04,441
which are very easy to become vulnerable to,
3
00:00:04,441 --> 00:00:05,677
because XML is insane
4
00:00:05,677 --> 00:00:07,674
and extremely dangerous, especially if you're
5
00:00:07,674 --> 00:00:09,613
running web services or similar.
6
00:00:09,613 --> 00:00:11,921
First up, Billion Laughs.
7
00:00:11,921 --> 00:00:16,588
Essentially, you can do text substitutions in XML,
8
00:00:16,588 --> 00:00:18,521
because obviously it can rewrite itself as you parse it.
9
00:00:18,521 --> 00:00:21,424
And you do them like this.
10
00:00:21,424 --> 00:00:23,026
So, you define a whole load of rules,
11
00:00:23,026 --> 00:00:26,196
and then at the bottom, &lol9 gets replaced by
12
00:00:26,196 --> 00:00:28,175
10 &lol8s, which each then get replaced by
13
00:00:28,175 --> 00:00:30,896
10 &lol7's, and eventually gives you one billion
14
00:00:30,896 --> 00:00:35,772
lols. Byte for each character, 3 bytes for a lol, gives you
15
00:00:35,772 --> 00:00:39,776
3GB of string. Parsing that will take a long time and will
16
00:00:39,776 --> 00:00:44,144
probably break things when you write it anywhere.
17
00:00:44,144 --> 00:00:47,350
On top of that, you can also substitute
18
00:00:47,350 --> 00:00:52,591
things for other resources, such as files.
19
00:00:52,591 --> 00:00:56,178
If you do this you'll read from the random bit on the disk,
20
00:00:56,178 --> 00:00:58,728
and it will keep giving you data, forever.
21
00:00:58,728 --> 00:01:00,469
You will never parse this.
22
00:01:00,469 --> 00:01:03,838
In the meantime it'll also fill all of your memory.
23
00:01:03,838 --> 00:01:06,870
It's not very good.
24
00:01:06,870 --> 00:01:09,778
In addition, imagine you've got some web service where
25
00:01:09,778 --> 00:01:11,757
you take some XML and maybe you look at it
26
00:01:11,757 --> 00:01:14,711
for a bit, and you give it back, in an error page or similar.
27
00:01:14,711 --> 00:01:16,746
If you do this, you give them your passwords, or
28
00:01:16,746 --> 00:01:19,763
your private keys, or anything else you can possibly get at.
29
00:01:19,763 --> 00:01:23,453
And also, even more insane, you can read from the Internet.
30
00:01:23,453 --> 00:01:28,441
So, this could now read anything, so you can do things like
31
00:01:28,441 --> 00:01:30,960
the previous attack, and you can give them the contents
32
00:01:30,960 --> 00:01:33,302
of your private intranet site, or they can provide you a
33
00:01:33,302 --> 00:01:35,570
website and they can just block you, you can just wait for
34
00:01:35,570 --> 00:01:38,271
awhile, or they can give you whatever they want to give
35
00:01:38,271 --> 00:01:40,810
you, and you will just sit and take it.
36
00:01:40,810 --> 00:01:43,473
These look relatively easy to block, but also
37
00:01:43,473 --> 00:01:46,843
there's XInclude, which does the same thing, but again.
38
00:01:46,843 --> 00:01:50,046
And then, you can mix them together.
39
00:01:50,046 --> 00:01:52,682
This will connect to that website one billion times.
40
00:01:52,682 --> 00:01:57,621
If you send this to a webserver, it will attack somebody else
41
00:01:57,621 --> 00:01:58,592
a lot.
42
00:01:58,592 --> 00:02:00,762
If you send this multiple times, all your multithreads on your
43
00:02:00,762 --> 00:02:03,793
little multithreaded webserver will all go and attack
44
00:02:03,793 --> 00:02:06,060
them, and it'll load balance it and everything.
45
00:02:06,060 --> 00:02:07,731
This isn't theoretical!
46
00:02:07,731 --> 00:02:10,200
Last week, had this, this works.
47
00:02:10,200 --> 00:02:14,904
If you point it to itself, it takes the entire server down.
48
00:02:14,904 --> 00:02:17,407
It's absolutely mad.
49
00:02:17,407 --> 00:02:20,710
As well as that, you can do XML injection,
50
00:02:20,710 --> 00:02:22,812
which is like SQL injection, but "enterprisey".
51
00:02:22,812 --> 00:02:27,183
If you expect to send this off your order-processing
52
00:02:27,183 --> 00:02:29,383
website somewhere, that's then going to build
53
00:02:29,383 --> 00:02:31,516
them and sell on this product and so on.
54
00:02:31,516 --> 00:02:33,340
And you're going to take some user input and you're going
55
00:02:33,340 --> 00:02:35,097
to put it into here, and you're going to sell them what
56
00:02:35,097 --> 00:02:37,994
they want. You take some user input, but oh no, they've
57
00:02:37,994 --> 00:02:41,430
given you XML, and they've written a new price in there and
58
00:02:41,430 --> 00:02:43,092
now everything's free.
59
00:02:43,092 --> 00:02:48,261
Lots of XML parsers are quite naïve and foolish, and
60
00:02:48,261 --> 00:02:51,614
will give you the last price, if you ask for a single element,
61
00:02:51,614 --> 00:02:54,043
and it'll be zero. But you've got a clever XML parser,
62
00:02:54,043 --> 00:02:55,478
so it's fine.
63
00:02:55,478 --> 00:02:59,346
But they've commented it now! There's only one
64
00:02:59,346 --> 00:03:01,758
interpretation of this, and this is that everything is free.
65
00:03:01,758 --> 00:03:04,619
If you take XML that goes in like this and you don't
66
00:03:04,619 --> 00:03:10,132
sanitize it, everything goes horribly horribly wrong.
67
00:03:10,132 --> 00:03:12,013
So what do you do?
68
00:03:12,013 --> 00:03:14,348
Essentially, this.
69
00:03:14,348 --> 00:03:17,809
If you're building XML yourself, you sanitize your inputs.
70
00:03:17,809 --> 00:03:20,570
You make sure you're not putting in special characters
71
00:03:20,570 --> 00:03:22,644
and then there's a different set of special characters
72
00:03:22,644 --> 00:03:25,308
if you're putting in attributes and so on and so on.
73
00:03:25,308 --> 00:03:27,544
Or sensibly, you build XML with some kind of framework,
74
00:03:27,544 --> 00:03:30,487
which will do all of this for you, but you should test it,
75
00:03:30,487 --> 00:03:31,680
as well.
76
00:03:31,680 --> 00:03:34,852
Secondly, you disable all the bits of XML which are
77
00:03:34,852 --> 00:03:37,887
absolutely mad, which is lots of them, and are enabled
78
00:03:37,887 --> 00:03:41,587
by default in lots of things. Again, needs testing.
79
00:03:41,587 --> 00:03:45,064
Then, when you get exceptions in your XML, you don't
80
00:03:45,080 --> 00:03:48,372
show them to the user! If you're trying to do XML injection
81
00:03:48,388 --> 00:03:51,018
and the thing tells you you're missing this element,
82
00:03:51,018 --> 00:03:52,186
you add the element,
83
00:03:52,186 --> 00:03:54,269
and it's easy and it just works.
84
00:03:54,269 --> 00:03:57,763
You don't want to let people do that, so that covers that.
85
00:03:57,763 --> 00:03:59,765
But then if you do need to use any of the special features,
86
00:03:59,765 --> 00:04:01,567
you lock it down.
87
00:04:01,567 --> 00:04:04,603
So you watch it and you don't let it parse for half an hour,
88
00:04:04,603 --> 00:04:07,112
or you don't let it use 3GB of memory, and if you really
89
00:04:07,112 --> 00:04:10,008
need to pull in resources from somewhere else,
90
00:04:10,008 --> 00:04:13,545
you only let in certain ones, not all of the Internet.
91
00:04:13,545 --> 00:04:17,745
In Java, this is relatively easy, you just set a load of things
92
00:04:17,745 --> 00:04:20,616
to false. I'll send this 'round for reference,
93
00:04:20,616 --> 00:04:22,421
but it's not terribly complicated.
94
00:04:22,421 --> 00:04:24,421
In .NET it's even easier and you can actually set
95
00:04:24,421 --> 00:04:26,592
the XmlResolver bit, which is the bit that looks up external
96
00:04:26,592 --> 00:04:29,062
resources, to null, so it can't get any, and it's
97
00:04:29,062 --> 00:04:31,096
just better.
98
00:04:31,096 --> 00:04:33,765
And that's basically it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.