kentfredric/00_TLDR.mkdn

## 00_TLDR.mkdn

      
    Raw
  

              00_TLDR.mkdn
            
          
CGI.pm is shit
CGI is shit
<"ARGV"> shouldn't work under use strict because thats a string dereferencing a symbolic ref.
Hash Keys can't retain tainting and so can be used to propagate un-vetted data into safe spaces:

  my $hash = unsafe_thing_that_returns_a_hash();
  $dbh->query(join keys %{$hash}); # data will be untained regardless of what it is.

CGI.pm should probably do something smarter than simply returning the first param when >1 params
are passed and you only expected there to be one.
^ Doubly so when one of those params is a file
CGI.pm should probably not unify GET parameters and uploaded files like that.
CGI should not introduce bugs when this data is seen as the effective params set:

[
  file => "ARGV",
  file => \$handle,
]
because clearly, when you say "Is that a file"
if ( $cgi->upload('file') ) { # true
  my $file = $cgi->param('file'); # this should not be a string, but it is
This is not good.

  
## 32c3_Perljam_2.mkdn

      
    Raw
  

              32c3_Perljam_2.mkdn
            
          
    CGI Params return list debacle.

Yep. CGI.pm made a bad choice. CGI is not Perl, it is written in perl.
Programmers can and do make dumb decisions.
Perl does not stop you from making dumb decisions, because Perl doesn't
want to stop you making good decisions.
CGI is not recommended, and has not been for some time, and is no longer
shipped in the Perl distribution to reflect this.
If you have something to bitch about with CGI: File a bug. Stop Using CGI.
Tell people who are using CGI not to.
Software doesn't magically stop being shit because you complain about it
on a stage.
That's literally all your options. Being written in Perl instead of Ruby
doesnt magically make it more shit. Writing it in some other language
wouldnt make it inherently non-shit.
But, you know, go on, hold up the shittiest modules as a representation of
the language itself and then present the problem as an intractable one.
Obviously, because Internet Explorer is shit, that whatever language it was
written in is shit, and that language will continue to be shit simply
because people continue to use Internet Explorer.
Stop Using C++ guys, it has bugs!.
K
"The Community"

Perl Monks is not "The Perl Community", it is one face of a subset of
"The Perl Community". There's also RT, IRC, MetaCPAN, StackOverflow,
blogs.perl.org, twitter, Facebook, and undoubtedly many more places that
"The Perl Community" congregate in.
The community can't really be considered "Gated" in this manner, you've
just chosen to find one semi-gated community and decide "That's the
Community". That'd be like me referring to a pub as being "The General
Public" and then calling "The General Public" a "Gated Community" because
pubs have bouncers.
WAT
"You can declare variables without specifying a data type"

Yes. This is true to an extent. And as you said, other languages have this.
But the slide given demonstrates 2 exceptions, because it demonstrates the
use of firstclass "Array" data types, and "Hash" data types, which require
specific syntax to manipiulate.
This is not your typical "FooType variable" type constraints, but it does
clearly seperate "Hash" and "Array" "types" from "Scalar" "types"
For contrast, without the "@" and "%" prefixes, you'd only have the options
of
    $int = 0
    $str = "hello";
    $array = [ ];
    $hash  = { };
If this was our reality, we'd be closer to PHP's
   $int = array( )
Except that array is sort of like Schrödingers Array where its a quantum
superposition of array and hash, where it behaves like the opposite of the
one you wanted at least some of the time.
   $int = array( "an", "array" );
   $int = array( "key" => "value" );            # WAT
   $int = (object)(array( "key" => "value" ));  # WAT 
Its clear we have a little more granularity in types than that.
And there's JavaScript of course that similarly blurs the lines (in useful
ways)
   x = { "0": "hello", "1": "world" }; 
   y = [ "hello", "world" ];
   x[0] == y[0] && x[1] == y[1]    // Wat? 
   y["foo"] = "bar";               // WAT 
   JSON.stringify(y);              // WAT? Where did "foo" go? 
Data is an unknown data type and you can't force it to be.

Yes. And this is a standard feature of Dynamic programming languages.
JavaScript, Ruby, Python, etc. They all do this.
Multiple paths for different data types.

Again, Yes, you can have different handling for different types of
"detected" types.
This is also a function of most dynamic programming languages.
Languages without this flexibility may support mutliple dispatch instead.
This is why things like
   $("Query String");
   $(function(jQuery) {   });
Are supportable in jQuery.
Because its often more convenient to have an API that allows multiple
signatures for a common function, than it is to have to define multiple
function names for the different  argument options.
"A Perl Standard" ... well, I don't know about that. Its a common convention
you see in many places. And how applicable it is to a situation depends on
a whole bunch of design trade-offs.
You can do it, and I would even argue, you should!
Because that's exactly the behaviour that is implied by default values
for arguments.
Because:
   object.function(argument)
and
   object.function(argument,secondargument)
is essentially asking the question "if second argument is of magical type
NOTSPECIFICED, substitute it with an alternative value"
And that's not a whole lot different from "Type Undefined"
And "do something different if the user forgot to pass an argument" is
pretty much a standard in every language.
There's similarly no way to tell JavaScript "hey, this argument must not be
an array", so you have to detect that at runtime.
So obviously JavaScript sucks and everyone should stop using it.
Incidentally, Perl6 supports these if you want them, so ... yay for Perl6!
"The problem is when Hashes and Arrays are considered as Secure Data types"

And that's just the Crux of it. They aren't. Anyone who writes code with an
API knows this.
Anyone making this mistake is the idiot.
Neither the language itself or the culture surrounding it claims that
structured data is inherently secure as a function of being structured.
Because wether or not a data structure is secure is clearly a function
of where that data came from.
For instance, PHP permits forms to send POST/GET requests which result in
arbitrarily structured arrays.
And people who are stupid take that data verbatim as trustworthy.
And that's just it: They're stupid for doing that.
If you don't sanitize your input at the border prior to passing it to
something that requires that input to be "Safe", then your data is not
"Safe".
You didn't sanitize it, there's no way it can be safe.
And some layers must assume the caller did due dilligence.
For instance if you have a function called "rm" and it takes a string of a
file to delete, its not rm's job to work out whether or not the user
calling it was permitted, and its not rms job to work out if the string
passed is a file-that-should-be-deleted or not.
You fix that or pass it to RM, and if rm deletes it, rm did its job,
and did exactly what you told it to do. If you told it to do the wrong
thing, that's all on you.
[ACTUAL POSSIBLE ISSUE] Hash Keys are Not Tainted.

This is the only place I've seen a justifiable complaint about taint mode.
Given the point of taint mode is to make things explode if you didn't
sanitize things correctly, it would be a problem indeed if somebody passed
json like:
   { "DROP TABLES *": "Harmless value" }
And somebody, somehow, managed to sanitize the values properly, but managed
to pass those keys unsanitized to a query.
Because:
   ->query( join keys %HASH )
Would silently work, regardles of wether or not the hash key itself was ever
sanitized.
The downside is that this problem is possibly too hard to solve, because
hash keys are heavily used in the VM itself, and underlies things like
Package stashes, and so any taint checks here could slow down perl in
entirely unacceptable ways.
Additionally, people have relied on the taintless behaviour of hash keys on
purpose, and have copied variables into hashes simply to remove taint flags,
which makes changing that behaviour without breaking a lot of code
difficult.
I would still maintain that your coding ethics SHOULD NOT rely on taint
mode to keep you safe, and you should exercise sufficient levels of
dilligence.
But taint mode should still be reliable enough to catch where you've left a
barn door open by accident.
But your criticism focused wrongly on hash values as being the attack
vector, when its hash keys.
Bugzilla treats hash values as secure.

Yes. This is a bug in Bugzilla.
No, this is not something Perl community promotes or encourages.
All data passed to a function must be treated as equally suspicious,
structured or not.
Given the syntax of the _load_from_db function based on the way the slide
presents it, the _load_from_db function is intended as a lower level
function that just "does what its told", which is why it allows a passing
a free-from component of an SQL-Query in verbatim in the first place.
I would posit that the error here is not that it takes a hash as an argument
and treats it as "Secure", its that the second method with "detaint" is
detainting at the wrong level.
_load_from_db should also be treating $param as "Secure", where any
quoting/escaping for passing the value to the database should only be
considered an API convenience, not a security feature.
And _load_from_db should be barfing if any tainted data is passed to
it, not transparently detainting it.
"On all 3 CGI Modules"

You cite one CGI module.
The other two are Web Platforms, which happen to support the CGI
interface under the hood, but are otherwise not related to CGI.pm
And the other two are generally NOT used with the CGI interface,
because CGI is shit, and CGI.pm is shit.
The rest of this part of your analysis seems to be complaining that
different web platforms do different things, and I'm struggling to
understand how that could be construed to be a point, let alone, a point
that demonizes Perl itself as being "Bad".
These are all design decisions of the independent modules in question,
which all just happen to be written in Perl.
Any language can make such a decision, and Any dynamic language
could be equally vulnerable to these sorts of issues.
I imagine if you compared the exact same metrics with different languages
and competing web frameworks, you'd again, get a variety of different
results.
And I would naturally expect something to happen automatically server-side
as a result of extra parameters, because that's a common idiom on the
web
( PHP might be more consistent here, because unlike the other languages
where parsing GET requests requires dedicated modules that implement parts
of the HTTP Specification, PHP implements those parts natively in the
language itself and magically pumps the values into global variables. )
"You can't tell what your datatype is made of!"

Wait. You just argued you could, earlier, and complained that it wasn't
automatic.
The example used ref, which indicates the type of data you have.
This is, again, a common pattern in dynamic programming languages, and
you've used this false claim with other false claims to construct an
argument that paints Perl as being radically different from other
weakly typed programming languages.
You just have to apply dilligence and write code that handles the different
cases.
Its when you don't care what your data is and then you try to do things
with that data that require you to care, that things go amok.
The trick is to not do that.
CGI.pm Exploit.

Yes. CGI.pm is shit, stop using it. Stop running conferences where you tell
people how not to do things with software we're telling people not to use.
Tell people not to use CGI.pm, because it is shit.
CGI.pm is not "Perl".
[ACTUAL POSSIBLE ISSUE] <"ARGV"> Expolit

You can demonstrate this bug much more clearly with
   use strict;
   use warnings;
   
   @ARGV = ('echo broken|');
   
   while( <"ARGV"> ) {
      print $_;
   }
This problem is again exacerbated because CGI is shit, as CGI created the
enviroment wherein you have a list of arbitrary user data in @ARGV, and
CGI.pm made things worse by its ->params() bullshit.
But your understanding is wrong.
   my $file = $cgi->param('file');
This does NOT return a list of files in your situation.
   use strict;
   use warnings;
   
   sub print_context {
      print defined wantarray
      ? ( wantarray ? 'list' : 'scalar' )
      : 'void';
   }
   
   my $file = print_context; # scalar
Because CGI.pm now spews a nice big warning if you're dumb enough to use it in
List Context.

CGI::param called in list context from {...}, this can lead to
vulnerabilities. See the warning in "Fetching the value or values of a
single named parameter" at {...}

Instead, CGI.pm does what you'd expect any other thing to do in a situation
where the user passed two variables: It returns only one of them.
So your demon about list context is nowhere to be seen here.
However, <"ARGV"> being communicated to  is clearly a bug that should
be fixed, because @ARGV is a symbol, and strings dereferencing to symbols
transparently like that is typically something strictures prohibits.
2-Arg Open And Friends Suck

Well. Yes, that's why we've been recommending againt using them in code
since 5.8 shipped.
while(<>) implying 2-arg open is also a thing we've recognised as "bad", so we now
have while(<<>>).
But its worth mentioning that if you are writing a Command Line application,
the ability to pass a filename, or a string that can be executed to
produce content can be useful.
Yes, in a Server context, this is very much not what you want, because you
generally don't want users to pass in executables to invoke.
Its kinda useful being able to have
   while(<>) {
   }
In a CLI app and having that transparently support:
   echo foo | perl ./script.pl -
   perl ./script.pl ./sourcefile_1 ./source_file_2 
   perl ./script.pl ./source_file_1 ./source_file_2 'gzcat ./source_file_3|'
But you of course don't want magic data in ARGV when you're running a
server.
But ... that's just an argument to "STOP USING CGI"
Because CGI stupidly makes the CLI and the HTTP request a unified interface,
so you get the worst of both worlds.
So, again, you've got an exploit that fundementally relies on ancient
technology that we're setting on fire and trying to kill for ages now.
Misleading Documentation

What did you do, step in a time machine? You're giving presentations in
December using a screenshot of an example dated 2009, that was updated back
in April.
https://github.com/leejo/CGI.pm/blob/58dd6be3899d2cb24a2dcd873729b8be0cc43052/examples/file_upload.cgi#L16
I can't confirm if the example has been updated to accommodate this, and I
still maintain CGI is shit and you should expect bad things to happen if you
use it.
Either way, that's what we call "A bug in CGI.pm", and I was under the
impression the right approach is to file a bug when they're found, so that
they can be fixed.
Unless of course you want to be a 1337 hacker geek and keep that to yourself
for 8 months so you can grand-stand on a stage and feel popular while making
it out like its some programming language to blame for bad software existing.
golf clap
"I'm not blaming CGI developers"

But you should. All these problems are design decisions made by
programmers.
They could have easily made decisions that avoided these pitfalls.
The language holding your hand wouldn't magically make this go away.
Yes, there are bugs in the language, and CGI's poor design choices make
those bug bigger.
But lets not fool ourselves that CGI.pm is the shining standard of decent
perl modules on CPAN, like you seem to have a hardon for trying to argue.
And you should blame people who use CGI.pm for using CGI.pm, because...

... CGI.pm is no longer considered good practice for developing web
applications, including quick prototyping and small web scripts. There
are far better, cleaner, quicker, easier, safer, more scalable, more
extensible, more modern alternatives available at this point in time.

And it has said that for over a year now ( May 2014 )
That's right up there in The Manual, and anyone who is part of "The Perl
Community" will gladly tell you that CGI.pm is shit, and you shouldn't use
it.
And if you're not going to listen to either of those very loud signals, you
very much should be considered part of the problem.
I mean, what else do we have to do to convince you not to use CGI.pm?
Maybe we should be adding security holes instead of removing them to
discourage its use?
CLOSING.

In short, your diagnosis revolves mostly on needing CGI.pm, and revolves
around mistakes that CGI.pm encourages you to make that don't make sense
in a Web Context, but do make sense in a CLI context.
And then you conflate intepretations of fundementals that aren't true, and
you think Perl is the problem.
But maybe, you should just stop using CGI.pm, and tell people "Stop using
CGI.pm"
And maybe, you could consider filing bugs about things and getting them
fixed, not bitching that something that is broken and wondering why nobody
has fixed it yet.
But hey, at least you know how Perl5 and Perl6 are fundementally different
;), so you got something right :p
AFTERTHOUGHTS on Questions

Why do Perl people bitch when its attacked?

Well, its become a bit of an annoying trope.
Its an "in thing" to go around complaining about Perl, and at least half the
people who are "attacking it" are using arguments about it that are false.
So attacking perl with stupid arguments has become a Meme in the programming
community, and it is largely perpetuated by people who don't know the
language, who see its syntax, never invest the effort to properly understand
it, and then critise it, not objectively, but out of ignorance.
So there's naturally a level of kickback.
And we also try to be better than have debates where we're constantly
insulting languages for non-objective reasons, becuase we'd rather shut the
fuck up and write some code ;)
Just if people could learn to criticise it objectively ( and file fucking
bugs plz ), then maybe we could consume that objective criticism and use it
to improve the language.