Skip to content

Instantly share code, notes, and snippets.

@bwoebi
Last active August 29, 2015 14:15
Show Gist options
  • Save bwoebi/b4c5564388ecd004ba96 to your computer and use it in GitHub Desktop.
Save bwoebi/b4c5564388ecd004ba96 to your computer and use it in GitHub Desktop.
Do NOT work around weak typehints, but fix them

To start, the current version (0.2) of the scalar type hints RFC is the correct way to do it, if we really want to have two modes (whether declare(strict_types=1) or <?php strict doesn't matter a lot here).

Type hinting in general is useful, no question. We can guarantee in the function body what we types are working with. That avoids bad surprises like working with a non-numeric string when we expect an integer. Very nice.

But two modes? Why? What issue are we solving with strict typing?

Strict typing

Only equal parameters can be passed in. There also a "numeric" type hint was proposed, to solve the issue where we just want a number.

That allows to check that we also only pass in what we expect. Should prevent passing absolutely wrong values in. So, for example

function foo(numeric $bar) {
	return sin($bar) * cos($bar);
}

foo("5 fruits");

That should fail.

Or that great example from Daniel Lowrey:

// Third paramter, expects integer:
// 1: only verify the peer cert HAS a name
// 2: verify the name matches the domain you connected to
// true: auto-casts to 1
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, true)

with current weak hints, that's absolutely dangerous... And found by strict hints.

But at the same time, it has its downsides:

function foo(int $bar) { /* ... */ }
foo((int) $stringContainingInteger);

We need some integer for the function (example: bitmask) when we get it in string form (e.g. database value). So, we need to cast it to get an integer. It just works. It never can fail, no matter what string we pass. It actively hides potential bugs by the string being casted. You may pass in garbage, you still get an integer.

Solution?

Improved weak type casting. Also especially fix internal function casting.

Yes, it will involve a BC break in case of internal functions. I know, we should avoid them, but rather fix it once than never.

Only pass things you can directly convert. (Actually assuming bigints will come, else we make our life harder than needed.) … and which make also sense to pass. Just like it makes no sense to pass true or false to a parameter expecting integer.

When expecting integer, only accept bare numbers. Means reject everything except -?[0-9]+(\.[0-9]*)(e[0-9]*)? (and make attention to not have more digits behind the dot than indicated after the exponent - we want an integer). And if a float is passed, fail if it doesn't represent an exact number.

When expecting a float, accept every integer and numeric string (as long as there's nothing else in the string, but the number. "5 apples" should be disallowed.)

When having a numeric type, use the same rules as for float, but if possible, return an integer rather than a float.

When expecting a boolean, accept an integer (0 == false, everything else is true) or a boolean. (Integers should be accepted, isn't it handy to just pass in a literal 1 instead of true?)

When expecting an array, only accept array.

When expecting a string, accept ints, floats and strings. (booleans and arrays don't really produce useful values here)

When being passed null, just cast null to the specific type.

This simple set of rules is relatively basic, yet it covers nearly every need of implicit casting. It fixes everything why we would need strict types. It fits all the examples above. Even when you want to pass a string to an integer type hint, it just allows valid integers, not every garbage, like when you'd have to do an explicit cast due to strict hints. Also, it probably will break nearly no PHP code which is not already broken in some way.

Is there any reason why we'd need a stricter set than what I just proposed here?

Conclusion

Fix weak type casts, add weak scalar type hints based on them. And we don't need strict type hints at all.

That's mainly why I voted no on the RFC. Not because it is a bad RFC nor a bad concept, it just tries to fix the problem the wrong way. It is a workaround around the too relax current type casting, which basically is the root problem to should be solved.

I think, working around the core problem is definitely not the way to go. Fix the core problem, then we don't need strict scalar hints (as weak hints already will cover everything strict hints aim to solve) and unicorns stop dying so fast.

The proposed rules, resumed

  • Strings which aren't numbers shouldn't be accepted for floats.
  • Strings which don't exactly represent an integer (either pure digits with eventual sign or resulting expression which an exponent shouldn't have any decimal part after expanding) shouldn't be accepted for integer.
  • Only accept floats for ints if they're exactly representable as int (means you should do the rounding yourself, relying on truncation often is risky).
  • For floats accept strings if they're representing a number or an integer. (like java, widening)
  • For numeric accept everything what float accepts, but if losslessly possible cast to int.
  • For strings accept float and integer.
  • For bool accept integer and float.
  • So what breaks compared to now:
  • Strings which aren't strictly numeric are rejected to int, float and numeric.
  • Floats, if they don't exactly represent an integer are rejected to int. (or rather: reject if there exists a nearer representation to the integer)
  • Booleans are rejected to everything else.
@flaupretre
Copy link

Hi,

What's you opinion on what we should allow in numeric strings ? leading blanks, leading zeroes, trailing blanks (just blanks not any char) ? The current STH RFC removes support for all of this but I am not sure it shouldn't be kept...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment