Skip to content

Instantly share code, notes, and snippets.

@gkinsman
Created September 18, 2013 00:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gkinsman/6602731 to your computer and use it in GitHub Desktop.
Save gkinsman/6602731 to your computer and use it in GitHub Desktop.
Slight modification of http://tomasp.net/blog/2013/tuples-in-csharp/index.html to group by not only parameter type, but name as well
(*@
Note that this is directly from his blog post on https://github.com/tpetricek/TomaspNet.Website/blob/master/source/blog/2013/tuples-in-csharp.fsx
The only minor addition is on L162:
yield [ for p in meth.GetParameters() -> p.ParameterType.ToString() + ": " + p.Name ] }
where we added the name of the parameter in the key for the grouping. Results are interesting!
Also added a Dump on L180 for LinqPad :)
*)
(*@
Layout = "post";
Title = "How many tuple types are there in C#?";
Tags = "c#, f#, functional programming";
Date = "2013-09-17T09:11:57.7562922-04:00";
Description = "In a recent StackOverflow question, the poster asked about the choice between " +
"a function that takes a tuple and a function that uses the curried form. In this article " +
"I look at the problem from the C# perspective.";
*)
(***hide***)
open System
(**
How many tuple types are there in C#?
=====================================
In a [recent StackOverflow question](http://stackoverflow.com/questions/18718232/when-should-i-write-my-functions-in-curried-form/18721711)
the poster asked about the difference between _tupled_ and _curried_ form of a function in F#.
In F#, you can use pattern matching to easily define a function that takes a tuple as an argument.
For example, the poster's function was a simple calculation that multiplies the number
of units sold _n_ by the price _p_:
*)
let salesTuple (price, count) = price * (float count)
(**
The function takes a single argument of type `Tuple<float, int>` (or, using the nicer F# notation
`float * int`) and immediately decomposes it into two variables, `price` and `count`. The other
alternative is to write a function in the _curried_ form:
*)
let salesCurried price count = price * (float count)
(**
Here, we get a function of type `float -> int -> float`. Usually, you can read this just as a
function that takes `float` and `int` and returns `float`. However, you can also use _partial
function application_ and call the function with just a single argument - if the price of
an apple is $1.20, we can write `salesCurried 1.20` to get a _new_ function that takes just
`int` and gives us the price of specified number of apples. The poster's question was:
> So when I want to implement a function that would have taken _n > 1_ arguments,
> should I for example always use a curried function in F# (...)? Or should I take
> the simple route and use regular function with an n-tuple and curry later on
> if necessary?
You can see [my answer on StackOverflow](http://stackoverflow.com/questions/18718232/when-should-i-write-my-functions-in-curried-form/18721711#18721711).
The point of this short introduction was that the question inspired me to think about how
the world looks from the C# perspective...
To curry or not to curry?
-------------------------
I will not repeat the whole answer in the blog post. The key idea is that you should use
tuple when the tuple has some _logical meaning_. For example, if you have a function that
takes a range or 2D coordinates, it makes sense to use `float * float`.
This makes sense because you can then nicely compose multiple functions that work with
ranges. For example, let's say we have a function `normalizeRange` and `expandRange`:
*)
let normalizeRange (lo, hi) =
if lo > hi then (hi, lo) else (lo, hi)
let expandRange offset (lo, hi) =
(lo - offset, hi + offset)
(**
Now we can easily write code that takes some range, normalizes it and expands it by 10:
*)
expandRange 10 (normalizeRange(50, 30))
// [fsi:val it : int * int = (20, 60)]
(**
So, if your tuple has some logical meaning, taking tuple as an argument leads to more
composable code and makes it easier to understand. On the other hand, if there is no
logical connection, it is better to use the curried form - this makes it possible to
use partial function application.
How about tuples in C#?
-----------------------
In C#, we can work with tuples using the `Tuple<T1, T2, ...>` family of types. This is
certainly possible, but it is not particularly convenient, because you need to write
the long type name repeatedly (you can use `var` inside method, but not in the method
declaration).
However, there is another place where tuples appear in C# - it is perfectly reasonable
to treat all .NET methods as functions that take a single tuple as the input and return
some other type as the result. This is how .NET methods look when you call them from
F#:
*)
Math.Round(4.5, MidpointRounding.ToEven)
(**
We do not usually think about this as a tuple - it is just a method call - but what if
C# had (in [some future version](http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2405699-build-list-dictionary-and-tuple-into-the-language))
syntactic support for tuples and let you write `(42, "Hello world")` to create a tuple
value of type `Tuple<int, string>`?
How many tuple types are there in .NET?
---------------------------------------
This inspired me to do a quick analysis of the standard .NET libraries to have a look
at the tuples that standard .NET methods take. How many of them follow the good practice
and take a tuple that actually means something? And how many of them should instead use
the curried form, because the tuple has no logical meaning?
Checking the logical meaning will be difficult, but we can see how many of the tuples
are used by more than one or two methods. If they are used in multiple places, it
likely means that they represent some common pattern or some common single-purpose
data structure.
This is pretty easy analysis to do using F# Interactive. Let's first look at all the types
in the current `AppDomain` (this uses assemblies that are loaded by default in F# - so
nothing fancy). We also only look at "mscorlib" and "System" assemblies:
*)
open System
open System.Reflection
// Get all types in currently loaded assemblies
let types = seq {
for asm in AppDomain.CurrentDomain.GetAssemblies() do
if asm.FullName.StartsWith("System") ||
asm.FullName.StartsWith("mscorlib") then
yield! asm.GetTypes() }
types |> Seq.length
(**
The code is a simple _sequence expression_ that iterates over all assemblies and
yields all types. On my machine, this gives us some 17000 types. Now, let's get a
list with all tuples - we'll iterate over all methods in each type and generate a
list with the names of parameter types. We skip all methods with less than 2 parameters:
*)
let tuples = seq {
for typ in types do
// Get declared, public, both instance and static methods
let flags = BindingFlags.DeclaredOnly ||| BindingFlags.Public |||
BindingFlags.Static ||| BindingFlags.Instance
let methods = typ.GetMethods(flags)
// Generate tuples with parameters types for each method
for meth in methods do
let pars = meth.GetParameters()
if pars.Length > 1 then
yield [ for p in meth.GetParameters() -> p.ParameterType.ToString() + ": " + p.Name ] }
tuples |> Seq.length
(**
So, on my machine there are 16463 methods in .NET that take some tuple as an argument.
Now, the question is, how many of them are used repeatedly? We can easily group the
tuples by the list of strings (F# implements structural comparison, so this is easy to do),
calculate the counts for each group and sort the results:
*)
let counts =
tuples
|> Seq.groupBy (id)
|> Array.ofSeq
|> Array.map (fun (k, vs) -> k, Seq.length vs)
|> Array.sortBy snd
|> Array.rev
result |> Dump
(**
Most common tuples in .NET
--------------------------
If we run `Seq.length counts`, we get 5805 as the result. This means that there are 5 thousand
distinct tuples (among roughly 15 thousand different methods). That certainly does not look like
most of them have some logical connection. But some of the top ones certainly do - here are the
top 8 (ignoring generics) with their counts:
1. `string * string` (714) - looks like many methods take two strings - not sure if there
is any logical meaning, but there probably are a few common uses
2. `byte[] * int * int` (341) - this one looks like an array with offset and length - clearly
this is a nice tuple with logical meaning
3. `int * int` (327) - similar to two strings
4. `object * object` (180) - hmm, maybe .NET likes untyped API :-)
5. `int * object` (165) - I was a bit puzzled by this one, so I checked the methods that
use this type. Good old untyped collections from the .NET 1.0 days!
6. `char[] * int * int` (159) - similarly to the number 2, another nice logical tuple!
7. `string * string * string` (156) - wow, so many methods take 3 strings
8. `ITypeDescriptorContext * Type` (152) - huh??
How many are actually useful?
-----------------------------
It looks like there is quite a few tuple types that actually mean something useful. But what
is the distribution? Let's use [the FSharp.Charting](http://fsharp.github.io/FSharp.Charting/)
library to draw a quick chart that draws a column chart plotting the counts for every single
of the 5000 tuple types:
*)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment