with apologies to Neil Young
We interrupt this k-Means series to bring you an important message about broadcasting.
Take two vectors, x and y, and create a matrix C from a function of the values of each element pair, such that
Cribbing from NumPy's outer function, that function looks like it hands you back an iterator over all pairs which you give to your function. PDL's outer primitive only does outer product, so close, but no cigar.
I know what. I'll ...
#!/usr/bin/perl | |
# | |
# Benchmark script comparison with https://github.com/duffee/plb2/blob/master/src/perl/matmul.pl | |
# generate 2 N-square matrices and multiply them together | |
# | |
# Faster matrix multiplication using PDL | |
# N = 200: 6.4 times faster | |
# N = 400: 33.2 times faster | |
# N = 700: 65.5 times faster | |
# N = 1000: 63.5 times faster |
Check matrix multiplication with *
Grab an orthogonal matrix cos 30 = .866, sin 30 = .5 Construct the Identity matrix, with 1's on the diagonal and zeroes everywhere else. multiply matrix with its transpose
use Test::PDL
$m = pdl([0.866, 0.5],[-0.5, 0.866])
[
[0.866 0.5]
banner https://en.wikipedia.org/wiki/Statistics#/media/File:Fisher_iris_versicolor_sepalwidth.svg attribution English: Histogram of sepal widths for Iris versicolor from Fisher's Iris flower data set. SVG redraw of original image. Date 13 July 2008, 04:08 (UTC) Source en:Image:Fisher iris versicolor sepalwidth.png Author en:User:Qwfp (original); Pbroks13 (talk) (redraw) https://creativecommons.org/licenses/by-sa/3.0/deed.en Creative Commons Attribution-Share Alike
If you're doing statistics on vast swathes of data, you could use PDL!
Sometimes PDL doesn't scratch the particular itch you have. PDL isn't like Vegas. It doesn't have to stay there. After crunching your data, you can get it out to Perl and beyond.
As any Yule fule know, PDL has plenty of graphing options so I'll need something unusual to make it worth the time. I was impressed by D3.js,
Santa has discovered quality control issues in some of the scientific equipment that was delivered to children last Christmas. The results from the Early Elf Joint Internship Training program were more No-No-No than Ho-Ho-Ho! However the EEJIT's were consistent and testing revealed that the first 3 values in every 10 measurments were faulty. An update was sent to all affected children so they could recalculate their results, knowing which readings to ignore.
Problem: How should they treat the suspect values?
You could choose to set those values to some out of range number to flag them, could be 0 or -999 for temperatures
If you have a bunch of numerical data that you need crunched fast, you need PDL !
PDL stores its values in a "vectorized data structure" which is compact in memory, usually in double
s and pre-declared sizes.
This allows for fast traversal and manipulation.
The underlying code is written in C for speed, with access to the internal structure for those who feel the need to tinker.
A PDL object is sometimes refered to as an ndarray (N-dimensional array) to conform with usage in other languages. Simply put, PDL gives you the ability to process large chunks of data at once.
I wrote my first Perl 6 program (that is, one that worked) the day before the London Perl Workshop where I proudly told people. So, JJ says "Why don't you write an Advent calendar post for Perl 6?"
With only one program under my belt, what do I have to write about? Well ... I authored Astro::Constants in Perl 5, so how hard would it be to migrate it to Perl 6?
Since I couldn't keep my big gob shut, I give you the tale of a Perl 5 module author wandering through the Perl 6 landscape.