A team working under a DARPA contract moved a system for geolocating iamges from a local cluster to AWS.
- aws s3 => always sync thrice
- do your own key store in the root of your bucket?
- map high volume IO to local ephemeral disks
- if you are using a NAT within a VPC, know that all traffic moves through that instance so size it accordingly
- snapshots are cheap, compressed, differential
- https://libcloud.apache.org/ is a library that abstracts different cloud providers into the same API
A research scientists presented on a ton of cool big data applications, mostly around tracking ecological health
Slides available here: https://t.co/3scXM0GCYT
HHypermap is an opendata.arcgis.com like project from Harvard and Terranodo to index all the geospatial services on the web. There is a lot of overlap with our project and a lot of areas of collaboration e.g. we could share our index with them or use their uptime monitoring statistics.
- http://hypersearch.cga.terranodo.io/
- Every remote service is cached
- Initial top level view is cached
- Using map proxy in the background
- Harvest initial view
- Lucene backed
- Building feature level search 20million features
- coming in the next couple months
- Uptime stats on everything using it in ranking
Hagander.net/talks/PostgreSQL_9.6.pdf
- Parallelism
- helps with cpu bound operations
- mark function parallel safe
- no json
- no string
- no array
- Foreign data wrappers have more pushdown capabilities
- Datetimes faster
- Heavy write loads faster dues to a better locking scheme
Geowave is a project sponsored by NSA for indexing billions of features. It's key innovation is using a Hilbert Space-Filling curve.
- https://github.com/ngageoint/geowave
- uses dimensionality reduction
- space filling curve
- can deploy thru aws emr
Valhalla is Mapzen's routing implementation.
-
valhalla now supports multimodal
-
mapzen has a mobility team
-
valhalla can be embedded due to it's dynamic runtime costing
-
single dataset multiple route types + options
-
super granular on just biking
- bike type
- use roads
- hills
very personalized routing
- routing tiles
- local
- roads and paths
- arterial
- remove road and paths
- local
- highways
- trunks, highways
tractors are autosteered
- connected to large WIFI network
- 200k pts of data per day multiple crop management zones
- plant seeds based on the soil characteristics know exactly where the fields are
sats, drones, stationary sensors
1 billion points a day all the apis are different 3-4 week turnaround for download 99% has spatial attributes
big mongo bucket => postgis
filo db? cassandra with spatial hooks
us sugar esri license they aren't using
hardware vendors are fighting sharing with opendata
using spotfire for analytics
earth observatory
sharing spatial imagery formats is a challenge lots of different products: processing levels, coverage, time
nisar is going to dramatically increase the data coming in per day
github nasa-gibs
all this stuff is developed in house
TIE brings everything together into a Meta-Raster-Format
investigating move to the cloud