Skip to content

Instantly share code, notes, and snippets.

@pm5
Created September 17, 2015 03:26
Show Gist options
  • Save pm5/2a29dbbd0f5909f023e1 to your computer and use it in GitHub Desktop.
Save pm5/2a29dbbd0f5909f023e1 to your computer and use it in GitHub Desktop.

FOSS4G Day 2

Keynote: Open Standards

Slayer, GeoThing.

  • SensorThing, IoT, open standard.
  • Typhoon Souledor, Taipei City Open Data.
    • Open standard for emergency information? (non-standardized) JSON, XML, CAP, KML are all mixed up in status quo.

Lessons from Open Mapping

Alyssa Wright, Mapzen Vice President

  • Entire career in open mapping. Some lessons learned. Over the years that guide me in the journey.
  • Lesson 1, People are tired. When I was young I was in it for the win. Then I graduated into the sophisticated world of open mapping. What? Cranky guys arguing? The fact is organizations are doing the same thing over (and over)^4 again. Open source is about spaces to let people get in.
  • Lesson 2, You are not alone. We are an ecosystem. We are a greater whole. Shadow colleagues. Everyone can find value.
    • Mapzen: Open. Data, Search, Navigation, Design.
  • Lesson 3, Geo is everywhere. It's everywhere that we don't see it anymore. Geo is critical but not a threshold.
  • Lesson 4, Now is the time. The world is not willing to pay for access anymore. The world is ready for open, to pay for strategy.
  • Lesson 5, The journey is the destination. Take the time to ask who I am in the journey.
  • Lesson 6, Do not take shortcuts (on open). Open is not perfect. Thought, discipline, patience, persistence. We are in the marathon.
    • At Mapzen: "Start where you are."
    • Shortcuts are a slippery slope. Instead learn to be honest and learn where and who you are.
  • Humility, integrity.

Data in vector tile

Robert Nordan, @robpvn, github.com/Norkart

  • Our pipeline
    • Data in GostGIS & SLD files
    • Use SLD as source
    • Create a Mapbox Studio YML & Mapnik XML as data def
    • Tilelive.js to render into vector mbtiles
    • Serve and style clientside
  • Zoom levels
    • Vector tiles usually only go to Z16, then overzoom
  • Tile size limits
    • 500BK pertile is a hard limit
  • Never use SELECT * in PostGIS
    • Usually SELECT geom is enough, only add the attributes you really need.
  • Generalization choices
    • Mapbox like Tippecanoe
  • Keep the originals
    • I think vector tiles as lossy compression for display rather than a data format.
  • The spec vs best/general practice
    • Z16 and 500KB limits are not in the spec!
    • Mapbox just configures all their stuff like that
    • Why fight against it?
  • Make Mapbox Studio beg for mercy
    • Opening projects with 200+ layers takes minutes to open, 450 layers is instacrash.
    • We created mapnik XMLs and generated vector mbtiles direcly with tilelive, then used them as source in MBS to inspect
  • Generation times
    • Not magically faster than raster tiles (same amount of work)
    • SSD help a lot
    • Test scanline vs pyramid strategies for your bbox!
  • File sizes
    • Magically small than raster tiles, from TB to GB.
  • Converting styles from SLD
    • Mapbox says to create GL styles from scratch
    • We didin't listen, deployed the power of summer interns
    • Machine-converted styles make for good starting point, need a lot of tweaking
      • Redundant data that used to be painted over
      • Rendering order
      • Separate style rules for the same geometry, different stylings in zoom

Andrea Aime, GeoSolution

  • raster input. optimize data is the most important step.
    • Objectives, fast extraction of a subset of data ond desired resolution
    • avoud having to open large number of files per request, parseing complex structures
    • Know bottlenecks
    • experiments with format, compression, color
    • don't use PNG/JPEG (bad formats, no tiling, memory CPU heavy), avoid ASCII format
    • JPEG2000 good in extensible and rich
    • MrSID can work but needs tuning
    • ECW licensing issue
    • GeoTIFF for the win, Swiss Army Knife, can do a lot.
      • Use gdal_translate, gdaladdo, optionally gdal_warp
      • Possible structures: single GeoTIFF (< 2GB, BigTIFF < 20~50GB), Mosaic of GeoTIFF (<500GB, not too many files), Pyramid (beyond)
      • Proper Mosaic preparation, don't want super big tile file but neither too many files
      • Multidimensional mosaics
      • Prepare pyramid. Use gdalbuildvrt to create a virtual single source for gdal retile. Use gdal_retile for createng the pyramid.
      • Prepare the list of tiles to be retiled.
  • vector inputs.
    • Objectives. Binary data, no complex parsing of data structures, fast extraction
    • Slow formats: WFS, GML, DXF
    • Good: Shapefile, Direcoryof shapefiles , SDE
    • Choose PostGIS if you can, the fastest source we have.
      • use connection pooling, validate connections with pooling, Table clustering, spatial indexing, alphanumeric indexing, indexes
    • Shapefile preparation. Remove .qix file, GeoServer wil recreted it optimized. large DBF remove them.
      • Show highways first then show strees when zoomed in.
  • optimize styling
    • Scale dependencies
      • Never show too much data.
      • Eagerly add MinScaleDenominator to your SLD rules
    • Labeling.
      • Conflict resolution is expensive, limit to the most inner zooms
      • Halo is importand for readability but slow
    • FeatureTypeStyle
    • z-ordering
      • if Use DBMS as data source
      • Add indexes on fields for z-ordering
  • Tiling and caching
    • GWC
      • Useful for stable layers, mostly static layer.
    • Space consideration
      • Tile takes a lot of space
      • Set a disk quota
    • Client side cache
      • Client-side caching of tiles
    • Right format
      • JPEG for background data
      • PNG8 + precomputed palette for background adat
      • Vector tiles, still new but
        • PNG encoding used 50% of the time, but vector tiles don't have them
        • more compact
  • Resource limits
    • Servers need to defend themselves
    • Enforce fairness
    • Control-flow. Contorl how many requests are executed in parallel.
  • JVM
    • There's no GO FAST option in JVM
    • Noting will be gain from JVM tuning if all above are done.
    • Raster subsystem
    • Marlin renderer
  • Upgrade upgrade upgrade!

Sometimes Geo is not about spatial.

  • A big place of things, and everything has an id. A big lits of stuff.
  • Sometimes it's good to just say California, and point to that with gesture.
  • Gadgeteer. A pen, point to anything in a museum than the record is saved.
  • the data is not the database
    • databases come and go, we are not optimizing for any of the databases.
    • we want our work to exists beyond our endeaver
    • the most important things about data is portability
    • standardize by text files, GeoJSON, so in the short-term it is difficult to use, so there are works
  • stable permanent IDs
  • example
    • search, return URL to GeoJSON resources, load the document, draw tiling.
    • super inefficient, but we try to kick the tire along the way
  • a consensus geom, and a list of accompany alternative geom.
  • concordances
  • relationships
  • supersedes
    • every geom has two properties: supersedes, superseded_by
  • "spelunking"
  • spelunker
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment