Skip to content

Instantly share code, notes, and snippets.

@tomkralidis
Last active March 27, 2018 12:50
Show Gist options
  • Save tomkralidis/006594382d6339c2047b441108a3991b to your computer and use it in GitHub Desktop.
Save tomkralidis/006594382d6339c2047b441108a3991b to your computer and use it in GitHub Desktop.
MapServer PostGIS query performance

Overview

  • works on 6.4.x
  • times out on 7.0.2

Environment

MapServer 7.0.2 PostgreSQL 9.3.14 PostGIS 2.2

MapServer LAYER object

LAYER
 NAME "totalozone"
 STATUS ON
 DEBUG 5
 CONNECTION "host=localhost dbname=foo user=foo password=foo"
 CONNECTIONTYPE POSTGIS
 PROCESSING "CLOSE_CONNECTION=DEFER"
 DATA "the_geom from (select * from totalozone order by instance_datetime) as subquery using unique data_payload_id using srid=4326"
 TYPE POINT
 DUMP TRUE
 TEMPLATE "ttt.html"
 PROJECTION
  "init=epsg:4326"
 END
 METADATA
  "ows_title" "totalozone"
  "gml_include_items" "all"
 END
END

Request:

http://example.org/ows?map=/tmp/foo.map&service=WFS&version=1.1.0&request=GetFeature&outputformat=GeoJSON&typename=totalozone&filter=<ogc:Filter><ogc:And><ogc:BBOX><PropertyName>msGeometry</PropertyName><Box srsName="EPSG:4326"><coordinates>-188.4375,-85.67541458664353 190.546875,83.44032649527307</coordinates></Box></ogc:BBOX><ogc:PropertyIsBetween><ogc:PropertyName>instance_datetime</ogc:PropertyName><ogc:LowerBoundary>1924-01-01 00:00:00</ogc:LowerBoundary><ogc:UpperBoundary>2016-12-31 23:59:59</ogc:UpperBoundary></ogc:PropertyIsBetween></ogc:And></ogc:Filter>&startindex=0&maxfeatures=1

Filter pretty-printed

<ogc:Filter>
 <ogc:And>
  <ogc:BBOX>
   <PropertyName>msGeometry</PropertyName>
   <Box srsName="EPSG:4326">
    <coordinates>-188.4375,-85.67541458664353 190.546875,83.44032649527307</coordinates>
   </Box>
  </ogc:BBOX>
  <ogc:PropertyIsBetween>
   <ogc:PropertyName>instance_datetime</ogc:PropertyName>
   <ogc:LowerBoundary>1924-01-01 00:00:00</ogc:LowerBoundary>
   <ogc:UpperBoundary>2016-12-31 23:59:59</ogc:UpperBoundary>
  </ogc:PropertyIsBetween>
 </ogc:And>
</ogc:Filter>

MapServer 6.4.2

select "data_payload_id","data_payload_uri","instance_datetime","url","agency","platform_type","platform_id","platform_name","gaw_id","instrument_name","instrument_model","instrument_number","monthly_date","monthly_columno3","monthly_stddevo3","monthly_npts","daily_date","daily_wlcode","daily_obscode","daily_columno3","daily_stddevo3","daily_utc_begin","daily_utc_end","daily_utc_mean","daily_nobs","daily_mmu","daily_columnso2","latest_observation",encode(ST_AsBinary(ST_Force2D("the_geom"),'NDR'),'hex') as geom,"data_payload_id" from (select * from totalozone order by instance_datetime ASC) as subquery where the_geom && ST_GeomFromText('POLYGON((-188.4375 -85.6754145866435,-188.4375 83.4403264952731,190.546875 83.4403264952731,190.546875 -85.6754145866435,-188.4375 -85.6754145866435))',4326) and ( ("instance_datetime" BETWEEN '1924-01-01 00:00:00' AND '2016-12-31 23:59:59')) limit 1 offset 0

MapServer 7.0.2

select "data_payload_id","data_payload_uri","instance_datetime","url","agency","platform_type","platform_id","platform_name","gaw_id","instrument_name","instrument_model","instrument_number","monthly_date","monthly_columno3","monthly_stddevo3","monthly_npts","daily_date","daily_wlcode","daily_obscode","daily_columno3","daily_stddevo3","daily_utc_begin","daily_utc_end","daily_utc_mean","daily_nobs","daily_mmu","daily_columnso2","latest_observation",encode(ST_AsBinary(ST_Force2D("the_geom"),'NDR'),'hex') as geom,"data_payload_id" from (select * from totalozone order by instance_datetime ASC) as subquery where the_geom && ST_GeomFromText('POLYGON((-85.6754145866435 -188.4375,-85.6754145866435 190.546875,83.4403264952731 190.546875,83.4403264952731 -188.4375,-85.6754145866435 -188.4375))',4326) and ((st_intersects(the_geom,ST_GeomFromText('POLYGON ((-85.6754145866435266 -188.4375000000000000, -85.6754145866435266 190.5468750000000000, 83.4403264952730694 190.5468750000000000, 83.4403264952730694 -188.4375000000000000, -85.6754145866435266 -188.4375000000000000))',4326)) = TRUE and ("instance_datetime" >= date_trunc('second',timestamp '1924-01-01 00:00:00') and "instance_datetime" <= (date_trunc('second',timestamp '2016-12-31 23:59:59') + interval '1 second' - interval '1 second')))) ORDER BY "instance_datetime"
@sdlime
Copy link

sdlime commented Dec 1, 2016

I've got a test environment setup using total ozone data I nabbed off the web, looks like what you're using. I have instance_datetime set as a timestamp w/o timezone in PostgreSQL. Is that correct?

Anyway with 6.4.3 I get the following SQL from the MapServer log file:

select "gid","data_payload_id","instance_datetime_as_char","url","agency","platform_type","platform_id","platform_name","gaw_id","instrument_name","instrument_model","instrument_number","monthly_date","monthly_column03","monthly_stddev03","monthly_npts","daily_date","daily_wlcode","daily_obscode","daily_column03","daily_stddev03","daily_utc_begin","daily_utc_end","daily_utc_mean","daily_nobs","daily_mmu","daily_columns02","latest_obsevation","instance_datetime",encode(ST_AsBinary(ST_Force2D("the_geom"),'NDR'),'hex') as geom,"data_payload_id" from (select * from totalozone order by instance_datetime) as subquery where the_geom && ST_GeomFromText("POLYGON((-188.4375 -85.6754145866435,-188.4375 83.4403264952731,190.546875 83.4403264952731,190.546875 -85.6754145866435,-188.4375 -85.6754145866435))",4326) and ( ("instance_datetime" BETWEEN 1924-01-01 00:00:00 AND 2016-12-31 23:59:59)) limit 1 offset 0

which of course pukes because the dates in the BETWEEN section aren't quoted. Any idea why that would be? I must be missing something in my mapfile. Am running the exact query you sent minus the GeoJSON output format. If I quote them properly in pgAdmin the query runs fine.

@sdlime
Copy link

sdlime commented Dec 1, 2016

With 7.0.2 get the following SQL from the MapServer log file:

select "gid","data_payload_id","instance_datetime_as_char","url","agency","platform_type","platform_id","platform_name","gaw_id","instrument_name","instrument_model","instrument_number","monthly_date","monthly_column03","monthly_stddev03","monthly_npts","daily_date","daily_wlcode","daily_obscode","daily_column03","daily_stddev03","daily_utc_begin","daily_utc_end","daily_utc_mean","daily_nobs","daily_mmu","daily_columns02","latest_obsevation","instance_datetime",encode(ST_AsBinary(ST_Force2D("the_geom"),'NDR'),'hex') as geom,"data_payload_id" from (select * from totalozone order by instance_datetime) as subquery where the_geom && ST_GeomFromText('POLYGON((-85.6754145866435 -188.4375,-85.6754145866435 190.546875,83.4403264952731 190.546875,83.4403264952731 -188.4375,-85.6754145866435 -188.4375))',4326) and ((st_intersects(the_geom,ST_GeomFromText('POLYGON ((-85.6754145866435266 -188.4375000000000000, -85.6754145866435266 190.5468750000000000, 83.4403264952730694 190.5468750000000000, 83.4403264952730694 -188.4375000000000000, -85.6754145866435266 -188.4375000000000000))',4326)) = TRUE and ("instance_datetime"::text >= '1924-01-01 00:00:00' and "instance_datetime"::text <= '2016-12-31 23:59:59')))

Returns 1 record in GML. Interesting the limit and offset are not part of the SQL in 7.0.2. I believe that's because we wanted to apply those across all drivers. Anyway I don't know why your generated SQL is so different for 7.0.2. In my log file output I see a line like so:

[Wed Nov 30 21:12:45 2016].142666 msPostGISLayerTranslateFilter. String: (intersects([shape],fromText('POLYGON ((-85.6754145866435266 -188.4375000000000000, -85.6754145866435266 190.5468750000000000, 83.4403264952730694 190.5468750000000000, 83.4403264952730694 -188.4375000000000000, -85.6754145866435266 -188.4375000000000000))')) = TRUE And ("[instance_datetime]" >= "1924-01-01 00:00:00" AND "[instance_datetime]" <= "2016-12-31 23:59:59")).

That's what the OWS filter code spits out as a MapServer expression. What are you seeing? I notice the date/time values are treated like strings with quotes and not backticks that would indicate a date/time value, I suppose there's no way for the OWS filter handling to know what they are unless that is being expressly set someplace. I wonder if that's the issue with 6.4.3 where I'm seeing the missing quotes. Seems like our configs aren't the same - need to resolve that.

@sdlime
Copy link

sdlime commented Dec 1, 2016

BTW There's a little cut and paste error in your 7.0.2 SQL snippet. If I fix and run that in pgAdmin against my data, which may be way off from yours (for example, I'm missing data_payload_uri column) it runs fine. My dataset has 25,000 rows.

@tomkralidis
Copy link
Author

TODO:

  • change WFS filter to update srsName="urn:ogc:def:crs:OGC:1.3:CRS:84"
  • test SQL without the extra st_intersects query

@tomkralidis
Copy link
Author

cc @rouault

It turns out that the issue is 6.4 ignored the WFS sortby parameter, whereas 7.0 does not. This results in a double order by clauses in the SQL if the users specifies a subquery which includes an order by in their LAYER.DATA directive. So in summary a workaround is not including order by in LAYER.DATA statements and delegate to WFS proper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment