Benchmarking 2010 IRC meeting logs Aug 4th

From OSGeo Wiki
Jump to: navigation, search
  • Logs from the meeting

<acuster> Hey all, shall we have a meeting?
<acuster> I put up a provisional agenda on:
<sigq> Title: Benchmarking 2010 - OSGeo Wiki (at
<acuster> basically: status of machines / status of data / status of styling / status of testing 
<msmith_> ok. well by the agenda the machine status is first so thats where I'll start
<msmith_> first, did everyone get the email that the ip's were switched between the jmeter and linux wms machines?
<msmith_> .78 is the wms machine, .76 is the jmeter machine
<acuster> is that why the disk was not big enough or was that a separate issue?
<msmith_> I've move the files on the jmeter machine over to the correct wms machine. If anyone has problems, let me know
<msmith_> acuster: thats why the disk was not big enough
<msmith_> it is now. we were installing on the 160gb drive of the jmeter machine rather than the 2tb drive on the wms machine
<msmith_> several groups have gotten access to the windows box and installed their software
<msmith_> and FrankW has processed all the raster images
<acuster> ah, okay that moves half way to the next topic
<msmith_> so thats where we are at
<ascollignon> hello, could we put the geotiff data on the windows server as well, please ?
<acuster> the shapefiles are all in place and the data inserted into the postgis database, right?
<msmith_> yes. I'll do that today
<ascollignon> great, thanks
<acuster> and the geotiffs are geotiffs or bigtiffs?
- marco ( has joined #foss4g
<acuster> i.e what size are they?
<msmith_> they are bigtiffs. and about 8gb per
<acuster> that's a blocker for us.
<acuster> i.e. for cedricbr and constellation
<msmith_> we can convert the rasters to a different geotifff format. just give the specs and frank can convert again to a diff dormat
<msmith_> c/dormat/format
<acuster> just standard geotiff. Here we broke them into 8
<acuster> four wide by two high
<acuster> makes for 112 images total
<msmith_> if you want to create your own set from the source ecw's that fine also
<acuster> oh, I thought we needed to all use the same files
<IvanSanchez> oh frak
- IvanSanchez apologizes for being late
<msmith_> no. we simply specified geotiff as the base, but they are so many different that each can have their own geotiff
<msmith_> just like the overviews anb spatial indexes can be customized
<acuster> ah? so, for the baseline, we share the shapefiles and the postgis but can create our own geotiff's?
<msmith_> i think so. Not everyone can use the same geotiff formats. And for the base I think its just shapefiles. The postgis is more for the "best" testing
<msmith_> thats my understanding. I can be told I'm wrong
<acuster> does that seem reasonable to all the people here?
<mpdaly> Seems like apples to oranges to me
<acuster> => action: Adrian will confirm by email
<acuster> let's move on
<acuster> any other questions about the status of the data on the machines?
<acuster> ...going twice
<msmith_> one issue that came up from aaime, is that the contour files are split but but not spatially split
- FrankW ( has joined #foss4g
<acuster> msmith_, okay we move to (2) Data status / issues
<msmith_> so if you are just displaying one contour file, it will display in many differnt areas, and not clustered
<acuster> yes, we noticed that as well. it's definately sub-optimal
<acuster> but this data set was really not created for use in cross-scale wms anyhow
<msmith_> agreed
<IvanSanchez> msmith_: are the non-spatially-split contours a problem?
- marco has quit (Remote host closed the connection)
<msmith_> i don't think so, just lower performance since you'll be hitting multiple files for each area
<acuster> another question I had about the shapefiles was related to character encoding
<acuster> it seems the labels in catalan have encoding errors which impacts their display 
<acuster> and their insertion into a PostGIS database
<IvanSanchez> acuster: just catalan? sounds strange
<acuster> did anyone else have character encoding issues?
<acuster> no, I just looked in catalonia
- IvanSanchez goes off to check the data specs
<acuster> it's probably the spanish symbols
<IvanSanchez> acuster: I suspect the problem is spain-wide, we've got á, é, í, ó, ú and ñ all over the place
<dmonie> We did not. But we had to mention the appropriate encoding on the server and in the browser.
<IvanSanchez> catalonia also has à, è, ì, ò, ù and ç
<cedricbr> I've tried to insert the shapefiles into a postgis database, and I get some encoding error in fact
<cedricbr> using shp2pgsql
<cedricbr> and specifying encoding UTF8
<acuster> who built the postgis db?
<msmith_> pramsey
<acuster> and he's not here. okay. next
<IvanSanchez> no, the docs don't say anything about character encoding
<dmonie> We also tried to insert in postgis, but encoding must be set to iso-8859-1
- FrankW pops back in after entertaining guests for three days.
- IvanSanchez stops being entertained
<acuster> dmonie, you confirmed it to be or you suspect it to be iso-8859-1?
<FrankW> The raster data is processed into bigtiff files in /benchmarking/data/raster and seems ok. 
<FrankW> msmith_: perhaps you can copy to windows if you haven't already?
<cedricbr> latin1 ?
<acuster> or 15
<dmonie> acuster: i am sure. And in the browser, we mentionned "western iso-8859-1".
<cedricbr> so the database should be created in latin1 too?
<acuster> dmonie, thanks
<msmith_> FrankW: I will
<acuster> cedricbr, the import tool should be able to convert on the fly if it knows what it's handling
<acuster> perhaps it was merely the dbf files that were strange. I'll look into the character encoding issue some more.
<acuster> any other issues related to data?
<dmonie> If we allow bigtiff or geotiff for baseline, can be also allow ecw?
<dmonie> c/be/we
<acuster> lol
<FrankW> dmonie: I think we have established that the baseline is geotiff.
<FrankW> I'm still suspecting that someone will find bigtiff is a problem and we will have to split up these files into quarters so they are regular tiff.
<FrankW> Of course ecw is fine for the "do your best" track.
<dmonie> Constellation has a problem with bigtiff apparently.
<acuster> FrankW, we have already been over that. Suggestion was that we might each generate our own tiff images
<acuster> a mail has just gone out asking about this.
<cedricbr> dmonie: right it is a problem for us bigtiff
<FrankW> ah, I haven't seen it.  Big email backlog.
<acuster> it probably has not yet arrived.
<FrankW> I suppose there is no harm in folks making their own geotiffs if they want since we have a reasonable amount of space. 
<acuster> locally we have broken each ecw into 8 tiles
<acuster> so each geotiff is ~770Mb
- pramsey ( has joined #foss4g
<acuster> any other issues related to data?
<FrankW> btw, are we in a meeting now?  
- pirmin_k has quit (Remote host closed the connection)
<acuster> yes
<FrankW> ah, sorry for budding in.
<msmith_> no FrankW: you are part of the meeting
<acuster> ...going twice
<IvanSanchez> FrankW: resistance is futile
<dmonie> It is very surprising that each participant can run the tests with his own shape of the dataset.
<dmonie> What do we compare, then?
<FrankW> dmonie: would you prefer we have one canonical set of geotiff files?
<acuster> dmonie, yes, that seems strange
<FrankW> I would be fine with that.
<mpdaly> +1 for one set of files
<mpdaly> Don't care which
<acuster> would they be well aligned?
<FrankW> aligned?
<IvanSanchez> the ecws are aligned AFAIK
<acuster> yeah, so the overviews are aligned on an overall grid, not merely image by image
<IvanSanchez> the overlap a bit on the edges but, besides that, they fit in a grid
<acuster> the pixels fit on a grid
- FrankW notes that file sets not on a regular grid are common and not unreasonable for us to work with.
<acuster> the images are kind of a mess
<mpdaly> I thought, for the baseline, the data was take it or leave it
- FrankW too, within reason.
<acuster> right, but  we are manipulating it to start with
<mpdaly> acuster: define "we"
<acuster> the question is what manipulations we choose to do
<acuster> foss4g
<IvanSanchez> hey, if you want to use the original 4000 shapefiles, be my guest
<acuster> we have chosen not to work with ecw
<FrankW> and we chose to build overviews.
<FrankW> Those do seem to be the agreed adjustments (and I added the coordinate system in the geotiff).
<acuster> okay, that is fine, we will probably not use the overviews in that case
<FrankW> why is that?
<acuster> but the images don't break down nicely into eights either
<mpdaly> By all means split the BigTiffs into smaller files, but everyone should use the same set, or choose to skip the test
<FrankW> Right, I'm ok with splitting into regular tiffs. 
<acuster> because we conceptually work with 'coverages' not images
<FrankW> ah, that must be challenging sometimes.
<acuster> so the overviews we use are of the 'coverage' not of the individual images
<acuster> but that's fine, we will simply have completely different performances in the best effort and the baseline
<FrankW> I would not object to your creating your own overviews as long as the full res data files are still used for "full res" requests.
<acuster> okay, we will work on this some more and see how it goes
<FrankW> And I will reprocess the data, splitting it into tiff rather than bigtiff files. 
<FrankW> Any objections?
- FrankW takes that as concurrence.
<acuster> Next item: styles
<acuster> we have been working on styles on our end and there were some styles posted on the wiki
<acuster> what/how do we get to a common set of styles?
<acuster> unfortunately sld leaves a lot of stuff 'that is machine dependent'
<acuster> so probably, until the OGC can clarify these issues in the next SLD/SE spec, there will be some vagueness
<acuster> so are people expecting to each write their own styles? are we going to use a set of common styles for benchmarking and 
<acuster> our own set of styles for the 'beautiful' mapping?
<acuster> and if we are using a common set of styles, how close are we to creating them?
<jmckenna> are you referring to layer styles that were discussed months ago and are posted on the wiki?  
<jmckenna> sorry i might have missed the discussion here.  i will follow along
<acuster> sure, if you want
<jmckenna> ha!
<pramsey> regarding encoding, I just read the log, and it looks to me that you're interpretting valid UTF8 as LATIN1. The answer, if you really want LATIN1 on your client, is to set your client encoding to LATIN1 and the database will automagically spit that out for you, if you fail to do so, you'll get the UTF8 that is stored in the database
<acuster> it seemed there was a desire to put labels along contours
<acuster> but doesn't have any labels
<cedricbr> pramsey: in fact it was when I launch the command shp2pgsql
<cedricbr> it fails with an encoding problem
<acuster> pramsey, it's more than that. OOffice doesn't like the dbfs, gnumeric reads them logging errors,
<acuster> and what cedricbr just said
<pramsey> when loading the data, use -W latin1 on the commandline
<pramsey> to transcode automatically
<cedricbr> ok I've tried with -W UTF8
<jmckenna> acuster: correct. please submit an sld to the group for contour labels.  my SLDs were done before we agreed on labels.
<cedricbr> will try again, thanks
<dmonie> acuster: we already did the styling processing based on the mentionned styles. Hope you won't change them now. We'll have to redo a good bunch of work.
<acuster> jmckenna, well there is a fair amount of work to do on the styles if they are to match the rules in an implementation independent way
<acuster> dmonie, I am merely asking where we stand
- Eclesia ( has left #foss4g
<dmonie> acuster: np
<jmckenna> agreed we discussed and agreed upon styles long ago.  not sure why acuster is causing trouble so late now.  where were your comments earlier?
<acuster> oh, if you consider them done, great.
<jmckenna> it is often about doing it and submitting your changes to the email list for discussion.  please feel free to do
- ofonts (5c38aecd@gateway/web/freenode/ip. has joined #foss4g
<acuster> we are here asking questions about status of various aspects which seem broken or incompletely defined
<acuster> ok, the next topic is out of time but kind of critical since it's about what we will actually test. I guess it gets bumped to next week.
- pramsey has quit (Quit: pramsey)
<acuster> any closing issues or all we done?
<acuster> okay, ciao all. Thanks msmith_ for the work on the machienes and FrankW (and paul) for the data massaging.