Benchmarking 2010 IRC meeting logs Aug 4th


 * Logs from the meeting

Hey all, shall we have a meeting? I put up a provisional agenda on: http://wiki.osgeo.org/wiki/Benchmarking_2010#Communication Title: Benchmarking 2010 - OSGeo Wiki (at wiki.osgeo.org) basically: status of machines / status of data / status of styling / status of testing  ok. well by the agenda the machine status is first so thats where I'll start  first, did everyone get the email that the ip's were switched between the jmeter and linux wms machines?  .78 is the wms machine, .76 is the jmeter machine is that why the disk was not big enough or was that a separate issue?  I've move the files on the jmeter machine over to the correct wms machine. If anyone has problems, let me know  acuster: thats why the disk was not big enough  it is now. we were installing on the 160gb drive of the jmeter machine rather than the 2tb drive on the wms machine  several groups have gotten access to the windows box and installed their software  and FrankW has processed all the raster images ah, okay that moves half way to the next topic  so thats where we are at hello, could we put the geotiff data on the windows server as well, please ? the shapefiles are all in place and the data inserted into the postgis database, right?  yes. I'll do that today great, thanks and the geotiffs are geotiffs or bigtiffs? - marco (~quassel@dyn.83-228-168-188.dsl.vtx.ch) has joined #foss4g i.e what size are they?  they are bigtiffs. and about 8gb per that's a blocker for us. i.e. for cedricbr and constellation  we can convert the rasters to a different geotifff format. just give the specs and frank can convert again to a diff dormat  c/dormat/format just standard geotiff. Here we broke them into 8 four wide by two high makes for 112 images total  if you want to create your own set from the source ecw's that fine also oh, I thought we needed to all use the same files  oh frak - IvanSanchez apologizes for being late  no. we simply specified geotiff as the base, but they are so many different that each can have their own geotiff <msmith_> just like the overviews anb spatial indexes can be customized ah? so, for the baseline, we share the shapefiles and the postgis but can create our own geotiff's? <msmith_> i think so. Not everyone can use the same geotiff formats. And for the base I think its just shapefiles. The postgis is more for the "best" testing <msmith_> thats my understanding. I can be told I'm wrong does that seem reasonable to all the people here? Seems like apples to oranges to me => action: Adrian will confirm by email let's move on any other questions about the status of the data on the machines? ...going twice <msmith_> one issue that came up from aaime, is that the contour files are split but but not spatially split - FrankW (~FrankW@ip-174-142-75-109.static.privatedns.com) has joined #foss4g msmith_, okay we move to (2) Data status / issues <msmith_> so if you are just displaying one contour file, it will display in many differnt areas, and not clustered yes, we noticed that as well. it's definately sub-optimal but this data set was really not created for use in cross-scale wms anyhow <msmith_> agreed <IvanSanchez> msmith_: are the non-spatially-split contours a problem? - marco has quit (Remote host closed the connection) <msmith_> i don't think so, just lower performance since you'll be hitting multiple files for each area another question I had about the shapefiles was related to character encoding it seems the labels in catalan have encoding errors which impacts their display and their insertion into a PostGIS database <IvanSanchez> acuster: just catalan? sounds strange did anyone else have character encoding issues? no, I just looked in catalonia - IvanSanchez goes off to check the data specs it's probably the spanish symbols <IvanSanchez> acuster: I suspect the problem is spain-wide, we've got á, é, í, ó, ú and ñ all over the place We did not. But we had to mention the appropriate encoding on the server and in the browser. <IvanSanchez> catalonia also has à, è, ì, ò, ù and ç I've tried to insert the shapefiles into a postgis database, and I get some encoding error in fact using shp2pgsql and specifying encoding UTF8 who built the postgis db? <msmith_> pramsey and he's not here. okay. next <IvanSanchez> no, the docs don't say anything about character encoding We also tried to insert in postgis, but encoding must be set to iso-8859-1 - FrankW pops back in after entertaining guests for three days. - IvanSanchez stops being entertained dmonie, you confirmed it to be or you suspect it to be iso-8859-1? <FrankW> The raster data is processed into bigtiff files in /benchmarking/data/raster and seems ok. <FrankW> msmith_: perhaps you can copy to windows if you haven't already? latin1 ? or 15 acuster: i am sure. And in the browser, we mentionned "western iso-8859-1". so the database should be created in latin1 too? dmonie, thanks <msmith_> FrankW: I will cedricbr, the import tool should be able to convert on the fly if it knows what it's handling perhaps it was merely the dbf files that were strange. I'll look into the character encoding issue some more. any other issues related to data? If we allow bigtiff or geotiff for baseline, can be also allow ecw? c/be/we lol <FrankW> dmonie: I think we have established that the baseline is geotiff. <FrankW> I'm still suspecting that someone will find bigtiff is a problem and we will have to split up these files into quarters so they are regular tiff. <FrankW> Of course ecw is fine for the "do your best" track. Constellation has a problem with bigtiff apparently. FrankW, we have already been over that. Suggestion was that we might each generate our own tiff images a mail has just gone out asking about this. dmonie: right it is a problem for us bigtiff <FrankW> ah, I haven't seen it. Big email backlog. it probably has not yet arrived. <FrankW> I suppose there is no harm in folks making their own geotiffs if they want since we have a reasonable amount of space. locally we have broken each ecw into 8 tiles so each geotiff is ~770Mb - pramsey (~pramsey@S01060014bf492c47.gv.shawcable.net) has joined #foss4g any other issues related to data? <FrankW> btw, are we in a meeting now? - pirmin_k has quit (Remote host closed the connection) yes <FrankW> ah, sorry for budding in. <msmith_> no FrankW: you are part of the meeting ...going twice <IvanSanchez> FrankW: resistance is futile It is very surprising that each participant can run the tests with his own shape of the dataset. What do we compare, then? <FrankW> dmonie: would you prefer we have one canonical set of geotiff files? dmonie, yes, that seems strange <FrankW> I would be fine with that. +1 for one set of files Don't care which would they be well aligned? <FrankW> aligned? <IvanSanchez> the ecws are aligned AFAIK yeah, so the overviews are aligned on an overall grid, not merely image by image <IvanSanchez> the overlap a bit on the edges but, besides that, they fit in a grid the pixels fit on a grid - FrankW notes that file sets not on a regular grid are common and not unreasonable for us to work with. the images are kind of a mess I thought, for the baseline, the data was take it or leave it - FrankW too, within reason. right, but we are manipulating it to start with acuster: define "we" the question is what manipulations we choose to do foss4g <IvanSanchez> hey, if you want to use the original 4000 shapefiles, be my guest we have chosen not to work with ecw <FrankW> and we chose to build overviews. <FrankW> Those do seem to be the agreed adjustments (and I added the coordinate system in the geotiff). okay, that is fine, we will probably not use the overviews in that case <FrankW> why is that? but the images don't break down nicely into eights either By all means split the BigTiffs into smaller files, but everyone should use the same set, or choose to skip the test <FrankW> Right, I'm ok with splitting into regular tiffs. because we conceptually work with 'coverages' not images <FrankW> ah, that must be challenging sometimes. so the overviews we use are of the 'coverage' not of the individual images but that's fine, we will simply have completely different performances in the best effort and the baseline <FrankW> I would not object to your creating your own overviews as long as the full res data files are still used for "full res" requests. okay, we will work on this some more and see how it goes <FrankW> And I will reprocess the data, splitting it into tiff rather than bigtiff files. <FrankW> Any objections? - FrankW takes that as concurrence. Next item: styles we have been working on styles on our end and there were some styles posted on the wiki what/how do we get to a common set of styles? unfortunately sld leaves a lot of stuff 'that is machine dependent' so probably, until the OGC can clarify these issues in the next SLD/SE spec, there will be some vagueness so are people expecting to each write their own styles? are we going to use a set of common styles for benchmarking and our own set of styles for the 'beautiful' mapping? and if we are using a common set of styles, how close are we to creating them? are you referring to layer styles that were discussed months ago and are posted on the wiki? sorry i might have missed the discussion here. i will follow along sure, if you want ha! regarding encoding, I just read the log, and it looks to me that you're interpretting valid UTF8 as LATIN1. The answer, if you really want LATIN1 on your client, is to set your client encoding to LATIN1 and the database will automagically spit that out for you, if you fail to do so, you'll get the UTF8 that is stored in the database it seemed there was a desire to put labels along contours export PGCLIENTENCODING=LATIN1 but http://labs.gatewaygeomatics.com/benchmarking/sld/contour-sld.xml doesn't have any labels pramsey: in fact it was when I launch the command shp2pgsql it fails with an encoding problem pramsey, it's more than that. OOffice doesn't like the dbfs, gnumeric reads them logging errors, and what cedricbr just said when loading the data, use -W latin1 on the commandline to transcode automatically ok I've tried with -W UTF8 acuster: correct. please submit an sld to the group for contour labels. my SLDs were done before we agreed on labels. will try again, thanks acuster: we already did the styling processing based on the mentionned styles. Hope you won't change them now. We'll have to redo a good bunch of work. jmckenna, well there is a fair amount of work to do on the styles if they are to match the rules in an implementation independent way dmonie, I am merely asking where we stand - Eclesia (~jsorel@mtd203.teledetection.fr) has left #foss4g acuster: np agreed we discussed and agreed upon styles long ago. not sure why acuster is causing trouble so late now. where were your comments earlier? oh, if you consider them done, great. it is often about doing it and submitting your changes to the email list for discussion. please feel free to do - ofonts (5c38aecd@gateway/web/freenode/ip.92.56.174.205) has joined #foss4g we are here asking questions about status of various aspects which seem broken or incompletely defined ok, the next topic is out of time but kind of critical since it's about what we will actually test. I guess it gets bumped to next week. - pramsey has quit (Quit: pramsey) any closing issues or all we done? okay, ciao all. Thanks msmith_ for the work on the machienes and FrankW (and paul) for the data massaging.