This forum is shutting down! Please post new discussions at community.naturalcapitalproject.org

Problems running under Centos 7 x86_64

HI there I know nothing about modelling per se, but am running a model on behalf of others in my  research group. I am the computing technical expertise. What we are trying to do is setup the nutrient model so we can run them in the cloud, in particular Google's IaaS stuff, GCE. The idea being that we can offload, adhoc any heavy modelling to the cloud. We are initially experimenting with models that we have already working, so we can compare the outputs. So to do this, obviously I am trying to run Linux versions of the model (so we don't have to pay any extra for licenses), and in particular the invest_natcap.nutrient version 3.0.0.  There are 108 instances of the parameters for the model, and it is running a tiling of the water-catchment for the whole of Scotland.

We have successfully run all 108 instances this under Windows 7, using python 2.77, but I cannot get them to run under Linux. About 10% of them fail.  I have tried running 32 bit linux, 64 bit linux (all Centos 7 versions), python 2.75 (default for Centos 7) and python 2.77 using virutalenv. The versions I am working with currently are:

scipy  0.19.1
tables 3.4.2
gdal 1.11.4-1

and typically the errors I get are:

08/22/2017 08:02:20  hydropower_water_yield INFO         Starting Water Yield Core Calculations
08/22/2017 08:02:20  raster_utils       INFO         clip_dataset nodata value is 255.0
08/22/2017 08:02:21  raster_utils       INFO         building aoi mask
08/22/2017 08:02:22  raster_utils       INFO         masking out each output dataset
08/22/2017 08:02:27  hydropower_water_yield INFO         Reclassifying temp_Kc raster
08/22/2017 08:02:27  raster_utils       INFO         Reclassifying
08/22/2017 08:02:27  raster_utils       INFO         Creating lookup numpy array
Traceback (most recent call last):
  File "failure.py", line 58, in <module>
    invest_natcap.nutrient.nutrient.execute(args)
  File "/usr/lib64/python2.7/site-packages/invest_natcap/nutrient/nutrient.py", line 40, in execute
    invest_natcap.hydropower.hydropower_water_yield.execute(water_yield_args)
  File "/usr/lib64/python2.7/site-packages/invest_natcap/hydropower/hydropower_water_yield.py", line 173, in execute
    out_nodata)
  File "/usr/lib64/python2.7/site-packages/invest_natcap/raster_utils.py", line 1661, in reclassify_dataset_uri
    exception_flag=exception_flag)
  File "/usr/lib64/python2.7/site-packages/invest_natcap/raster_utils.py", line 1706, in reclassify_dataset
    (1, map_array_size), dtype=type(value_map.values()[0]))
TypeError: 'float' object cannot be interpreted as an index
```
We going to upgrade to 3.3.3 (which will involve quite a lot of changes to our code), but I was wondering if anybody could shed any light on our problem. I have been told that natcap is developed in Linux, so hopefully somebody can point me in the correct direction on this one.

Many thanks in advance.

Comments

  • RichRich Administrator, NatCap Staff
    Hello, 3.0.0 is a very old version.  We've fixed hundreds of issues and made order of magnitude performance increases with our recent releases including that of the PyGeoprocessing library.  The error you're posting doesn't look like it's associated with any particular versioning or OS issue, but I hesitate to debug it since there's a good chance it's not an issue in a modern version.

    You mention upgrading to 3.3.3 would involve a lot of code changes... What would that entail?  Anything we can help with?

  • RichRich Administrator, NatCap Staff
    Oh now that I look at that specific error more closely, are you using a landcover map that uses a floating point type rather than a integer typeit?  We may have relaxed that requirement in a later version of PyGeoprocessing.  But without other context, that's what I'd guess is going on there.
  • Thanks Rich for the prompt response. It is much appreciated.

    What I meant by changing the code is that most of our input files are preprocessed using Python scripts using the python API to ArcGIS to create the input files necessary for the Natcap nutrient model.

    We are going to move to 3.3.3 as soon as we can, initially just with one instance of the set of parameters I mentioned above, so I can validate the procedure before moving on to the rest of the data. However, the reason we have remained on 3.0.0 was for what you mention above; for performance reasons. My colleagues have informed me that using the older version is substantially faster, than the newer versions of the code.

    This is curious, because when the 3.0.0 model works for me, it seems to be running considerably faster on Linux than it does on Windows. I had put this down to the difference between 64 and 32 bit.  I know you guys develop on Linux, so was wondering if this was related and you had come across maybe other people slowing down on Windows with the newer versions of the model. My supposition is that most people utilzing the model will be doing so in a Windows environment. Are we doing something weird and wonderful to slow down the model based on your wider experience of people running it?
    On the performance issues, if you want I can try and get some hard figures on the slow-down.

    Cheers and thanks,
  • RichRich Administrator, NatCap Staff
    Hi Douglas, in almost all cases the recent versions of InVEST are faster than the older ones.  But even if they weren't, almost all of them are more *correct*.  As a major example, InVEST 3.0.0 has a sediment and nutrient model that was exponentially sensitive to the stream threshold value.  We've since made SDR and NDR models that correctly model the biophysical part.  In general, that's happened to one degree or another across all the InVEST models since v3.0.0.  

    There could be many reasons why you're experiencing a faster runtime on Linux than windows.  Emulation layers?  Different versions of platform-specific Python libraries?  Different hardware? There's otherwise nothing inherent in how we developed InVEST that makes it run faster or slower on a particular operating system.

  • Hi Rich. What you've said is what I kind of expected. My suspicion was originally software issues.Some weird and wonderful combinatiion of something not patched correctly, but I too am beginning to suspect hardware. I run bog-standard Centos7 vms in the cloud, which is about as vanilla as you can get, and these are running fine (well apart from the stuff above), so yes, I think you may be correct, the hardware running the Windows version is probably flaky (and it is bare metal).

    My original inclination was to ask for the code to be upgraded, and at least I can use the above to argue that this should be the case.

    Cheers and thanks for the responses.
Sign In or Register to comment.