This forum is shutting down! Please post new discussions at

[Pygeoprocessing] Comparison between TauDEM and Pygeo CTI computation workflow - need help

I am a spatial analyst so I don't really have general hydrologist knowledge apart from what I have read for this project. My job was to script a CTI computation with TauDEM, but then I found Pygeoprocessing which sounds really good to me as it runs on our big servers and would allow us to process huge amount of data.
So thanks to your help I was able to get a workflow coded that takes a DEM and go through all the steps to compute the CTI with Pygeoprocessing. The problem is that the values are really different between TauDEM and Pygeoprocessing

So here is my question, how can I convince people that Pygeo solution is better from a method point of view, are the results given by Pygeoprocessing as correct as the TauDEM solution. I guess that TauDEM has the advantage of being built by the group who actually published the D-inf method for flow direction and flow accumulation so people tend to use that as "the best solution".

I noted in difference in the flow accumulation calculation as TauDEM has a D-inf flow accumulation algorithm and Pygeoprocessing doesn't really indicate which method is used. There is a difference within the flats as well as TauDEM leaves really few or no flats at all while Pygeoprocessing seems to struggle a bit more with that.

My lack of knowledge in hydrology and coding don't allow me to really dig in the code and compare the two methods so if someone is familiar with the differences, if you use Pygeo to compute CTI for research, if you have feedback on what method give the best results and why I would be really grateful.

In any case thanks a lot for this great module, beautifully self documented and easy to use for python beginners.

TauDEM workflow :
  • - fill pits
  • - calculation of D-inf flow direction
  • - calculation of D-inf flow accumulation
  • - Calculation of CTI using the traditionnal formula

Pygeo workflow :
  • pygeoprocessing.routing.fill_pits
  • - pygeoprocessing.geoprocessing.align_dataset_list 
  • - pygeoprocessing.routing.flow_direction_d_inf
  • - pygeoprocessing.routing.flow_accumulation
  • - Calculation of CTI using customized python code
workflow mainly based on the advice of "rich" in this discussion and study of the source code


  • RichRich Administrator, NatCap Staff
    Oh thanks so much for this. For reference, PyGeoprocessing has a d_infinity flow accumulation algorithm.

    And it's been a bit since we did a side-by-side comparison with TauDEM.  Two factors that would make different flow accumulation:

    * PyGeoprocessing doesn't fill pits yet, although we do have the API hook for it.  We *might* fill single pixel pits, but it's not a robust algorithm at all.  If you're using TauDEM to resolve pits, AND there are hydrological pits in the raster, then you'd get a different flow direction and ultimately flow accumulation algorithm.

    * The plateau resolution algorithm is different than what TauDEM had implemented a year ago when we looked at it.  I believe it was an older algorithm that resolved ambiguous flow directions by a multipass iteration; not great for some degenerate, but common cases.  So, in cases where your DEM has a plateau the flow direction may be slightly different.  

    So in the context of those constraints, we believe PyGeoprocessing is correctly draining and routing the DEM based on our own internal consistency checks of the math, algorithms, implementation, and outputs of our models.

    As for how you can convince people one is better than another, I tend to think that's our job more than yours, but just happens to be a lower priority for us at the moment.  We developed PyGeoprocessing as a high performance and memory efficient Python-based implementation of common hydrological routing routines that can be packaged along with InVEST, RIOS, OPAL, and other NatCap tools.  In that sense it's 'better' for us than TauDEM.  But as PyGeoprocessing matures we'll do a better job of comparing it against other solutions.

    Sound okay?
  • j_schroj_schro Member
    Thanks for the quick answer, so when I am using  pygeoprocessing.routing.flow_accumulation, it is actually a d_infinity algorithm? if yes that is great news!

    *Good to know for the pit filling, I will try to find another solution in the meantime, if you plan to integrate a better algorithm later.
    *I did notice that and saw your reference to a recent algorithm in the source code so that makes me believe that your methods is probably more up to date so more likely to be adopted by the people with whom I work.

    Well you summarized about everything I need to know to present my method to the group which is great, in a way it is my job as well as I adopted your module and have to "sell it" to the group.
    I am following closely your work and looking forward to the future improvement, thanks again!

    PS : I am actually doing some output comparison for the two methods if you are interested, I can send you an email with some samples.

  • j_schroj_schro Member
    I might as well compare the results with this new python module if you are interested
  • RichRich Administrator, NatCap Staff
    Oh wonderful!  I will, in time, dig deeply into this.
Sign In or Register to comment.