Global Mapper v25.0

Changed multithreaded behavior for GM_GetPathProfileLOSEx()?

jkenneally
jkenneally Global Mapper UserTrusted User
edited April 2013 in SDK
Hello,

Quite a while back I got some assistance with a call to GM_GetPathProfile. It amounted to our software loading/unloading elevation data through GM_LoadLayerList/GM_CloseLayer on a sub thread, while the main thread would call GM_GetPathProfileEx, passing known valid layers (ie layers that have been loaded and are not being unloaded by the sub thread) into the call. This worked very well and has been in use for some time with a fairly old version of the SDK (GlobalMapperInterface.DLL v 1.35.0.0).

We had downloaded a newer version at one point, but not switched over to it as we had no need. Recently I've needed a fast way to load the elevation data into my own memory buffer, so I switched over to using that 'newer' version when I found it had a GM_GetPixelElevationRow() that didnt exist in our version of the SDK. The newer version is (GlobalMapperInterface.DLL v 1.37.0.0). We didnt upgrade to absolute newest version as we are hesitating on the license fee, and just needed this one new function.

However, I'm finding that I now get crashes when calling GM_GetPathProfileEx() from main thread while loading/unloading data on a sub thread. I'm sure I'm not passing layers to that call which are actually being created/destroyed. The sub thread is also only ever creating OR destroying layers at a given time.

Can you comment on this? Was there something changed about the SDK that would make this setup start crashing?

If necessary we will bite the bullet and upgrade to the newest version of the SDK, but I would like some confidence that this issue will be resolved in the new builds?

For reference, here is my original thread that resulted in GM_GetPathProfileLOSEx being created:
http://www.globalmapperforum.com/forums/sdk/5904-multi-threading-getpathprofilelos.html

Thanks!
«1

Comments

  • global_mapper
    global_mapper Administrator
    edited November 2012
    I took a look at the code in the GM_GetPathProfileEx function and I found a few places where non-protected access from multiple threads, even if the layer being accessed wasn't the one being unloaded, could cause problems if the timing was exactly wrong. I actually would have expected the same things to be hit in earlier versions of the SDK, but since they were timing things it could just be a matter of luck.

    I have fixed the one issue that I found from a cursory examination and can provide a new v14 SDK build if you upgrade to that. If you upgrade and there are additional issues I can work with you to track those down. While the SDK is not in general guaranteed to work with calls from multiple threads at once doing any operations, specific cases have been addressed over the years and so long as you aren't doing anything severe (i.e. changing the projection) we should be able to get the SDK to do the necessary synchronization itself.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.globalmapper.com
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited November 2012
    Hi Mike,

    Thanks for getting back quickly on this. I'm fairly confident we are not using Global Mapper for anything that you would consider severe. We essentially just have the data being loaded or unloaded from a worker thread, and the main thread is performing the calls to GM_GetPathProfileLOSEx, passing only layer handles that we can guarantee have not been queued for destruction.

    I will upgrade SDK versions at the first of the week and retest our issue.

    Thanks!
  • global_mapper
    global_mapper Administrator
    edited November 2012
    Let me know once you have v14 and I can provide you with a new SDK build with the changes that I just made to protect access to the internal layer info list that the GM_GetPathProfileLOSEx accesses and GM_CloseLayer/GM_LoadLayerList functions modify.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.globalmapper.com
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited November 2012
    Will do! Just out of curiosity, I was under the impression that I was avoiding these kinds of issues by passing the list of layer handles to use into GM_GetPathProfileLOSEx()? I didn't realize that the function was still then accessing lists the load/unload would be internally?


  • global_mapper
    global_mapper Administrator
    edited November 2012
    Passing in the list will avoid most of them, but there is still a check to make sure the layer handle that you pass in is valid before the SDK tries to access it, otherwise passing in a bad handle would cause a crash. That list is what might get modified by another thread that is opening or closing layers. So if the path profile check for validity of the handle is happening right when another thread opens/closes a layer, the layer list would change out from under it and cause a crash.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.globalmapper.com
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Let me know once you have v14 and I can provide you with a new SDK build with the changes that I just made to protect access to the internal layer info list that the GM_GetPathProfileLOSEx accesses and GM_CloseLayer/GM_LoadLayerList functions modify.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.globalmapper.com

    Hi Mike,

    We just finished the process of purchasing our v14 license, and I've got it downloaded/running here now. Can you provide me the build with the threading fixes you've made?

    Thanks!
  • global_mapper
    global_mapper Administrator
    edited January 2013
    The latest SDK build with those changes is available at http://www.globalmapper.com/GlobalMapperSDK_v14_latest_beta.zip .

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    I've updated to the version of the SDK in the link below, but the same issue seems to be present (although crashing slightly differently).

    After anywhere from a few seconds to a multiple minutes of making high frequency calls to the GM_GetPathProfileLOSEx, passing layer handle lists, along with GM_LoadLayerList/GM_CloseLayer on a sub thread seems to eventually cause heap corruption.
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Just as an update to this, at one point during testing this afternoon, I actually got a series of visual dialogs popping up from global mapper saying 'unknown error closing layer' just before my heap corruption crash.
  • global_mapper
    global_mapper Administrator
    edited January 2013
    If you thread protect the calls do the errors go away? As I'm sure you're aware tracking down issues in multi-threaded applications like this can be extremely difficult if the issue isn't easily reproducible. Is there any way you could make a version of your app that calls the SDK that reproduces the error that I could run locally to get more information about what is going wrong? It's hard to even say for certain that the issue is in the SDK and not somewhere else with this threading issues, but if thread-protecting the calls into the SDK makes the problem go away that would be pretty suspicious.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Haha yeah multi-threaded issues can be a huge pain to track down.

    I'll take a look at mutexing access to GM as an experiment. The closest I've done to this so far is running everything normally but commenting out just the call to GM_GetPathProfileLOSEx that happens on the main thread (ie the rest of the application source is fully exercised and tiles are being loaded/unloaded by the sub thread through GM). The crash only starts appearing once we've started doing the GM_GetPathProfileLOSEx calls. I will do another round of checks to ensure we aren't ever passing handles queued for unload, etc.

    Giving you our test app might be possible, although it has a bunch of dependencies, and a few hard coded data paths at the moment so packaging it up would be a bit of a pain.

    One question regarding the underlying GM data manager...what happens if we try to load a dted (gmg) tile that is already loaded in GM? Internally is that tile data shared between 2 unique handles (i.e. reference counted)? Are the handles returned the same for the second instance? Or is the data/handle set completely separate?

    The reason I ask is that our system works could potentially queue a given DTED tile for load that has already been loaded, but queued for unload through our data manager thread. Usually this wouldn't occur, but it's possible in one update a tile will be seen as unused, and added to a queue that the data manager thread unloads at it's convenience. The next update could see that tile as needed and queue it for load. This new load could potentially occur before the old tile instance is actually released. It would be a rare occurance so I don't want to add the overhead of searching through the unload queue for each tile I would queue for load. Sorry if that's convoluted, but figured it might be useful information. Hopefully this doesn't create some scenario where a shared tile is free'd by unloading the first instance of it, while the second is being used in GetPathProfileLOSex...
  • global_mapper
    global_mapper Administrator
    edited January 2013
    You can load an already loaded tile without any issues. It will be completely separate from the other and have it's own cache. This is useful in case you want to apply different settings to the same file and have both versions loaded at once. So no issues there.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/products/global-mapper.php
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Hi Mike,

    I went ahead and temporarily added mutexed access to the GM API from our test app. By ensuring only one thread was ever getting through to Global Mapper at a time, the crash seems to have been eliminated. The test app ran for over an hour before I shut it down, whereas it would usually crash within 5 minutes.

    Just to re-iterate, we are using 3 calls from the API at the moment. The GM_LoadLayerList/GM_CloseLayer are called from a data manager sub thread, along with GM_GetPathProfileLOSEx on the main thread.

    I can package our test app for you if necessary, although it is a bit cumbersome with a bunch of 3rd party dependencies, and we use a proprietary meta file per *.gmg file containing basic attributes of the elevation tile. Because of this I'll have to send you the elevation data set as well as the binaries.

    Just let me know how you'd like to proceed!

    Thanks,
    Jeff
  • global_mapper
    global_mapper Administrator
    edited January 2013
    Jeff,

    I've made some more updates to fix potential issues with multi-threaded access to GM_GetPathProfileLOSEx along with load and close calls on different layers at the same time. In theory it should be working so long as the list of layers that you pass into GM_GetPathProfileLOSEx doesn't contain anything that gets closed during the call. However without an application that makes these calls it's hard to test for sure since I don't know where the actual failure is occurring. I have placed a new build at http://www.globalmapper.com/GlobalMapperSDK_v14_latest_beta.zip for you to try.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • global_mapper
    global_mapper Administrator
    edited January 2013
    Jeff,

    If this doesn't help can you provide me with one of your sample calls to GM_GetPathProfileLOSEx so I can see what flags you are passing in?

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Ok thanks, Mike. I will try the new build asap and let you know how it goes. In our case we always call GM_LoadLayerList/GM_CloseLayer on the same sub thread, so at the very least only one of those two should ever be called at the same time as GetPathProfileLOS.

    Here is our GetPathProfileLOS param set and call:
    GM_PathProfileLOSParams_t aParams;
    
    
    aParams.mSize = sizeof( GM_PathProfileLOSParams_t );                             
    aParams.mFlags = GM_PathProfile_LOSValid | GM_PathProfile_LOSIgnoreEndpoints | 
                             GM_PathProfile_LOSFromHeightAbsolute | GM_PathProfile_LOSToHeightAbsolute;
    aParams.mStartX = pointA_Lon;
    aParams.mStartY = pointA_Lat;
    aParams.mEndX = pointB_Lon;
    aParams.mEndY = pointB_Lat;
    aParams.mElevList = NULL;
    aParams.mListSize = lookupResolution;
    aParams.mDfltElev = 0.0;
    aParams.mDetailsStr = NULL;
    aParams.mDetailsStrMaxLen = 0;
    aParams.mAtmosphericCorr = 1.0;
    aParams.mLOSFromHeight = pointA_AltMetresMSL;
    aParams.mLOSToHeight = pointB_AltMetresMSL;
    aParams.mLOSMinClearance = 0.0;
    aParams.mFresnelFreq = 0.0;
    aParams.mFresnelPctClear = 0;
    
    GM_Error_t32 err = GM_GetPathProfileLOSEx( &(tmpCombinedLayers[0]), (uint32)tmpCombinedLayers.Count(), &aParams );
    

    tmpCombinedLayers is a wxArray class instance containing our handles.

    I'm still doing my own tests to ensure we can't somehow pass handles into GM_GetPathProfileLOSEx that are being closed, but so far I don't see a way this could be happening.

    Thanks!
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Hi Mike,

    I've downloaded and tested the latest beta you linked. There seems to be a new issue not necessarily related to the threads themselves. In this build I am consistently getting an access violation/crash the first time I call GM_GetPathProfileLOSEx after loading my first data tile(s).

    As far as I can see at the point of the crash GM_Close layer has not yet been called although it could potentially be doing other LoadLayer calls.
  • global_mapper
    global_mapper Administrator
    edited January 2013
    That's what I get for coding right before bed time many hours separated from caffeine :) I did some tests and found the problem so now I can get the path profile, at least in a single-threaded environment. The bug I fixed would have killed both single and multi-threaded. I have placed a new build at http://www.globalmapper.com/GlobalMapperSDK_v14_latest_beta.zip with the fix.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Haha well I appreciate you plugging away at this so late. I'll download the latest and give it a spin. I'm also in the process of packaging up my test application in case we can't get the problem resolved this way. I think I can get a reasonably simple set of binaries, data, etc together that should demonstrate the issue.
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited January 2013
    Hi Mike,

    The latest build seems to have done the trick! Given the sporadic nature of the crash, I won't completely relax til it's had some real soak testing, but it has run stably for over half an hour through our test application. Previous builds would have corrupted the heap in under ~5 mins.

    Thanks very much for all your effort around this. If we do see the crash again I'll let you know, but so far so good!
  • global_mapper
    global_mapper Administrator
    edited January 2013
    That's great news! These things are always so hard to find, but I did fix some things last night which could definitely affect this or anything accessing raster layers when some may be closing, even if the ones actually being used for an operation were not being closed. So this could fix things for other multi-threaded users.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited April 2013
    Hi Mike,

    I've been investigating reports that we are still experiencing a software crash related to our dted functionality. Unfortunately it looks like we still have the problem that I thought we had fixed here previously. What seems to have changed is that the problem occurs less frequently and is harder to reproduce.

    With my test application, I often run for upwards of 1-2 hours of constant GetPathProfileLOS() calls while loading/unloading unrelated sets of layers on a sub thread before I encounter the crash. When it does occur it appears identical to the previous issue. At the moment of getting the fatal access violation, our sub thread is loading/unloading a tile at the moment that the main thread is calling GetPathProfileLOSEx() with (often a single) layer(s) that has been previously loaded and should be fine.

    If it is any help I can package up my test application and a sample dted data set. The app is just a console window that spits out general info regarding our GetPathProfile request params/results as they occur. It loads an intermediate DLL that manages our tile loading/unloading through the GM API.

    Cheers!
  • global_mapper
    global_mapper Administrator
    edited April 2013
    I took a further look and discovered additional possibilities for thread contention and have added protection for them. I have placed a new SDK build with this change at http://www.globalmapper.com/GlobalMapperSDK_v14_latest_beta.zip .

    If this doesn't help, I could add a bunch of debug calls to a message callback (from GM_SetMessageCallback) so that you would have a constant log of where you are in the path profile code so when it crashes I could narrow it down to a specific section. But hopefully the change I just made will fix it and this won't be necessary.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited April 2013
    Hi Mike,

    It seems that you've improved the stability again, but eventually I am still getting a crash. I ran my test app for a few hours in debug mode with no issues yesterday afternoon, then left a release build running overnight. Unfortunately it did crash sometime in the evening.

    I'd be happy to take the debug build of the SDK if you think that might narrow down where we are having problems. In the mean time, I'm looking at ways to reproduce the issue more quickly without complicating the test case.
  • global_mapper
    global_mapper Administrator
    edited April 2013
    Darn, must be another thread issue somewhere. I will add the debug logging support and get you a new build so you can run it and log the messages and then I can tell exactly where the path profile stuff is when it crashes.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited April 2013
    Yep, these kinds of issues are always hard to isolate in a big piece of software.

    I really appreciate you working through this though, as the core functionality is critical to our application.

    I'll keep an eye out for your new test build and set it up when its ready.

    Thanks!
  • global_mapper
    global_mapper Administrator
    edited April 2013
    I have added a bunch of debug error messages to the path profile operation so when it crashes we can check a log and see exactly where it happened. If you run 'regedit' and add a string named 'LOG_FILENAME' in 'HKEY_CURRENT_USER\Software\GlobalMapper' with the full name and path of a text file to write to, you should get a steady stream of messages with the stage of the path profile operation that is happening. Then if it crashes the last logged message should clue me in as to where to look.

    Note if you use GM_SetRegistryKey to have your setting stored somewhere else then the LOG_FILENAME will go there instead.

    The new SDK build with this change is at http://www.globalmapper.com/GlobalMapperSDK_v14_latest_beta.zip .

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited April 2013
    Great, thanks.

    I'm running the build now. I initially ran my test app as in debug config, and twice in a row encountered crashes quite quickly.

    I ran a third test in release. It ran longer but then had a similar crash. The call stack at the time of crashing looks different than what I'm used to. In all three cases the call stack ended up in the "_invalid_parameter_noinfo(void)" function of nvarg.c within the Microsoft runtime stuff.

    I've attached a zip containing logs from the 3 runs. global_mapper_log_D1.txt and global_mapper_log_D2.txt are the two initial debug build runs of my test app, and global_mapper_log_R1.txt is the third run in release mode.
  • global_mapper
    global_mapper Administrator
    edited April 2013
    Can you tell which thread is hitting the _invalid_parameter_noinfo function? The logs all end in completely different places for the path profile stuff, with one not being in it at all when it crashes. And of the 2 crashes, only one is in a place that might have anything to do with layer access, so I'm wondering where the other threads are as it doesn't look like the path profile one is necessarily the one that's crashing.

    Thanks,

    Mike
    Global Mapper Guru
    gmsupport@bluemarblegeo.com
    http://www.bluemarblegeo.com/
  • jkenneally
    jkenneally Global Mapper User Trusted User
    edited April 2013
    The behavior generally seems to be that my child thread is calling GMLoadLayerList while main thread is calling GM_GetPathProfileLOSEx