View Issue Details

IDProjectCategoryLast Update
0026263AI War 2Crash/ExceptionFeb 5, 2022 9:19 pm
ReporterDaniexpert Assigned ToChris_McElligottPark  
Status resolvedResolutionfixed 
Product VersionBeta 3.794 Synchronicity 
Fixed in VersionBeta 3.794 Synchronicity 
Summary0026263: Called CreateExternalBaseInfo<Arcen.AIW2.External.SappersPerUnitBaseInfo>() when BaseInfo was actually a different type
DescriptionRevision 16089
TagsNo tags attached.

Relationships

related to 0026270 resolvedChris_McElligottPark Hit exception in UpdateWatchtowers debugCode 300 and Sappers Stage3 error: System.Exception 
related to 0026277 resolvedChris_McElligottPark Sapper exceptions - Watchtowers have wrong BaseInfo 

Activities

Daniexpert

Feb 5, 2022 3:26 pm

manager  

exception.txt (5,598 bytes)   
2/5/2022 9:21:19 PM	3.793	SINGLEP	DELAYED2 TID29 Alerted Aggression Eye #19106 AI Sentinels on Kuroe is sending 351x VWing against test (Necro Empire)
2/5/2022 9:21:19 PM	3.793	SINGLEP	DELAYED3 TID29 Alerted Aggression Eye #19106 AI Sentinels on Kuroe is sending 5x MLRSGuardian against test (Necro Empire)
2/5/2022 9:22:05 PM	3.793	SINGLEP	DELAYED4 TID6 Error A!  Called CreateExternalBaseInfo<Arcen.AIW2.External.SappersPerUnitBaseInfo>() when BaseInfo was actually a different type, Arcen.AIW2.External.ElderlingsPerUnitBaseInfo on SapperEmeraldBlaster
  at System.Environment.get_StackTrace () [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at Arcen.Universal.ArcenDebugging.ArcenDebugLog (System.String Message, Arcen.Universal.Verbosity Verbosity) [0x00000] in <2b7024b982f642e4ace865cef8e0523e>:0 
  at Arcen.AIW2.Core.GameEntity_Squad.CreateExternalBaseInfo[T] (System.String BaseInfoSourceName) [0x00000] in <fec75340adc94ef4a99956478c4ea719>:0 
  at Arcen.AIW2.External.SappersFactionDeepInfo+<>c__DisplayClass14_2.<UpdateWatchtowers>b__0 (Arcen.Universal.KeyValuePair`2[TKey,TValue] pair) [0x00000] in <813acd289d5a493c91af159ea484335e>:0 
  at Arcen.Universal.Dictionary`2[TKey,TValue].DoFor (Arcen.Universal.DictionaryProcessor`2[TKey,TValue] Processor) [0x00000] in <2b7024b982f642e4ace865cef8e0523e>:0 
  at Arcen.AIW2.External.SappersFactionDeepInfo.UpdateWatchtowers (Arcen.AIW2.Core.ArcenHostOnlySimContext Context) [0x00000] in <813acd289d5a493c91af159ea484335e>:0 
  at Arcen.AIW2.External.SappersFactionDeepInfo.DoPerSecondLogic_Stage3Main_OnMainThreadAndPartOfSim_HostOnly (Arcen.AIW2.Core.ArcenHostOnlySimContext Context) [0x00000] in <813acd289d5a493c91af159ea484335e>:0 
  at Arcen.AIW2.Core.Faction.Safe_DeepInfo_DoPerSecondLogic_Stage3Main_OnMainThreadAndPartOfSim_HostOnly (Arcen.AIW2.Core.ArcenHostOnlySimContext Context) [0x00000] in <fec75340adc94ef4a99956478c4ea719>:0 
  at Arcen.AIW2.External.EntitySimLogicImplementation_BaseInfo+<>c__DisplayClass110_0.<DoWorldSecondLogic_FromSimBGThread>b__4 (Arcen.AIW2.Core.Faction faction) [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at Arcen.AIW2.Core.World_AIW2.DoForFactions (Arcen.AIW2.Core.Faction+ProcessorDelegate Processor) [0x00000] in <fec75340adc94ef4a99956478c4ea719>:0 
  at Arcen.AIW2.External.EntitySimLogicImplementation_BaseInfo.DoWorldSecondLogic_FromSimBGThread (Arcen.AIW2.Core.ArcenClientOrHostSimContextCore Context) [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at Arcen.AIW2.External.EntitySimLogicImplementation_BaseInfo.DoWorld_PerStep_TimeKeeping (Arcen.AIW2.Core.ArcenClientOrHostSimContextCore Context) [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at Arcen.AIW2.External.EntitySimLogicImplementation_BaseInfo.DoWorldStepLogic_ClientOrHost_FromSimBGThread (Arcen.AIW2.Core.ArcenClientOrHostSimContextCore Context) [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at Arcen.AIW2.External.SimExecution.Execute () [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at Arcen.AIW2.External.ArcenClientOrHostSimPlanningContext.BackgroundThreadRunHandler (System.Single SuicidesAfterTime, System.Boolean FailSilentlyIfNotFinishedYet) [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at Arcen.AIW2.External.ArcenClientOrHostSimPlanningContext+<>c__DisplayClass15_0.<RunOnBackgroundThread>b__0 () [0x00000] in <7f2d4c20a2834e8d94dcaecd85f88cb3>:0 
  at System.Threading.Tasks.Task.InnerInvoke () [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.Tasks.Task.Execute () [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.Tasks.Task.ExecutionContextCallback (System.Object obj) [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, System.Object state, System.Boolean preserveSyncCtx) [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.Tasks.Task.ExecuteWithThreadLocal (System.Threading.Tasks.Task& currentTaskSlot) [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.Tasks.Task.ExecuteEntry (System.Boolean bPreventDoubleExecution) [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.Tasks.Task.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in <695d1cc93cca45069c528c15c9fdd749>:0 

2/5/2022 9:22:05 PM	3.793	SINGLEP	DELAYED5 TID6 Hit exception in UpdateWatchtowers debugCode 2500 System.NullReferenceException: Object reference not set to an instance of an object
  at Arcen.AIW2.External.SappersFactionDeepInfo+<>c__DisplayClass14_2.<UpdateWatchtowers>b__0 (Arcen.Universal.KeyValuePair`2[TKey,TValue] pair) [0x0011b] in <813acd289d5a493c91af159ea484335e>:0 
  at Arcen.Universal.Dictionary`2[TKey,TValue].DoFor (Arcen.Universal.DictionaryProcessor`2[TKey,TValue] Processor) [0x0001e] in <2b7024b982f642e4ace865cef8e0523e>:0 
  at Arcen.AIW2.External.SappersFactionDeepInfo.UpdateWatchtowers (Arcen.AIW2.Core.ArcenHostOnlySimContext Context) [0x00527] in <813acd289d5a493c91af159ea484335e>:0 
exception.txt (5,598 bytes)   

Chris_McElligottPark

Feb 5, 2022 5:54 pm

administrator   ~0064143

Ow. This proves that new units are being accessed by two different threads at once:

                            for ( int j = 0; j < data.ShipsInside[pair.Key]; j++ )
                            {
                                debugCode = 2300;
                                ArcenPoint spawnLocation = entity.Planet.GetSafePlacementPoint_AroundEntity( Context, pair.Key, entity, FInt.FromParts( 0, 005 ), FInt.FromParts( 0, 030 ) );
                                GameEntity_Squad newEntity = GameEntity_Squad.CreateNew_ReturnNullIfMPClient( pFaction, pair.Key, entity.CurrentMarkLevel,
                                       pFaction.Faction.LooseFleet, 0, spawnLocation, Context, "Sappers-FromWatchtower" ); //is fine, main sim thread
                                debugCode = 2400;
                                if ( newEntity != null )
                                {
                                    newEntity.ShouldNotBeConsideredAsThreatToHumanTeam = true; //just in case
                                    newEntity.Orders.SetBehaviorDirectlyInSim( EntityBehaviorType.Attacker_Full, -1 ); //is fine, main sim thread
                                }
                                debugCode = 2500;
                                SappersPerUnitBaseInfo newData = newEntity.CreateExternalBaseInfo<SappersPerUnitBaseInfo>( "SappersPerUnitBaseInfo" );
                                newData.HomeWatchtowerId = entity.PrimaryKeyID;
                            }

Either that, or that the external data is not being cleared properly, I suppose. So it could still be either. I'll have to think about this.

Chris_McElligottPark

Feb 5, 2022 6:40 pm

administrator   ~0064144

If we see this again, then we'll know this time that it's a cleanup issue. Cheers!

* Based on some of the recent bugs we have had, and in particular one that DaniExpert ran into, I am starting to strongly suspect that in a very small minority of cases, the game is actually giving back a single item from a pool to multiple calling threads.
** In general, there are two types of pools that are relevant for these purposes: TimeBasedPool<> and ConcurrentPool<>. Both of these previously used traditional locks with internal Queue<>s, but were something I transitioned over to internally having ConcurrentQueue<>, instead.
** ConcurrentQueue<>, ConcurrentDictionary<>, and ConcurrentBag<> are all three data types that I am getting increasingly suspicious-of. They seem to prize throughput over accuracy, whereas we need absolutely pristine accuracy (these ships that are double-returned are very much a tiny minority case, and only on some machines some of the time, one or two bad returns out of hundreds of thousands, but this is still way too frequent for us).
*** A build or so ago I already transitioned our central dictionaries away from ConcurrentDictionary<> over to regular Dictionary<> collections with write-specific locks surrounding it to maximize throughput while not allowing for write-based errors of any sort.
*** Now I'm doing the same for TimeBasedPool<> and ConcurrentPool<>, except in their case it's for read and write. We do multiple orders of magnitude fewer reads from these than we do from the central dictionaries, and read and write are both equally volatile from our perspective here.
** So far, the performance seems functionally identical, which is not surprising given the other architectural improvements to the game. The end result of this should hopefully be that we no longer get units jumping ownership, or the many other strange related-seeming bugs that might come from that.
*** Originally I made the switch to the newer .NET data structures because I assumed that they would maintain perfect accuracy like locks do (I have not seen anything stating that they do not), but without the risk of deadlocks, and potentially with some improvements to performance.
*** That said, over the last 8 months I've wound up making a lot of other performance improvements to the game, and it turns out that these are not part of the critical path for performance (either they are not now, or they never were). So all the performance improvements since the last stable version have mostly come from structural shifts to code, shifts to the threading model, and etc. With that in mind, we don't lose anything by switching back to simpler and more battle-tested locking strategies like this.

Issue History

Date Modified Username Field Change
Feb 5, 2022 3:26 pm Daniexpert New Issue
Feb 5, 2022 3:26 pm Daniexpert File Added: exception.txt
Feb 5, 2022 5:54 pm Chris_McElligottPark Note Added: 0064143
Feb 5, 2022 6:40 pm Chris_McElligottPark Assigned To => Chris_McElligottPark
Feb 5, 2022 6:40 pm Chris_McElligottPark Status new => resolved
Feb 5, 2022 6:40 pm Chris_McElligottPark Resolution open => fixed
Feb 5, 2022 6:40 pm Chris_McElligottPark Fixed in Version => Beta 3.794 Synchronicity
Feb 5, 2022 6:40 pm Chris_McElligottPark Note Added: 0064144
Feb 5, 2022 8:32 pm BadgerBadger Relationship added related to 0026270
Feb 5, 2022 9:19 pm Daniexpert Relationship added related to 0026277