Autonomous Island Defense (AID) Algorithm Introduction

Background: as advanced algorithms become more common and useful, that’s “AI”, there is the constant background fear being discussed, of skynet and terminator style AI robots destroying all. The truth is less exciting: killer robots require lots of very expensive sensors and compute capability. Their programming tends to be much more brittle and slower to update than humans. They integrate more poorly with other systems, whereas humans integrate well with most existing infrastructure. The simple truth is killer humans remain the go to tool on the battlefield, for cost and utility reasons, and likely will remain so for quite some time.

Put another way, you may have seen how hard self-driving cars have been to develop, with many failing despite billions in development dollars and those that are succeeding still struggling to show a clear value advantage over human drivers. Well, autonomous killer robots are a whole order of magnitude of complexity harder above that.

That’s not to say computers and algorithms won’t become even more common, indeed demands of the military created many modern electronics and computers to begin with. It is just unlikely that fully autonomous systems will become major players anytime soon.

The main challenge with autonomous systems is the complexity of the battlefield, and the exponential increase in costs required to handle more complexity. But what if there was a way to keep things simple, by focusing on only a single, relatively simple but useful scenario. I very much wanted to come up with such a scenario that would help Ukraine, but so far I have not come up with a clear example. However, I did eventually think of one scenario that could be of value, one that might just help prevent the next world war.

It’s no secret that a certain powerful nation really wants to attack and destroy a peaceful democratic island nation, basically a case of sibling rivalry taken to an extreme, and in doing so start a major war because said island nation is both ideologically and economically very important to many other democratic nations. That attack will require a large scale amphibious invasion, the type of operation that is quite risky to attempt even in training. Anything which can make that operation even more risky to attempt should significantly reduce the probability of that war starting in the first place.

This is our simple scenario for a viable and useful autonomous system. The goal is to target the amphibious vehicles and small to medium sized boats that are required to move everything onto shore with an algorithm that can run on very cheap hardware, almost any little computer and cheap little camera should suffice, likely mounted to a fixed wing one way attack drone. The scenario is ideal because basically anything truck-like or boat-like in the water in the area of a large scale amphibious landing can be assumed to be an enemy combatant.

The algorithm as presented here was written so that it could be adapted for a microcontroller, but for general frame of reference, a $15 Raspberry Pi Zero should be sufficient for the algorithm processing and guidance calculations. A complete drone could likely be completed and delivered for as little as a few hundred dollars. While that may still sound a decent sum of money to many people, in the world of military expenditure that is dirt cheap. A thousand of these could be launched per day and still be considered a tiny expense.

In order to make this ethical in application (as far as one can in war), there are a few considerations

Doesn’t target individual humans, only large equipment
Red cross filter, objects with large red and white crosses (like a hospital ship) are removed
Civilians are unlikely to be foolish enough or to be allowed to go out on a boat fishing in the middle of an invasion fleet

Then there is the question of whether it is ethical to make this open source where any person can access it

Water environments only, not much appeal for abuse, terror
Not particularly effective on its own. This wouldn’t be much threat on its own, only when used in massive numbers alongside all the other tools of an advanced military (like some level of airspace denial)
None selective, indeed, error prone (as in attacking rocks sometimes), only useful in target rich environments and not for tasks like selective commerce harassment
This is just an algorithm, not a complete final product, still requiring significant skill and resources to make into anything more
Dual use – there is utility of these algorithms for civilian watercraft, simple toys, or beach safety

The primary advantage of a system with such an algorithm is not raw destruction, but to decoy, saturate, and exhaust enemy air defenses and their operators to clear the way for more expensive weapons, all while being immune to many countermeasures such as most electronic warfare. They could be used by defenders with limited training and who have had severe communications degradation, all they need to know is general area of a large scale landing.

Note that the main limitation with using a true microcontroller is RAM rather than compute cycles. Most have less than 1 MB of RAM. Likely a device with 8MB or so of PSRAM is the minimal viable compute.

Algorithm Outline:

Phase 0: launch, simple flight controls, inertial guidance guide toward beachhead. The expectation is the landing zone, target area, is large enough that precise arrival localization is not required.
Phase 1: beach arrival detection, ongoing during Phase 0. Images are segmented for water (sea), sky, and obstacle (everything else). Essentially, wait until the image becomes mostly sea and sky, then enter Phase 2.
Phase 2: target detection and selection. Option 1 here is to use obstacle in the water for Phase 1 segmentation, applying heuristics to identify likely targets. Option 2 is lightweight object detection neural network on obstacles in the water. When an appropriate target is found, select and begin Phase 3. If multiple targets found, choose on criteria like confidence score and proximity to center of current flight path.
Phase 3: terminal guidance. Use object tracking algorithm to track target frame to frame and adjust course as needed.

Algorithm Technical Outline

Phase 1:
- Separable Gaussian filter for noise removal
- Local window feature extraction
- Segmentation prediction with decision tree
- Postprocess:
  - skyline detection and sea/sky cleaning
  - smoke removal and dilation
  - Connected Component Labeling (CCL) (Hoshen-Kopelman) and filtering
- Calculate beach arrival criteria for Phase 2
  - Note this is tolerant of some segmentation error, Phase 2 is more sensitive
Phase 2:
- Option 1: heuristics with obstacles from CCL (pictured here)
- Option 2: lightweight object detection NN, ie SSD MobileNet V2 on objects in water (boat, truck 25% confidence)
- Red cross filter
Phase 3
- MOSSE correlation filter for tracking, gradual update on high confidence
- region of interest calculation
- small random offset for distributed arrival
- MAVLink outputs

As of writing, only a limited proof of concept has been put together. Here are some observations from the work so far.

Initial testing suggested, contrary to many people’s expectations, that neural network based approaches, both image segmentation and image classifiers (“seashore”) would perform poorly for Phase 1. Fine tuning might help, but there is also a compute runtime concern, particularly for advanced image segmentation. The local window feature and decision tree method here is likely the best option when properly optimized.

For Phase 2, neural network object detection looks promising. For example, using SSD Mobilenet V2, a fast and lightweight pretrained model, using the COCO output labels of “boat” and “truck” with 25% confidence of higher looked promising for useful target identification. More fine tuned options might work, but there is also a concern that a more specific and more finely tuned model might also be more brittle and easier to trick. Lightweight models tend to run best on small image sizes, helped here by utilizing cropping based on Phase 1 segmentation to feed in only those areas that are of highest relevance.

Phase 3 is the trickiest from a flight control perspective, and has the most stringent needs for low latency updates. It is fairly well developed problem however, and there is plenty of existing research to borrow ideas from for optimization.

Proof of Concept accessible here: https://github.com/winedarksea/aid_algorithm

Colin Catlin, January 2025

Leave a Comment Cancel Reply