Text search using Stochastic Diffusion Search

Stochastic Diffusion Search (SDS), a multi-agent population-based global search and optimization algorithm, is a distributed mode of computation utilizing interaction between simple agents. SDS shows off a strong mathematical framework. It is robust, has minimal convergence criteria and linear time complexity.

SDS has been applied to diverse problems such as text search, object recognition, feature tracking, mobile robot self-localization and site selection for wireless networks. As a part of my Optimization Techniques laboratory project, I implemented a text search using SDS.

Basic SDS Algorithm

SDS algorithm has many variations. Following is the basic version.

1. For all agents do
2. INITIALIZE: Agent picks a random hypothesis 
3. TEST: Agent partially evaluates her hypothesis 
    • If test criterion = TRUE, agent = Active (satisfied) 
    • Else agent = Inactive (dissatisfied) 
4. DIFFUSE
    • Inactive agent meets a randomly chosen agent 
    • Inactive agent update/change hypothesis 
5. REPEAT until Halting criterion.

SDS Text Search Algorithm

INITIALIZATION PHASE
Each agents selects a haystack offset (hypothesis) at random.
WHILE (NOT all agents are active)
    TESTING PHASE
        Each agent randomly selects an offset less than or equal to the length of needle (needle offset) and match the character in haystack at (haystack offset + needle offset) with the character in needle at needle offset
        IF the letters match
            Agent is active.
        ELSE
           Agent is inactive.

    DIFFUSION PHASE
    Each inactive agent selects another agent at random.
    
    IF the selected agent is active
        The inactive agent adopts the hypothesis (index) of that agents.
    ELSE
        The inactive agent selects a new index (hypothesis) at random.
END WHILE

Each agent haystack offset is the starting index of needle.

Implementation

The following link links to the Java implementation of above algorithm on GitHub.

https://github.com/rajbdilip/stochastic-diffusion-search

Observation

Stochastic Diffusion Search text search is linear and fast. Agents initially search finish communicating with each other and hence finish searching in very less number of iterations. i.e. Around 100 iterations with 5 agents in the haystack of length around 500. For smaller increments in the number of agents the number of iterations required to converge to the solution decreases. But it cannot be ensured the big increments will significantly reduce the number of iterations. The number of iterations can rather increase.

However, if the needle is present in more than one offset, it cannot be guaranteed that this algorithm will find all of the occurrences regardless of the number of agents used. Since SDS is random, a different offset is returned every time.

References

al-Rifaie, Mohammad Majid, and John Mark Bishop. “Stochastic diffusion search review.” Paladyn, Journal of Behavioral Robotics 4.3 (2013): 155-173.

2 thoughts on “Text search using Stochastic Diffusion Search”

Andrew Martin says:

April 28, 2017 at 3:48 pm

I’ve just glanced my eye over the article and code, but couldn’t work out why you’ve made a Random number generator that can exclude a number.

SDS works fine with agents selecting themselves for example.

1. Dilip Raj Baral says:
  
  May 1, 2017 at 7:00 pm
  
  Hi, Andrew.
  
  The Random generator number generates a random number that hasn’t been generated before. It generates positions (hypothesis) which are assigned to agents. Since two agents shouldn’t work on the same hypothesis, it is required to generate a new random number that hasn’t been used before.

Dilip Raj Baral

A blog