Informed Search Strategies for State Space Search Solving

NEHAAL Last Updated : 02 Oct, 2022

4 min read

This article was published as a part of the Data Science Blogathon.

Introduction

In the last article, we learned about various blind search algorithms because no further information is given beyond the constraints laid out in the problem. Hence, The algorithms look for traversing through many different states before reaching the goal state. The disadvantage of these search strategies is very poor Time complexity. Hence, The use of these strategies for solving real-world problems is non-sensical.

This solution can be overcome by providing some Heuristics i.e., ‘experience from exposure’ for solving this problem. This gives way to Informed Search Strategies. These algorithms promise a solution and can quickly solve any complex issue. They also have a lower time complexity.

Informed Search Strategies use some heuristic function to choose the node to ensure the most promising way to reach the goal. The most popular way to give the search problem more information about the problem is using heuristic functions. It is the cost estimate to reach the goal state from a particular node, n. Any problem-specific function is acceptable (arbitrary and nonnegative). H(n) = 0 if n is a goal node.

The two Informed Search strategies which we will see in this article are:

Best First Search Algorithm
A* Search Algorithm

Best First Search Algorithm

The Informed BFS follows a greedy approach for state transitions to reach a goal. For every node here, we maintain an evaluation function(f(n)) which provides a cost estimate. The idea is to expand the node with the lowest f(n) every time. The Evaluation function here has a heuristic function h(n) component. i.e.

f(n) = h(n)

The node that seems to be closest to the goal is therefore extended. The implementation of this algorithm is done using a priority queue ordered by the evaluation function for each node. The Pseudo code for the same would look like this:

function Best_First_Search(problem)
{
    node = problem.initial_state;
    frontier = priority queue with node as the only element
    explored = empty set
    loop(until goal state is found)
        if(frontier == empty)
            return failure
        node = frontier.pop()
        add node to explored_set
        if(node == problem.goal_state)
            return path from Initial state up to the node
        else
            generate successors of the node
        if(successors not in frontier and not in explored_set)
            add successor to the priority queue
    end loop
}

Properties of Best-First Search Algorithm:

Completeness: No. Gets stuck in a loop sometimes.

Space Complexity: O(b^m)

Time Complexity: O(b^m) (but a good heuristic can make a drastic improvement.)

Optimal: No

(b is the branching factor, and m is the maximum depth of the search space.)

A* Search Algorithm

A* Search Strategy is one of the most widely used strategies as it guarantees an optimal solution. The idea is to avoid expanding paths that are already expensive. On a similar grounds with Best-FS, It uses an evaluation function(f(n)) to evaluate every node(state) in the path.

f(n) = g(n) + h(n)

Where g(n) is the actual cost from the initial state to the current node.
h(n) is the estimated cost from the current node to the goal state.

A* algorithm succeeds in finding the shortest distance path for the problem in a faster time. Hence, It is the most widely accepted solution for state space search. The optimality of the solution for the A* search depends on the admissibility of the heuristic function we choose. We will learn more about it below in some time.

Algorithm for A* Search Strategy

Step1: Initialise an empty set(explored). Initialize a priority queue(frontier) ordered by the evaluation function of the nodes. Add the node(initial state) to the queue with its evaluation function, f(start) = 0+h(start) ………..(g(start)=0)

Step2: Loop until a goal is found – If the frontier is empty, return failure. Else, pop the node with the lowest evaluation function(best_node) from the frontier. Add the best_node to the explored set. Check if the best_node is the goal state; if yes, return the solution. Else, generate its successors.

Step3: For every successor generated, do

3.1: Set successor to point back to the best_node. (These backward links help find the solution path from the first state to the last goal state.)

3.2: Compute actual cost for successor, i.e. g(successor) = g(best_node) + actual cost from getting from best_node to the successor.

3.3: if the successor is already in frontier(i.e., a path already exists to reach this node), we call it old_node.

3.4: Add old_node to the list of best_node’s successors. Now, we will compare the actual cost(g(n)) for reaching the old_node via its previous path and the current new path(i.e., via best_node)

3.5: if the old path is cheaper, we continue. Else, we update the link of old_node’s parent to point to best_node and update the evaluation function of old_node.

3.6: if the successor is present in explored(i.e., we have visited this node before), we call it old_node. We repeat steps 3.4 and 3.5 to see if we get a new, better path. We must propagate the improvement to old_node’s successors.

3.7: if the successor is not in the frontier and not in the explored set, we add it to the frontier. Put it on the list of best_node’s successors. We compute its evaluation function,

(f(successor) = g(successor) + h(successor))

The time complexity for this algorithm is b^d.

The space complexity for this algorithm is b^d.

(where b is the branching factor of the tree and d is the depth of the solution node.)

The optimality of the solution depends on the admissible heuristics. So, what are admissible heuristics?

Admissible Heuristics

A heuristic function, h(n) is admissible if, for every node, h(n) is less than or equal to g(n). (here, g(n) is the actual cost to reach the goal state). Therefore, An admissible heuristic never overestimates the cost of reaching the goal. It is always optimistic about finding the best path to the goal node. If h(n) is admissible for the A* algorithm, we get the optimal solution to our problem.

If we have two admissible functions (h1 and h2), where h2(n) ≥ h1(n), we let h2 dominate over h1.

Conclusion

We learned about the two methods of informed search strategies in this essay. A heuristic function provides more information about the issue to the informed search strategies. We observed the algorithms for both tactics, i.e., Best-First Search and A* Search. According to their attributes, the A* algorithm is the most effective search strategy out of all the Informed and Uninformed Search strategies.

In the upcoming articles, we will learn about Local Search Algorithms and Constraint Satisfaction Problems for State Space Search. If you like my article, Connect with me on Linkedin here.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

NEHAAL

🎓 I'm a Final-year Undergrad pursuing my BTech in Computer Science from MIT Academy Of Engineering, Pune. As a curious learner, I am also working on my honors in Data Science to expand my expertise in this domain.

Algorithm Beginner Deep Learning

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Informed Search Strategies for State Space Search Solving

Introduction

Best First Search Algorithm

A* Search Algorithm

Algorithm for A* Search Strategy

Admissible Heuristics

Conclusion

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS