Elasticsearch: No alive nodes found in your cluster (fails repeatedly, breaks the inventory index, all products disappear, customers can't place orders!)

I know this isn’t a help forum for ElasticSearch, but as there is already a recent (locked) post about ElasticSearch failing, I’m posting in case this become a more common question and if we can find a solution then it will help others out.


Summary of Issue

I’m running Magento 2.4.1 and ElasticSearch crashes randomly. When it crashes
(1) the Inventory Index and Catalog Search Indexes get broken
(2) on the frontend multiple product category pages show “We can’t find products matching the selection” even though there should be products in there
(3) customers can’t complete their purchases, end up retrying multiple times leaving unpaid orders on the system
(4) search box shows no results


Restarting ElasticSearch

It looks like ElasticSearch is the cause. To reset it and get everything working, I do this (renamed usernames to match Craig’s installation instructions):

(logged in as user magento)
su craig
sudo systemctl restart elasticsearch (takes a couple of minutes)
exit (user Craig)
(as user magento)
php bin/magento indexer:status
php bin/magento indexer:reset inventory
php bin/magento indexer:reindex

Then everything works again for a random amount of time anywhere from 5 hours to 7 days, then it falls over again.


Magento system.log

Have narrowed it down as ElasticSearch is the problem, or a problem kills ElasticSearch!
main.CRITICAL: No alive nodes found in your cluster [] []


Check of ElasticSearch logs

su Craig
sudo nano journalctl -u elasticsearch
gives…
systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
systemd[1]: elasticsearch.service: Failed with result ‘signal’.

sudo nano /var/log/elasticsearch/elasticsearch.log
open the log file, but there are no entries in there since the date of ElasticSearch installation (2 months ago)


Server specifications

Ionos Dedicated Cloud Server XL
4 vCore
8GB RAM
160GB SSD
Setup exactly as per Craig’s amazing Installing Magento 2.4 from scracth video (I’ve installed 2.4.1, but everything else is the same)
php.ini memory_limit set at 8GB
Magneto root .htaccess memory_limit is set at 4GB


Tried alternative Search Engine

Is ElasticSearch used for more than just the Search Box?

I installed Algolia Search which replaces ElasticSearch and runs it’s own search from their servers (we used this very sucesfully in the M1.9 store, and were going to do this anyway). Of course, I was hoping that this would mean no more ElasticSearch failures, but it’s just happened again today.


Questions!!!

  1. What would cause the errors
    systemd[1]: elasticsearch.service: Main process exited, code=killed, status=9/KILL
    systemd[1]: elasticsearch.service: Failed with result ‘signal’.
  2. How do I diagnose further
  3. Can I set it up to restart automatically after failing (and ideally reindex everything and also email me to let me know!)
  4. Should I just give up, go back to Magneto 1.9 and bury my head in the sand?

Looking forward to some suggestions and discussion on this. My concern is the MSI module in 2.4.1 is fragile and will lead to more people experiencing the same issue.

2 Likes

I have the same behavior here. I have a demo shop and a self made shop.
I am quite new to Magento so i am trying of lot of settings.
I do not have clear answers.

When I invalidated all the indexes and as a second step updates all the indexes by schedule the problem was solved. So my ques is could it be something with the indexing? Hope it sheds some light.

My question would be. Is it wise to invest in a indexing extension? for instance Improved Asynchronous Reindexing for Magento 2 by Mersavit?

ElasticSearch is the cause of the index breaking. So buying a plugin won’t solve cause of the problem it just acts as a sticking plaster. I solved the problem by enabling autostart on ElasticSearch.

ElasticSearch and M2.4
Unlike M2.35, ElasticSearch does more than just search in M2.4 and can’t be disabled because it runs the categories and sends data to Magento as described in the link below. So with ElasticSearch service stopped, the next time the indexing runs (by save or by schedule) it fails…https://magento.stackexchange.com/questions/318745/stopping-elasticsearch-service-removes-category-products-in-2-4

Turn on ElasticSearch AutoRestart on Failure
https://stackoverflow.com/questions/52624720/how-to-auto-restart-elasticsearch-search-once-crashed-on-linux-server (and other places same same thing)

su craig
sudo systemctl edit elasticsearch.service
this creates…
/etc/systemd/system/elasticsearch.service.d/override.conf
sudo nano /etc/systemd/system/elasticsearch.service.d/override.conf
(nano shows /etc/systemd/system/elasticsearch.service.d/.#override.conf20db41b52b20238b - don’t worry this is fine)
add this to the file

[Service]
Restart=always

save (ctrl-X, Y to save changes)
check it has added it at the bottom…
sudo systemctl daemon-reload
sudo systemctl cat elasticsearch.service
sudo systemctl restart elasticsearch
exit (user craig)

3 Likes

Thank you for sharing!
What a great way to go forward.

I followed the steps and i hope it will work. Keep you updated!

I had similar issue, it could be OOM killer in Linux kernel that kills the process.
I had to increase RAM for fixing the issue in my case.

Hello Shrikant,
I am rather new to this all.
I have not heard about OOM -killer.
Can this only be solved by increasing your RAM memory? What was your configuration and what is the new one?

Since I adapted the solution above I had no problems so far.
I use a server configuration suggested by Craig.

OOM Killer is “Out of Memory Killer”
I don’t know the technical details of how it works, but basically when the server detects it is low on memory it checks all it’s processes and kills (turns off) any that it thinks it doesn’t need.
So increasing server memory or adding AutoRestart to ElasticSearch as described above are solutions to the problem. Of course, increasing memory is likely to avoid the situation in the first place.

Thanks for the explanation!
Since I used your solution everything works great. Also with 2.4.2.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.