Elastic Agent stuck in "Update Rollback" loop

Elastic Agent stuck in "Update Rollback" loop

There are some cases where the Elastic Agent gets stuck in an "Update Rollback" loop. This can happen for various reasons, such as network issues, configuration problems, or bugs in the agent itself.

Those states can be quite annoying, as the agent will keep trying to update itself but fail, leading to a continuous rollback state.

This can be resolved via the API.
Requirements are:

  • You have a user with access to the Elastic Stack API
  • API Tool like curl or Postman

Then you can start with the following steps:

  • First we need a token to authenticate with the Elastic Stack API. You can create a token using the following command:
curl -X POST --user yourusername:${cat .secret_password_file} -H 'x-elastic-product-origin:fleet' -H' content-type:application/json' "https://elasticserver:9200/_security/service/elastic/fleet-server/credential/token/fix-agents"
  • Next, you need to use the token to authenticate your requests to the Elastic Stack API. You can do this by including the token in the Authorization header of your API requests:
curl -XPOST -H "Authorization: Bearer <YOURTOKEN>" -H 'x-elastic-product-origin:fleet' -H'content-type:application/json' "https://elasticserver:9200/.fleet-agents/_update_by_query" -d '{
    "query": {
      "bool": {
        "must": [
          {
            "exists": {
              "field": "upgrade_details"
            }
          },
          {
            "bool": {
              "must_not": [
                {
                  "term": {
                    "upgrade_details": ""
                  }
                },
                {
                  "term": {
                    "upgrade_details": "null"
                  }
                }
              ]
            }
          }
        ]
      }
    },
    "script": {
      "source": "ctx._source.upgrade_details = null",
      "lang": "painless"
    }
  }'
  • After this you should destroy the token, as it is no longer needed:
curl -k -XDELETE --user yourusername:${cat .secret_password_file} -H 'x-elastic-product-origin:fleet' -H'content-type:application/json' "https://elasticsearchserver:9200/_security/service/elastic/fleet-server/credential/token/fix-agents"
  • Finally, you can check if the agents are no longer in the "Update Rollback" state by querying the agents:

I hope this helps you to resolve the "Update Rollback" loop of the Elastic Agent. If you have any questions or need further assistance, feel free to ask.