Thursday, March 29, 2018

Dealing with failed hosts: Ansible playbooks - Part 2

By default, Ansible will continue executing actions as long as there are hosts in the batch that have not yet failed. The batch size for a play is determined by the serial parameter. If serial is not set, then batch size is all the hosts specified in the hosts: field. In some situations, such as with the rolling updates described above, it may be desirable to abort the play when a certain threshold of failures have been reached. To achieve this, you can set a maximum failure percentage on a play as follows:
- hosts: webservers
  max_fail_percentage: 30
  serial: 10
In the above example, if more than 3 of the 10 servers in the group were to fail, the rest of the play would be aborted.
Note
The percentage set must be exceeded, not equaled. For example, if serial were set to 4 and you wanted the task to abort when 2 of the systems failed, the percentage should be set at 49 rather than 50.

No comments:

Post a Comment