Discussion for action retry.
@ivoire , I'd like raise a separate issue about action retry for next MRs:
The #1176 could fix nest retry timeout issue, but during local practice, we found another issue: if the boot really can't boot up, the action will first retry for internal action "login", then the outer action "uboot".
Then, if we define failure_retry=4
in job, it will results in 4*4 actions as it's nested which cost lots of time.
You know in MR1147, I removed all nest actions. But in MR903, there is still a internal login action which need to be retried, and we utilize the RetryAction
.
This results in today's MR1776, but I want to search the suggestion from you: do you think we need MR1776, should we allow nest actions in lava? You see it will results in NxN timeout as I mentioned above.
Do we need to somehow move the 2 RetryAction
to same tree level of pipeline to remove the lots nested action issue?
I need your kindly suggestion here, as this really frequently bring trouble to us especially for some new NPI which may have lots of small exceptions. Thanks for suggestion.