Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • lava lava
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 147
    • Issues 147
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 35
    • Merge requests 35
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • lava
  • lavalava
  • Issues
  • #551

Closed
Open
Created Jun 27, 2022 by Larry Shen@atlineContributor

Discussion for action retry.

@ivoire , I'd like raise a separate issue about action retry for next MRs:

  • !903 (merged)
  • !1147 (merged)
  • !1776

The #1176 could fix nest retry timeout issue, but during local practice, we found another issue: if the boot really can't boot up, the action will first retry for internal action "login", then the outer action "uboot".

Then, if we define failure_retry=4 in job, it will results in 4*4 actions as it's nested which cost lots of time.

You know in MR1147, I removed all nest actions. But in MR903, there is still a internal login action which need to be retried, and we utilize the RetryAction.

This results in today's MR1776, but I want to search the suggestion from you: do you think we need MR1776, should we allow nest actions in lava? You see it will results in NxN timeout as I mentioned above.

Do we need to somehow move the 2 RetryAction to same tree level of pipeline to remove the lots nested action issue?

I need your kindly suggestion here, as this really frequently bring trouble to us especially for some new NPI which may have lots of small exceptions. Thanks for suggestion.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking