• Circuit breakers

    There are several reasons why you don’t want to retry too many times over too long a period:

    • Too many users persistently retrying failed requests might degrade other users’ experience. If millions of people are all making repeated retry requests, you could tie up IIS dispatch queues and prevent your app from servicing requests that it otherwise could handle successfully.

    • If everyone is retrying an operation because of a service failure, so many requests could be queued up that the service gets flooded when it starts to recover.

    • If the error is the result of throttling and there’s a window of time the service uses for throttling, continued retries could move that window out and cause the throttling to continue.

    • You might have a user waiting for a webpage to render. Making people wait too long might be more annoying that relatively quickly advising them to try again later.

    Exponential back-off addresses some of these issue by limiting the frequency of retries that a service can get from your application. But you also need to have circuit breakers: this means that at a certain retry threshold your app stops retrying and takes some other action, such as one of the following:

    • Custom fallback. If you can’t get a stock price from Reuters, maybe you can get it from Bloomberg; or if you can’t get data from the database, maybe you can get it from cache.

    • Fail silently. If what you need from a service isn’t all-or-nothing for your app, just return null when you can’t get the data. For example, if you're displaying a Fix It task and the Blob service isn't responding, you could display the task details without the image.

    • Fail fast. Error out the user to avoid flooding the service with retry requests that could cause service disruption for other users or extend a throttling window. You can display a friendly “try again later” message.

    There is no one-size-fits-all retry policy. You can retry more times and wait longer in an asynchronous background worker process than you would in a synchronous web app where a user is waiting for a response. You can wait longer between retries for a relational database service than you would for a cache service. Here are some sample recommended retry policies to give you an idea of how the numbers might vary. ("Fast First" means no delay before the first retry.)

    Source of Information : Building Cloud Apps With Microsoft Azure


0 comments:

Leave a Reply