Summary: A failover group defines failover between spaces in the cluster.

Overview

A group can define a failover policy, in which case clients of spaces belonging to the group receive and use a clustered proxy instead of a regular proxy. The failover policy of the group determines the failover behavior for the clustered proxy of any space in the group.
A group that defines a failover policy is called a failover group. A failover group can also be a replication group and a load-balancing group.

A space cannot reside in different load-balancing and failover groups. In other words, the only way to apply both load balancing and failover to a space is to define these policies for one group, which the space belongs to.

If an operation performed on a space fails because the space server fails, the clustered proxy tries to locate another space in the failover group and redirect the operation to it, subject to the failover policy defined. An exception is operations that are performed under a transaction; in this case, the clustered proxy aborts the transaction and throws an exception to the application. In this case, the application should start a new transaction, perform the operations again and re-commit.
To enable the Fail Over panel, check the Fail Over box in the Cluster Options panel.

For more details see the Cluster Configuration section *** .

The Policy tab of the Fail Over pane, shown above, allows you to define under which circumstances failover within the group should take place.
Failover policies are defined per operation: Write, Read, Take and Notify. In the policy table, select the Enable check-box next to an operation's name to turn on failover for that operation. For example, if you select the check-box for Write, all write operations made on a failed space will be transparently routed to a live space in the group. If you do not select this check-box, write operations made on a failed space will not be routed.
The above is only true if you do not select the Default check-box. This row in the policy table has a special function: If you select it, operations you did not explicitly enable will assume the failover policy defined under Default (instead of having no failover policy at all). This is done in the same way as defining a failover policy for a specific operation, as explained below.
The Policy Type column is a drop-down menu that allows you to specify how failover should occur for a specific operation:

  • Fail to Available - specifies that if a space is down, the operation should be routed to a live space in the group, according to the load-balancing policy.
  • Fail to Backup - specifies that if a space is down, the operation will only be routed to one of the spaces in the internal tab Backup Members below. Note that there is a separate Backup Members tab for each operation - Default, Write, Take and Notify (and Default, which applies to operations you did not explicitly define).
  • Immediately Fail to Alternate - specifies that if a space is down, the operation should be routed to an available space in one of the alternate groups, defined in the Alternate Groups tab, described later.

The Backup Members tab, shown above, allows you to define a set of spaces that should be used for failover in case of a specific operation (or Default). It is only enabled if you select Fail to Backup as the failover policy of the operation.
The group shown in the Edit Cluster Group dialog above has three spaces: sp1, sp2 and sp3. The Backup Members tab for the write operation has sp2 defined as a backup member of sp1. This means that if sp1 fails, operations are routed to sp2 and not to sp3.
This partial failover configuration is useful if you want to guarantee high performance for spaces with a high workload.
The opposite configuration is also possible - you can designate spaces that will be dedicated to failover. This will guarantee performance in case of a space failure. This is done using the Backup Only tab, which is also defined per operation, as an internal tab of the operation tab.
To specify and prioritize backup members for a specific operation:

  1. In the Edit Cluster Group dialog, click the Fail Over tab, then the internal tab Policy Description.
  2. Make sure the operation you want to work on is enabled, and that its policy is set to Fail to Backup. Click on its tab within Policy Description.
  3. Click the Backup Members tab, and press Add. A new row is created in the Members table.
  4. Click the Member Name column to open the drop-down menu, and select the name of the space for which you want to define backup members. If this space fails, connections will be routed to the backup members.
  5. Click the Backup Members column. The Define backup members dialog opens, allowing you to select backup members from the spaces in the group. Do not select all the spaces in the group, because this is the same as not defining backup members at all.
  6. If you add more than one space, you can prioritize backup members by selecting them and clicking the up/down arrows at the right of the dialog. The higher a space appears in the list of backup members, the higher its priority. If the focal space fails, and more than one backup member is available, the operation is routed to the member with the highest priority.
  7. When you are done selecting and prioritizing backup members, press OK in the Define backup members dialog.
  8. Repeat steps 3-7 to define backup members for other spaces in the group. You do not have to define backup members for all spaces - keep in mind that the default action is failover to all spaces.
  9. Click the Backup Only tab. This tab allows you to specify that a space, which serves as a backup member be dedicated to this function. "Backup only" spaces are not used for anything other than routed failover operations.
  10. The spaces shown here are all the backup members defined previously, but excluding the spaces for which backup members were defined. Select the Enabled check-box next to the name of a space to specify that this space should only be used for routed failover operations.
    Before setting a space as "backup only," make sure that it is not needed in the cluster's normal operations. Once you set a space to "backup only," the cluster proxy will stop routing operations to it (until another space fails).

The Alternate Groups tab within the Fail Over tab, shown above, allows you to define one or more alternate groups. These groups serve as backups for the focal group (the group you are defining in the Edit Cluster Group dialog). In other words, if all spaces in the focal group fail, the cluster proxy routes the operation to an available space in one of the alternate groups.
Select a group and use the right arrow to move it to the Selected Items box. This specifies that it should backup the group being edited.
If you add more than one alternate group, you can prioritize it using the up/down arrows at the right of the dialog. The higher a group appears in the list, the higher its priority. If all spaces in the focal group fail, and more than one alternate group is available, the operation will be routed to an available space in the group that has the highest priority.
Find Timeout Msec is the amount of time the cluster proxy waits, after receiving no reply from an alternate group, before deciding it is unavailable and trying another one (or giving up, if there are no more alternate groups).
The alternate groups defined here are relevant not only when all of the focal group's spaces fail. They are also used if you specify a policy type of Immediately Fail to Alternate for a certain operation. Use the internal tab Policy Description to set policy types for specific operations.
To apply cluster group definitions, click Create to return to the Create Groups window.
For more details see the Cluster configuration section*** .

***Link required

GigaSpaces.com - Legal Notice - 3rd Party Licenses - Site Map - API Docs - Forum - Downloads - Blog - White Papers - Contact Tech Writing - Gen. by Atlassian Confluence