Summary: This section describes GigaSpaces IMDG Load-balancing and data-partitioning facilities.
Data Load-balancingLoad-balancing is essential in any truly scalable architecture, as it enables scaling beyond the physical resources of a single-server machine. In GigaSpaces, load-balancing is the mechanism used by the clustered proxy to distribute space operations among the different cluster members. Each cluster member can run on a different physical machine. If a space belongs to a load-balancing group, the clustered proxy originating from this space contains logical references to all space members in the load-balancing group. The references are "logical", in the sense that no active connection to a space member is opened until it is needed. This is illustrated in the following diagram:
Partitioning TypesGigaSpaces ships with a number of built-in load-balancing policies. These range from relatively static policies, where each proxy "attaches" to a specific server and directs all operations to it, to extremely dynamic policies where the target of an operation takes into account the operation data and the relative strength of the server machine. In each load balancing group, a load-balancing policy is specified per basic space operation: write, read, take and notify. This allows you to direct different kinds of operations to different spaces, ensuring correct semantics for the application. The following table describes the built-in load balancing types.
Hash-Based Load-BalancingThis is the default mode and applicable for most of the application. When using hash-based load-balancing policy the client proxy spread new written space objects into relevant cluster space nodes. The relevant space to rout the operation is determined using space object routing field (aka also as routing index) value hashcode. This value together with the number of tue cluster partitions used to calculate the target space for the operation. Target partition space ID = safeABS(routing field hashcode) % (# of partitions) int safeABS( int value) { return value == Integer.MIN_VALUE ? Integer.MAX_VALUE : Math.abs(value); } The routing field must implement the hashCode method and will be used both when performing write and read operations. When using this approach the assumption is there is normal distribution of routing field values to have even distribution of the data across all the cluster partitions.
ExampleA cluster is defined with 3 partitions where each partition have one primary and one backup space. The cluster configured to use the hash-based load-balancing policy. The application writes the Account space object into the clustered space. The accountID field as part of the Account class is defined as the routing field using the @SpaceRouting decoration. The Account class implementation: @SpaceClass public class Account { Integer accountID; String accountName; @SpaceRouting public Integer getAccountID() { return accountID; } public String getAccountName() { return accountName; } // setter methods ... } The accountID field value is used by the space client proxy together with the number partitions in the following manner to determine the target space for the write, read, or take operations: Target Space number = (accountID hashCode value) modulo (Partitions Amount) If we will write 30 Account space objects with different accountID values into the cluster, the space objects will be routed into the 3 partitions in the following manner:
Hash based Load-Balancing CalculatorSee below Hash based Load-Balancing Calculator that calculates the target space of a given routing value. import java.util.Hashtable; import java.util.Random; public class LoadBalancingCalc { public static void main(String[] args) { int partitions = 10; // String values[] = {"A","B","C" , "AAAAAA" , "BBB" , "CCCC", "QQQQ" , "C" , "Y"}; Random rand = new Random(1000); int maxObject =1000; String values[] = new String [maxObject ]; for (int i=0; i<maxObject ;i++) { values[i] = String.valueOf(rand.nextInt(maxObject )); } Hashtable dist = new Hashtable (); for (int i=0;i<values.length ; i++) { int hc = values[i].hashCode(); int targetPartitionID = safeABS(hc) % partitions; Integer dist_value = (Integer)dist.get(targetPartitionID); if (dist_value == null) dist_value = new Integer(0); dist.put(targetPartitionID , dist_value.intValue() + 1) ; } System.out.println("Total amount of objects:" + maxObject); System.out.println("Total amount of partitions:" + partitions); for (int i=0;i<dist.size() ; i++) { System.out.println("Partition " + i + " have " + dist.get(i) + " objects"); } System.out.println(); System.out.println("Routing field values:"); for (int i=0; i<maxObject ;i++) { System.out.print (values[i] + ","); if ((i % 80 ==0) && (i>80)) System.out.println(); } } static int safeABS( int value) { return value == Integer.MIN_VALUE ? Integer.MAX_VALUE : Math.abs(value); } } Here is an example output: Total amount of objects:1000 Total amount of partitions:10 Partition 0 have 107 objects Partition 1 have 104 objects Partition 2 have 104 objects Partition 3 have 99 objects Partition 4 have 104 objects Partition 5 have 92 objects Partition 6 have 103 objects Partition 7 have 103 objects Partition 8 have 90 objects Partition 9 have 94 objects Routing field values: 487,935,676,124,.... Load-balancing with Replication/FailoverLoad-balancing can be combined with replication. Depending on the application needs and the load-balancing policy used, this may or may not be necessary. For example, if users tend to perform read operations, and the policy used is round-robin, replication will be necessary to ensure the requested space object exists on the target space. If, by contrast, the policy used is distribute-by-class, it may not be necessary to replicate, because the class requested should have been written to the same space. Load-balancing can also be combined with failover, to achieve both scalability and fault tolerance. BroadcastThere are three Broadcast options:
Batch Operation Execution ModeThe following table specifies when the different batch operations executed in parallel manner and when in serial manner when the space running in partitioned mode:
Parallel Operation Thread PoolThe client performs parallel operations using a dedicated thread pool managed at the proxy. You may tune this pool size via the following settings at the cluster config: <proxy-broadcast-threadpool-min-size>4</proxy-broadcast-threadpool-min-size> <proxy-broadcast-threadpool-max-size>64</proxy-broadcast-threadpool-max-size> Considerations
|
![]() |
GigaSpaces.com - Legal Notice - 3rd Party Licenses - Site Map - API Docs - Forum - Downloads - Blog - White Papers - Contact Tech Writing - Gen. by Atlassian Confluence |