Summary: This tutorial shows how an application interacts with a GigaSpaces Data Grid, clustered in either a replicated, partitioned, master-local, or local-view topology. The application either actively reads data or registers for notifications.
This tutorial contains a client application that runs on GigaSpaces 6.6. You must have GigaSpaces 6.6 installed before proceeding. You can download the product here. Both Java and .NET implementations are provided:
This icon specifies instructions relevant only for Java. This icon specifies instructions relevant only for .NET.
Java
Code that differs in Java and .NET is shown in two tabs, like this.
Java code
.NET
Code that differs in Java and .NET is shown in two tabs, like this.
Different applications might have different caching requirements. Some applications require on-demand loading from a remote cache, due to limited memory; others use the cache for read-mostly purposes; transactional applications need a cache that handles both write and read operations and maintains consistency.
In order to address these different requirements, GigaSpaces provides an In-Memory Data Grid that is policy-driven. Most of the policies do not affect the actual application code, but rather affect the way each Data Grid instance interacts with other instances. The policies allow the Data Grid to be configured in almost any topology; most common topologies are predefined in the GigaSpaces product and do not require editing policies.
In this tutorial, you will use GigaSpaces to implement a simple application that writes and retrieves user accounts from the GigaSpaces In-Memory Data Grid, clustered in the most common topologies - replicated, partitioned, master-local and local-view. The application will either actively read data or ask to be notified when data is written to or modified in the Data Grid.
GigaSpaces Data Grid - Basic Terms
Data Grid instance - an independent data storage unit, also called a cache. The Data Grid is comprised of all the Data Grid instances running on the network.
Space - a distributed, shared, memory-based repository for objects. A space runs in a space container - this is usually transparent to the developer. In GigaSpaces each Data Grid instance is implemented as a space, and the Data Grid is implemented as a cluster of spaces organized in one of several predefined topologies.
Grid Service Container - a generic container that can run one or more space instances (together with their space containers) and other services. This container is launched on each machine that participates in the Data Grid, and hosts the Data Grid instances.
Replication - a relationship in which data is copied between two or more Data Grid instances, with the aim of having the same data in some or all of them.
Syncronous replication - replication in which applications using the Data Grid are blocked until their changes are propagated to all Data Grid instances. This guarantees that everyone sees the same data, but reduces performance.
Asyncronous replication - replication in which changes are propagated to Data Grid instances in the background; applications do not have to wait for their changes to be propagated. Asynchronous replication does not negatively effect performance, but on the other hand, changes are not instantly available to everyone.
Partitioning - new data or operations on data are routed to one of several Data Grid instances (partitions). Each Data Grid instance holds a subset of the data, with no overlap. Partitioning is done according to an index field in the data - operations are routed to partitions based on the value of this field.
Topology - a specific configuration of Data Grid instances. For example, a replicated topology is a configuration in which some or all Data Grid instances replicate data between them. In GigaSpaces, Data Grid topologies are defined by cluster policies (explained in the following section).
Reading - one way to retrieve data from the Data Grid, which will be used in this tutorial, is to call the space read operation, supplying a read template object which specifies what needs to be read.
Notifications - GigaSpaces allows applications to be notified when changes are made to objects in the Data Grid. Applications register in advance to be notified about specific events. When these events occur, a notification is triggered on the application, which delivers the actual data that triggered the event.
GigaSpaces Clustering Concepts
In GigaSpaces, a cluster is a grouping of several spaces running in one or more containers. For an application trying to access data, the cluster appears as one space, but in fact consists of several spaces which may be distributed across several physical machines. The spaces in the cluster are also called cluster members.
A cluster group is a logical collection of cluster members, which defines how these members interact. The only way to define relationships between clustered spaces in GigaSpaces, is to add them to a group and define policies. A cluster can contain several, possibly overlapping groups, each of which defines some relations between some cluster members - this provides much flexibility in cluster configuration.
A GigaSpaces cluster group can have one or more of the following policies:
Replication Policy - defines replication between two or more spaces in the cluster, and replication options such as synchronous/asynchronous and replication direction.
Load Balancing Policy - because user requests are submitted to the entire cluster, there is a need to distribute the requests between cluster members. The load balancing policy defines an algorithm according to which requests are routed to different members. For example, in a replicated topology, requests are divided evenly between cluster members; in a partitioned topology they are routed according to the partitioning key.
Failover Policy - defines what happens when a cluster member fails. Operations on the cluster member can be transparently routed to another member in the group, or to another cluster group.
A cluster schema is an XML file which defines a cluster - the cluster name, which spaces are included in the cluster, which groups are defined on them, and which policies are defined for each group. GigaSpaces provides predefined cluster schemas for all common cluster topologies. Each topology is a certain combination of replication, load balancing and failover policies.
Data Grid Topologies Shown in this Tutorial
Topology and Description
Common Use
Options
Replicated (view diagram) Two or more space instances with replication between them.
Allowing two or more applications to work with their own dedicated data store, while working on the same data as the other applications.
Replication can be synchronous (slower but guarantees consistency) or asynchronous (fast but less reliable, as it does not guarantee identical content).
Space instances can run within the application (embedded - allows faster read access) or as a separate process (remote - allows multiple applications to use the space, easier management).
In this tutorial: two remote spaces, synchronous replication.
Partitioned (view diagram) Data and operations are split between two spaces (partitions) according to an index field defined in the data. An algorithm, defined in the Load-Balancing Policy, maps values of the index field to specific partitions.
Allows the In-Memory Data Grid to hold a large volume of data, even if it is larger than the memory of a single machine, by splitting the data into several partitions.
Several routing algorithms to chose from.
With/without backup space for each partition.
In this tutorial: Two spaces, hash-based routing, with backup.
Master-Local (view diagram) Each application has a lightweight, embedded cache, which is initially empty. The first time data is read, it is loaded from a master cache to the local cache (lazy load); the next time the same data is read, it is loaded quickly from the local cache. Later on data is either updated from the master or evicted from the cache.
Boosting read performance for frequently used data. A useful rule of thumb is to use a local cache when over 80% of all operations are read operations.
The master cache can be clustered in any of the other topologies: replicated, partitioned, etc.
In this tutorial: The master cache comprises two spaces in a partitioned topology.
Local-View (view diagram) Similar to master-local, except that data is pushed to the local cache. The application defines a filter, using a spaces read template or an SQL query, and data matching the filter is streamed to the cache from the master cache.
Achieving maximal read performance for a predetermined subset of data.
The master cache can be clustered in any of the other topologies: replicated, partitioned, etc.
In this tutorial: The master cache comprises two spaces in a partitioned topology.
The topologies above are provided in the GigaSpaces product as predefined cluster schemas. Schemas can be found inside the <GigaSpaces Root>/lib/JSpaces.jar, under the schemas/config directory. The schema names are:
Partitioned with backup - partitioned-sync2backup-cluster-schema.xsl The master-local and local-view topologies do not need their own schemas, because the local cache is defined on the client side.
Now that you have a little background about the GigaSpaces Data Grid and the topologies used in this tutorial, the first step is to deploy the Data Grid.
To deploy the Data Grid instances, you will first launch two GigaSpaces Grid Service Containers (generic containers that can run Data Grid instances) on the same machine. Each container will host one cluster node. In real life, each cluster node usually runs on a different physical machine.
You will also start a Grid Service Manager that will manage the two Grid Service Containers.
Then, using the GigaSpaces Management Center (GS-UI), you will launch two spaces, clustered together according to one of the Data Grid topologies discussed above.
Start by choosing the Data Grid topology that interests you most, and launching it using the instructions below. After you start the client application and test this topology (as described in the following sections), you can return to this section, deploy another topology, and try it out as well.
To run the Grid Service Containers:
Start the GS-UI, by executing <GigaSpaces Root>\bin\gs-ui.bat (or .sh).
From the upper toolbar select Launch -> Local Service (GSM/GSC) -> Grid Service Manager to start a local (running in this machine) Grid Service Manager, which manages the containers.
From the upper toolbar select Launch ->* Local Service (GSM/GSC)* -> Grid Service Container to start a local (running in this machine) Grid Service Container.
Start another Grid Service Container by selecting Launch -> Local Service (GSM/GSC) -> Grid Service Container again.
To deploy the Data Grid:
Inside the GS-UI, on the toolbar at the top, click the Launch Data Grid ( ) button. This is how you deploy a data grid.
The following page showing the Data Grid attribute fields is displayed:
In the Data Grid Name field, type the name myDataGrid as shown above. This name represents the Data Grid you are deploying in the GS-UI. This name will be given to all spaces in the cluster. Remember this space name - you will use it when running the client application and connecting to the Data Grid.
In the Space Schema field, leave the space schema as default. This field allows you to specify whether the space instances in the cluster should be persistent (data automatically persisted to a database) or not. You will not use persistency in this tutorial.
In this page of the wizard you will define the Data Grid topology by filling the Cluster Info area, do one of the following:
If you want to deploy the Data Grid in areplicated topology, From the *Cluster schema drop-down menu, select the sync_replicated option. This option uses the sync_replicated-cluster-schema, which has synchronous replication between all cluster members. This option refers to a single space or a cluster of spaces (in one of several common topologies) with no backup.
Select the number of spaces (Data Grid instances) in your replicated cluster. Deploy a cluster with 2 spaces, by typing the number 2 into Number of Instances field. The following shows the settings for the replicated topology:
If you want one of the other topologies,partitioned, master-local or local-view, from the Cluster schema drop-down menu, select the partitioned option. This option refers to a single space with a backup, or a partitioned cluster of spaces with backups.
You need to select the number of partitions. Specify two partitions by typing 2 into the Number of Instances field. This option uses the partitioned-cluster-schema. Specify one backup for each partition, by typing 1 into the Number of backups field. When using the partitioned cluster with backups the cluster schema used is the partitioned-sync2backup-cluster-schema. The following shows the settings for the partitioned (with backup) topology:
For both topologies you need to select a Grid Service Manager (GSM) for deployment from the table placed in the bottom area of the page. The table might include more than one Grid Service Manager. If so, look for the specific manager you launched - you can find it according to the Machine field (look for the machine on which you ran the Grid Service Manager). Click your Grid Service Manager to select it.
Click Deploy to deploy the cluster. Deployment status is displayed (Here for the two replicated Data Grid instances):
In the master-local and local-view topologies, the master cache can in principle be clustered in any topology - partitioned, replicated, etc. (or can be a single space). The master-local/local-view aspect of the topology is specified on the client side: when the client connects to the cluster or space (the master cache), it specifies if it wants to start a local cache and how this cache should operate.
Depending on the type of deployment you performed, you should see that either two spaces (two replicated Data Grid instances) or four spaces (two Data Grid partitions with one backup each) were provisioned to the host running the Grid Service Containers.
You deployed the the Data Grid using the GS-UI and its Deployment Wizard. An alternative way to deploy is to start the cluster manually, by executing the gsInstance script (<GigaSpaces Root>\bin\gsInstance.bat or .sh). Manual deployment requires the use of Space URLs, which might take different arguments for different topologies.
For more details on deploying a cluster manually, refer to Space URL.
In this tutorial, we provide a sample application that consists of the following components:
A Data Loader that writes data to the Data Grid.
A Simple Reader that reads data directly from the Data Grid (using spaces read).
A Notified Reader that registers for notifications on the Data Grid and is notified when data is written by the Data Loader. You can run one or more reader of either or both types.
AnAccountobject, defined as a POJO (Java) or PONO (.NET), which represents the data in the Data Grid. It has the following fields: userName, accountID and balance.
Getting Source Code and Full Client Package
The source code of all three components, and the scripts used to run them, remains the same for all Data Grid topologies described above. To view the source code, use the links below:
The full Java client package including execution scripts is included, together with other GigaSpaces examples and tutorials. Find the client package for this tutorial at <GigaSpaces Root>\examples\tutorials\datagrid\topologies.
The full .NET client package can be found at the following path: <GigaSpaces Root>\dotnet\examples\DataGrid. If you don't see this path, this is because when you download the product, the dotnet directory is initially zipped. Extract the ZIP file in the dotnet directory into <GigaSpaces Root\dotnet, then look for this tutorial's client package under <GigaSpaces Root>\dotnet\examples\DataGrid.
Client Operating Process (In Brief)
When you run the Data Loader, it:
Connects to the Data Grid and clears it from all data.
Creates a new Account object, with a certain userName and accountID. The Account also has a balance (Java) or Balance (.NET) field, which is obtained by calculating accountID*10 (Java) or AccountID*10 (.NET).
Writes 100 Account instances with IDs 1 through 100 to the Data Grid, using JavaSpaces write.
When you run a Simple Reader, it reads all the Account instances in the Data Grid, then reads them again every few seconds, until you close it.
When you run a Notified Reader, it registers for notification on the Account class, and starts listening for notifications. When Account objects are written to the Data Grid, the Notified Reader immediately receives notifications from the Data Grid. The notifications include the Account objects themselves.
If you run more 'Simple Readers' or 'Notified Readers', they repeat step 2 or 3 above, respectively.
How the Client Application Connects to the Data Grid
The application connects to the space using the GigaSpaces SpaceFinder.find() (Java) or SpaceProxyProviderFactory.Instance.FindSpace(spaceUrl) (.NET) method. This is a method that accepts a space URL, discovers the space, and returns a proxy that allows the application to work with the space. The URL is usually not defined in the client application itself, but is supplied to it as an argument when it is started.
In this tutorial, we will use a space connection URL similar to the following:
jini://*/*/myDataGrid
This URL uses the Jini protocol, which enables dynamic discovery of the space (the client does not need to know which machines are participating in the Data Grid).
*/*/myDataGrid specifies that the client wants to connect to a cluster in which all the spaces are called myDataGrid, regardless of which physical machines participate in the cluster.
useLocalCache is an additional parameter, not shown above, which launches a local cache in the connecting application. This is necessary for the master-local and local-view topologies.
The URL above is used by the application to connect to the space (a cluster of spaces in this case), so it is called a space connection URL. This should not be confused with a space start URL, a similar form of URL which can be used to start a space. In this tutorial, you will not use a space start URL, rather you will start the spaces using the GS-UI, as described below.
How Notifications Work
In a GigaSpaces Data Grid, applications can ask to be notified when changes are made to objects in the Data Grid. A request for notification has two components: a template and a mask:
The template specifies the class type and attribute values the application is interested in.
The mask (also called NotifyActionType in Java or DataEventType in .NET) specifies which events the application wants to be notified about - new data written to the Data Grid, data taken from the Data Grid, and so on.
GigaSpaces provides a mechanism that handles this process without requiring remote calls. The mechanism works as follows:
Java
The application instantiates a Notify Container and connects it to the space.
This notify container is registered to get notifications when an account object matching the template, is written ( .notifyWrite)=true ) to the space.
When a notification arrives the account object is read from the space.
And the event inner event listener class ( SpaceDataEventListener ) is triggered, and its onEvent method is invoked on the account object read.
Here is how the Notified Reader registers for notifications:
notifyContainer = new SimpleNotifyContainerConfigurer(gigaSpace)
.template(new Account()).fifo(true).notifyWrite(true)
.eventListener(new SpaceDataEventListener<Account>() {
public void onEvent(Account account, GigaSpace gigaSpace,
TransactionStatus txStatus, Object source) {
System.out.println("Read account info ["+account.toString()+"]");
}
}).notifyContainer();
System.out.println("Listening...");
.NET
The application instantiates an IDataEventSession and connects it to the space.
The application creates an EventSessionConfig object, which can specify different options for the notification.
The application passes the configuration object to the CreateDataEventSession method, and gets a or IDataEventSession (.NET).
The application uses the IDataEventSession.addListener method to generate an IEventRegistration - the object that actually receives the notifications from the space. The addListener method accepts the notification template, the DataEventType and other parameters, but most importantly the listener object.
The listener object is an EventHandler which is fired when a notification is received.
When the Data Grid detects relevant operations, it contacts the IEventRegistration.
IEventRegistration then fires the listener.
For every relevant space event, the Data Grid provides an object of type SpaceDataEventArgs, which contains information about the event that occurred (e.g. new data was written), and also the actual object that was involved (e.g. the object that was written). The code implemented in the listener can extract the object and perform operations on it.
Here is how the Notified Reader registers for notifications:
space.DefaultDataEventSession.AddListener<Account>(
\\ Template
new Account(),
\\ Listener
Space_AccountChanged,
DataEventType.Write);
And here is the callback method invoked when the application is notified:
The Data - Defined as a POJO (Java) or PONO (.NET)
In this tutorial all the objects written to the space instances, which make up the Data Grid, are Plain Old Java Objects - POJOs (Java) or Plain Old .NET Objects - PONOs (.NET). This is in contrast to the tutorials in the Parallel Processing Track of this Quick Start Guide, in which objects written to the space implement the Entry class, as in the JavaSpaces standard.
To demonstrate use of POJOs (Java) or PONOs (.NET), the Account class is implemented with private fields, and with set/get methods (Java) or Properties (.NET) for each field, which enable the space to read and write the field value. For example:
private string _userName = null;
[...]
public string UserName
{
get { return _userName; }
set { _userName = value; }
}
Index Field for Partitioning
Inside the Account object, one of the data fields is defined as a routing index field for the purposes of partitioning. If this object is used in a Data Grid deployed in a partitioned topology, the routing index field is used to distribute data between the Data Grid instances, and to retrieve data from the relevant Data Grid instance when it is read.
In this tutorial, the routing index field is AccountID, and the partitioning algorithm is a hash. This means the operations on accounts are distributed evenly, based on the AccountID, between the Data Grid instances. You deployed two spaces (Data Grid instances), so all the operations on half the accounts - those with even IDs - go to one space, and all operations on the other half - with the odd IDs - go to the other space.
Here is how AccountID is defined as the index field (Java) or property (.NET), inside the Account object - in Java, using annotations before the get/set methods; in .NET, using attributes before the property:
[SpaceID]
[SpaceRouting]
[SpaceProperty(NullValue = -1)]
publicint AccountID
{
get { return _accountID; }
set { _accountID = value; }
}
When using JDK 1.4, instead of using annotations, an Account.gs.xml file should be placed in a folder named config\mapping. The file should contain the following:
persist="false" replicate="false" fifo="false" >
For more information on using a gs.xml file instead of annotations (in Java), refer to C++ Mapping File.
Running Client, Testing Notifications and Verifying Data Grid Topologies
Now that you have started the Data Grid topology of your choice, you can run the client application, described in the previous section, verify that the Notified Reader receives notifications, and then test that the Data Grid topology is functioning as expected (for example, that data is really being replicated between the spaces).
Before you begin - download and compile the client application:
If you haven't done so already, extract the client application: If your <GigaSpaces Root>\dotnet folder contains a ZIP file, extract it.
The client application package should appear at the following path: <GigaSpaces Root>\examples\tutorials\datagrid\topologies. <GigaSpaces Root>\dotnet\examples\DataGrid
Compile the client's source files by executing \bin\compile.bat (or .sh) from the example folder.
Select the topology you deployed from the tabs below.
Replicated
To run the client, test notifications and see that replication is working:
Start a console and cd to the folder containing this tutorial's client application: <GigaSpaces Root>\examples\tutorials\datagrid\topologies <GigaSpaces Root>\dotnet\examples\DataGrid\bin
Run the Notified Reader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_NotifiedReader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_NotifiedReader.bat
The .NET batch file contains the line DataGrid.exe Events "jini://*/*/myDataGrid"
The string at the end is the Space Connection URL. myDataGrid is the name you defined for all spaces (Data Grid instances) in the cluster. The client uses this URL to search the network for Data Grid instances with this name (the client is unaware of the cluster topology, replicated in this case).
Run the Data Loader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_DataLoader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_DataLoader.bat
The .NET batch file contains the line: DataGrid.exe Events "jini://*/*/myDataGrid"
Switch to the command console that is running the Notified Reader, maximize the window, and see that it receives all 100 accounts.
You have just verified that notifications work! The Data Loader connects to the cluster of spaces (Data Grid instances), and writes 100 Account objects which are automatically replicated between the two spaces. Then, one of these spaces generates a series of events which deliver the new Account objects to the notified reader.
If the GS-UI is not currently running, start it by executing <GigaSpaces Root>\bin\gs-ui.bat (or .sh).
In the GS-UI, click the Space Browser tab on the left. This allows you to view running spaces and perform operations on them.
In the Grid Tree on the left, under Spaces, there should be two spaces called myDataGrid (the space name you used when deploying the cluster). Expand the first space with this name, and click Classes. This shows all the classes of data stored in the space.
The Classes List pane is displayed on the right. At the top-center of this pane, click Start to read the list of classes from the space. One of the classes listed should be the class written by the Data Loader, com.gigaspaces.examples.tutorials.topologies.Account.
The Instance Count column shows, for each class, how many objects exist in the space. Check that the instance count for the Account class is 100.
In the Grid Tree on the left, under Spaces, expand the second space called myDataGrid, and click Classes.
In the Classes List, check that the instance count for the class com.gigaspaces.examples.tutorials.topologies.Account is 100.
You have just verified that replication works! Although the Data Loader only wrote the Account objects once, they were replicated, and the same 100 accounts now exist in each of the two Data Grid instances.
If you want to test another topology, click the Deployments, Details tab on the left. In the Service Grid Network tree on the left, click the myDataGrid node.
Click the undeploy button ( ) and click Yes to approve. Once you have successfully undeployed, Return to Deploying the Data Grid.
Partitioned with Backup
To run the client, test notifications and see that partitioning is working:
Start a console and cd to the folder containing this tutorial's client application: <GigaSpaces Root>\examples\tutorials\datagrid\topologies <GigaSpaces Root>\dotnet\examples\DataGrid\bin
Run the Notified Reader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_NotifiedReader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_NotifiedReader.bat
The .NET batch file contains the line: DataGrid.exe Events "jini://*/*/myDataGrid"
The string at the end is the Space Connection URL. myDataGrid is the name you defined for all spaces (Data Grid instances) in the cluster. The client uses this URL to search the network for Data Grid instances with this name (the client is unaware of the cluster topology, partitioned in this case).
It didn't work? Troubleshoot...
Unable to render {include} Couldn't find a space with key: QSG6
Behind the scenes: The Notified Reader connects to the cluster of spaces (Data Grid instances), and registers for notifications on the Account class. From this point onwards, the reader functions as a listener which waits to receive events from the space.
Run the Data Loader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_DataLoader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_DataLoader.bat
The .NET batch file contains the line: DataGrid.exe Loader "jini://*/*/myDataGrid"
It didn't work? Troubleshoot...
Unable to render {include} Couldn't find a space with key: QSG6
Switch to the command console that is running the Notified Reader, maximize the window, and see that it receives all 100 accounts.
You have just verified that notifications work! The Data Loader connects to the cluster of spaces (Data Grid instances), and writes 100 Account objects. These objects are divided between the two spaces according to the partitioning algorithm: accounts with even IDs go to one space; accounts with odd IDs go to the other space. The Notified Reader automatically receives all the even-ID accounts from one space, and all the odd-ID accounts from the other space (without being aware that the data is partitioned).
If the GS-UI is not currently running, start it by executing <GigaSpaces Root>\bin\gs-ui.bat (or .sh).
In the GS-UI, click the Space Browser tab on the left. This allows you to view running spaces and perform operations on them.
In the Grid Tree on the left, under Spaces, there should be four spaces called myDataGrid (the space name you used when deploying the cluster). There are four spaces because you started two partitions with one backup each. Expand one of the two primary spaces (Primary space icon is green and marked P) with this name, and click Classes. This shows all the classes of data stored in the space.
The Classes List pane is displayed on the right. At the top-center of this pane, click Start to read the list of classes from the space. One of the classes listed should be the class written by the Data Loader, com.gigaspaces.examples.tutorials.topologies.Account.
The Instance Count column shows, for each class, how many objects exist in the space. Check that the instance count for the Account class is 50.
In the Grid Tree on the left, for each of the other three spaces named myDataGrid, expand the space node and click Classes. Check that the instance count for the class com.gigaspaces.examples.imdg.Account is also 50, for each of these spaces.
You have just verified that partioning and backup works! Although the Data Loader wrote the Account objects once, they were split between the two primary spaces (50 in each space), and replicated to the two backup spaces (which now also contain 50 accounts each).
Double click on one of the primary spaces to expand the space node and click Classes.
In the Classes List on the right, right-click the Account class, and from the context menu, select Query. The Classes List is replaced by a query panel. At the top of this panel is an SQL query requesting all objects belonging to this class. Below the query is a result list, currently ordered by object ID. The AccountID column shows the ID for each account.
Scroll through the results to see that this space contains only accounts with odd IDs (1, 3, ..., 49), or only accounts with even IDs (2, 4, ..., 50).
Select the other space running under a container without _1 at the end of its name - this is the other Data Grid partition. Repeat the previous steps to see that this space contains the other half of the 100 accounts.
You have just verified that partitioning uses the index field! The Account objects written to the Data Grid were split between the two spaces according to the index field we defined in the Account class (the AccountID field). All even-ID accounts were written to one partition, while all odd-ID accounts were written to the other partition.
If you want to test another topology, click the Deployments, Details tab on the left. In the Service Grid Network tree on the left, click the myDataGrid node.
Click the undeploy button ( ) and click Yes to approve. Once you have successfully undeployed, Return to Deploying the Data Grid.
Master-Local
To run the client, test notifications, verify master cache partitioning, and see themaster-localpattern in action:
Start a console and cd to the folder containing this tutorial's client application: <GigaSpaces Root>\examples\tutorials\datagrid\topologies <GigaSpaces Root>\dotnet\examples\DataGrid\bin
Run the Notified Reader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_NotifiedReader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_NotifiedReader.bat
The .NET command contains the line: DataGrid.exe Events "jini://*/*/myDataGrid"
The string at the end is the Space Connection URL. myDataGrid is the name you defined for all spaces (Data Grid instances) in the cluster. The client uses this URL to search the network for Data Grid instances with this name (the client is unaware of the cluster topology, partitioned in this case).
The Notified Reader does not use a local cache - it doesn't need one, because it receives the data once from the Data Grid and doesn't need to read it again.
It didn't work? Troubleshoot...
Unable to render {include} Couldn't find a space with key: QSG6
Behind the scenes: The Notified Reader connects to the cluster of spaces (Data Grid instances), and registers for notifications on the Account class. From this point onwards, the reader functions as a listener which waits to receive events from the space.
Run the Data Loader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_DataLoader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_DataLoader.bat
The .NET batch file contains the line: DataGrid.exe Loader "jini://*/*/myDataGrid"
It didn't work? Troubleshoot...
Unable to render {include} Couldn't find a space with key: QSG6
Switch to the command console that is running the Notified Reader, maximize the window, and see that it receives all 100 accounts.
You have just verified that notifications work! The Data Loader connects to the cluster of spaces (Data Grid instances), and writes 100 Account objects. These objects are divided between the two spaces according to the partitioning algorithm: accounts with even IDs go to one space; accounts with odd IDs go to the other space. The Notified Reader automatically receives all the even-ID accounts from one space, and all the odd-ID accounts from the other space (without being aware that the data is partitioned).
If the GS-UI is not currently running, start it by executing <GigaSpaces Root>\ServiceGrid\bin\gs-ui.bat (or .sh).
In the GS-UI, on the navigation bar at the left, click the Space Browser tab. This allows you to view running spaces and perform operations on them.
In the Grid Tree on the left, make sure the Spaces node at the top is selected. In the panel on the right, a table lists the four spaces you deployed earlier (two partitions with one backup each, which together comprise the master cache). In the Count column of the table, you can see that each of the four spaces contains either 50 or 51 objects - 50 accounts that were written by the Data Loader, and possibly one more used to administrate the space. Altogether there are ~200 account objects - 100 in the two partitions, and another 100 in the two backups.
You have just verified partioning and backup in the master cache! Although the Data Loader wrote the Account objects once, they were split between the two primary spaces (50 in each space), and replicated to the two backup spaces (which now also contain 50 accounts each).
In the Grid Tree on the left, expand the first space called myDataGrid (out of four spaces with this name), and click the Statistics node under it.
The pane on the right shows activity statistics for the space:
Each of the bars on the horizontal axis represents a type of space operations (e.g. Read), and the height of the bar is the accumulated number of space operation. You should see 50 or 51 Write operations - 50 account objects (one half of the accounts written by the Data Loader, which were routed to this partition), and possibly 1 administrative object, written when the cluster was set up.
Leave the statistics pane open. In the command console, run the Simple Reader by executing the following command:
The script passes the parameter "jini://*/*/myDataGrid?useLocalCache" to the application
start run_SimpleReaderLocalCache.bat
The .NET batch file contains the line: DataGrid.exe Reader "jini://*/*/myDataGrid?useLocalCache"
The string at the end is the Space Connection URL:
myDataGrid is the name you defined for all spaces (Data Grid instances) in the cluster.
myLocalCache is a parameter which instructs the client to load an embedded local cache. When this parameter appears on its own, without the views parameter, it instructs the cache to lazy-load data from the space.
After executing the reader, maximize the command console window. The Simple Reader reads all 100 accounts from the space, sleeps and repeats. Wait for it to read the accounts 4 times, then proceed to the next step.
In the GS-UI, the statistics panel should now look like this:
On the right, you can see the 50 notifications received by the Notified Reader from this space (one notification for each of the 50 account objects written to this space by the Data Loader).
At the center, the Read bar shows 50 or 51 read operations. What is important is that there are only 50 - although the reader has already read the accounts more than 4 times over.
You have just verified that the master-local pattern is working! The space shows 50 read operations, because the first time the Simple Reader tried to read these accounts, its local cache read them from the space (lazy load). However, from the second read cycle and onwards, the accounts were not read from the space, but from the local cache. This is why, although the read operations succeeded, the space does not show additional read activity.
If you wait 5 minutes, space statistics should show 50 more read operations. This is because of eviction of objects in the local cache. Eviction ensures that objects in the local cache do not become outdated. By default, any object loaded to the local cache is evicted after five minutes; for this reason, every five minutes the Simple Reader's local cache reads the 50 accounts again from the space. But the Simple Reader continues to read these accounts from its local cache, not directly from the space.
If you want to test another topology, click the Deployments, Details tab on the left. In the Service Grid Network tree on the left, click the myDataGrid node.
Click the undeploy button ( ) and click Yes to approve. Once you have successfully undeployed, Return to Deploying the Data Grid.
Local-View
To run the client, test notifications, verify master cache partitioning, and see thelocal-viewpattern in action:
Start a console and cd to the folder containing this tutorial's client application: <GigaSpaces Root>\examples\tutorials\datagrid\topologies.
Run the Notified Reader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_NotifiedReader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_NotifiedReader.bat
The .NET batch file contains the line: DataGrid.exe Events "jini://*/*/myDataGrid"
The string at the end is the Space Connection URL. myDataGrid is the name you defined for all spaces (Data Grid instances) in the cluster. The client uses this URL to search the network for Data Grid instances with this name (the client is unaware of the cluster topology, partitioned in this case).
The Notified Reader does not use a local cache - it doesn't need one, because it receives the data once from the Data Grid and doesn't need to read it again.
It didn't work? Troubleshoot...
Unable to render {include} Couldn't find a space with key: QSG6
Behind the scenes: The Notified Reader connects to the cluster of spaces (Data Grid instances), and registers for notifications on the Account class. From this point onwards, the reader functions as a listener which waits to receive events from the space.
Run the Data Loader by executing the following command:
(Windows/UNIX)
<example dir>\bin\run_DataLoader.(bat/sh)
The script passes the parameter "jini://*/*/myDataGrid" to the application
start run_DataLoader.bat
The .NET batch file contains the line: DataGrid.exe Loader "jini://*/*/myDataGrid"
It didn't work? Troubleshoot...
Unable to render {include} Couldn't find a space with key: QSG6
Switch to the command console that is running the Notified Reader, and maximize the window. See that the Notified Reader receives all 100 accounts from the Data Grid.
You have just verified that notifications work! The Data Loader connects to the cluster of spaces (Data Grid instances), and writes 100 Account objects. These objects are divided between the two spaces according to the partitioning algorithm: accounts with even IDs go to one space; accounts with odd IDs go to the other space. The Notified Reader automatically receives all the even-ID accounts from one space, and all the odd-ID accounts from the other space (without being aware that the data is partitioned).
If the GS-UI is not currently running, start it by executing <GigaSpaces Root>\ServiceGrid\bin\gs-ui.bat (or .sh).
In the GS-UI, on the navigation bar at the left, click the Space Browser tab. This allows you to view running spaces and perform operations on them.
In the Grid Tree on the left, make sure the Spaces node at the top is selected. In the panel on the right, a table lists the four spaces you deployed earlier (two partitions with one backup each, which together comprise the master cache). In the Count column of the table, you can see that each of the four spaces contains either 50 or 51 objects - 50 accounts that were written by the Data Loader, and possibly one more used to administrate the space. Altogether there are ~200 account objects - 100 in the two partitions, and another 100 in the two backups.
You have just verified partitioning and backup in the master cache! Although the Data Loader wrote the Account objects once, they were split between the two primary spaces (50 in each space), and replicated to the two backup spaces (which now also contain 50 accounts each).
Leave the statistics pane open. In the command console, run the Simple Reader by executing the following command:
The script passes the parameter "jini://*/*/myDataGrid?useLocalCache&views={com.gigaspaces.examples.tutorials.topologies.Account:accountID=20 AND balance=200,com.gigaspaces.examples.tutorials.topologies.Account:accountID=17}&groups=${LOOKUPGROUPS}" to the application
start run_SimpleReaderLocalView.bat
The .NET batch file contains the line: DataGrid.exe Reader-LocalView "jini://*/*/myDataGrid"
The string at the end is the Space Connection URL:
myDataGrid is the name you defined for all spaces (Data Grid instances) in the cluster. myLocalCache is a parameter which instructs the client to load an embedded local cache. views={...} is the view filter. This parameter instructs the cache to pull from the space objects matching the query inside the curly brackets. The query used here is structured as follows: the class name, followed by a colon, followed by the name of the field being queried, followed by an SQL condition.
When the parameter Reader-LocalView is used, a Local View is created in the code:
publicstatic SimpleReader CreateLocalViewReader(ISpaceProxy proxy)
{
// Create an account template:
Account template = new Account();
template.AccountID = 20;
template.Balance = 200;
// Create a view query based on the template:
View view = new View(template, "Balance < ? and AccountID < ?");
// Create a local view:
IReadOnlySpaceProxy localView = proxy.CreateLocalview(view);
SimpleReader reader = new SimpleReader(localView);
return reader;
}
Class Name
Field Being Queried
Condition
GigaSpaces.Examples.Tutorials.Topologies.Account
Balance
<20
GigaSpaces.Examples.Tutorials.Topologies.Account
AccountID
<200
Behind the scenes: The Simple Reader attempts to read all the Account objects using the Read operation. But it is actually reading from its local cache, which only contains a subset of the data in the Data Grid - the Local-View pattern.
Check that the Simple Reader follows your Local View instructions: The Reader should only read 5 accounts from the space - the accounts that meet the condition you specified, accountID>95 - so the accounts read are 96, 97, 98, 99, and 100. The Reader should only read 19 accounts from the space - the accounts that meet the conditions specified in the code, Balance<20 and AccountID<200.
You have just verified that the local cache is working with a local view! Although there are 100 accounts in the space, the Simple Reader only reads a small subset - those 5 accounts (Java) or 19 accounts (.NET) that match your respective view filter.
If you want to test another topology, click the Deployments, Details tab on the left. In the Service Grid Network tree on the left, click the myDataGrid node.
Click the undeploy button ( ) and click Yes to approve. Once you have successfully undeployed, Return to Deploying the Data Grid.