Summary: GigaSpaces's persistency approach consists of several paradigms for data persistency, according to the application needs. This section gives a basic overview of each paradigm.
Persisting Space Data into Permanent StorageThere are many situations where space data needs to be persisted to permanent storage and retrieved from it. For example:
Bridging the Gap Between Object to RelationalObject-oriented development dominates the enterprise, and most client applications today are written in the Java, C#, and C++ languages. However, the majority of business-critical data is stored in relational database management systems (RDBMS) or similar systems that use record-based (non object-oriented) storage, whose data is read by query-based search schemes. Because of this mismatch, an intermediate object-relational mapping (ORM) step is required to perform translation of objects to records when writing data to a database, and translation of records to objects when reading data from a database. This intermediate step is implemented in middleware that is detached from and transparent to the client application. Client calls to standard API read and write methods trigger the middleware functionality without a need for the client to intervene. Advanced middleware systems permit the client API to formulate and pass a database query for use when reading from the database. The Hibernate library, an ORM persistence and query service for the Java language, can provide this service for RDBMS. Hibernate allows you to express queries in its own portable SQL extension (HQL), as well as in native SQL. However, Hibernate is restricted to run at the client level, and does not relate to read/write-through caching. Migrating Legacy Hibernate API Applications to GigaSpaces APITo benefit from data caching and other capabilities, it is worthwhile to migrate a legacy application that uses the Hibernate API, to the GigaSpace or GigaMap API. In such cases, these applications can benefit from the ability to scale when using the GigaSpaces Data Grid. This is achieved by partitioning the data across different spaces running on different machines, and having the business logic colocated with each partition. This allows the space and the business logic to run in same memory address, eliminating remote calls when accessing the data. The following tables show the correspondence between the Hibernate basic API methods to GigaSpace API and the Map API methods.
The Moving from Hibernate to Space best practice includes step by step instructions for moving from Hibernate based application to GigaSpaces Data-Grid as the data access layer. This use Hibernate as the space persistency layer using write-through approach when pushing updates into the database.
Caching policies and the External Data SourceThe External Data Source supports the All In Cache and LRU Cache policies. All In Cache PolicyWith the All In Cache policy, the assumption is the Space holds the entire data in memory. In this case, the space communicated with the data source at startup, and loads all the data. If data within the space is updated/added/removed, the space is calling the relevant external data source interface to update the underlying data source. All data activities leveraging the data in memory. LRU Cache Policy - Read-AheadLRU persistency model is based on the eviction model: Some of the data stored In-Memory (based on auto expiration mechanism or explicit data eviction) and ALL the data stored on disk where the preferred disk media is a database. You may leverage Hibernate as the mapping layer when data is persist or have a custom persistency mapping implemented leveraging the External Data Source API.
Using a database to store the data allows you to:
Database technology has proven itself to be able to store vast amount of data very efficiently with very good high-availability. You may use RDBMS SQL databases (mySQL, Oracle, Sybase, DB2) or NoSQL databases (MongoDB , MarkLogic, AllegroGraph) as the space persistency layer.
With the LRU policy, the assumption is that some of the data (recently used) is stored in memory. The amount of data stored in memory is limited by the cache size parameter, the memory usage watermark threshold parameters and available free GSC JVM heap size. In this case, once the space is started is loads data up 50% (you may tune this value) of the defined cache max size (total of objects per partition). If data within the space is updated/added/removed, the space is calling the relevant external data source interface to update the underlying data source. When performing read operations for a single object (read/readById/readIfExists) and no matching object is found in-memory (cache miss), the relevant external data source is called to search for a matching data to be loaded back into the space and from there sent to the client application (read-ahead). If a query is executed (readMultiple), and the max objects to read exceed beyond the amount of matching objects in memory, the relevant external data source is called to search for matching data elements to be loaded back into the space and from there sent to the client application. In this case, the client might have in return objects that were originally within the space, and objects that have been read from the external data source and loaded into the space as a result of the query operation.
In both cases (ALL_IN_CACHE and LRU cache policy), you can customize the data load phase to speed up the space initialization phase. The External Data SourceThe space can load data from external data sources, store data into external data sources, and persist data into a relational data source or any other media via a custom External Data Source implementation. The Hibernate External Data Source support RDBMS. The Cassandra Space Persistency Solution allow applications to leverage NoSQL Cassandra DB having a distributed database infrastructure as an alternative to RDBMS. Section Contents
|
![]() |
GigaSpaces.com - Legal Notice - 3rd Party Licenses - Site Map - API Docs - Forum - Downloads - Blog - White Papers - Contact Tech Writing - Gen. by Atlassian Confluence |