Introduction

The OptIPuter is a powerful distributed cyber-infrastructure to support data-intensive scientific research and collaboration. The OptiStore Project aims to develop a data management system that bridges the gap between the very large data sets (usually three dimensional volume data sets and time-varying spatial data sets) and highly intensive computing applications in the context of OptIPuter. In the OptIPuter concepts, the distributed components, such as rendering clusters, data storage clusters and computation clusters are inter-connected by wide area optical network. Thus the data that the computation cluster needs at one location may exist on different remote data storage clusters at other locations. It is the goal of OptiStore to access large amount of data (from terabytes to petabytes) on remote locations, query them on the distributed servers, transfer them among OptIPuter components, fast and reliably, and transform them from one data model to anther in real-time.

Motivation

OptiStore is intended as a universal distributed data management system of very large time-varying spatial information. Usually, the data is stored in different location and maybe affiliated with different organizations. Typically, the data are organized by the local server using various data management systems, such as relation databases, Object-Oriented Databases, or even data repository as simple as indexed file systems (in some cases, the raster images and time-varying volume data exist as files). OptiStore should provide an interface to query the heterogeneous data repositories, access the spatial data, and maintain the systems. The visualization tools dont need care about where the data is, in what format the data is organized, how to access the data, which portion of the data it should crop and etc. They simply request the data from OptiStore client API, and OptiStore should handle the rest of the work. Whats more, OptiStore also should have the capability of data modeling and data mining, which can discover inner relation within the original crude data.

Related Work

1. LambdaRAM

LambdaRAM is an application being developed to address long-haul latency in optical networks. This technique collects memory in a compute cluster and then allocates it as a cache to minimize the effects of latency over long-distance, high-speed networks. LambdaRAM takes advantage of multiple-gigabit networks (available on the StarLight and OMNInet testbeds) to pre-fetch information before an application is likely to need it (similar to how RAM caches work in computers today). LambdaRAM extends this concept over high-speed networks. OptiStore will do its job on the top of LambdaRAM. Like normal database system treats memory buffers and disk system, OptiStore take the advantage of local LambdaRAM as a large shared memory and remote LambdaRAM as disk systems. It will swap in/out the data needed by the application on this client/server cache hierarchy.

2. LambdaStream

LambdaStream is a transport protocol designed specifically to support gigabit-level streaming, which is required by streaming applications over OptIPuter. The protocol takes advantage of characteristics in photonic networks. It adapts the sending rate to dynamic network conditions while maintaining a constant sending rate whenever possible. One advantage of this scheme is that the protocol avoids deliberately provoking packet loss when probing for available bandwidth, a common strategy used by other congestion control schemes. Another advantage is that it significantly decreases fluctuations in the sending rate. As a result, streaming applications experience small jitter and react smoothly to congestion. Another important feature is that the protocol extends congestion control to encompass an end-to-end scope. It differentiates packet loss and updates the sending rate accordingly, thus increasing throughput.

3. Ethereon

Ethereon is the new version of visualization tool which merges 2D scalable visualization tool C JuxtaView and volume rendering tool C Vol-a-Tile into a new interface and is implemented with the support of LambdaRAM and OptiStore as well as SAGE. Ethereon is one of applications, for which OptiStore provides data management services.

OptiStore Architecture

The diagram of OptiStore architecture.