A Path Approach to Processing and Retrieving Large Scale Linked Data within a Super-Computing Environment

May 14th, 2015

Categories: MS / PhD Thesis, Supercomputing

M. Lewis, PhD candidate
M. Lewis, PhD candidate


EVL PhD candidate, Michael Lewis presents his thesis research:
Thursday, May 14, 2015, 11:00 AM
EVL Cyber-Commons
Room 2068 ERF

Many large scale clustered systems have incorporated infrastructure for processing Resource Description Framework (RDF) data in order to provide greater semantics and finer access to the data. As the amount of RDF data expands at a rapid pace, is critical to have query systems be able to process in scale and to provide users with new ways to correlate linked data. Much research has been done to distributively process and retrieve linked data from conjunctive queries, however the current research and the existing frameworks has only been shown to scale on clusters with a limited size. Query languages have been developed to formulate triple-pattern based expressions and limited path expressions for retrieving properties and graph primitives covering a specified path, however these queries are made for a limited scale framework and there are no RDF query languages that can cover a range of connectedness expressions. This research introduces an in-memory approach to processing and retrieving RDF queries that can scale to thousands of machines within a super-computer environment. In addition this research specifies an approach for the user to query key/URI correlations.