Introduction

RDF Editor was created in response to a unique situation where Linked Data was exchanged between parties, with the amount of data reaching around 200 million triples (or facts) per exchange. Exchanging this volume of data creates its own problems. I had the opportunity to dedicate 10% of my time to study (a standard practice for experts at my agency). I decided to use this study time to investigate solutions for the problem I encountered.

The results are presented here. I assume that the reader is familiar with Linked Data and Lisp. If not, here are some resources to get you started:

Linked Data

  1. Introduction to Linked Data (by Linked Data Tools) Intro to Linked Data

  2. What is Linked Data? (W3C) W3C Linked Data Overview

  3. Tim Berners-Lee’s Design Principles for Linked Data Linked Data Principles

  4. W3C Linked Data Primer Linked Data Primer (W3C)

  5. LOD Cloud Linked Open Data Cloud Diagram

  6. RDF Resource Description Framework (W3C) W3C RDF Overview

  7. SPARQL Query Language for RDF (W3C) W3C SPARQL Standard

  8. OWL Web Ontology Language (W3C) W3C OWL Overview

  9. JSON-LD 1.1: A JSON-based Serialization for Linked Data (W3C) W3C JSON-LD Documentation

  10. SKOS Simple Knowledge Organization System (W3C) W3C SKOS Reference

  11. DASH Data Shapes Topquadrant: Stuff that could become an official standard. Highly recommended reading material, and the foundation for the RDF-editor as a feasible solution.

Lisp

  • Beginner:

    • Practical Common Lisp: A practical introduction to Common Lisp, excellent for grasping fundamental concepts. (Focuses on CL but concepts transfer to other Lisps)
  • Beginner/Intermediate (Specifically Elisp):

  • Intermediate:

    • On Lisp: A comprehensive study of advanced Lisp techniques, with bottom-up programming as the unifying theme.

Central Questions

When I started this project, two key questions arose:

  1. Is it possible to understand the contents of a dataset without prior communication about it?
  2. Is it possible to create a Linked Data editor that works with data stored in an RDF store, rather than a file?

A third one came up during the work on these two:

  1. Is it possible to achieve this using open-source tools?

Understanding Large Datasets

Many tools are available for this purpose. Ontologies and concept schemes are integral to Linked Data. Tools like Protégé allow viewing class hierarchies, and there are numerous other options.

However, many tools struggle with the combination of the ontology and the linked data. For example, Protégé can display both the ontology and instance data, but it struggles (taking 20 minutes or more) when an ontology contains 25,000 classes and the data comprises more than 200 million triples. In such cases, SPARQL queries become the only viable option. My goal was to enable quick insights into the data structure and easy access to associated data.

Editing RDF Data

RDF data is often file-based during transit. Eventually, the data is stored in an RDF store. If minor changes are required, the input file must be edited and reloaded. This can be a challenge due to the sheer file size. I wanted to know if it was possible to edit data directly within the server. While SPARQL’s update, insert, and delete statements exist, I wanted to avoid writing SPARQL for simple updates.

Open Source

Having worked with open-source software for most of my career, I am biased towards open-source solutions. Therefore, I decided that part of the challenge would be to use open-source tools. As an avid Emacs user, Emacs was the natural choice. My limited programming experience also motivated me to become more proficient in that area.