RVO, Research Variable Ontology, proposes a schema enterprise can use to record their data analytics experiments and use as a knowledgebase for learning and recommendation support. RVO is designed around the research variables, which form the basis of the hypothesis that analysts test through building a model. RVO can answer an array of questions raised by data analysts when they start designing a solution and provide recommendations and alternatives. All facts or expert knowledge recorded in RVO are traceable to its origin (i.e. a person, publication, validated model). RVO follows best practices in ontology design and reuse existing data models and vocabularies (such as DBPedia1, RDF-Cube2, FABIO3) to facilitate efficient reuse in real world applications by using semantic technologies and open standards (RDF, OWL, SPARQL).
The purpose of RVO is to:
The components of the ontology are shown in figure below. Main classes of the ontology are Variable, Measure, Model, LinkedVariables and Origin. LinkedVariables class captures a link between any two variables. LinkType is defined to represent details of different links between variables as an extendable catalogue of link types (Causal, InvestigativeCausal, NoLink). Origin class is defined to represent details about the origin or reference to any concept. Main origin types in RVO are: ViaModel, ViaLiterature, and FromExperts. One strength of this ontology is how it can be linked with any domain specific ontology via the operationalized relationship. We can take any concept (thing or property) defined in any third-party ontology as rvo:Concept and link it to a variable. In this way, the context of the variable is readily available through a domain specific ontology.
|Concept||Concept or Thing refer to abstraction of any entity which may exist in an ontology.|
|Variable||Variables exist as metrics to quantify concepts.|
|Measure||Measures are the metrics that represent actual values for a variables.|
|Variable Link||A link describes a connection between two variables, the nature of this connection and how this connection was established.|
|Origin||Origin is where a concept (e.g. variable, Variable Link, Measure) is first defined or mentioned. RVO has three origins – Expert Person, Reference of a publication, Analytics Model|
|Link Type||Link Type describes the type of link that exists between two or more variables.
Example of Link Types are:
|Data Source||Data source references to the source from which a data is coming from.|
|Dataset||Dataset refers to the data file provided by a data source which contain the data for one or multiple measures. We recommend publishing dataset following RDF-Cube structure.|
|Data Structure||Dataset structure refers to the meta data for a dataset such as what measures are captured, unit of measure.|