ReleaseToolsProposal

Proposal Name : ReleaseTools

Proposal Type : Improvement

Editor(s): DanCorwin

Input from: AlanRuttenberg

  1. Executive Summary
  2. Introduction
    1. Goals/Motivation
  3. Biological Questions / Use Cases
  4. Requirements
    1. User Requirements
    2. Practical Software Requirements
  5. Proposed Implementation
  6. Worked Examples
  7. Open Issues
  8. Expected growth and plan for growth
  9. Relation to external ontologies and plan for use
  10. OWL considerations
  11. Backward Compatibility
  12. Notes



Executive Summary

This proposes a "core" ontology for future BioPAX.org releases, focused on improved top-level concepts, support of best-practice Semantic Web software R&D; plus established paradigms for scientific attribution; end-user training; validated DB conversions, and interoperability. Its construction would be mediated by the CorePhysicalOntologyGuide, in both its nature and the R&D methodolgy used.

Special features include integrated web utilities to assist data suppliers; support for OWL-DL reasoners; and linkage to upper ontologies. Around this imported general core, an open suite of modular extension ontologies could arise and evolve, in which DX workers, external suppliers, data consumers and curators might all publish new models for exchanging triples on pathways. Standards too could become more modular, as BioPAX.org collected end-user feedback.



Introduction

Goals/Motivation

2.0 was released late, with bugs and caveats. The push to develop DX releases is also late by 3+ months, and it has not yet added clear plans to improve BioPAX standards in OWL-DL, R&D processes, documentation quality, or interoperability with other major ontologies. The loss of BioPAX's credibility as a standards publisher, even within our membership, is a growing threat.

To counter such problems and dangers, I propose a new R&D approach to post-2.0 releases - the layering of release modules around a "core" ontology within which software engineering and data exchange, not biochemistry, takes center stage. Relative to 2.0, "core" will address

A first goal is a 2.1 "core" released in roughly the same time as a DX module, similar to 2.0, but generalized and formalized in its handling of biochemical details of PARTICIPANTS, so that what remains provides a clear, formal base upon which the community can erect BioPAX 3.0

Biological Questions / Use Cases

Biologically speaking, "core" 2.1 will address pathway models in general, with minimal restrictions placed on PARTICIPANTS beyond solid documentation and attribution. Extensions can then add these restrictions, if, as and when their authors deem that desirable for particular needs.

  1. What will "core" remove relative to 2.0? - Basically, all "utility classes" and other imaginary or conflated concepts. DX releases can import them if required from BioPAX 2.0. To minimize confusion, every extension to "core" will be expected to use a separate name space, then import all external concepts it requires from available sources.

  2. What will "core" add relative to 2.0? - Whatever makes sense for its goals as a base, and can be defined in a reasonable number of months. New models will tap EXTERNAL engineering process, upper-ontology, and attribution paradigms, seeking a solid base for disciplined releases that support or exceed the original goals for 3.0.

Primary use cases for "core" 2.1 are restoring confidence in BioPAX.org's ability to deliver on the software goals that it publicly set for itself; re-establishing ontological quality as the top working group value; speeding validated data conversions; and experimentally discovering how 3.0 goals can best be advanced by using all available tools, paradigms and standards for industrial information exchange and professional software release processes.

Requirements

User Requirements

  1. See "BioPAX-discuss" thread on "Some Requirements", starting May 25, 2006

  2. Continued expansion of this email thread and related spinoffs is encouraged

  3. (Others will be added after on-going analysis by "core" participants)

Practical Software Requirements

  1. Commonly used upper ontology classes and properties will be integrated

  2. Design patterns from other metadata-exchange efforts will be supported

  3. The standards imposed on "core" ontology models will include OWL-DL

  4. (Others will be added after on-going analysis by "core" participants)

Proposed Implementation

  1. Core 2.1 will itself be assembled from separate, integrated modules

  2. Core concepts will be documented using Self-Annotating_Identifers

  3. Model-attribution standards will let Dublin Core metadata be located

  4. Models will conform to principles in the CorePhysicalOntologyGuide

Worked Examples

Explanations of these are posted in the wiki - see OntologyDesignPatterns.

  1. The basic patterns for triple exchange-formats include associations:

  2. Ways to restrict extensions are exemplified for phosphorylation

  3. A web utility agent for DB conversion is exemplified for interactions

Open Issues

  1. Timing and content of initial "core" releases (TBDL by the SW group)

  2. Acceptable exchange file formats (OWL, RDF, XTM, CSV,. etc.).

Expected growth and plan for growth

  1. Biological extensions from many parties can be layered on top of "core"

  2. Each new layer should take an incremental step toward a 3.0 class tree

  3. BioPAX.org will solicit submissions and feedback from expert sources

  4. Periodic 2.x standard "core" upgrades can arise under group goverance

Relation to external ontologies and plan for use

  1. Interoperability use cases benefit from shared upper-ontology concepts

  2. User-built ontologies can also be widely built as extensions to "core"

  3. Exchange files based on open combinations of both are encouraged

  4. Relative to 2.0, this should "open up" BioPAX use cases considerably

  5. It should considerably boost compatibility, with GO most especially

OWL considerations

  1. OWL tools continue to operate only in early release stages

  2. "Core" R&D will offer suggested ways to address this problem..

  3. Design Patterns (see Worked Examples) are a central method

Backward Compatibility

  1. 2.0-compatible "core" extensions should become possible in Q3

  2. If usage guides were followed, few or no 2.0 instances should need changes

  3. Proposed DX releases suggest very different (incompatible) standards

Notes

  1. Please use "discuss" email, not wiki pages, to debate the above so that dates, authors, and opinions go into a more clearly persistent group record.

  2. Discussion below can summarize conclusions.



Discuss this proposal (see note 1).

last edited 2006-09-29 18:05:29 by DanCorwin