Alan Ruttenberg/Re-expressing a use case

The purpose of this page is to make a start at clarifing use case, and separating the use case from a particular implementation, so that as we potentially embark on a choice of representations in 2.5/3.0 or DX/SW we can share these cases and write our proposals against them.

As an example, I've taken one of the biological questions from the StatesProposal, and tried to rework it into a suitable format for reference in an implementation proposal.


Original:

How are signals from NfKB transmitted to p53?

Graph queries, such as shortest path or their modified forms can be helpful for this case. However traversing over reference physical entities for such a question is not meaningful, as for example not every signal received by B-catenin from NfKB is transmitted to p53. We need to traverse the graph where nodes are states (in BioPAX Level 2, this would require us to identify identical PEPs, which is inconvenient).


In order to more clearly define this case we need to get more specific.

Definitions

Edge: A pair of physical entities related in some way. Let's write the edge as ei = (vi1,ri,vi2) (v stands for vertex, i is an index, ri is the relation between the two).

Edge constructor: Some procedure for, given a BioPAX structure, creating all edges that make sense for the task you want to do. A constructor would define, for example, what sorts of things are vertices, what are the set of relations, and how does one transform specific BioPAX things into them. It should be noted that the specific constructors can vary from task to task.

Connected edge: A pair of edges e1 = (v11,r1,v12), e2 = (v21,r2,v22) where v12 =conn v21. The comparison operator =conn is a function that lets us compare whether two physical entities are the same for the purpose at hand. It should be noted that different comparison operators might be used for different tasks.

Path: A sequence of edges e1,e2,..en where every ei and ei+1 is a connected edge. n is the path length.

Path between two entities A and B: A path e1..en where (e11 =endpoint A) and (en2 =endpoint B). The comparison operator =endpoint is a function that lets us compare whether two physical entities are the same for the purpose at hand. It should be noted that different comparison operators might be used for different tasks and different from =conn

The shortest path problem can now be stated (for example) as follows:

An implementation proposal should specify desired edge constructors, =endpoint and =conn and how they are to be computed from the proposed BioPAX representation.


Example

Some example edge constructors based on BioPAX level 2:

Given a catalysis with CONTROLLER a physicalEntityParticipant with PHYSICAL-ENTITY protein C, and CONTROLLED a bioChemicalReaction which has a RIGHT which is a physicalEntityParticipant with PHYSICAL-ENTITY protein P, create an edge (C,controls,P)

Given a catalysis with CONTROLLER a physicalEntityParticipant with PHYSICAL-ENTITY complex C, and CONTROLLED a bioChemicalReaction which has a RIGHT which is a physicalEntityParticipant with PHYSICAL-ENTITY protein P, foreach COMPONENT of C, a physicalEntityParticipant with PHYSICAL-ENTITY CPi create an edge (CPi,controls,P)

An example vertex comparison function, based on the edges above:

A =conn B is true if there is some A XREF a unificationXref X1, and some B XREF a unificationXref X2, and both the respective strings in the DB and the ID properties of the X1 and X2 are the same. =endpoint is defined to be the same as =conn

Citing the use case in an implementation proposal

I think I would call this a computational use case, motivated by a biological use case, in that there may be other ways to compute an answer to the question "How are signals from NfKB transmitted to p53?". So to structure this I would say name the biological use case something like "How are signals transmitted from A to B", then have this shortest path be a computational use case that can provide an answer to the biological one. An implementation proposal would make the claim that it answers biological use cases by reference to both the biological and the computational cases that it proposes to support.

Comments on the original

In the original it appears that =endpoint doesn't depend on the state variables as no particular state of p53 is mentioned, and that =conn does depends on the state variables. It appears that vertices may be complexes (NfKB). The precise comparison operators are not completely specified. The edge constructors are not specified. (In this case the editor of that proposal is invited to correct any mistake in this assessment).

In the normal process of BioPAX development, we would negotiate until the use case statement was to everyone's agreement. After that any proposal of an implementation that purports to support this use case would defined the parameters, in this case =conn, =endpoint and the edge constructors. Based on that we should be able to be evaluate whether it satisfies the use case, and compare a particular solution to others.

/Discussion

last edited 2006-01-07 08:27:30 by AlanRuttenberg