query optimization - prolog logic gates aggregation recursion optimisation -
i trying implement logic gate type aggregation operation. , having trouble writing implementation perform calculations in reasonable amount of time. think have works logically slow , don't think needs be. think should possible out using many 'findall's or cuts'.
i have table of approximately 10,000 columns , 70 rows. rows correspond samples , columns probes. each value in table either 1 or 0 (the state of probe in sample).
multiple probes code protein. (many 1 relationship) want aggregate probe columns protein columns logical or operation.
in addition proteins part of protein complex or protein set. both protein complexes , protein sets in addition containing proteins may in turn contain protein complexes or protein sets. can kind of recursive relation. want model protein sets or gates , protein complexes , gates. collectively refer proteins, protein_sets , complexes 'entities'.
so on want have predicate can ask if protein or entity on or off in sample works quickly.
if of other predicates not clear can tell do.
protein(sample, reactome_id, state):- setof(sample, probe^samples(sample, probe, probevalue), samples), %sample/3 set of facts correspond described table member(x, samples), %used generate sample id's %this seems wasteful protein_reactome_id_to_uniprot_id(reactome_id, uniprotid), % set of facts matching 2 types of id %used generate uniprot ids findall(value, uniprot_sample_probes(uniprotid,x,_,value),vs), vs = [_|_], %check list not empty delete(vs,0,listofones), (listofones=[]-> (state 0, write('off'));(state 1,write('on'))). %as or think should able find single 1 , cut on case , if not possible off. %if (simple) entity protein set , state on %this base case entity not have complexs or sets inside state_of_entity(entity,state,sample):- all_children_proteins(entity), %checks children of type protein type(entity, protein_set), child_component(entity,child), %generates children of entity protein(sample,child,1), state 1,!. %if (simple) entity protein set , it's state if off %this base case entity not have complexs or sets inside %i find proteins sample, list of values, delete %zeros , remaining list unify empty list. state_of_entity(entity,state,sample):- all_children_proteins(entity), type(entity, protein_set), child_component(entity,child), bagof(value, value^protein(sample,child,value),vs), delete(vs,0,listofones),listofones=[], state 0,!. %if (simple) entity complex , off %this base case entity not have complexs or sets inside state_of_entity(entity,state,sample):- all_children_proteins(entity), type(entity, complex), child_component(entity,child), protein(sample,child,0), state 0,!. %if (simple) entity complex , on. %this base case entity not have complexs or sets inside %i find protein in sample, list of values, delete %zeros , remaining list unify empty list. state_of_entity(entity,state,sample):- all_children_proteins(entity), type(entity, complex), child_component(entity,child), bagof(value, value^protein(sample,child,value),vs), delete(vs,1,listofzeros),listofzeros=[], state 1,!. %if complex components off %recursive case state_of_entity(entity,state,sample):- type(entity, complex), child_component(entity,child), (state_of_entity(child,0,sample); protein(sample,child,0)), %if has proteins input other components state 0,!. %if complex components on %recursive case state_of_entity(entity,state,sample):- type(entity, complex), child_component(entity,child), bagof(value, value^state_of_entity(child,value,sample),vs),%if has component inputs bagof(value2, value2^protein(sample,child,value2),vs2),%if has protein inputs append(vs, vs2, vs3), delete(vs3,1,listofzeros),listofzeros=[],%delete ones, list of zeros empty if inputs on state 1,!. %if protein set components on %recursive case state_of_entity(entity,state,sample):- type(entity, protein_set), child_component(entity,child), (state_of_entity(child,1,sample); protein(sample,child,1)), %if has proteins input other entities state 1,!. %if protein set components off %recursive case state_of_entity(entity,state,sample):- type(entity, protein_set), child_component(entity,child), bagof(value, value^state_of_entity(child,value,sample),vs), %if has entity inputs bagof(value2, value2^protein(sample,child,value2),vs2), %if has protein inputs append(vs, vs2, vs3), %join list of inputs delete(vs3,0,listofones),listofones=[], %delete zeros, list of 1's empty if inputs off state 0,!.
update ended protein bit work wanted.
samples(samples):- setof(sample_in, probe^samples(sample_in, probe, probevalue), samples). sample(sample):- once(samples(samples)), %why need this?! member(sample, samples). protein_stack(sample, reactome_id, state):- ( protein_reactome_id_to_uniprot_id(reactome_id, uniprotid), uniprot_sample_probes(uniprotid, sample, probe, 1), !, state 1 ; state 0 ). protein_good(sample, reactome_id,state):- sample(sample), protein_reactome_id_to_uniprot_id(reactome_id, _), protein_stack(sample, reactome_id,state).
let's take first rule protein/3
.
is relation between
reactome_id
,uniprotid
unique? if yes, move beforesetof(sample ...), member(x, samples)
, put cut after it. otherwise try satisfy every result ofsetof(...), member(x, samples)
. more that, green cut helps in terms of performance.the rule has 1 purpose, see if there @ least 1 value of 1 in
vs
. should not generate members ofvs
, search value of 1, stop when find first 1 satisfyuniprot_sample_probes(uniprotid, x, _, value)
.protein(sample, reactome_id, state):- ( protein_reactome_id_to_uniprot_id(reactome_id, uniprotid), setof(sample, probe^samples(sample, probe, probevalue), samples), member(x, samples), uniprot_sample_probes(uniprotid, x, _, 1), !, state 1, write('on)) ; state 0, write('off') ).
the other rules can optimized using same pattern:
state_of_x(x, state) :- goal, !, state = 1. state_of_x(x, state) :- state = 0.
or, more concise,
state_of_x(x, 1) :- goal, !. state_of_x(x, 0).
Comments
Post a Comment