Case File
efta-efta01125174DOJ Data Set 9OtherAdvanced Al
Date
Unknown
Source
DOJ Data Set 9
Reference
efta-efta01125174
Pages
49
Persons
0
Integrity
No Hash Available
Extracted Text (OCR)
Text extracted via OCR from the original document. May contain errors from the scanning process.
Advanced Al for Longevity Genomics
Dr. Ben Goertzel
OpenCog Foundation
Hong Kong Poly U
Hanson Robotics
Aidyia Limited
Stevia First
EFTA01125174
OpenCog
• Open-source Al project aimed at Artificial
General Intelligence
• Integrated system aimed at controlling autonomous, generally intelligent agents
• Components of OpenCog currently in use for various practical applications...
• ... such as analyzing genomics data
EFTA01125175
if
ATLANTIS
PRESS
Atlantis Thinking Machines
Sees Editor. K.- U. Kiihnberger
Ben Goertzel
Cassio Pennachin
Nil Geisweiller
Engineering
General
Intelligence, Part 1
A Path to Advanced AGI via Embodied
Learning and Cognitive Synergy
ATLANTIS
PRESS
[ A ng Machines
.- U. Kiihnberger
Ben Goertzel
Cassio Pennachin
Nil Geisweiller
Engineering
General
Intelligence, Part 2
The CogPrime Architecture for Integrative, Embodied AGI
EFTA01125176
OpenCog stores and manipulates knowledge in the form of complex graphs
(weighted, labeled hypergraphs)
Perception Action
Feeling Nodes pixel at (100,50)1s
MA) at 1.42:01,
Sept 15, 2006
Joint_53_actuato is oil at 2:42:01,
Sept. 15, 2006
Specific Objects,
Composite Actions,
Complex Feelings alse_arm_55
Abstract Concepts
(some corresponding to named concepts, some not) raise arm
EFTA01125177
1
PROCEDURAL
KNOWLEDGI
MOSES
(probabilistic;
evolutionary learning), hillclimbing'
KNOWLEDGE,
"deep learning" hierarchy of memory/ processing units
KNOWLEDGE
Probabilistic Lo '
Networks, concept blendin language comprehension generation
ATTENTIONAL &
INTENTIONAL
KNOWLEDGE economic attention networks, adaptive goal hierarchy
EPISODIC
KNOWLEDGE internal world
`mutation engine
OpenCog features multiple cognitive algorithms, each acting on different sorts
of knowledge within the common "Atomspace" dynamic knowledge store.
The aim is to achieve high levels of general intelligence via "cognitive synergy"
between the different cognitive algorithms, cooperating together to help an
agent choose actions based on goals, context and experience.
EFTA01125178
PROCEDURAL
KNOWLEDGE
MOSES
(probabilistic evolutionary learning),
DECLARATIVE
KNOWLEDGE robabilistic Logi'
Networks, oncept blending language omprehension
EPISODIC
EP
I
L
KNOWLEDGE internal world imulation engine ‘
,
So far, two of OpenCog's cognitive algorithms (MOSES and Probabilistic Logic
Networks (PLN) are being used to help understand genomics data. In time,
the full integrated OpenCog architecture will be used to serve the role of an
"artificial scientist."
EFTA01125179
OpenCog Al for Genomics:
Two Examples
• MOSES for identifying patterns differentiating supercentenarians from healthy —80 year olds
based on SNP combinations
• PLN (probabilistic logic networks) for using bio-ontologies to identify genes indirectly
connected to the longevity phenotype, via a combination of genomic data and ontological
knowledge
EFTA01125180
phenotype classification of wholegenome sequenced samples with
boolean models derived via MOSES supervised machine learning
(Mike Duncan & Ben Goertzel)
EFTA01125181
abstract
•
A boolean classification function was constructed using a novel supervised
machine learning algorithm to categorize healthy from chronically ill
geriatric subjects. From an evenly divided sample set of 783 subjects, a
population of boolean functions consisting on average of 130 variables
was evolved, with a mean out-of-sample accuracy of 0.851, compared to
an in-sample accuracy of 0.860.
•
The same analysis pipeline was used to distinguish 17 super centenarians
from a subset of the above data set consisting of 230 healthy geriatric
females. Five significant functions were evolved, four binary and one with
a single variable was evolved, with perfect out-of-sample accuracy. These
functions consisted of 5 distinct SNP variants.
EFTA01125182
meta optimizing semantic evolutionary search (MOSES)
• MOSES is a 2 level genetic programing algorithm to search catagorization function space, allowing detailed
exploration of multiple local fitness maxima.
• Functions from the meta-level population are selected and "mutated" (their neighborhood in function space
is searched).
• Variants with improved fitness (better at categorizing) are simplified and returned to the meta-population.
• In addition, integrated feature selection and multiple tunable search and fitness functions improve on
standard genetic programing algorithms
EFTA01125183
epistatic boolean classification models
•
MOSES evolves programs coded in a simple programing language called combo.
•
Binary variables are valued "0" if a sample is homozygous for the reference allele
and "1" for any alternate alleles for a particular variant. A "true" value indicates
"case" status.
•
an example boolean combo program applied to simulated genomic data:
or( $rs1234 and( !Srs5678 or( or( $rs2468 $rs7531 ) and( $rs3142 $rs2001 ))))
variable sample 1 sample 2 rs1234 ref (0) ref (0) rs2001 ref (0)
alt (1) ref (0) rs3142 ref (0) alt (1) ref (0)
It (1) rs2468 ref (0) rs5678 ref (0) rs7531 ref (0) program value
control
(false) case (true)
EFTA01125184
supervised machine learning strategy
• A cross validation strategy is used where the data set is randomly partitioned into training
and testing sets at a ratio of 4:1.
• Accuracy scores on training and testing sets are compared for each combo to assess over-
fitting.
• Ensembles of combos can be averaged to increase accuracy on out-of-sample data.
• Ranked lists of variants can be constructed by counting variable occurrence in combo
ensembles.
EFTA01125185
whole genome variation data sets wellderly and illderly data set
• from Scripps
• 783 samples aged 80 and above
• 342 males and 441 females
• 397 wellderly cases and 386 illderly controls
• 230 samples in wellderly female subset super centenarian data set
• From Stanford
• 17 samples aged 110 and above
• 16 females and 1 male
• 14 whites, 2 Latinas, and 1 African American
EFTA01125186
wellderly vs. illderly MOSES analysis
150 sample variant load histogram
4103
1700
1900
4900 variant count eke
51O3 phenotype
El Myth example combo variables are gemini db reference ID numbers
•
There were 900 combos with accuracies significantly greater than the case prevalence (p > 0.05, McNemar's
test)
•
mean of 130 features per combo
•
The mean out-of-sample accuracy of combo ensembles was 0.884.
•
Means for all combos in each cross validation set:
accuracy precision recall mean out-of-sample
0.851
0.863
0.843 mean in-sample
0.860
0.871
0.850 example combo
0.880
0.863
0.909 and(or(and(or(and(or(and(or(and(or(and(!$X106020 !$X168745) $X763139) !$X735449 !$X297852) and(or($X53710 $X297852 $X552647) !$X766840 $X808350))
or($X14463 $X766840)) and(or(and(or(!$X735449 $X54045) !$X67669) and(or($X135964 !$X808350) $X558377) $X434945) or(and($X735449 $X522743) and($X497883
$X702846) $X431028 !$X480341) or(and($X735449 $X552647) and(!$X135964 !$X194619))) and(or(and($X217849 $X256079 !$X808350) $X297852) !$X735449
$X695782)) or(and(!$X735449 !$X67669 $X434945) !$X427182)) and(or(and(or($X256079 $X808350) $X135964) !$X217849 $X434945) or($X735449 $X766840) $X67669
$X427182) and(or(and($X735449 !$X165425 $X808350) !$X67669 $X434945) or(and($X16581 !$X695782) $X128740 !$X135964 $X434945) !$X14463 $X217849)
$X379084) or(and($X14463 $X217849) !$X67669 $X106020 $X379084 !$X427182 !$X480341) or(!$X379084 !$X379090)) a nd(or(and(or(and($X194619 !$X807580)
and($X256079 !$X763139) $X217849 $X427182) or(!$X14463 !$X434945) !$X135964) and($X735449 $X431028) and(!$X106020 $X434945)) or(and(or(!$X165425 I
$X217849) $X558377) and(or($X807580 !$X808350) $X695782) and(!$X735449 !$X14463) and($X480341 !$X807580) $X135964) or(and($X427182 $X497883) !$X67669)
or(and(!$X558377 !$X807580) !$X14463 $X53710 $X427182) !$X297852 $X379084)) or(and(or(!$X379090 $X427182) !$X128740) and(!$X67669 $X135964) and(!
$X480341 !$X577132) !$X808350))
EFTA01125187
wellderly vs. illderly top combo SNPs
•
The top 25 variants ranked by number of occurrences in the 10000 best combos from 10 cross validation runs.
•
"Category" indicates if variable is negated, i.e. if variant is negated in combo then category is "control" because
combos are "true" for cases. Note variables can have different categories in different combos.
•
Alternate allele frequencies (AAFs) are shown for the data set and 2 reference genome sets: the Exome
Aggregation Consortium (ExAC) and the 1000 Genomes.
•
Annotations are from geminil v1.7.O (ensembl v75, dbSNP v141, ExAC vO.3)
1.
httocileemini readthedocs ore/en/latest/index html rs id combo count
category cyto-band gene transcript data AAF adjusted
ExAC AAF lk Genomes
AAF rs10953303
2719 control chr7q22.1
MN
ENST00000546213
0.211
0.236
0.198 rs1050348
2206 control chr6q21
LAMA4
ENST00000389463
0.405
0.665
0.758 rs6942733
2120 case chr7q22.1
ZAN
ENST00000538115
0.255
0.235 ii
0.199 a1050348
1308 case chr6q21
LAMA4
ENST00000389463
0.405
0.665
0.758 rs6942733
1004 control chr7q22.1
ZAN
ENST00000538115
0.255
0.235
0.199 a2243191
1002 control chr1q32.1
IL19
ENST00000270218
0.203
0.748
0.673 rs1688005
901 case chr19q13.12
FXYDS
ENST00000588699
0.261
0.322
0.412 a4842978
900 control chr15q25.2
W0R73
ENST00000561447
0.432
NA
0.726 rs1977420
899 control chrllpl3
APIP
ENST00000395787
0.359
0.404
0.457 a7905784
802 case chr 10p13
MCM10
ENST00000378694
0.152
0.118
0.064 rs7905784
801 control chr10p13
MCM10
ENST00000378694
0.152
0.118
0.064 a1381057
801 control chr3q13.33
POLQ
ENST00000264233
0.327
0.722
0.745 rs1977420
797 case chrllpl3
APIP
ENST00000395787
0.359
0.404 MEI 0.457 rs10953303
720 case chr7q22.1
ZAN
6N5100000546213
0.211
0.236
0.198 rs2228331
703 case chr2q37.3
GPC1
ENST00000264039
0.307
0.664
0.664 a4842978
701 case chr15q25.2
W0R73
ENST00000561447
0.432
NA
0.726 rs2397084
604 case chr6p12.2
IL17F
ENST00000336123
0.102
0.069
0.033 a6587467
603 control chr1q44
OR2T6
ENST00000355728
0.309
0.721
0.773 rs912174
601 case chr9p24.3
KANK1
ENST00000382293
0.225
0.219
0.204 a671694
601 control chr7p22.2
SDK1
ENST00000404826
0.268
0.752
0.799 rs11895564
600 case chr2q31.1
ITGA6
ENST00000264106
0.294
0.281
0.252 a7386783
600 case chr8q24.22
OC90
ENST00000254627
0.277
0.729
0.737 rs11250
600 control chr4q13.2
CENPC
ENST00000273853
0.392
0.657
0.701 a4802648
599 case chr19q13.33
ZNF473
ENST00000595661
0.221
NA
0.197 rs10277
598 case chr5q35.3
CSorf45
ENST00000376931
0.433
0.626
0.688
EFTA01125188
wellderly vs. illderly combo SNPs effects
•
Predicted translation effects of variants in top combos
•
Selected from feature set of 13,242 SNPs classified in gemini dbl as "high" and "medium" impact
•
Sequence Ontology (SO) impact classification determines gemini impact severity for variant filtering.
•
The combined annotation scoring tool (CAROL)2 combines SIFT3 and PolyPhen-24 nucleotide scores to
predict SNP effect on translated protien.
1
httovileemini readthedors ordrn/latesticontent/databace cchema htmliteletails.of-the.imnart.and-imnart-ceveritv.rolumns
2.
Lopes MC, Joyce C, Ritchie GRS, John SL, Cunningham F, Asim it 1, Zeggini E.
A combined functional annotation score for non-synonymous variants
Human Heredity (in press)
3.
Kumar P, Henikoff 5, Ng PC.
Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm
Nature Protocols 4(8):1073-1081(2009) doi:
10.1038/nprot.2009.86
4.
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR.
A method and server for predicting damaging missense mutations
Nature
Methods 7(4):248-249 (2010)
SNPdb ID rs1977420 rs10277 rs11250 rs 1688005 rs2228331 rs2397084
rs2243191 rs11895564 rs912174 rs1050348 rs7905784 rs7386783 rs6587467
rs 1381057 r5671694 rs4842978 rs10953303 rs6942733 rs4802648
gene
APIP
CSorf45
CENPC
FXYDS
GPC1
IL17F
1119
ITGA6
KANK1
LAMA4
MCM10
OC90
OR2T6
POLO
SDK1
WDR73
ZAN
ZAN
ZNF473 gene name
APAF1 interacting protein chromosome 5 open reading frame 45
centromere protein C
FXYD domain containing ion transport regulator 5 glypican 1 interleukin 17F
interleukin 19 integrin, alpha 6
KN motif and ankyrin repeat domains 1 laminin, alpha 4 minichromosome maintenance complex 10
otoconin 90 olfactory receptor, family 2T, member 6 polymerase (DNA directed), theta
sidekick cell adhesion molecule 1
WD repeat domain 73 zonadhesin (gene/pseudogene) zonadhesin (gene/pseudogene)
zinc finger protein 473 category case & control case control
case
[repeated 3 times] control case case case & control case & control
case control
[repeated 3 times] case & control
[repeated 3 times] case ensembl transcript
SO impact
CAROL prediction
ENST00000395787
ENST00000376931
ENST00000273853
ENST00000588699
ENST00000264039
ENST00000336123
ENST00000270218
ENST00000264106
ENST00000382293
ENST00000389463
ENST00000378694
ENST00000254627
ENST00000355728
ENST00000264233
ENST00000404826
ENST00000561447
ENST00000546213
ENST00000538115
ENST00000595661 missense variant
[repeated 15 times] splice region variant missense variant missense variant
splice donor variant
Neutral (0.876)
Neutral (0.773)
Neutral (0.000)
Neutral (0.705)
Neutral (0.000)
Deleterious (1.000)
Neutral (0.000)
Neutral (0.724)
Neutral (0.000)
Neutral (0.307)
Neutral (0.380)
Neutral (0.436)
Neutral (0.670)
Neutral (0.000)
Neutral (0.200) nan
Deleterious (0.999)
Neutral (0.036) nan
EFTA01125189
super centenarian vs. wellderly female MOSES input data sample variant load histogram
40
4560
5000
5500 variant count
6000
Plumb/Pe
■wendedy super centenarian
• Super centenarian cases were matched to wellderly female controls to
attempt to find SNPs associated with extreme longevity.
• Feature set constructed from intersection of SNPs in case & control
sets.
• Cases have almost 30% more variants per sample than controls.
EFTA01125190
super centenarian vs. wellderly female MOSES results
• There were 5 combos with accuracies significantly greater than the case
prevalence (p > 0.05, McNemar's test)
• The out-of-sample accuracy for all significant combos was 1.0
1. and(!$rs17521570 $rs5905720)
2. and(!$rs2230681 $rs5905720)
3. and(!$rs557337 $rs5905720)
4. and($rs2230681 $rs5905720)
5. $rs5905720 gene name location transcript data
ExAC2
AAF'
AAF
,
1kG3 AAF rs5905720
MAGIX
MAGI family member, X-linked chrXp11.23
ENST00000425661
0.021
0.002
0.0003 rs2230739
ADCY9 adenylate cyclase 9 chr16p13.3
ENST00000294016
0.302
0.357
0.260 rs17521570
RAI14 retinoic acid induced 14 chr5p13.2
ENST00000515799
0.116
0.119
0.099 rs2230681
PSMD9 proteasome 26S subunit, non-ATPase 9 chr12q24.31
ENST00000261817
0.123
0.856
0.834 rs557337
TBC1D4
TBC1 domain family, member 4 chr13q22.2
ENST00000377636
0.065
0.083
0.174
'Alternate Allele Frequency
2 Exome Aggregation Consortium
3 1000 Genomes
EFTA01125191
summary
MOSES can find a diverse set of accurate boolean categorization functions even in
data with very large feature sets and highly imbalanced sample category sizes.
EFTA01125192
Probabilistic Logic for
Connecting Genomic Data
Patterns with Bio-Ontologies:
A simple example
(Eddie Monroe & Ben Goertzel)
EFTA01125193
Logic: very general, flexible framework for carrying out abstract reasoning.
Encompasses both mathematical and commonsense reasoning.
Probability theory: very general, flexible framework for carrying out reasoning based on uncertainty.
Used in a huge variety of areas including data mining, robotics, vision processing, etc.
EFTA01125194
"Progic" = probability + logic
•Various approaches to synthesizing probability and logic exist
•Probabilistic Logic Networks (PLN) is a "progic" framework oriented toward artificial general intelligence.
EFTA01125195
Probabilistic Logic Networks
• OpenCog represents knowledge in its "Atomspace" in terms of
nodes and links of various types
• PLN contains a set of probabilistic logic rules, that transform sets of
nodes/links into other sets of nodes/links
• PLN can do deduction, induction, abduction, analogy and other
types of reasoning
• PLN can reason on any kind of data, including data-patterns
("combo models") learned in genomic data by MOSES, or data imported into OpenCog from bio-ontologies
• Due to its ability to process huge amounts of information in subtle
ways, PLN can identify data patterns the human mind will miss
• A fundamentally different paradigm than currently popular
"machine learning" or "deep learning" architectures, with more
capability for abstract symbolic understanding — but can work
together with more standard ML algorithms
EFTA01125196
Term Logic
Predicate Logic
A
-3.
B
B
-,
C
I-
A—
C
A
A --> B
-
A
33
A
I -
33
->
C
1.
Induction
23.
C
B
C
I
I—
A
—5.
35
Alzoduc flora
EFTA01125197
Multiple PLN Relationship Types
PLN involves more than a dozen logical relationship types, each with particular semantics.
For instance could be interpreted in many ways including
Extensionallnheritance A B
Extensionallnheritance B C
Extensionallnheritance A C
IntensionalInheritance A B
IntensionalInheritance B C
IntensionalInheritance A C
EFTA01125198
"Higher-Order" PLN
Following Pei Wang' s usage in NARS, in PLN we refer
to logic regarding variables or higher-order functions as "higher-order"
ImplicationLink
EvaluationLink has($X, mouth)
EvaluationLink eats($X, food)
EFTA01125199
Quantifying Truth Values
Each PLN relationship has a truth value attached to it. PLN supports truth value objects of different types, e.g.
• Single probability
• SimpleTruthValue:
- (s,c) = (probability, confidence level)
— (s,n) = (probability, amount of evidence)
• Imprecise truth value
— (L,U) interval, e.g. (.4,.6)
• Indefinite truth value
— (L,U,b,k) ... interval plus confidence level b, and "personality
parameter" k, e.g. (.4,.6,.9,2)
• Distributional truth value
- first or second order pdf
EFTA01125200
Example PLN rule+formula:
deduction
B <sB>
C <sc>
ExtensionalInheritance
A B <sAB>
ExtensionalInheritance
B C <sBc>
ExtensionalInheritance
A C <SAC>
SAC = SAB SBC
+
( 1 - SAO
( Sc — sg Sgc )
/
(1 — sg )
As given above, this acts on single-probability truth values.
It can be extended to other true value forms.
EFTA01125201
PLN rules
Each rule maps a tuple of relationships into a relationship
Example: deduction rule
Subset A B
Subset B C
I -
Subset A C
PLN formulas
Each formula maps a tuple of truth values into a truth value
Example: deduction formula
SAC = 5AB Sgc
+ (1 — SAB)
( Sc — ss Sgc )
/
(1 — Ss )
EFTA01125202
Inversion
(Bayes Rule)
A
B
Subset A B
(-
Subset B A
In PLN, simple first-order induction and abduction are obtained by
combining deduction and Bayes rule.
More advanced induction and abduction result from using intensional relationships.
A 4
B
B 4
C
I -
A4 C
5 4
A 4
I
B
A 4
B
I -
A ->
C
[repeated 3 times]
B
A
A Nt,
Abduction
EFTA01125203
Glossary of Link Types
AttractionLink
Indicates the extent to which one concept is a pattern or property helping
to characterize another.
(AttractionLink A B) indicates the extent to which B is a property that
characterizes A.
ConceptNode
A node representing any concept.
ExecutionOutputLink
Indicates execution of a function with a list arguments to that function.
This allows for atomspace representation of the execution of arbitrary
code.
GeneNode
A node representing a particular gene.
EFTA01125204
Glossary of Link Types (cont.)
GroundedSchemallode
Specifies the name of a predefined procedure that is to be called.
ImplicationLink
Expresses an if...then... relation, or that the truth of one predicate implies the
truth of another.
(ImplicationLink A B) denotes that A implies B.
IntensionalEquivalenceLink
Indicates that the properties associated with one predicate being true are similar
to the properties associated with another predicate being true.
(IntensionalEquivalancelink A B) denotes that the properties associated with A
being true are similar to the properties associated with B being true.
IntensionalImplicationLink
Expresses an if... then... relation between the properties of 2 predicates.
(IntensionalImplicationlink A B) denotes that the properties of A imply the
properties of B.
EFTA01125205
Glossary of Link Types (cont.)
IntensionalSimilarityLink
Indicates that two concepts have similar properties.
(IntensionalSimilaritylink A B) denotes that the properties of A are similar to the
properties of B
ListLink
Used for grouping Atoms for some purpose, typically to specify a set of arguments
to some function or relation.
MemberLink
Indicates set membership.
(MemberLink x S) denotes that element x is a member of set S. The TruthValue
associated with a MemberLink is meant to indicate fuzzy set membership.
NotLink
Corresponds to the negation of a concept or predicate.
EFTA01125206
Glossary of Link Types (cont.)
PredicateNode
Names the predicate of a relation. Predicates are functions that have
arguments and produce a truth value as output.
SetLink
A type of link used to group its arguments into a set
(SetLink x y z) simply indicates that there is a set {x,y,z}
SubsetLink
Denotes extensional inheritance, which is inheritance between sets based
on their members. It specifies an "is-an-instance-of" relationship.
(SubsetLink A B) specifies that A is an instance of B.
EFTA01125207
Example Inference: Goal
• Through MOSES analysis, we found overexpression of
LY96 appears to distinguish Nonagenarians from controls.
• Using PLN, what can we infer about the relationship between LY96 and longevity based on background
domain and experimental knowledge?
Our target conclusion is:
ImplicationLink
(ExecutionOutputLink
(GroundedSchemallode "scm: make-over-expression-predicate")
(GeneNode "LY96"))
(PredicateNode "LongLived")
Interpretation: "Overexpression of LY96 implies longevity."
EFTA01125208
Background Information
There is pre-existing evidence that over-expression of gene TBK1 is associated with increased lifespan
(Source: Lifespan Observations Database)
(IntensionalImplicationLink (sty 0.3 0.7)
(ExecutionOutputLink (sty 0.2 0.7)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "TBK1" (sty .0004 0.9))))
(PredicateNode "LongLived" (sty 0.15 0.8)))
Interpretation: "Overexpression of TBK1 implies longevity"
Genes are associated with Gene Ontology terms and other categories.
((MemberLink
(GeneNode "TBK1" (sty 0004 0.9))
(ConceptNode "GO:0005515" (sty 0.001 0.9)))
(MemberLink
(GeneNode "TBK1" (sty 0004 0.9))
(ConceptNode "GO:0045087" (sty 0.001 0.9)))
Interpretation: "TBK1 is a member of GO category 0005515,"
"TBK1 is a member of GO category 0045087,"
... for each gene category annotation
EFTA01125209
Inference Chain Steps
(1) Member-to-Subset Rule
(Member A B) I- (Subset (Set A) B)
Premises:
(MemberLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998))
(ConceptNode "GO:0051607" (sty 0.001 0.89999998))
)
"TBK1 is a member of GO category 0051607"
Conclusions:
(SubsetLink
(SetLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998))
)
(ConceptNode "GO:0051607" (sty 0.001 0.89999998))
)
"The singleton set containing TBK1 is a subset of GO category 0051607"
EFTA01125210
Intensional Similarity
• We will infer a relationship between the gene
LY96 and the predicate LongLived through the similarity of LY96 with gene TBK1, which is
already known to be related to longevity.
• Intensional similarity is based on common properties of the genes.
• Steps 2-5 that follow are needed for creating the IntensionalSimilarity relationship.
EFTA01125211
(2) Compare gene properties
• We are using GO category annotations for gene properties.
• At the start of the inference, we need to get the supersets of
{TBK1} and {LY96} and determine the intersection and union of the
supersets
LY96: member of 25 GO categories
TBK1: member of 34 GO categories
Common categories (intersection):
GO:0005515 protein binding
GO:0045087 innate immune response
GO:0006954 inflammatory response
GO:0010008 endosome membrane
GO:0002224 toll-like receptor signaling pathway
GO:0002756 MyD88-independent toll-like receptor signaling pathway
GO:0007249 1-kappaB kinase/NF-kappaB signaling
GO:0034138 toll-like receptor 3 signaling pathway
GO:0034142 toll-like receptor 4 signaling pathway
GO:0035666 TRIF-dependent toll-like receptor signaling pathway
EFTA01125212
(3) Subset NotA B Direct Evaluation
(Inheritance A B) I- (Inheritance (Not A) B)
For each common category relationship (LinkType A B), create (LinkType (Not A) B)
Premises:
(SubsetLink (sty 1 0.99999982)
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
(ConceptNode "GO:0045087" (sty 0.001 0.89999998))
)
"{LY96} is a subset of GO:0045087"
Conclusions:
(SubsetLink (sty 0.028667862 0.99999982)
(NotLink
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
)
(ConceptNode "GO:0045087" (sty 0.001 0.89999998)))
"A random gene (exclusive of LY96) belongs to GO:0045087 (with a low probability)"
EFTA01125213
(4) AttractionRule
(And (Subset A B) (Subset (Not A) B)) I- (AttractionLink A B)
Make AttractionLinks for LY96 and TBK1 for each common relationship (IOW for each relationship in the
intersection of the supersets).
Premises:
(SubsetLink (sty 1 0.99999982)
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
(ConceptNode "GO:0045087" (sty 0.001 0.89999998))
(SubsetLink (sty 0.028667862 0.99999982)
(NotLink
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
(ConceptNode "GO:0045087" (sty 0.001 0.89999998)))
{O196} is a subset of "GO:0045087,"
"A random gene not in {LY96} is a subset of GO:0045087 (with a low probability)"
Conclusions:
(AttractionLink (sty 0.97133213 0.99999982)
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
(ConceptNode "GO:0045087" (sty 0.001 0.89999998)))
"GO:0045087 is a property of/pattern in (LY96)"
EFTA01125214
(5)IntensionalSimilarity Direct Evaluation
(And (Attraction P A) (Attraction P B) (Attraction (Q A) (Attraction (0 B) ...) I-
(IntensionalSimilarity A B)
Premises:
(AttractionLink (sty 0.97133213 0.99999982)
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
(ConceptNode "GO:0045087" (sty 0.001 0.89999998)))
(AttractionLink (sty 0.97133213 0.99999982)
(SetLink (GeneNode "TBK1" (sty 4.1666666e-05 0.89999998)))
(ConceptNode "GO:0045087" (sty 0.001 0.89999998)))
"GO:0045087 is a property of {1Y96}"
"GO:0045087 is a property of {TBK1}"
Etc.. . .
Conclusion:
(IntensionalSimilarityLink (sty 0.19570713 0.99999982)
(SetLink (GeneNode "TBK1" (sty 4.1666666e-05 0.89999998)))
(SetLink (GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
"{TBK1} properties are similar to {LY96} properties"
EFTA01125215
(6) Singleton-Similarity-Rule
(Similarity {A} {B}) I- (Similarity A B)
Premise:
(IntensionalSimilarityLink (sty 0.19570713 0.99999982)
(SetLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998)))
(SetLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))
"{TBK1} properties are similar to {LY96} properties"
Conclusion:
(IntensionalSimilarityLink (sty 0.19570713 0.99999982)
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998))
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))
)
"TBK1 properties are similar to LY96 properties"
EFTA01125216
(7) Gene-Similarity-to-Overexpression-Equivalence
(Similarity (Gene A) (Gene B)) I- (Equivalence (A-overexpressed) (B-overexpressed)
Premise:
(IntensionalSimilarityLink (sty 0.19570713 0.99999982)
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998))
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))
)
"TBK1 properties are similar to LY96 properties"
Conclusion:
untensionalEquivalenceLink (sty 0.19570713 0.99999982)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998))))
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))))
"Properties associated with over-expression of TBK1 are similar to properties associated with
overexpression of LY96"
EFTA01125217
(8) Equivalence-Transformation Rule
(Equivalence A B) I- (And (Implication A B) (Implication B A))
Premise:
(IntensionalEquivalenceLink (sty 0.19570713 0.99999982)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998))))
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998)))))
" `Overexpression of TBK1 properties' is similar to 'overexpression of RYR1 properties' "
Conclusion:
(IntensionalImplicationLink (sty 0.3273496 0.99999982)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))))
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998)))))
"Having properties associated with over-expression of LY96 implies having properties associated with
overexpression of TBK1"
EFTA01125218
(9) Implication Deduction Rule
(And (Implication A B) (Implication B C) I- (Implication A C)
(Part 1)
Premises:
(IntensionalImplicationLink (sty 0.3273496 0.99999982)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))))
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "TBK1" (sty 4.1666666e-05 0.89999998)))))
"Having properties associated with overexpression of LY96, implies having properties associated with
overexpression of TBK1"
(IntensionalImplicationLink (sty 0.3 0.7)
(ExecutionOutputLink (sty 0.2 0.7)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "TBK1" (sty .0004 0.9))))
(PredicateNode "LongLived" (sty 0.15 0.8)))
)
"Having properties associated with overexpression of TBK1, implies having properties associated with
longevity"
EFTA01125219
(9) Implication Deduction Rule
(And (Implication A B) (Implication B
I- (Implication A C)
(Part 2)
Conclusion:
(IntensionalImplicationLink (sty 0.17387806 0.69999999)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))
(PredicateNode "LongLived" (sty 0.15000001 0.80000001))
"Having properties associated with 'Overexpression of LY96' implies
having properties associated with longevity"
EFTA01125220
(10) Implication Conversion Rule
(IntensionalImplication A B) - (Implication A B)
Premise:
(IntensionalImplicationLink (sty 0.17387806 0.69999999)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))))
(PredicateNode "LongLived" (sty 0.15000001 0.80000001))
"Having properties associated with `Overexpression of LY96' implies having properties associated with
longevity"
Conclusion:
(ImplicationLink (sty 0.17387806 0.48999998)
(ExecutionOutputLink (sty 0.2 0.69999999)
(GroundedSchemallode "scm: make-overexpression-predicate")
(ListLink
(GeneNode "LY96" (sty 4.1666666e-05 0.89999998))))
(PredicateNode "LongLived" (sty 0.15000001 0.80000001))
"Overexpression of LY96 implies longevity" (Our target conclusion)
Next big Al challenge here:
Fully automated, scalable inference control (choice of which inference steps to
take), via data-mining of inference history
EFTA01125221
Broad Vision: Al Scientist
• Integrated knowledge-base of all biological (+ chemical etc.) knowledge, in the Atomspace, built in semi-
automated way
• Knowledge comes from: datasets, databases, texts, simulations, automated use of lab equipment
• MOSES, PLN and other Al methods used for hypothesis discovery and validation
• Connect OpenCog w/ simulation engine, use OpenCog data/inferences to help set simulation parameters
• Al to design experiments, run robotized experiments
• Language generation to produce written reports
• Full-on Al Scientist!!
EFTA01125222
Technical Artifacts (35)
View in Artifacts BrowserEmail addresses, URLs, phone numbers, and other technical indicators extracted from this document.
GPS
0.15000001 0.80000001GPS
0.17387806 0.48999998GPS
0.17387806 0.69999999GPS
0.19570713 0.99999982GPS
0.3273496 0.99999982GPS
0.97133213 0.99999982Phone
15000001Phone
1700
1900Phone
17387806Phone
17521570Phone
2228331Phone
2230681Phone
2230739Phone
2243191Phone
2397084Phone
3273496Phone
348
1308Phone
348
2206Phone
4802648Phone
4842978Phone
560
5000Phone
5671694Phone
5905720Phone
6587467Phone
6942733Phone
7133213Phone
7386783Phone
7905784Phone
8667862Phone
8999998Phone
953303
2719Phone
9999982Phone
9999998Phone
9999999Wire Ref
referenceRelated Documents (6)
DOJ Data Set 11OtherUnknown
EFTA02499157
2p
DOJ Data Set 10OtherUnknown
EFTA02121328
1p
DOJ Data Set 9OtherUnknown
Creating Intelligent Humanoid Robots
10p
DOJ Data Set 9OtherUnknown
Science Philanthropist, Jeffrey Epstein, Backs the First Free Thinking Robots
2p
DOJ Data Set 10CorrespondenceUnknown
EFTA Document EFTA01756518
0p
DOJ Data Set 11OtherUnknown
EFTA02573952
1p
Forum Discussions
This document was digitized, indexed, and cross-referenced with 1,400+ persons in the Epstein files. 100% free, ad-free, and independent.
Annotations powered by Hypothesis. Select any text on this page to annotate or highlight it.