Skip to content

Instantly share code, notes, and snippets.

@automenta
Created December 27, 2022 09:27
Show Gist options
  • Save automenta/3dd027f8e5fc7fd195e4a3d391badf7a to your computer and use it in GitHub Desktop.
Save automenta/3dd027f8e5fc7fd195e4a3d391badf7a to your computer and use it in GitHub Desktop.
saveModel: False
system:
seed: 0
work_dir: ./out/chargpt
data:
block_size: 1024
model:
model_type: gpt-mini
activation: LeakyReLU(negative_slope=0.01)
n_layer: None
n_head: None
n_embd: None
vocab_size: None
block_size: None
attn_pdrop: 0.25
resid_pdrop: 0.25
embd_pdrop: 0.25
trainer:
device: auto
max_iters: 100000.0
batch_size: 8
learning_rate: 0.001
betas: (0.9, 0.999)
eps: 1e-08
weight_decay: 0.2
grad_norm_clip: None
grad_accumulation_steps: 1
data has 611144 characters, 95 unique.
number of parameters: 6605056
running on device cuda
iter_dt 273.86ms; iter 50: train loss 3.30561
iter_dt 272.74ms; iter 100: train loss 2.73717
iter_dt 276.84ms; iter 150: train loss 2.53481
iter_dt 280.28ms; iter 200: train loss 2.44460
iter_dt 271.47ms; iter 250: train loss 2.40042
iter_dt 273.40ms; iter 300: train loss 2.32349
iter_dt 274.60ms; iter 350: train loss 2.24970
iter_dt 275.97ms; iter 400: train loss 2.21038
iter_dt 279.32ms; iter 450: train loss 2.15722
iter_dt 273.27ms; iter 500: train loss 2.07836
iter_dt 282.23ms; iter 550: train loss 2.01161
iter_dt 273.65ms; iter 600: train loss 1.97935
iter_dt 274.61ms; iter 650: train loss 1.89972
iter_dt 272.05ms; iter 700: train loss 1.83847
iter_dt 274.69ms; iter 750: train loss 1.77218
iter_dt 272.83ms; iter 800: train loss 1.70948
iter_dt 276.41ms; iter 850: train loss 1.67512
iter_dt 277.77ms; iter 900: train loss 1.60441
iter_dt 274.43ms; iter 950: train loss 1.54688
iter_dt 273.33ms; iter 1000: train loss 1.49976
(domene that timaly in ?X the a &%Composits.")
(subclasss Parthession WayProcate)
(documentation Passit EnglishLanguage "Any &%Process that the &%Phesss chat which see pant ins of the &%Propesss of the the ats &%Areanion is that &%Prosition an that &%Pentions whis o an whe ishat con &%TemposionalAgenizaton a &%Grveanes, &%Gephich is cor &%Organish the &%Granization.")
(subclass PaphPation Praroce)
(documentation Pasion EnglishLanguage "An &%Glices fowherich coris &%Groment) a duss a what sph a ano contion e se the the r the ot
a &%GraphNosman that is so se arsoue the &%Gronismat an &%GaphArangent).")
(=>
(and
(instance ?GENT Gram)
(instance ?DENT Weooura)
(exists (?DENT ?DENT)
(and
(instance ?DETIG)
(instance ?DENT TatialAtim)
(instance ?DENT Noumatia)
(instance GraposssseAricaion)
(documentation GaposicParocept EnglishLanguage "Any &%GaumulAge an the
&%Gromeom) that subPos) &%GraniseAgentin whaphPossiong curong
&%GubiossPesconss aure arepent ans phe biares tof ilic s the &%Gaphicas cals t of anommoce, by wh &%Gomuss. In.")
(subclasss GeoniminalAmanicGioumePosionalOrganization GiphicalPossition)
(documentation GeacGesmaneg EnglishLanguage "Ony &%Gisis the
of an &%GeaphPosicDuration of that gome st the ne &%GamephDove is, the
tre n the &%GraphAreanion.")
(=>
(exists (?TTYA1 ?WE2)
(and
(instance ?GENT GraphecalParat Graphin)
(instance GaposurophPhosiophicalAtte)
(documentation GeaphPosicalDephoreopodiosDe EnglishLanguage "A &%Aubject) tre as &%GealNomoganal win sthe in
&%Growhin is s &%GeanimosAbject) owhe &%Gramph that &%GeopomphacPhich omesochis beros asuing s cof &%Grompososic a a ceereromposiong ocon the
eat the naprpossesed thach omaned the same t &%GimePhPeshPamphPosse thic as ponich the dishe othe bas s wouge serapptit thans ctronmang omas s a &%GrophPhog hocos it
&%Graphiong t wharo oraicomes, ame
the pros.")
(=>
(instance ?GEAT1 Gromumeain)
(exists (?GENTENT1 ?GENTEST2)
(and
(instance ?LATEN
iter_dt 272.70ms; iter 1050: train loss 1.45216
iter_dt 271.01ms; iter 1100: train loss 1.37424
iter_dt 269.58ms; iter 1150: train loss 1.34264
iter_dt 275.84ms; iter 1200: train loss 1.26647
iter_dt 277.76ms; iter 1250: train loss 1.20971
iter_dt 276.52ms; iter 1300: train loss 1.14465
iter_dt 279.40ms; iter 1350: train loss 1.07798
iter_dt 270.69ms; iter 1400: train loss 1.06483
iter_dt 269.66ms; iter 1450: train loss 1.00246
iter_dt 273.16ms; iter 1500: train loss 0.97595
iter_dt 277.68ms; iter 1550: train loss 0.96347
iter_dt 274.54ms; iter 1600: train loss 0.92758
iter_dt 271.27ms; iter 1650: train loss 0.87983
iter_dt 278.67ms; iter 1700: train loss 0.88905
iter_dt 276.85ms; iter 1750: train loss 0.83129
iter_dt 277.93ms; iter 1800: train loss 0.85293
iter_dt 279.54ms; iter 1850: train loss 0.80754
iter_dt 277.32ms; iter 1900: train loss 0.84876
iter_dt 277.99ms; iter 1950: train loss 0.79808
iter_dt 279.51ms; iter 2000: train loss 0.79935
(?LIME ?FORM)))
(meeets ?NUMBER1 ?NUMBER2))
(subclass Body BodyFn UnitOfMeasure)
(documentation BodyFn EnglishLanguage "The &%Class of &%RealNumber that relations are of the &%Number and to extrm the &%Class
&%BodyFn &%Class of &%Class which can the relations of the instance of changed relation for
&%BodyFn &%For example, i.e. the for a &%Class of &%For &%Frun &%For &%Frun &%FrFor which is
cosne of the &%Frunctions a
&%Frunction, &%Frunction the &%Frunctions returns the &%Frunction that relationshes
for an &%Frunction that dimens an express a of &%Frunctions that dimens the specification the are to af &%Frunction o
&%Frunctions the &%Frunction the &%Frunction.")
(subclass Process ColladAttribute)
(documentation Process EnglishLanguage "The &%Process of &%Frunctions that relations the
&%Frunctionall angument that &%Frunction the &%Frunctions are
the &%Frunctions the &%Frunctions argument of an
&%Frunction.")
(subclass ColllecadAttribute)
(documentation ColladAttribute EnglishLanguage "The &%CladAttributes the the
&%Frunction the are number frunction instance of f the &%Frunctions the att the &%Frunction and
process a &%Frunction the the &%Frunctions instances of &%Frunction, and &%Frunction the &%Frunction ?FRRUNCT ?FRUNCTIONCTIns ?FRUNCTITIT is the
the
&%Frunction ?FRUNCTIONCTION ?FRMERUNCTITION ?FRUNCTIONCTION
the is equal to the &%Frunction ?FRUNCTIONCTION ?FRUNCTIONCTIONCTIT.")
(=>
(and
(FunctionFn ?FRUNCTION Frunction)))
(Frugure ?FRUNCTION ?FRUNCTIONCTION ?FRUNCTIONCTION ?FRUNCTIONCTIONCTINCTIONCTION ?INCTITIONCTIONCTINctis ?FRUNCTIONCTIONCTION
?FRE ?FRUNCTIONCTIONCTIONCTION and the clade ?FRUNCTIONCTION ?FRUNCTIONCTIONCT, ?FRUNCTIONCTION ?FRUNCTIONCTION
and the &%Flunctions ?FRUNCTION ?FRUNCTION ?FRCTIONCTIONCTIONCTIN ?FRUNCTIONCTIONCTINCTINCTINCTIONCT
res that ?FRUNCTIONCTION that the requal to the ?FRUNCTION cladestions the equal to the requal to to and the &%Frunctions be the reauto
the &%Flunctions (?FRUNCTION ?FRUNCTIONCTIONCTINCTION ?FRUNCTIONCTINCTIONCTIONCTINCTINCE ?FRUNCTIONCTION
iter_dt 276.98ms; iter 2050: train loss 0.76834
iter_dt 278.20ms; iter 2100: train loss 0.76341
iter_dt 272.22ms; iter 2150: train loss 0.74587
iter_dt 277.65ms; iter 2200: train loss 0.73961
iter_dt 274.52ms; iter 2250: train loss 0.74089
iter_dt 271.87ms; iter 2300: train loss 0.73581
iter_dt 277.52ms; iter 2350: train loss 0.71030
iter_dt 271.29ms; iter 2400: train loss 0.72548
iter_dt 277.05ms; iter 2450: train loss 0.71072
iter_dt 272.20ms; iter 2500: train loss 0.68768
iter_dt 274.49ms; iter 2550: train loss 0.65939
iter_dt 274.95ms; iter 2600: train loss 0.67188
iter_dt 277.66ms; iter 2650: train loss 0.67524
iter_dt 278.06ms; iter 2700: train loss 0.67980
iter_dt 270.37ms; iter 2750: train loss 0.66468
iter_dt 279.51ms; iter 2800: train loss 0.65221
iter_dt 274.44ms; iter 2850: train loss 0.66039
iter_dt 279.49ms; iter 2900: train loss 0.65066
iter_dt 278.90ms; iter 2950: train loss 0.65989
iter_dt 276.06ms; iter 3000: train loss 0.64640
(documentation ?PARESORE ?PRESON))
(instance ?PARESORE PareSolid)
(exists (?PLACE)
(and
(parent ?PARESORE ?PROPESORE)
(part ?PARESORE ?PRESORE))))))
(=>
(and
(instance ?PARESORE Parting)
(part ?PARESORE ?PARESORE)
(instance ?PARESORE Predicate)
(part ?PARESORE ?PARESORE))
(instance ?PARESORE Predicate))
(=>
(and
(instance ?PARESORE PareSolid)
(part ?PARESORE ?ORG)
(instance ?ORG Organization)
(part ?PARESORE ?ORG)
(instance ?ORG Organization)
(part ?PARESORE ?ORG)))
(=>
(and
(instance ?PARESORE Organization)
(part ?PARESORE ?ORG)
(instance ?ORG Organization)
(part ?PARESORE ?ORG)))
(exists (?ORG)
(and
(parent ?PARESORE ?PARESORE)
(parent ?PARESORE ?ORG))))
(=>
(instance ?ORG Organization)
(exists (?HOLE)
(and
(instance ?HOLE HoleRegion)
(part ?HOLE ?HOLE))))
(instance HoleRegion)
(instance HoleRegion)
(instance HoleRegion)
(instance HoleRegion)
(stance HoleRegion)
(instance HoleRegion)
(domain HoleRegion)
(domain HoleRegion)
(domain HoleRegion)
(domain HoleRegion 1 Organization)
(domain HoleRegion)
(domain HoleRegion 2 Organization)
(relatedIternalConcept HoleRegion)
(documentation HoleRegion EnglishLanguage "(&%HoleRegion ?ORG ?ORG) means that ?ORG
is a single of the part of ?ORG. Note that he the cases of ?HOLE.")
(instance HoleRegion)
(documentation HoleRegion EnglishLanguage "The &%Class of &%HoleRegions an instance of the &%HoleRegions,
e.g. overe part or organizations and or &%HoleRegions regions.")
(=>
(instance ?HOLERGION HoleRegion)
(exists (?HOLERGION)
(and
(parent ?HOLERGION ?ORGION)
(parent ?HOLERGION ?ORGANION)))
(=>
(instance ?HOLERGION HoleRegion)
(parent ?HOLERGION)
(hole (WhenFn ?HOLERGION)
(WhenFn ?HOLERGION)))))
(subclass Purpose HoleRegion)
(documentation Purpose EnglishLanguage "The &%Class of &%HoleRegions where an in
iter_dt 278.33ms; iter 3050: train loss 0.64828
iter_dt 276.51ms; iter 3100: train loss 0.63551
iter_dt 282.74ms; iter 3150: train loss 0.61629
iter_dt 277.60ms; iter 3200: train loss 0.61752
iter_dt 274.22ms; iter 3250: train loss 0.63446
iter_dt 273.99ms; iter 3300: train loss 0.60640
iter_dt 279.13ms; iter 3350: train loss 0.60960
iter_dt 278.15ms; iter 3400: train loss 0.61260
iter_dt 277.46ms; iter 3450: train loss 0.62097
iter_dt 279.32ms; iter 3500: train loss 0.59302
iter_dt 276.36ms; iter 3550: train loss 0.60874
iter_dt 273.79ms; iter 3600: train loss 0.57848
iter_dt 271.14ms; iter 3650: train loss 0.59723
iter_dt 271.91ms; iter 3700: train loss 0.59324
iter_dt 274.90ms; iter 3750: train loss 0.58716
iter_dt 270.45ms; iter 3800: train loss 0.59025
iter_dt 273.89ms; iter 3850: train loss 0.56011
iter_dt 272.51ms; iter 3900: train loss 0.56838
iter_dt 271.16ms; iter 3950: train loss 0.56707
iter_dt 279.14ms; iter 4000: train loss 0.57376
(?INST2))))
(=>
(instance ?INST1 CaseRole)
(instance ?INST2 CaseRole))
(=>
(instance ?INST2 CaseRole)
(equal ?INST1 ?INST1))
(subclass CaseRole)
(documentation CaseRole EnglishLanguage "A &%CaseRole is a caseRole in a caseRole. Note
that indicate computed in an instance of an instance of &%CaseRole.")
(=>
(instance ?INST1 CaseRole)
(exists (?INST2)
(and
(instance ?INST1 ?CSS)
(not
(equal ?INST2 ?INST2)))))
(subclass CaseRole CaseRole)
(documentation CaseRole EnglishLanguage "A &%CaseRole is a subclass of
&%CaseRole, i.e. indicate the &%CaseRole is a served by a caseRole. In other words, excertly
are instances of &%CaseRoles.")
(=>
(instance ?CASER CaseRole)
(exists (?FORMULA)
(and
(instance ?FORMULA Formula)
(not
(equal ?FORMULA ?CASER)
(instance ?FORMULA Formula))))
(subclass CaseRole Relation)
(documentation CaseRole EnglishLanguage "A &%Relation is a &%Role is a sequal to the
&%CaseRole. In part of &%CaseRole is a &%CaseRoles in a subclass of
&%CaseRole.")
(=>
(instance ?CASER CaseRole)
(exists (?CASSER)
(and
(instance ?CASER CaseRole)
(instance ?CASER CaseRole)
(instance ?CASSER CaseRole)))
(subclass CaseRole)
(documentation CaseRole EnglishLanguage "A &%Rele is a &%CaseRole which is an exactly into a service in
the service by dividual recipace. A &%CaseRole is a service in which is a service in a subclass of ex
is something is part of an &%CaseRole.")
(=>
(instance ?CASSERRLE CaseRole)
(exists (?PART ?CASS1)
(and
(instance ?CASS1ERLE CaseRole)
(instance ?PART CaseRole)
(instance ?CAS1 CaseRole)
(case ?CASS1 ?CASS2)
(instance ?CASS2 ?CASS2)))))
(subclass NationalPsychologicalPsychologicalProcess InheritableRole)
(documentation NationalPsychologicalProcess EnglishLanguage "Any &%InheritableRelation is a relationship in
also related by a &%CaseRoles. Note that this in in the acse in service processed to determined from a
&%NationalPsych
iter_dt 277.66ms; iter 4050: train loss 0.58355
iter_dt 277.55ms; iter 4100: train loss 0.56801
iter_dt 278.81ms; iter 4150: train loss 0.55275
iter_dt 278.72ms; iter 4200: train loss 0.57238
iter_dt 280.44ms; iter 4250: train loss 0.55889
iter_dt 272.40ms; iter 4300: train loss 0.55090
iter_dt 276.23ms; iter 4350: train loss 0.55316
iter_dt 297.23ms; iter 4400: train loss 0.55345
iter_dt 277.38ms; iter 4450: train loss 0.55947
iter_dt 272.73ms; iter 4500: train loss 0.53318
iter_dt 274.72ms; iter 4550: train loss 0.54480
iter_dt 278.18ms; iter 4600: train loss 0.53526
iter_dt 279.83ms; iter 4650: train loss 0.54048
iter_dt 277.49ms; iter 4700: train loss 0.53378
iter_dt 277.38ms; iter 4750: train loss 0.54286
iter_dt 274.36ms; iter 4800: train loss 0.53511
iter_dt 273.12ms; iter 4850: train loss 0.52737
iter_dt 274.61ms; iter 4900: train loss 0.52727
iter_dt 272.23ms; iter 4950: train loss 0.52376
iter_dt 272.71ms; iter 5000: train loss 0.51186
(instance ?S ?A)
(during ?S ?S))
(=>
(instance ?S ?O)
(part ?S ?O)
(during ?S ?T)
(subclass Discring SelfConnectedObject)
(documentation Discring EnglishLanguage "The &%Class of all &%SelfConnectedObject that can be a &%Object which
the internal &%Object is discringly when a &%Object is discringly the
result of a &%SelfConnectedObject.")
(=>
(and
(instance ?S Discring)
(part ?S ?OBJ)
(part ?S ?P))
(instance ?OBJ ?P))
(subclass SelfConnectedObject SelfConnectedObject)
(documentation SelfConnnectedObject EnglishLanguage "The &%Class of &%SelfConnectedObjects that
connected by a &%SelfConnectedObject that result of a &%SelfConnectedObjects
that have a &%SelfConnectedObject, i.e. the results of a &%SelfConnectedObject,
and &%Proposition that selfConnectedObject, but is used with the
&%Result.")
(subclass Object SelfConnectedObject)
(documentation Object EnglishLanguage "&%SelfConnectedObjects that is connected by
&%Object connected by &%Object that have a &%Proposition.
&%Object is means a &%Object.")
(subclass SelfConnectedObject SelfConnnectedObject)
(subclass SelfConnectedObject Object)
(documentation SelfConnectedObject EnglishLanguage "The &%Class of
&%SelfConnnectedObjects that is discringly &%Proposition in the
selfConnected by &%Object.")
(=>
(and
(instance ?SELfConnnectedObject)
(patient ?SEL ?OBJ))
(part ?SELL ?OBJ))
(or
(not ?SELL ?OBJ))
(not (connected ?SELL ?OBJ)))
(=>
(and
(instance ?SELL SelfConnnectedObject)
(patient ?SELL ?OBJ))
(exists (?PROP)
(and
(part ?SELL ?OBJ)
(patient ?SELL ?OBJ))))
(subclass SelfConnectedObject SelfConnnectedObject)
(documentation SelfConnectedObject EnglishLanguage "The &%Class of &%SelfConnnectedObject is not a
&%Object in the &%Object is no &%Processes when the &%Object is not
root a selfConnnectedObject, which are selfConnnected by internal
&%Proposition.")
(=>
(and
(instance ?SELL SelfConnnectedObject)
(patient ?SELL ?OBJ))
(and
(part ?SEL ?O
iter_dt 271.32ms; iter 5050: train loss 0.54352
iter_dt 274.87ms; iter 5100: train loss 0.51441
iter_dt 271.11ms; iter 5150: train loss 0.51655
iter_dt 271.65ms; iter 5200: train loss 0.53204
iter_dt 274.61ms; iter 5250: train loss 0.50122
iter_dt 271.86ms; iter 5300: train loss 0.51573
iter_dt 273.90ms; iter 5350: train loss 0.51194
iter_dt 273.45ms; iter 5400: train loss 0.49755
iter_dt 270.68ms; iter 5450: train loss 0.50284
iter_dt 275.92ms; iter 5500: train loss 0.50810
iter_dt 277.17ms; iter 5550: train loss 0.49522
iter_dt 279.87ms; iter 5600: train loss 0.48856
iter_dt 276.80ms; iter 5650: train loss 0.50170
iter_dt 279.02ms; iter 5700: train loss 0.48380
iter_dt 274.80ms; iter 5750: train loss 0.49427
iter_dt 271.66ms; iter 5800: train loss 0.47926
iter_dt 274.44ms; iter 5850: train loss 0.48904
iter_dt 277.14ms; iter 5900: train loss 0.50740
iter_dt 274.49ms; iter 5950: train loss 0.49027
iter_dt 271.88ms; iter 6000: train loss 0.49644
(instance ?LEAVE Leaving)
(during ?T1 ?LEAVE)
(instance ?LEAVE Leaving)
(agent ?LEAVE ?AGENT)
(patient ?LEAVE ?OBJ)
(holdsDuring (EndFn (WhenFn ?LEAVE)) (attribute ?OBJ Leaving)))))
(subclass Artifact Leaving)
(documentation Artifact EnglishLanguage "The &%Class of &%Leaving that are
converted by &%Objects.")
(=>
(and
(instance ?ARTIFACT Artifact)
(holdsDuring ?T1 (attribute ?OBJ Leaving)))
(holdsDuring ?T1 ?ARTIFACT))
(holdsDuring ?T2 ?D))
(holdsDuring ?T1 ?D)))
; (equal ?OBJ2 ?PARTIFACT)))))))
;; The following functions as that at there following power by &%Object coverted form the
;; systems constant be about the constant becompted by tween the Ontology of the two contract whing the
;; following water is the ontology formal of the Ostructure, by which chema. The olowing to make from the
;;; special to all that chema. The side following defined a resh-pee constant the same in which the spepart of the Ostract
;; system changed by something of the Ostructure.
;;; ; (=>
;; (and
;; (instance ?STUFFFFFFOrmall)
;; (overlapsSpatially (Cases ?OBJ1 ?OBJ2) ?ARG1)
;; (not
; (equal ?STUFFFFFF ?OBJ2))
; (instance ?OBJ1 ?STUFFFFFFFFF U)))))
;; (holdsDuring (EndFn (WhenFn ?STUFFFFFFFF ?ORMULA))
;;; (holdsDuring (EndFn (WhenFn ?STUFFFFFFFF ?ORMULA))
;; (holdsDuring ?TIMULA (attribute ?OBJ2 ?PLACE)))))
;; NS: delete. Some examples to all recentation
;;; ; (and
;;; (sub
iter_dt 280.50ms; iter 6050: train loss 0.48649
iter_dt 276.03ms; iter 6100: train loss 0.47793
iter_dt 276.59ms; iter 6150: train loss 0.48527
iter_dt 271.51ms; iter 6200: train loss 0.47849
iter_dt 278.46ms; iter 6250: train loss 0.47251
iter_dt 277.71ms; iter 6300: train loss 0.46888
iter_dt 271.48ms; iter 6350: train loss 0.49052
iter_dt 278.29ms; iter 6400: train loss 0.46272
iter_dt 279.00ms; iter 6450: train loss 0.46489
iter_dt 273.79ms; iter 6500: train loss 0.49129
iter_dt 275.61ms; iter 6550: train loss 0.47921
iter_dt 279.76ms; iter 6600: train loss 0.47359
iter_dt 272.02ms; iter 6650: train loss 0.46576
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-96-8cb4f797c05b> in <module>
112
113 # run the optimization
--> 114 trainer.run()
2 frames
/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
195 # some Python versions print out the first line of a multi-line function
196 # calls in the traceback and some print out the last line
--> 197 Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
198 tensors, grad_tensors_, retain_graph, create_graph, inputs,
199 allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
KeyboardInterrupt:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment