Last active Oct 22, 2019
Deliberation as a method to find the "actual preferences" of humans

Some recent discussion about what Paul Christiano means by "short-term preferences" got me thinking more generally about deliberation as a method of figuring out the human user's or users' "actual preferences". (I can't give a definition of "actual preferences" because we have such a poor understanding of meta-ethics that we don't even know what the term should mean or if they even exist.)

To set the framing of this post: We want good outcomes from AI. To get this, we probably want to figure out the human user's or users' "actual preferences" at some point. There are several options for this:

  • Directly solve meta-ethics. We figure out whether there are normative facts about what we should value, and use this solution to clarify what "actual preferences" means and to find the human's or humans' "actual prefere
riceissa /
Last active Oct 30, 2019
Attempt to pass Paul's ITT for strategy-stealing stuff

warning: I'm currently making a bunch of changes to this

Understanding strategy-stealing in the Corrigible Contender scenario

Here is my current best guess for Paul's strategy-stealing position:

There is a tension between (a) doing things that the human user understands; and (b) being competitive, doing the "optimal" thing for the long term, stealing unaligned AIs' strategies, etc. Paul resolves this tension by giving up on (a), and focusing just on (b). This means that the human user will basically not understand what's going on in the world (the world is changing too quickly and too dramatically, the aligned AI is taking actions that are too difficult to understand, etc.).

If we were talking about the Sovereign Singleton scenario (I will be using terminology from Wei Dai's success stories post), giving up on (a) seems fine, since the AI would have a CEV-like specification of the human user's values. But in the Corrigibl

#!/usr/bin/env python3
import datetime
import mysql.connector
# import matplotlib
# matplotlib.use('Agg')
import matplotlib.pyplot as plt
cnx = mysql.connector.connect(user='issa', database='donations')
riceissa /
Last active Sep 9, 2019
Funding chains in the x-risk/AI safety ecosystem
#!/usr/bin/env python3
# License: CC0
from graphviz import Digraph
whitelist = {
'Open Philanthropy Project': "Open Phil",
'Future of Humanity Institute': "FHI",
'Machine Intelligence Research Institute': "MIRI",
'Berkeley Existential Risk Initiative': "BERI",
riceissa / eliezer_non_sequence_posts.csv
Last active Jun 30, 2019
Eliezer Yudkowsky's non-sequence posts on LessWrong
View eliezer_non_sequence_posts.csv
We can make this file beautiful and searchable if this error is corrected: It looks like row 8 should actually have 4 columns, instead of 3. in line 7.
2018-12-12T01:40:13.298Z,71,Should ethicists be inside or outside a profession?,
2018-12-07T22:24:17.072Z,82,Transhumanists Don't Need Special Dispositions,
2018-12-05T20:12:13.114Z,86,Transhumanism as Simplified Humanism,
2018-11-16T23:06:29.506Z,115,Is Clickbait Destroying Our General Intelligence?,
2018-10-28T20:09:32.056Z,108,On Doing the Improbable,
2018-10-04T00:38:58.795Z,150,The Rocket Alignment Problem,
#!/usr/bin/env python3
# List from
meta_dict = {
"English": "en",
"Cebuano": "ceb",
"Swedish": "sv",
"German": "de",
"French": "fr",
"Dutch": "nl",
riceissa / dump.tsv
Last active Mar 29, 2019
select donor,donations_url,website,notes from donors where donor_type = 'Individual' or donor_type = 'Couple'
View dump.tsv
donor donations_url website notes
Jeff Kaufman and Julia Wise It looks like the 50% of AGI became a target as a result of a compromise between Jeff and Julia: Jeff sought a little less and Julia sought a little more. In 2017, the couple targeted 30% initially, while Jeff was working at a lower-pay but potentially higher direct-impact job at money transfer company Wave. However, it went back up to 50% after he was fired from Wave and returned to Google. In 2011, the couple did not donate because Jeff was working at a startup and Julia was paying for graduate school out of pocket, see note [1] at Also, note that donations listed in italics are not included as separate line items in the donations list on the Donations List Website; however, the ones among these that are employer matches are noted as employer matches to the corresponding donation. Consistency of totals against the totals listed on top at https://www.jefftk.c
riceissa / dump.tsv
Created Mar 29, 2019
select donor,sum(amount),group_concat(distinct url separator ' ') from donations group by donor order by sum(amount) desc
View dump.tsv
donor sum(amount) group_concat(distinct url separator ' ')
Vitalik Buterin 4295605.00
Thiel Foundation 1627000.00
Gordon Irlam 1333026.00
riceissa / dump.csv
Created Feb 1, 2019
View dump.csv
Day Index Pageviews
1/22/19 0
1/23/19 26
1/24/19 159
1/25/19 36
1/26/19 18
1/27/19 21
1/28/19 11
1/29/19 17
1/30/19 5
riceissa / analysis_theorems.tex
Last active Dec 7, 2018
dependency graph of theorems in real analysis
View analysis_theorems.tex
\node[draw,text width=3.5cm] (lub) at (5,10) {Least upper bound property};
\node[draw,text width=4cm] (bw) at (0,0) {Bolzano--Weierstrass theorem};
\node[draw,text width=3cm] (nested) at (11,0) {Nested intervals theorem};
\node[draw,text width=3cm] (ivt) at (11,5) {Intermediate value theorem};
\node[draw,text width=3cm] (bounded) at (5,5) {Boundedness theorem};
\node[draw,text width=3cm] (evt) at (0,5) {Extreme value theorem};
\draw[->] (lub) -- (nested) node[midway, fill=white, text width=1.5cm] {Stillwell, Folland};
\draw[->] (lub) -- (evt) node[midway, fill=white] {Spivak};
