"Humans Provide An Untapped Wealth Of Evidence About Alignment" By TurnTrout & Quintin Pope LessWrong Curated podcast

"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope

1+ y ago 22:48

שתפו

תוכן מסופק על ידי LessWrong. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי LessWrong או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

https://www.lesswrong.com/posts/CjFZeDD6iCnNubDoS/humans-provide-an-untapped-wealth-of-evidence-about#fnref7a5ti4623qb

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

TL;DR: To even consciously consider an alignment research direction, you should have evidence to locate it as a promising lead. As best I can tell, many directions seem interesting but do not have strong evidence of being “entangled” with the alignment problem such that I expect them to yield significant insights.

For example, “we can solve an easier version of the alignment problem by first figuring out how to build an AI which maximizes the number of real-world diamonds” has intuitive appeal and plausibility, but this claim doesn’t have to be true and this problem does not necessarily have a natural, compact solution. In contrast, there do in fact exist humans who care about diamonds. Therefore, there are guaranteed-to-exist alignment insights concerning the way people come to care about e.g. real-world diamonds.

“Consider how humans navigate the alignment subproblem you’re worried about” is a habit which I (TurnTrout) picked up from Quintin Pope. I wrote the post, he originated the tactic.

254 פרקים

https://www.lesswrong.com/posts/CjFZeDD6iCnNubDoS/humans-provide-an-untapped-wealth-of-evidence-about#fnref7a5ti4623qb

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

“Consider how humans navigate the alignment subproblem you’re worried about” is a habit which I (TurnTrout) picked up from Quintin Pope. I wrote the post, he originated the tactic.

פודקאסטים ששווה להאזין

LessWrong Curated Podcast « »
"Humans provide an untapped wealth of evidence about alignment" by TurnTrout & Quintin Pope