Learning From Human Preferences
סדרה בארכיון ("עדכון לא פעיל" status)
When? This feed was archived on February 21, 2025 21:08 (). Last successful fetch was on January 02, 2025 12:05 ()
Why? עדכון לא פעיל status. השרתים שלנו לא הצליחו לאחזר פודקאסט חוקי לזמן ממושך.
What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.
Manage episode 424744810 series 3498845
One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.
Original article:
https://openai.com/research/learning-from-human-preferences
Authors:
Dario Amodei, Paul Christiano, Alex Ray
A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.
85 פרקים