Artwork

תוכן מסופק על ידי Machine Learning Street Talk (MLST). כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Machine Learning Street Talk (MLST) או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.
Player FM - אפליקציית פודקאסט
התחל במצב לא מקוון עם האפליקציה Player FM !

Ryan Greenblatt - Solving ARC with GPT4o

2:18:01
 
שתפו
 

Manage episode 427554540 series 2803422
תוכן מסופק על ידי Machine Learning Street Talk (MLST). כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Machine Learning Street Talk (MLST) או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

Ryan Greenblatt from Redwood Research recently published "Getting 50% on ARC-AGI with GPT-4.0," where he used GPT4o to reach a state-of-the-art accuracy on Francois Chollet's ARC Challenge by generating many Python programs.

Sponsor:

Sign up to Kalshi here https://kalshi.onelink.me/1r91/mlst -- the first 500 traders who deposit $100 will get a free $20 credit! Important disclaimer - In case it's not obvious - this is basically gambling and a *high risk* activity - only trade what you can afford to lose.

We discuss:

- Ryan's unique approach to solving the ARC Challenge and achieving impressive results.

- The strengths and weaknesses of current AI models.

- How AI and humans differ in learning and reasoning.

- Combining various techniques to create smarter AI systems.

- The potential risks and future advancements in AI, including the idea of agentic AI.

https://x.com/RyanPGreenblatt

https://www.redwoodresearch.org/

Refs:

Getting 50% (SoTA) on ARC-AGI with GPT-4o [Ryan Greenblatt]

https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt

On the Measure of Intelligence [Chollet]

https://arxiv.org/abs/1911.01547

Connectionism and Cognitive Architecture: A Critical Analysis [Jerry A. Fodor and Zenon W. Pylyshyn]

https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/proseminars/Proseminar13/ConnectionistArchitecture.pdf

Software 2.0 [Andrej Karpathy]

https://karpathy.medium.com/software-2-0-a64152b37c35

Why Greatness Cannot Be Planned: The Myth of the Objective [Kenneth Stanley]

https://amzn.to/3Wfy2E0

Biographical account of Terence Tao’s mathematical development. [M.A.(KEN) CLEMENTS]

https://gwern.net/doc/iq/high/smpy/1984-clements.pdf

Model Evaluation and Threat Research (METR)

https://metr.org/

Why Tool AIs Want to Be Agent AIs

https://gwern.net/tool-ai

Simulators - Janus

https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators

AI Control: Improving Safety Despite Intentional Subversion

https://www.lesswrong.com/posts/d9FJHawgkiMSPjagR/ai-control-improving-safety-despite-intentional-subversion

https://arxiv.org/abs/2312.06942

What a Compute-Centric Framework Says About Takeoff Speeds

https://www.openphilanthropy.org/research/what-a-compute-centric-framework-says-about-takeoff-speeds/

Global GDP over the long run

https://ourworldindata.org/grapher/global-gdp-over-the-long-run?yScale=log

Safety Cases: How to Justify the Safety of Advanced AI Systems

https://arxiv.org/abs/2403.10462

The Danger of a “Safety Case"

http://sunnyday.mit.edu/The-Danger-of-a-Safety-Case.pdf

The Future Of Work Looks Like A UPS Truck (~02:15:50)

https://www.npr.org/sections/money/2014/05/02/308640135/episode-536-the-future-of-work-looks-like-a-ups-truck

SWE-bench

https://www.swebench.com/

Using DeepSpeed and Megatron to Train Megatron-Turing NLG

530B, A Large-Scale Generative Language Model

https://arxiv.org/pdf/2201.11990

Algorithmic Progress in Language Models

https://epochai.org/blog/algorithmic-progress-in-language-models

  continue reading

213 פרקים

Artwork
iconשתפו
 
Manage episode 427554540 series 2803422
תוכן מסופק על ידי Machine Learning Street Talk (MLST). כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Machine Learning Street Talk (MLST) או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

Ryan Greenblatt from Redwood Research recently published "Getting 50% on ARC-AGI with GPT-4.0," where he used GPT4o to reach a state-of-the-art accuracy on Francois Chollet's ARC Challenge by generating many Python programs.

Sponsor:

Sign up to Kalshi here https://kalshi.onelink.me/1r91/mlst -- the first 500 traders who deposit $100 will get a free $20 credit! Important disclaimer - In case it's not obvious - this is basically gambling and a *high risk* activity - only trade what you can afford to lose.

We discuss:

- Ryan's unique approach to solving the ARC Challenge and achieving impressive results.

- The strengths and weaknesses of current AI models.

- How AI and humans differ in learning and reasoning.

- Combining various techniques to create smarter AI systems.

- The potential risks and future advancements in AI, including the idea of agentic AI.

https://x.com/RyanPGreenblatt

https://www.redwoodresearch.org/

Refs:

Getting 50% (SoTA) on ARC-AGI with GPT-4o [Ryan Greenblatt]

https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt

On the Measure of Intelligence [Chollet]

https://arxiv.org/abs/1911.01547

Connectionism and Cognitive Architecture: A Critical Analysis [Jerry A. Fodor and Zenon W. Pylyshyn]

https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/proseminars/Proseminar13/ConnectionistArchitecture.pdf

Software 2.0 [Andrej Karpathy]

https://karpathy.medium.com/software-2-0-a64152b37c35

Why Greatness Cannot Be Planned: The Myth of the Objective [Kenneth Stanley]

https://amzn.to/3Wfy2E0

Biographical account of Terence Tao’s mathematical development. [M.A.(KEN) CLEMENTS]

https://gwern.net/doc/iq/high/smpy/1984-clements.pdf

Model Evaluation and Threat Research (METR)

https://metr.org/

Why Tool AIs Want to Be Agent AIs

https://gwern.net/tool-ai

Simulators - Janus

https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators

AI Control: Improving Safety Despite Intentional Subversion

https://www.lesswrong.com/posts/d9FJHawgkiMSPjagR/ai-control-improving-safety-despite-intentional-subversion

https://arxiv.org/abs/2312.06942

What a Compute-Centric Framework Says About Takeoff Speeds

https://www.openphilanthropy.org/research/what-a-compute-centric-framework-says-about-takeoff-speeds/

Global GDP over the long run

https://ourworldindata.org/grapher/global-gdp-over-the-long-run?yScale=log

Safety Cases: How to Justify the Safety of Advanced AI Systems

https://arxiv.org/abs/2403.10462

The Danger of a “Safety Case"

http://sunnyday.mit.edu/The-Danger-of-a-Safety-Case.pdf

The Future Of Work Looks Like A UPS Truck (~02:15:50)

https://www.npr.org/sections/money/2014/05/02/308640135/episode-536-the-future-of-work-looks-like-a-ups-truck

SWE-bench

https://www.swebench.com/

Using DeepSpeed and Megatron to Train Megatron-Turing NLG

530B, A Large-Scale Generative Language Model

https://arxiv.org/pdf/2201.11990

Algorithmic Progress in Language Models

https://epochai.org/blog/algorithmic-progress-in-language-models

  continue reading

213 פרקים

כל הפרקים

×
 
Loading …

ברוכים הבאים אל Player FM!

Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.

 

מדריך עזר מהיר

האזן לתוכנית הזו בזמן שאתה חוקר
הפעלה