Using The Smartest AI To Rate Other AI Unsupervised Learning podcast

Artwork

Security Software Development Daniel Miessler Tech Future Hacking InfoSec

Player FM - Internet Radio Done Right

341 subscribers

הוסף לפני seven שנים

תוכן מסופק על ידי Daniel Miessler. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Daniel Miessler או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

<

<div class="span index">1</div> <span><a class="" data-remote="true" data-type="html" href="/series/tedtalks-technology">TED Tech</a></span>

<div class="span index">1</div> <span><a class="" data-remote="true" data-type="html" href="/series/tedtalks-technology">TED Tech</a></span> podcast artwork

<div class="span index">1</div> <span><a class="" data-remote="true" data-type="html" href="/series/tedtalks-technology">TED Tech</a></span> podcast artwork

1
TED Tech

בטל רישום

לפני 7 ימיםלפני 7d ago

בטל רישום

שבועי

From the construction of virtual realities to the internet of things to the watches on our wrists—technology's influence is everywhere. Its role in our lives is evolving fast, and we're faced with riveting questions and tough challenges that sit at the intersection of technology and humanity. Listen in every Friday , with host, journalist Sherrell Dorsey , as TED speakers explore the way tech shapes how we think about society, science, design, business, and more. Follow Sherrell on Instagram @sherrell_dorsey and on LinkedIn @sherrelldorsey Hosted on Acast. See acast.com/privacy for more information.

Unsupervised Learning « »
Using the Smartest AI to Rate Other AI

לפני שנה 9:35

שתפו

MP3•בית הפרקים

תוכן מסופק על ידי Daniel Miessler. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Daniel Miessler או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels.

I talk about:

1. Using One AI to Evaluate Another
The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review.

2. A Human-Centric Grading System
Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected.

3. Custom Prompts That Push for Deeper Evaluation
The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance.

Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

… continue reading

545 פרקים

#Security #Software Development #Daniel Miessler #Tech #Future #Hacking #InfoSec

Artwork

Using the Smartest AI to Rate Other AI

Unsupervised Learning

341 subscribers

published לפני שנה

שתפו

MP3•בית הפרקים

תוכן מסופק על ידי Daniel Miessler. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Daniel Miessler או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels.

I talk about:

1. Using One AI to Evaluate Another
The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review.

2. A Human-Centric Grading System
Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected.

3. Custom Prompts That Push for Deeper Evaluation
The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance.

Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

… continue reading

545 פרקים

#Security #Software Development #Daniel Miessler #Tech #Future #Hacking #InfoSec

All episodes

×

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 489: STANDARD EDITION | My personal toolchain updates, Google tracking through DuckDuckGo, Anthropic’s Pentagon Deal, Grok4 NSFW, Substack Crushes WSJ, and more... 22:01

לפני 8 ימים22:01

הפעל מאוחר יותר

הפעל מאוחר יותר

22:01

UL NO. 489: STANDARD EDITION | My personal toolchain updates, Google tracking through DuckDuckGo, Anthropic’s Pentagon Deal, Grok4 NSFW, Substack Crushes WSJ, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-489 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 488: STANDARD EDITION | Google Granting Confusing Access to Gemini, A New Favorite Creator, Russia's new Autonomous Drones, Claude Code Madness and Neovim Config, and more... 30:11

לפני 14 ימים30:11

הפעל מאוחר יותר

הפעל מאוחר יותר

30:11

UL NO. 488: STANDARD EDITION | Google Granting Confusing Access to Gemini, A New Favorite Creator, Russia's new Autonomous Drones, Claude Code Madness and Neovim Config, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-488 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 487: STANDARD EDITION: Iranian Critical Infra Attacks, Insane Recent Productivity, A Chinese Mosquito Drone, Marcus's Response to Our AI Debate, "Context Engineering" Ain't It, and more... 41:31

לפני 22 ימים41:31

הפעל מאוחר יותר

הפעל מאוחר יותר

41:31

UL NO. 487: STANDARD EDITION: Iranian Critical Infra Attacks, Insane Recent Productivity, A Chinese Mosquito Drone, Marcus's Response to Our AI Debate, "Context Engineering" Ain't It, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-487 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
An AI Debate with Marcus Hutchins 2:00:04

לפני 28 ימים2:00:04

הפעל מאוחר יותר

הפעל מאוחר יותר

2:00:04

Marcus and I debate AIs capabilities from nearly polar opposite ends. He thinks it's basically autocomplete, and I think it's the most important tech we've ever built as humans. It was a fantastic, and very civil conversation, so thanks to Marcus for that, and we're already planning on Part 2. This two-hour discussion covers: 🧠 The real risks of AI vs. the imagined ones 🔐 How security researchers view AI's capabilities 🤖 The blurry line between useful and dangerous automation ⚖️ Bias, alignment, and who gets to control intelligence 📉 Whether AI might ultimately collapse under its own complexity Marcus Hutchins is best known for stopping the WannaCry ransomware attack and brings a sharp, skeptical perspective to AI. Marcus' Website: https://marcushutchins.com Watch the interview on YouTube: https://youtu.be/I9-iD_rLRjA Subscribe to the UL newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Follow Marcus on LinkedIn: https://www.linkedin.com/in/malwaretech/ Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 486 STANDARD EDITION: Fully Automated AI Malware (Binary and Web), My Debate with Marcus Hutchins on AI and more 55:03

לפני 28 ימים55:03

הפעל מאוחר יותר

הפעל מאוחר יותר

55:03

UL NO. 486: STANDARD EDITION: Fully Automated AI Malware (Binary and Web), My Debate with Marcus Hutchins on AI, The 'Did You Notice?' Psyop, The METR AI Metric for Longterm Tasks, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-486 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 485: STANDARD EDITION: Netflix RCE, My Current AI Stack, All-in on Claude Code, and more... 36:45

לפני 5 weeks36:45

הפעל מאוחר יותר

הפעל מאוחר יותר

36:45

STANDARD EDITION: Netflix RCE, My Current AI Stack, All-in on Claude Code, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-485 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 484: STANDARD EDITION: OpenAI's Malicious AI Report, Disappointed with WWDC, AI's First Actual Science Breakthrough, and more... 43:31

לפני 6 weeks43:31

הפעל מאוחר יותר

הפעל מאוחר יותר

43:31

UL NO. 484: STANDARD EDITION: OpenAI's Malicious AI Report, Disappointed with WWDC, AI's First Actual Science Breakthrough, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-484 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 483 | STANDARD EDITION: A Chrome 0-Day, Meta Automates Security Assessments, New Essays, My New Video on Hacking with AI, Ukraine's Asymmetrical Attack, Thoughts on My AI Skeptical Friends,… 31:39

לפני 7 weeks31:39

הפעל מאוחר יותר

הפעל מאוחר יותר

31:39

A Chrome 0-Day, Meta Automates Security Assessments, New Essays, My New Video on Hacking with AI, Ukraine's Asymmetrical Attack, Thoughts on My AI Skeptical Friends, The Dangers of Winning the Wrong Game, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-483 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
The Future of Hacking is Context 33:45

לפני 7 weeks33:45

הפעל מאוחר יותר

הפעל מאוחר יותר

33:45

Sponsored by Vanta. Vanta takes the busywork out of GRC so you can focus on what actually matters—improving your security, not chasing compliance. https://ul.live/vanta This isn’t just another AI podcast. It’s about the deeper shift that’s happening in cybersecurity—away from individual tools and dashboards, and toward real-time, comprehensive world models of what we’re trying to protect or attack. I'll walk through how I came to this idea, what it means for security assessments, red teaming, vuln management, and beyond—and why context, not AI, is the actual revolution. 📽️Check out the full video here: https://youtu.be/UwTTcka1Wd8 Topics covered: Why the core problem in security is organizational knowledge Unified Entity Context (UEC) as the future architecture Modular, AI-augmented security stacks Why every attacker and defender will soon be running one How this flips the AI conversation on its head If you care about where hacking, automation, and AI are headed—this is the blueprint. 📬Subscribe for updates about trends and ideas in Cybersecurity, National Security, AI, Technology, and Society👇🏼 https://newsletter.danielmiessler.com/ 👉🏻 X (Twitter): https://ul.live/x 👉🏻 Instagram: https://ul.live/ig 👉🏻 BlueSky: https://ul.live/bluesky 👉🏻 LinkedIn: https://ul.live/li Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
UL NO. 482 | STANDARD EDITION: AI Finds an 0-Day!, Postman Leaking Secrets, High Agency Mental Model, My Unified Entity Context Video, Github MCP Leaks Private Repos, Google vs. OpenAI vs. Apple on… 31:33

לפני 8 weeks31:33

הפעל מאוחר יותר

הפעל מאוחר יותר

31:33

AI Finds an 0-Day!, Postman Leaking Secrets, High Agency Mental Model, My Unified Entity Context Video, Github MCP Leaks Private Repos, Google vs. OpenAI vs. Apple on AI Vision, and more... You are currently listening to the Standard version of the podcast, consider upgrading and becoming a member to unlock the full version and many other exclusive benefits here : https://newsletter.danielmiessler.com/upgrade Read this episode online: https://newsletter.danielmiessler.com/p/ul-482 Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
Unified Entity Context 30:18

לפני 10 weeks30:18

הפעל מאוחר יותר

הפעל מאוחר יותר

30:18

🔹 Thanks to ProjectDiscovery for sponsoring today’s video. I've been using their tools like Nuclei and Subfinder for years, and now they’ve brought that power to the cloud with a full vulnerability management platform. ➡ Try it yourself at https://ul.live/PD1 For over a decade, I've been exploring how AI and context intersect—and I believe Unified Entity Context (UEC) is the key to unlocking what comes next. In this podcast, I walk through my journey—from security assessments and AI-powered tools to building real-world demos like Alma and Threshold. The core idea? That most hard decisions are only hard because we lack the necessary context. With rich, accurate, and fresh context, even complex decisions become simple. If you're building in security, investing in AI, or just trying to understand where things are heading, this concept might reframe everything. Check out the full video here: https://youtu.be/IHUqk90ch7I Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
Reviewing RSA 2025 with Jason Haddix 1:21:02

לפני 11 weeks1:21:02

הפעל מאוחר יותר

הפעל מאוחר יותר

1:21:02

What really happened at RSA 2024? Daniel Miessler and Jason Haddix break it down. Fresh off a whirlwind RSA week, Daniel sits down with Jason Haddix (Arcanum Information Security) to talk about what mattered—beyond the show floor noise. From off-site innovation summits to real-world AI implementation, this deep dive covers: -Where the real innovation happened (hint: not on the show floor) -Key takeaways from the OpenAI and Airbnb AI Security events -Jason’s talk on AI pentesting methodology and the Prompt Injection Taxonomy -The future of cybersecurity moats and the risk of AI-native disruption -Why agents aren’t the main character—data is -DARPA's AIxCC competition and the rise of Cyber Reasoning Systems -Challenges with evals, autonomous security workflows, and VDP backlash -Behind the scenes at RSA: puppies, parties, burnout, and brutal honesty They also explore content creation, the future of platform-native context, and why being opinionated (with receipts) matters more than ever in security and tech. Jason's Company https://arcanum-sec.com Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
A Conversation with Bar-El Tayouri from Mend.io 45:53

לפני 11 weeks45:53

הפעל מאוחר יותר

הפעל מאוחר יותר

45:53

➡ Get full visibility, risk insights, red teaming, and governance for your AI models, AI agents, RAGs, and more—so you can securely deploy AI powered applications with ul.live/mend In this episode, I speak with Bar-El Tayouri, Head of AI Security at Mend.io , about the rapidly evolving landscape of application and AI security—especially as multi-agent systems and fuzzy interfaces redefine the attack surface. We talk about: • Modern AppSec Meets AI Agents How traditional AppSec falls short when it comes to AI-era components like agents, MCP servers, system prompts, and model artifacts—and why security now depends on mapping, monitoring, and understanding this entire stack. • Threat Discovery, Simulation, and Mitigation How Mend’s AI security suite identifies unknown AI usage across an org, simulates dynamic attacks (like prompt injection via PDFs), and provides developers with precise, in-code guidance to reduce risk without slowing innovation. • Why We’re Rethinking Identity, Risk, and Governance Why securing AI systems isn’t just about new threats—it’s about re-implementing old lessons: identity access, separation of duties, and system modeling. And why every CISO needs to integrate security into the dev workflow instead of relying on blunt-force blocking. Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler Chapters: 00:00 - From Game Hacking to AI Security: Barel’s Tech Journey 03:51 - Why Application Security Is Still the Most Exciting Challenge 04:39 - The Real AppSec Bottleneck: Prioritization, Not Detection 06:25 - Explosive Growth of AI Components Inside Applications 12:48 - Why MCP Servers Are a Massive Blind Spot in AI Security 15:02 - Guardrails Aren’t Keeping Up With Agent Power 16:15 - Why AI Security Is Maturing Faster Than Previous Tech Waves 20:59 - Traditional AppSec Tools Can’t Handle AI Risk Detection 26:01 - How Mend Maps, Discovers, and Simulates AI Threats 34:02 - What Ideal Customers Ask For When Securing AI 38:01 - Beyond Guardrails: Mend’s Guide Rails for In-Code Mitigation 41:49 - Multi-Agent Systems Are the Next Security Nightmare 45:47 - Final Advice for CISOs: Enable, Don’t Disable Developers Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
The 4 AAAAs of the AI ECOSYSTEM: Assistants, APIs, Agents, and Augmented Reality 27:04

לפני 13 weeks27:04

הפעל מאוחר יותר

הפעל מאוחר יותר

27:04

In this episode, I break down what I believe is the emerging structure of the AI-powered world we're all building—consciously or not. I call it the “Four A’s”: Assistants, APIs, Agents, and Augmented Reality. This framework helps make sense of recent developments and where it’s all headed. I talk about: 1. Digital Assistants That Understand and Optimize Your Life Your DA (like “Kai”) will know your goals, preferences, health, schedule, and context—and proactively optimize your day, from filtering messages to planning meals or surfacing relevant information in real time. 2. APIs and the Real Internet of Things Everything becomes an API—from businesses to people to physical objects. Your assistant interacts with these APIs to act on your behalf, turning the world into a navigable ecosystem of services, tools, and resources. 3. Agents and AR Bringing It All Together Agents act autonomously to complete multi-step goals, and AR glasses will display their outputs contextually as you move through the world. These systems will collaborate, search, and act—quietly transforming how we live, work, and perceive reality. Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler See you in the next one! Chapters: 00:00 - The AI Ecosystem We’re Building Without Realizing It 01:33 - Assistant: Your Most Powerful Digital Companion 03:08 - APIs: How DAs Interact with the World 07:54 - Agents: The Step Beyond Automation 11:00 - Augmented Reality: The Interface Layer of the AI Ecosystem 14:20 - Combining APIs, Agents, and UI for Real-Time Situational Awareness 17:17 - Summary: A Unified Ecosystem Driven by the Four A’s 23:36 - Industry Trends: How Companies Like OpenAI, Apple, and Meta Fit In 25:11 - Final Thoughts on Timelines, Winners, and Interpreting AI News Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

U

Unsupervised Learning

Unsupervised Learning podcast artwork

Unsupervised Learning podcast artwork

1
Using the Smartest AI to Rate Other AI 9:35

לפני 14 weeks9:35

הפעל מאוחר יותר

הפעל מאוחר יותר

9:35

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels. I talk about: 1. Using One AI to Evaluate Another The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review. 2. A Human-Centric Grading System Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected. 3. Custom Prompts That Push for Deeper Evaluation The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance. Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models. Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at: https://danielmiessler.com/upgrade Follow on X: https://x.com/danielmiessler Follow on LinkedIn: https://www.linkedin.com/in/danielmiessler See you in the next one! Become a Member: https://danielmiessler.com/upgrade See omnystudio.com/listener for privacy information.…

ברוכים הבאים אל Player FM!

Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.

תקשיבו ל-500+ נושאים

מדריך עזר מהיר

פודקאסטים מובילים

מכביבול ברדיו תל אביב

הזווית - פודקאסט

בכל יום נתון - אוריאל דסקל וראם שרמן

עושים חשבון Osim Heshbon

טייכר וזרחוביץ' ברדיו תל אביב

טייכר וזרחוביץ' - מערכונים

גיבור תרבות Culture Hero

סרטים זה אנחנו - מגזין קולנוע

עושים תוכנה Osim Tochna

השבוע - פודקאסט הארץ

המשחק הגדול

חושבים טוב

המעבדה The Lab

מדע בגובה האוזניים

סליחה על השאלה - ההסכת You Can't Ask That Podcast

סיפור ישראלי

עבר פלילי

Lets Talk Murder | בואי נדבר רצח

Lets Talk Murder | בואי נדבר רצח

האזן לתוכנית הזו בזמן שאתה חוקר