התחל במצב לא מקוון עם האפליקציה Player FM !
LW - Confusing the metric for the meaning: Perhaps correlated attributes are "natural" by NickyP
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 26, 2024 16:04 ()
What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 430585431 series 2997284
Epistemic status: possibly trivial, but I hadn't heard it before.
TL;DR: What I thought of as a "flaw" in PCA - its inability to isolate pure metrics - might actually be a feature that aligns with our cognitive processes. We often think in terms of composite concepts (e.g., "Age + correlated attributes") rather than pure metrics, and this composite thinking might be more natural and efficient
Introduction
I recently found myself describing Principal Component Analysis (PCA) and pondering its potential drawbacks. However, upon further reflection, I'm reconsidering whether what I initially viewed as a limitation might actually be a feature. This led me to think about how our minds - and, potentially, language models - might naturally encode information using correlated attributes.
An important aspect of this idea is the potential conflation between the metric we use to measure something and the actual concept we're thinking about. For instance, when we think about a child's growth, we might not be consciously separating the concept of "age" from its various correlated attributes like height, cognitive development, or physical capabilities. Instead, we might be thinking in terms of a single, composite dimension that encompasses all these related aspects.
After looking at active inference a while ago, it seems like in general, a lot of human heuristics and biases seem like they are there to encode real-world relationships that exist in the world in a more efficient way, which are then strained in out-of-distribution experimental settings to seem "irrational".
I think the easiest way to explain is with a couple of examples:
1 - Age and Associated Attributes in Children
Suppose we plotted two attributes: Age (in years) vs Height (in cm) in children. These are highly correlated, so if we perform Principal Component Analysis, we will find there are two main components. These will not correspond to orthogonal Age and Height components, since they are quite correlated. Instead, we will find an "Age + Height" direction, and a "Height relative to what is standard for that age" direction.
While once can think of this as a "failure" of PCA to find the "true things we are measuring", I think this is perhaps not the correct way to think about it.
For example, if I told you to imagine a 10-year-old, you would probably imagine them to be of height ~140 5cm. And if I told you they were 2.0m tall or 0.5m tall, you would be very surprised. On the other hand, one often hears phrases like "about the height of a 10-year-old".
That is, when we think about a child's development, we don't typically separate each attribute into distinct vectors like "age," "height," "voice pitch," and so on. Instead, we might encode a single "age + correlated attributes" vector, with some adjustments for individual variations.
This approach is likely more efficient than encoding each attribute separately. It captures the strong correlations that exist in typical development, while allowing for deviations when necessary.
When one talks about age, one can define it as:
"number of years of existence" (independent of anything else)
but when people talk about "age" in everyday life, the definition is more akin to:
"years of existence, and all the attributes correlated to that".
2 - Price and Quality of Goods
Our tendency to associate price with quality and desirability might not be a bias, but an efficient encoding of real-world patterns. A single "value" dimension that combines price, quality, and desirability could capture the most relevant information for everyday decision-making, with additional dimensions only needed for finer distinctions.
That is, "cheap" can be conceptualised ...
2447 פרקים
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 26, 2024 16:04 ()
What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 430585431 series 2997284
Epistemic status: possibly trivial, but I hadn't heard it before.
TL;DR: What I thought of as a "flaw" in PCA - its inability to isolate pure metrics - might actually be a feature that aligns with our cognitive processes. We often think in terms of composite concepts (e.g., "Age + correlated attributes") rather than pure metrics, and this composite thinking might be more natural and efficient
Introduction
I recently found myself describing Principal Component Analysis (PCA) and pondering its potential drawbacks. However, upon further reflection, I'm reconsidering whether what I initially viewed as a limitation might actually be a feature. This led me to think about how our minds - and, potentially, language models - might naturally encode information using correlated attributes.
An important aspect of this idea is the potential conflation between the metric we use to measure something and the actual concept we're thinking about. For instance, when we think about a child's growth, we might not be consciously separating the concept of "age" from its various correlated attributes like height, cognitive development, or physical capabilities. Instead, we might be thinking in terms of a single, composite dimension that encompasses all these related aspects.
After looking at active inference a while ago, it seems like in general, a lot of human heuristics and biases seem like they are there to encode real-world relationships that exist in the world in a more efficient way, which are then strained in out-of-distribution experimental settings to seem "irrational".
I think the easiest way to explain is with a couple of examples:
1 - Age and Associated Attributes in Children
Suppose we plotted two attributes: Age (in years) vs Height (in cm) in children. These are highly correlated, so if we perform Principal Component Analysis, we will find there are two main components. These will not correspond to orthogonal Age and Height components, since they are quite correlated. Instead, we will find an "Age + Height" direction, and a "Height relative to what is standard for that age" direction.
While once can think of this as a "failure" of PCA to find the "true things we are measuring", I think this is perhaps not the correct way to think about it.
For example, if I told you to imagine a 10-year-old, you would probably imagine them to be of height ~140 5cm. And if I told you they were 2.0m tall or 0.5m tall, you would be very surprised. On the other hand, one often hears phrases like "about the height of a 10-year-old".
That is, when we think about a child's development, we don't typically separate each attribute into distinct vectors like "age," "height," "voice pitch," and so on. Instead, we might encode a single "age + correlated attributes" vector, with some adjustments for individual variations.
This approach is likely more efficient than encoding each attribute separately. It captures the strong correlations that exist in typical development, while allowing for deviations when necessary.
When one talks about age, one can define it as:
"number of years of existence" (independent of anything else)
but when people talk about "age" in everyday life, the definition is more akin to:
"years of existence, and all the attributes correlated to that".
2 - Price and Quality of Goods
Our tendency to associate price with quality and desirability might not be a bias, but an efficient encoding of real-world patterns. A single "value" dimension that combines price, quality, and desirability could capture the most relevant information for everyday decision-making, with additional dimensions only needed for finer distinctions.
That is, "cheap" can be conceptualised ...
2447 פרקים
All episodes
×ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.