| 
View
 

Logical Validity

Page history last edited by Mike 2 months, 2 weeks ago

Home > Page Design > Algorithms > Logical Validity Score

The Logical Validity Score

The Wall-of-Words Problem

Every platform that hosts opinions suffers from the same structural flaw.

Amazon reviews. Letterboxd comments. YouTube reactions. Podcast discussions. Article comment sections. They all allow people to express arguments, but they treat those arguments as monolithic blocks. Upvote or downvote. Helpful or not. Thumbs up or thumbs down.

A thoughtful critic writes five paragraphs explaining why something fails. Buried in that wall are three distinct critiques: a factual error, a logical flaw, and a contested value judgment. Those three claims deserve separate evaluation. Maybe the first is devastating, the second is weak, and the third is genuinely contested. But the platform treats the whole review as one unit.

The arguments never get decomposed. When hundreds of critics independently raise the same objection, those remain hundreds of separate walls of text. Nobody consolidates them into one crisp statement with branching pros and cons. When someone raises a genuinely novel objection, it doesn't stand out from the redundant pile.

The Logical Validity Score (LVS) solves this by decomposing arguments into atomic claims, each with its own branching tree of supporting and opposing sub-arguments. Similar points merge. Redundancy disappears. Every critique links to what it's critiquing. What emerges is a structured argument network: the most comprehensive, deduplicated map of why something's reasoning succeeds or fails.

Note: The LVS measures logical validity, not overall quality. A film can be beautifully shot while making flawed arguments. A book can be a slog to read while being logically airtight. A podcast can be entertaining while spreading misinformation. For craftsmanship, enjoyment, and other dimensions, see the companion Quality Scores for each medium.


Why Branching Sub-Arguments Matter

Here's where most evaluation systems fundamentally fail: they treat claims as self-contained, when actually every claim rests on deeper contested assumptions.

Consider a critic who writes: "This story is depressing. It doesn't offer hope or a satisfying resolution. Two stars."

That sounds like a straightforward opinion. But the critic is making a claim ("lack of hope is a flaw") that depends on winning several deeper arguments they never explicitly make:

Surface claim: "This is bad because it's hopeless"

Sub-argument 1: "Stories should offer hope"

  • Pro: Narrative evolved to model how humans overcome adversity. From prehistoric campfires to modern cinema, storytelling exists to show people triumphing over obstacles. Psychologists argue this is the fundamental utility of the medium. Wanting hope isn't naive Pollyanna thinking; it's what stories are for.
  • Con: That's a narrow view of narrative art. Tragedy is an ancient and respected genre. Oedipus doesn't offer hope. Neither does 1984, Chinatown, or Blood Meridian. "Art should comfort" is itself a contested aesthetic position, not an objective standard.

Sub-argument 2: "Realism that lacks hope is a flaw, not a feature"

  • Pro: Art that offers no path forward breeds despair and passivity. Relentless bleakness isn't "honest"โ€”it's one selective framing of reality, just as distorted as forced optimism.
  • Con: Life doesn't always have happy endings. Demanding that stories resolve neatly is demanding they lie. Some audiences specifically seek unflinching portrayals because sugar-coated narratives feel false to their experience.

Sub-argument 3: "My emotional response (depression) indicates a flaw in the work"

  • Pro: Audience response matters. If a work leaves people feeling worse without compensating insight, it has failed at a basic level.
  • Con: Discomfort isn't failure. The best art often disturbs. Kafka, Dostoevsky, Cormac McCarthy, and Lars von Trier regularly leave audiences unsettled. That's the point.

Inherited Validity

Here's the critical insight: the critic's two-star rating is only as valid as these sub-arguments.

If "stories should offer hope" scores 55% in its own right (strong evolutionary psychology support, but legitimate counter-examples from tragic art), then the surface claim "this is bad because it's hopeless" inherits that contested foundation.

The critic might be completely right that the work is hopeless. But whether "hopeless" equals "bad" is a separate question requiring its own analysis. The Idea Stock Exchange separates these:

  1. Factual component: "This work lacks hopeful resolution" (verifiable against the content)
  2. Evaluative component: "Lack of hope is a flaw" (links to the existing debate node on this question)

The factual component gets evaluated on accuracy. The evaluative component inherits its score from the linked debate. Critics can't smuggle contested assumptions as if they were obvious truths.

This cascading structure means:

  • Every critique is only as strong as the assumptions it rests on
  • Common debates ("should art comfort or challenge?") don't get re-litigated in every review; they get consolidated into canonical nodes
  • When underlying debates shift (new evidence about narrative psychology, for instance), all dependent critiques automatically update
  • Readers can trace any evaluation back through the reasoning that produced it

The 6 Logic Battlegrounds

Every claim becomes a belief node stress-tested in six arenas. These apply across all media types, though the specific application varies by medium.

1. Fallacy Decomposition

When someone flags a logical flaw, that accusation becomes its own node requiring evidence.

Example: A documentary critic claims "The filmmaker commits post hoc fallacy by linking social media to teen depression."

  • Supporting the accusation:
    • Correlation timeline doesn't establish causation
    • Multiple confounding variables changed simultaneously
  • Opposing the accusation:
    • The film cites longitudinal studies controlling for confounds
    • Dose-response relationship strengthens causal inference

The accusation's score depends on which branch survives scrutiny. No more "gotcha" accusations that go unchallenged. No more legitimate critiques that get buried.

Impact: Proven fallacies reduce the original claim's validity. Failed accusations get flagged so future readers don't waste time on them.

2. Contradiction Mapping

Internal inconsistencies get explicit evaluation: Is this a genuine contradiction or a nuanced position the critic missed?

Example: "The author praises free markets in Chapter 1 but demands regulation in Chapter 8."

  • Arguments it's a real contradiction:
    • Both chapters use absolute language that precludes nuance
    • No explicit framework distinguishing when each applies
  • Arguments it's nuanced consistency:
    • Chapter 1 addresses commodity markets; Chapter 8 addresses healthcare
    • Position matches mainstream economics on market failures

Impact: Confirmed contradictions reduce coherence scores. Alleged contradictions that turn out to be nuanced positions actually boost the scoreโ€”the work survived a stress test.

3. Evidence Evaluation Trees

We decompose the link between cited evidence and stated conclusions.

Example: A book claims "10,000 hours of practice produces expertise" and cites Ericsson (1993).

  • Arguments the evidence supports the claim:
    • Study found strong correlation between practice hours and skill
    • Replicated across multiple contexts
  • Arguments the evidence fails to support the claim:
    • Ericsson himself says the author misrepresented his findings
    • Meta-analyses show practice explains only 26% of variance
    • 10,000 was an average, not a threshold

Each sub-argument links to its source. You can trace critiques back to the original research.

Impact: Claims backed by evidence that survives scrutiny gain validity. Claims where cited evidence doesn't actually support the conclusion lose validity proportionally. A debunked foundational study triggers cascading score drops across everything that depends on it.

4. Metaphor Analysis

Creators use analogies to make complex ideas accessible. These can illuminate or mislead.

Example: A pundit argues "AI development is like building a nuclear bomb."

  • Points of illumination:
    • Both involve potentially irreversible large-scale consequences
    • Both require international coordination to manage
  • Points of distortion:
    • AI development is iterative and correctable; detonation isn't
    • Comparing tool to weapon prejudices policy conversation
    • "Nuclear" triggers emotional response that bypasses rational evaluation

Impact: Scores are weighted by how much argumentative work the metaphor does. A metaphor used for illustration is judged lightly. A "load-bearing metaphor" that carries the central argument is judged heavily. If it fails, the argument it supported fails with it.

5. Prediction Tracking

Testable predictions get tracked against real-world outcomes.

Example: A 2015 futurist book predicted "VR will replace traditional screens by 2020."

  • Arguments the prediction succeeded:
    • VR headset sales grew 30% annually
    • Major tech companies invested billions
  • Arguments the prediction failed:
    • "Replace" implies VR became primary interface; it didn't
    • Consumer adoption plateaued below 5% of households
    • Traditional screen sales continued growing

Impact: Failed predictions retroactively lower scores. Accurate foresight gains credibility over time. Partial credit for directionally correct predictions that missed on magnitude or timing.

6. Influence vs. Validity Mapping

We track the gap between how widely an idea spreads and how well it's reasoned.

Metrics tracked:

  • Academic citations
  • Policy influence
  • Social shares and viral reach
  • Sales and audience size

Impact: Pairing influence with validity reveals which weak arguments spread virally and which strong arguments languish. This exposes where marketing beats reasoning, where tribal loyalty trumps evidence, and where society is collectively failing at critical thinking.


Scoring Dynamics

Recursive Validity

Claims inherit validity from their sub-arguments. If a foundational study gets debunked, every argument that relies on it sees its score drop automatically. Strong foundations strengthen everything built on them; weak links weaken everything downstream.

Consolidation

When hundreds of critics make the same point, those merge into one well-stated critique incorporating the best evidence from all versions. The consolidated node includes the strongest supporting arguments and the strongest counter-arguments. Novel objections become visible instead of buried in redundancy.

Credibility Weighting

Users earn Reasoning Reputation based on:

  • Survival rate of their arguments against counter-arguments
  • Accuracy in identifying genuine fallacies versus false accusations
  • Quality of evidence provided
  • How well their consolidations hold up

Top contributors have their arguments weighted more heavily in initial scoring, but all arguments must survive on their merits.

Time-Decay and Updates

Scores evolve as evidence emerges. A claim that looked solid in 2015 might face new counter-evidence by 2025. Predictions get evaluated against outcomes. Works don't coast on past reputation.


Human + AI Synergy

Neither humans nor AI alone solves the wall-of-words problem optimally.

Role Strength Limitation
AI Pattern detection at scale; flags redundant arguments for consolidation; identifies where branching depth is insufficient Misses context, nuance, and value-laden disputes
Crowd Contextual judgment; lived experience with contested values; determines where debates actually stand Inconsistent quality; tribal biases; often stops at surface claims
Experts Deep domain knowledge; verifies technical claims; spots where arguments depend on resolved debates in other fields Limited bandwidth; own blind spots

Together, they produce better decomposition than any could individually.


Validity Score vs. Quality Score

The Logical Validity Score measures one thing: does the reasoning hold up under decomposition?

But creative works have other virtues. A film can be beautifully shot while making flawed arguments. A book can be a slog to read while being logically airtight. A podcast can be entertaining while spreading misinformation.

Remember the "hopeless story" example: even if "lack of hope is a flaw" scores only 55% in the evaluative debate, an audience member might still weight enjoyment highly. "Yes, I know hopelessness isn't objectively a flaw. I still don't enjoy hopeless stories. That's a preference, not an error."

Each medium has a companion Quality Score addressing non-logical dimensions: craftsmanship, readability/watchability, challenge level, emotional experience, and whether the work delivers what it promises.

A complete evaluation shows both scores. You might choose a 70% validity / 95% quality work for enjoyment, and a 95% validity / 60% quality work for research. Both choices are legitimate once you see the tradeoffs.


Medium-Specific Applications

The LVS applies across all media that make claims, but each medium has unique considerations:

Medium Unique Considerations Protocol Page
Books Extended arguments with cited evidence; nonfiction vs. fiction with implicit arguments Book Logical Validity Score
Movies Visual rhetoric; "based on true story" claims; documentaries vs. narrative films with embedded arguments Movie Logical Validity Score
Articles Journalism standards; op-ed vs. reporting; academic papers with methodology evaluation Article Logical Validity Score
Podcasts Conversational claims; interview dynamics; long-form argument development Podcast Logical Validity Score
Social Media Viral claims; meme arguments; thread decomposition Social Media Logical Validity Score

Join the Reasoning Network

The wall-of-words problem isn't going away on its own. The Logical Validity Score is our answer: decompose claims, branch into sub-arguments, consolidate redundancy, link to evidence, update as knowledge evolves.

Ways to participate:

  • ๐Ÿ“– Submit content for decomposition
  • ๐ŸŒณ Add branching sub-arguments where depth is insufficient
  • โš–๏ธ Evaluate arguments at every level
  • ๐Ÿ”— Link claims to evidence and related debates
  • ๐Ÿ“Š Propose better consolidations of redundant points

Help us refine the system. | View the Algorithm Code


Related Scores and Algorithms

Argument scores from sub-argument scores

Evidence Scores

Importance Score

Linkage Scores

Truth Scores

Objective Criteria Scores

 

 

Book Logical Validity Score

 

 

Logical Validity and Truth Scores
The Truth Score integrates two complementary aspects:

  • Logical Validity: The quality of reasoning.
  • Level of Verification: Empirical validation.
    A baseline multiplier governs the interplay between these components. This multiplier dynamically shifts weight between observation and argumentation, reflecting their contextual relevance. Arguments evaluating the importance of each type drive the multiplierโ€™s adjustment, ensuring the scoring adapts to the specific nature of the belief being evaluated. This dynamic mechanism maintains fairness and adaptability across diverse belief systems.

 

 

Logical Soundness Framework

We will build a robust framework to assign each belief a "logical validity" score, among other metrics. This score will dynamically reflect the logical soundness of the belief based on a structured evaluation of its arguments and reasoning.

Key Components of the Framework

  1. Logical Fallacy Identification and Templates:

    • Templates for Accusations:

      • Users will employ specific templates to accuse a statement of being one of many recognized logical fallacies.

      • Templates guide users to:

        • Understand the logical fallacy in question.

        • Clearly explain why the original statement qualifies as that fallacy. They may copy and paste or highlight which part of the original statement is offending or lines up with different parts of logical fallicies. 

    • Fallacy Examples:

      • Ad Hominem, Strawman, Circular Reasoning, False Cause, etc., with examples and criteria to assist users.

  2. Pro/Con Evaluation of Accusations:

    • Other users can post reasons to agree or disagree with:

      • The accusation itself.

      • The reasoning behind the accusation.

    • This process ensures that all claims are rigorously tested and transparently debated.

  3. Structured Argumentation:

    • The forum design will break down discussions into clear, manageable sub-compartments:

      • Premises: The foundational statements of an argument.

      • Conclusions: What the premises are purported to support.

      • Linkages: The logical relationship between premises and conclusions.

    • Users can challenge the:

      • Validity of premises.

      • Validity of conclusions.

      • Strength or relevance of linkages between premises and conclusions.

Dynamic Scoring System

  1. Logical Validity Score:

    • A beliefโ€™s score will be influenced by:

      • Validity and strength of supporting arguments.

      • Relevance and linkage strength of premises to conclusions.

      • Accuracy and acceptance of logical fallacy accusations.

  2. Weakening Linkage Strength:

    • Users can weaken the linkage strength by:

      • Demonstrating flaws in the connection between premises and conclusions.

      • Identifying logical inconsistencies or unsupported assumptions.

  3. Real-Time Updates:

    • The logical validity score is dynamically updated based on:

      • New evidence or arguments.

      • Changes in the strength or relevance of linkages.

      • Pro/con evaluations of logical fallacy accusations.

Benefits of the Framework

  1. Encourages Logical Rigor:

    • Promotes clear reasoning by requiring users to explain and justify their accusations.

    • Ensures that all claims are subjected to critical scrutiny.

  2. Transparency:

    • Arguments are visually organized into premises, conclusions, and linkages, making the reasoning process transparent and easy to follow.

  3. User Education:

    • Templates and examples help users learn and apply logical principles effectively.

  4. Enhanced Debate Quality:

    • Breaking down arguments into sub-components reduces confusion and improves the clarity of discussions.

 

Comments (0)

You don't have permission to comment on this page.