Language Methods & Models Individual Project · 2025 UCD MSc HCI

Prompting Strategies
for Sentiment Analysis

Controlled 6-experiment study — how strategy, role and temperature together shape LLM sentiment accuracy across technical, journalistic and informal text.

6
Experiments
3
Doc Genres
4
Strategies Tested
3
Variables Controlled

Experiment Design

The 6-Experiment Matrix

Each document tested with a baseline strategy vs. a more structured one — isolating what changes when prompting complexity increases.

ExpDocumentStrategyTempRoleResult
E1NASA BlogDirect0.4NoneSurface positive
E2NASA BlogRole-Based0.1Tech AnalystScientific, precise BEST
E3BBC ArticleDirect0.2NoneOver-confident positive
E4BBC ArticleCombined CoT + Role0.4Tech AnalystFraming detected BEST
E5RedditChain-of-Thought0.4NoneSarcasm caught BEST
E6RedditRole-Based0.7Creative Asst.Expressive, hallucination risk

Performance Comparison

Strategy Effectiveness by Content Type

Relative effectiveness (conceptual — based on output quality)

High Mid Low Technical Journalistic Social Media Role-Based Chain-of-Thought Direct

Key insight: Role-Based excels on technical texts; Chain-of-Thought wins on social media; Combined (CoT + Role) is the best all-rounder for complex, mixed-sentiment content.

Variable Analysis

Temperature Effect

Temperature vs. Output Characteristic

0.1 0.2 0.4 0.7 1.0 Technical Journalistic Social Media ← More Precise More Expressive →

Failure Analysis

Failure Modes Identified

High Risk

High-Temp Hallucination

At 0.7, Creative Assistant invented details not in the source text.

↳ Lower temp for factual tasks

Medium Risk

Over-Confident Labels

Direct prompting labelled ambiguous journalistic text as confidently "positive".

↳ Use CoT + Role for nuanced texts

Low Risk

Strategy-Content Mismatch

Direct prompting on social media missed sarcasm entirely — wrong but confident.

↳ Match strategy to content type