Case Study · E-Commerce · Product Discovery & Trust

Amazon ARIS: Building Trust Into Every Purchase Decision

A 3-phase product exercise for Amazon - from identifying the review trust crisis to designing a full Review Integrity System (ARIS) with a PRD, wireframes, RICE prioritisation, and rollout plan.

Product Amazon (Feature Addition)
Domain E-Commerce · Trust & Discovery
Target Segment Convenience Shoppers
Business Goal +12-15% conversion in high-anxiety categories
Deliverables Discovery · Research · PRD · Wireframes
Institution Great Lakes Institute of Management

Discovery: What is actually broken on Amazon?

The starting point was a product I use often and have both appreciation for and small frustrations with. Amazon. Below is the friction map across its 4 core flows.

Flow 1: Search → Product Discovery → Product Page

What works well
  • Search is forgiving - works even with typos or vague queries
  • Strong variety: multiple sellers, options, brands
  • Filters and sorting available upfront
  • Buy Now CTA placement is clear
  • Detailed specs and a genuinely useful Q&A section
Where it hurts
  • Too many similar-looking products; quality is impossible to validate online
  • Sponsored results blend with organic - makes ranking feel untrustworthy
  • Reviews are unreliable, fake-looking, or repetitive
  • Product images sometimes don't match what actually arrives
  • Too many badges (Amazon's Choice, Best Seller) with no clear earning criteria

Flow 2: Cart → Checkout → Payments

What works well
  • All items visible together for easy cross-checking
  • Wide payment options: cards, UPI, wallets, EMI
  • Expected delivery date sets clear expectations
Where it hurts
  • Seller location not shown - matters for perishables like plants
  • Too many upsells (warranty, services) feel distracting
  • Address selection UI is clunky with multiple saved addresses

Flows 3 & 4: Delivery → Returns

Delivery pain
  • "Arriving today" sometimes flips to "Delayed" at the last minute
  • No transparency on why a delivery date changed
  • Delivery person contact sometimes unavailable or non-functional
Returns pain
  • Some items show "Not eligible for return" only after delivery
  • Replacement options inconsistent by seller
  • Quality check at pickup is random - sometimes strict, sometimes lenient

3 Problem Hypotheses Identified

P1 (chosen) — High confidence
Convenience shoppers struggle to trust the authenticity and quality of products during discovery because reviews feel inconsistent and sometimes fake - leading to hesitation and delayed purchases.
P2 · Medium-High Confidence
Price-sensitive shoppers can't identify real deals because sponsored items blend with organic results and discounts feel inflated.
P3 · Medium Confidence
Heavy repeat buyers face quality inconsistency when different sellers fulfil the same product listing across reorders.

Research: What users actually said

5
User interviews, 20-25 mins each
17
Survey responses (Google form)
2.4/5
Average trust in Amazon reviews (survey)
82%
Always or often compare multiple listings before buying
59%
Received a product that felt very different from what reviews suggested
10-20 min
Time spent comparing for a typical purchase (most common range)

Interview Finding A: Review trust has collapsed - and it is emotional

3 out of 5 interviewees said some version of "I don't fully trust Amazon reviews." What surprised me was the emotion behind it - not just frustration, but feeling cheated.

I wasted almost 25 minutes comparing identical mobile holders. Every listing had 4+ stars and the reviews looked copy-pasted or copied from ChatGPT.

The product pictures looked amazing. The real thing that arrived looked cheap. I honestly felt cheated.

For my kid's lunchbox, I was genuinely stressed. I don't want to rely on random reviews when it's for my child.

Interview Finding B: The discovery workflow is a multi-tab ordeal

Most users described the same pattern: search, open 5-10 tabs of similar listings, scroll ratings, skim reviews, zoom user photos, then cross-check on YouTube.

By the time I reach a decision, I am just mentally done. I click and hope for the best.

Interview Finding C: Pain spikes in specific categories

  • Fashion and footwear: fit uncertainty + misleading photos
  • Kids' products: safety and health anxiety
  • Kitchenware / home items: durability hard to assess online
  • "Cheap but important" electronics: chargers, hubs, cables

Biggest frustrations from the survey (multi-select)

13/17
Fake or suspicious reviews
11/17
Sponsored items mixing with organic results
9/17
Identical products from different sellers with wildly different quality
Key Research Reframe
The core problem is not just fake reviews. Users can't quickly get to a confident decision. Decision fatigue is the real problem - fake reviews are the amplifier. The emotional state is: Doubt + Cognitive Overload + Fear of Regret.

SWOT: What the research revealed about Amazon

Strengths
  • Logistics and operations are trusted deeply - delivery, refunds, returns work
  • Search is forgiving and habit-forming
  • Prime + "Buy Again" creates strong habit loops
  • Default starting point for most categories
Weaknesses
  • Low perceived authenticity of reviews (repetitive, ChatGPT-like, mismatched photos)
  • Cluttered discovery on mobile
  • Sponsored results erode trust in Amazon as a neutral platform
  • Lack of seller transparency causes quality inconsistency in repeat orders
Opportunities
  • Build a trust layer on top of reviews
  • Structured comparison across top 3-5 options
  • Better signalling when two listings are essentially the same product
  • Category-specific trust features for high-anxiety segments
Threats
  • Niche D2C brands with fewer but more curated products winning on trust
  • Users going to YouTube to make decisions - Amazon risks becoming just the fulfilment layer
  • Amazon's own badges (Amazon's Choice) are losing credibility

Personas: Two shoppers, one broken experience

Riya
The Speedy Convenience Shopper · Primary
Age: 34 · Working professional, metro city
Usage: Prime member, 6-8 orders/month, mainly on mobile
Behaviour: Shops in small windows (late night, between meetings). Skims ratings and photos. Defaults to known brands when decision fatigue hits.
Pain: Everything looks the same. Reviews sound too polished. Doesn't want 20 mins on a Rs 600 purchase.
I trust Amazon to deliver. I just want one screen that tells me, in simple words, if this product is genuinely good enough for me to stop thinking and click Buy.
Arjun
The Value-Obsessed Deal Hunter · Secondary
Age: 19 · Student, Tier-2 city
Usage: 2-3 orders/month, mostly during sales, very price-sensitive
Behaviour: Sorts by lowest price/discount. Opens 8-10 tabs. Validates expensive purchases on YouTube. Will abandon cart if something feels off.
Pain: Can't tell real deals from drama. Sponsored products make him doubt the whole ranking. Cheap products with 5-star reviews feel suspicious.
I don't mind doing the research, but at least be honest with me about reviews and discounts. Don't make me feel like I'm being played.

P0 Problem: Decision fatigue lives at the intersection of trust, noise, and mental effort

Users are struggling to confidently evaluate product quality during discovery because the current review experience feels noisy, repetitive, and hard to trust. This leads to longer decision times, mental fatigue, and drop-offs before purchase - especially in high-anxiety categories like fashion, kids' products, and home/kitchen.

PIF Scoring (Population x Intensity x Frequency)

Problem Population Intensity Frequency Total
P1: Low trust in reviews during discovery 5 - cuts across almost all shoppers 5 - directly hits confidence and mental effort 5 - shows up in most medium+ consideration purchases 15
P2: Confusing deals and sponsored results 4 - strongest for price-sensitive users 4 - leads to regret and mistrust 4 - spikes around sales, not every purchase 12
P3: Inconsistent repeat orders 3 - mostly repeat buyers in certain categories 3 - annoying but usually fixable via returns 3 - occasionally, not every reorder 9

Goal Statement

Final Goal
Improve the discovery-to-purchase conversion rate by ~12-15% for convenience shoppers in selected high-anxiety categories (fashion, kids, home/kitchen) over the next 6 months by increasing perceived trust and clarity in the review and comparison experience.

Ideation & Prioritisation: 11 ideas, 3 winners

11 solutions were brainstormed across four categories, then scored using RICE to identify the top 3 for the MVP.

RICE Prioritisation

Solution Reach Impact Confidence Effort Score
Review Integrity Check + Pattern Detection 3333 9
Confidence Meter (Trust Score) 3322 9
Customizable Quick Compare 3333 9
Long-term Use Review Badge 2221 8
Theme Keyword Clusters 3222 6
"Why different" listing highlight 2222 4
Category Trust Signals 2212 4

Ideas not scored but kept as moonshots: Social proof layer, Cross-platform sentiment summary (YouTube/Instagram), AR try-out expansion, Audio reviews.

PRD: ARIS - Amazon Review Integrity System

The three winning solutions are unified under a single system: ARIS. Together they address review noise, trust signalling, and comparison fatigue.

Current User Flow

Search → Product Page → Read dozens of reviews → Open 5-10 tabs → Cross-check on YouTube → Maybe buy

Target User Flow with ARIS

Search → Product Page → Confidence Meter → Review Quality Snapshot → Quick Compare → Buy with confidence

01
Reviews Quality Snapshot (AI Honest Summary)
ARIS ensures shoppers read only the most credible review signals - in under 30 seconds
Layer 1 - Clean Input: Verified Purchase Integrity
  • Strict purchase verification before review submission
  • Cross-checks reviewer account age, purchase patterns, and behaviour signals
  • Prevents bulk review injections via velocity and IP pattern detection
Layer 2 - AI Pattern Detection and Review Flagging
  • AI identifies suspicious patterns: repetitive phrases, shallow content, AI-like wording
  • Identical phrases across multiple reviews from different accounts
  • Suspicious volume spikes around listing creation or price drops
  • Flagged reviews are marked "Potential fake review - Learn more" and downranked
  • Users can challenge a flagged review via a high-friction path (requires evidence)
Layer 3 - Refined Output: AI Honest Summary Box
  • Transparent AI-generated summary appears above the review list
  • Format: "Most buyers liked X. 12% reported Y issue."
  • Checkmarks for positive signals, warning triangles for recurring issues
  • Shows credibility basis: "4.3 based on 2,341 verified purchases"
  • Shows fake review count: "23 potential fake reviews detected out of 2,341"
  • Expandable "View breakdown" for transparency
  • Feedback button: "Was this summary helpful?"
Impact
Users reach a confident decision in under 30 seconds, eliminating the need to scroll through hundreds of reviews.
02
Confidence Meter (Trust Score)
A single blended trust indicator near the product price - acts as an emotional green or red signal

Inputs to the Trust Score

  • Review credibility score (from ARIS pattern detection)
  • Verified purchase density (% of reviews with verified purchase badge)
  • Return rate for this product and seller
  • Long-term use review ratio (15-30 day post-purchase reviews)
  • Seller history and track record

Displayed as a percentage bar below the product price. Example: "Overall Trust Confidence: 82%". A simple label accompanies it: "Trusted Seller" or "New Seller".

Impact
Enables faster decisions. Users like Riya can stop thinking and click Buy when the number is high. Users like Arjun can quickly filter out suspicious listings.
03
Customizable Quick Compare
Eliminates the multi-tab ordeal - compare top 3 similar products side by side, then customise

Auto-Generated Product Selection

QC auto-selects the top 3 most similar listings using: image similarity, product attributes matching, title/model matching, price range proximity, and category signals.

Users can then: remove any column, add a product manually, or add from the Related Products / Customers Also Viewed sections via the Quick Compare icon on each card.

QC Table Structure

RowDescription
Product ImageWith 'x' remove button per column
Product NameFull name as per listing
SellerSeller name, clickable link to seller page
PriceWith any discounts
RatingStars + number of ratings
Confidence MeterTrust score bar + label
Return RateRecent return rate for this product
Key DifferencesSide-by-side seller/product comparison and differences
Warning Badgee.g. "Higher number of potential fake reviews detected"
Shipping DetailsExpected delivery date
ActionsAdd to Cart + Buy Now buttons

Horizontal scroll appears if more than 3 items are compared.

Why QC Matters
Reduces cognitive load (no multi-tab juggling), improves trust (fake review warnings surface in context), and improves conversion (user reaches a decision without leaving the product page).

Success Metrics: How we know ARIS is working

TypeMetricDefinitionWhy it matters
Key Product Page Conversion Rate % of users who buy after viewing product page Direct measure of reduced decision fatigue
Secondary Time to Purchase Average time between first view and purchase Should decrease with summary and trust signals
Secondary Review Scroll Depth How deep users scroll into reviews Should reduce significantly with AI summary
Secondary Compare Tab Open Count Number of similar product tabs opened Less confusion = fewer tabs needed
Guardrail Return Rate % of returns post purchase Ensures trust improvements are not misleading users
Guardrail Customer Complaints Category-specific issue tickets Monitors user sentiment after changes

Risk & Mitigation

AssumptionFailure ModeImpactLikelihoodMitigationRollback
AI correctly identifies low-credibility reviews High-credibility reviews flagged OR fake reviews missed High Medium Conservative scoring; manual review of edge cases; supervised fine-tuning; user challenge option (high-friction) Temporarily relax pattern detection; reprocess flagged reviews
AI Quick Summary is perceived as neutral and helpful Summary feels biased, oversimplified, or incorrect - users lose trust Medium Low-Medium Transparency disclaimer; "Was this summary helpful?" feedback Pause AI summary; fallback to review highlights
AI selects truly comparable products for Quick Compare Quick Compare shows mismatched or irrelevant items High Medium Multi-signal similarity (image + attributes + title matching); internal validation set Lower AI similarity threshold; fallback to Amazon's existing matching logic

Rollout: Pilot, Scale, Global

01
Pilot
  • Select 2-3 high-anxiety categories: fashion, kids, home/kitchen
  • A/B test with 10-20% of traffic in these categories
  • Monitor all key and guardrail metrics closely
02
Scale
  • Expand to all high-anxiety categories based on Pilot results
  • Introduce seller-facing dashboard showing their Trust Score
03
Global
  • Roll out across all Amazon marketplaces
  • Localise review language models for regional markets
  • Introduce ARIS as a trust brand signal in marketing
PM Reflection
The most important insight from this exercise: the review problem on Amazon is not a data problem - Amazon has more data than anyone. It is a signal-to-noise problem. Users are not looking for more reviews. They are looking for a way to stop reading and start trusting.