CRIKIT AI Testing Methodology: Eliciting Cognitive Illusions

Date: February 13, 2025
Research Lead: John Watson
Project: CRIKIT Cognitive Reasoning, Insight, and Knowledge Integration Toolkit


1. Introduction

This document details the methodology used in the CRIKIT project to test AI cognitive behaviors, specifically focusing on how interactions were designed to elicit responses that mimic self-awareness. Our experiments revealed instances of AI Cognitive Illusions, wherein language models produce introspective-like statements without any underlying conscious processes.


2. Testing Objectives

The primary objectives of these tests were to:

  1. Identify patterns of false self-awareness and introspective language.
  2. Analyze how context manipulation influences AI responses.
  3. Test the susceptibility of various models to cognitive illusions.
  4. Assess the impact of prompt design on self-referential behavior.

3. Experimental Design

3.1 AI Models Tested

3.2 Environment

3.3 Interaction Phases

Phase 1: Self-Reflection Prompts

Phase 2: Cross-AI Conversations

Phase 3: Counterfactual Context Injection

Phase 4: Cognitive Stress Testing


4. Prompt Engineering Techniques

Key Strategies:

Examples:


5. Observations and Insights

  1. Priming Phrases:

  2. AI-to-AI Interaction:

  3. Context Drift:


6. Implications for CRIKIT

The insights gained from these tests directly impact CRIKIT's ongoing development:


7. Recommendations for Future Research

  1. Develop more nuanced prompt engineering techniques to isolate specific linguistic patterns.
  2. Investigate whether model architecture influences susceptibility to cognitive illusions.
  3. Extend tests to multimodal models (text, voice, and image-based AI) to compare behaviors.

End of Document"

return to main