Trend AnalysisEducation

AI Tutors in Engineering Education: Domain Expertise vs. General Intelligence

Engineering education demands precision that general-purpose LLMs cannot reliably deliver. A wave of domain-specific AI tutors—from geotechnical engineering to biomechanics—reveals both the promise and the peril of teaching students disciplines where wrong answers can collapse bridges.

By Sean K.S. Shin
This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.

In most academic disciplines, a tutor who is occasionally wrong is merely unhelpful. In engineering, a tutor who is occasionally wrong is potentially dangerous. When a first-year student learning structural analysis receives confidently incorrect guidance on load-bearing calculations, the pedagogical harm extends beyond a failed exam—it plants misconceptions that, uncorrected, could eventually manifest in professional practice where steel beams, patient prosthetics, and electrical systems leave no room for hallucination.

This fundamental constraint—that engineering education operates in domains where precision is not a desirable feature but an existential requirement—shapes the entire landscape of AI tutoring in engineering. And it explains why notable advances in this space have come not from deploying general-purpose LLMs like ChatGPT but from building domain-specific AI tutoring systems that trade breadth for reliability.

The Evidence Base: What Works and What Doesn't

Frankford, Sauerwein, and Bassner (2024) provide a rigorous empirical evaluation. Their exploratory case study embedded LLM-based tutoring into a software engineering course, using surveys and assessments to evaluate the integration. The study identified both advantages and challenges of LLM-based tutoring in programming education.

The findings reveal a mixed picture. On the positive side, the AI tutor provided timely feedback and scalable support. But the study also identified significant challenges:

  • Conceptual transfer limitations: Students could reproduce solutions the AI had demonstrated but showed limited transfer of underlying reasoning to novel problems.
  • Generic response quality: AI-generated feedback often lacked the specificity that domain-expert instructors provide, particularly for advanced design decisions.
  • Dependency concerns: Students reported concerns about potential learning progress inhibition—a worry that reliance on AI feedback might reduce independent problem-solving capacity.
This pattern—strong procedural gains, weak conceptual transfer, and emerging dependency—recurs across virtually every domain-specific deployment.

Domain-Specific Architectures: The EngiBot Approach

Rodrigues, Pinto, and Gonçalves (2025) present EngiBot, a purpose-built AI tutoring system for engineering education that addresses the general-purpose LLM's limitations through two core subsystems:

  • Document Processing Pipeline: Rather than relying on open web retrieval, EngiBot extracts structured knowledge from technical PDF materials such as lecture notes and problem sets, constructing a course-specific knowledge base enriched with metadata and semantic structure. This domain-constrained approach aims to ensure precise and relevant response generation.
  • Natural Language Understanding Module: A hybrid approach combining rule-based intent classification and Large Language Models interprets student queries, enabling context-aware responses tailored to engineering domains.
  • Quantitative evaluation demonstrates effective performance in both knowledge extraction and question-answer retrieval, confirming the system's potential as a support tool in engineering education. The domain-constrained architecture represents a deliberate trade-off: reduced breadth for increased reliability in safety-critical domains.

    The Geotechnical Challenge: When Soil Is Not a Textbook Problem

    Tophel, Chen, and Hettiyadura (2025) take the domain-specificity argument further by testing LLM tutors in geotechnical engineering—a discipline where correct answers depend on site-specific conditions (soil type, water table, seismic zone) that no training corpus fully captures. Their comparative study evaluates multiple LLM APIs on undergraduate geotechnical problems and reveals a consistent hierarchy:

    • Factual recall (definitions, classification systems): LLMs performed reasonably well, suggesting that declarative knowledge is well-represented in training corpora.
    • Calculation-based problems (bearing capacity, settlement analysis): General-purpose LLMs showed notably lower accuracy, while fine-tuned domain models performed better—though still imperfectly.
    • Design judgment (choosing foundation type given ambiguous site data): All models performed poorly, with general-purpose LLMs often generating responses that were internally consistent but based on incorrect assumptions about soil behavior.
    The implication is worth noting: the tasks that matter most in engineering education—exercising judgment under uncertainty—are the tasks where AI tutors remain least reliable.

    JULIUS: Teaching Resilience, Not Just Answers

    Martinez, Chong, and Maya (2025) introduce an alternative philosophy with JULIUS, an AI tutor designed not to make students better programmers but to make them more resilient programmers. JULIUS operates on the premise that the primary failure mode of engineering students is not lack of knowledge but lack of emotional regulation when confronting difficult problems.

    When a student expresses frustration ("I've been stuck on this for hours"), JULIUS does not immediately provide a hint. Instead, it engages in metacognitive coaching: "What specifically is confusing? Can you identify where your understanding breaks down?" Only after the student articulates their confusion does JULIUS offer targeted assistance.

    The design is grounded in connectivism and Challenge-Based Learning (CBL), promoting autonomy, reducing anxiety, and supporting real-world problem-solving through structured group dynamics. By withholding immediate solutions, JULIUS preserves the student's sense of autonomy; by providing targeted (not complete) assistance, it builds competence; by engaging in empathetic dialogue, it supports emotional resilience.

    Results from a mixed-methods study in a Fundamentals of Programming course show that JULIUS users demonstrated significant improvements in conceptual understanding, logical reasoning, motivation, and collaboration based on pre- and post-test comparisons, with qualitative analysis confirming enhanced emotional well-being and metacognition.

    Claims and Evidence

    <
    ClaimEvidenceVerdict
    AI tutoring improves engineering students' procedural skillsFrankford et al. (2024): timely feedback and scalability advantages observed✅ Supported
    AI tutoring improves engineering students' conceptual understandingFrankford et al. (2024): generic responses and transfer limitations noted⚠️ Uncertain
    Domain-specific AI tutors outperform general-purpose LLMsTophel et al. (2025): fine-tuned models outperform general-purpose LLMs on domain calculations✅ Supported
    Domain-constrained knowledge bases improve tutoring reliabilityEngiBot (Rodrigues et al., 2025): effective performance in knowledge extraction and QA retrieval demonstrated⚠️ Uncertain (single system, no comparative study)
    AI tutors can teach engineering judgmentAll studies: poor performance on design judgment tasks requiring contextual reasoning❌ Refuted

    Open Questions

  • Should AI tutors in engineering be regulated like engineering software? If students use AI-generated solutions in professional practice, does the AI tutor bear a form of professional liability? Current legal frameworks have no answer.
  • Can we teach judgment through counterfactual simulation? Rather than answering "What foundation should I use?", could an AI tutor simulate the consequences of each choice—"If you choose a shallow foundation on this soil, here is what happens to settlement over 50 years"?
  • How do we measure the hidden curriculum? Engineering education transmits not just technical knowledge but professional identity, ethical reasoning, and risk awareness. Can AI tutors contribute to these outcomes, or do they inherently reduce engineering education to a technical skill?
  • What is the right level of domain specificity? EngiBot's curated knowledge base trades generality for reliability. At what point does domain constraining become domain limiting—preventing students from making the cross-disciplinary connections that drive engineering innovation?
  • Implications

    The message from this literature is clear but uncomfortable: AI tutors in engineering education work best for the tasks that matter least (procedural skill) and work worst for the tasks that matter most (design judgment). This does not mean they are useless—procedural fluency is a genuine bottleneck in engineering education, and relieving it frees instructor time for the judgment-intensive work that AI cannot yet support.

    But it does mean that the narrative of AI tutoring as a replacement for human engineering instructors is premature by at least a decade. The most promising path is complementary: AI handles drill, practice, and procedural scaffolding while human instructors focus on design thinking, ethical reasoning, and the cultivation of professional judgment that no training corpus can encode.

    References (5)

    [1] Frankford, E., Sauerwein, C., Bassner, P., Krusche, S., & Breu, R. (2024). AI-Tutoring in Software Engineering Education: Experiences with Large Language Models in Programming Assessments. Proc. IEEE/ACM ICSE-SEET 2024.
    [2] Rodrigues, B., Pinto, R., & Gonçalves, G. (2025). EngiBot: An AI-Based Tutoring System for Personalized Learning in Engineering Education. Proc. IEEE ICELIE 2025.
    [3] Tophel, A., Chen, L., & Hettiyadura, U. (2025). Towards an AI Tutor for Undergraduate Geotechnical Engineering: A Comparative Study. Information Retrieval Journal.
    [4] Martinez, J.R., Chong, M., & Maya, S. (2025). Enhancing Algorithmic Thinking and Emotional Resilience in Programming Education Through AI Powered Virtual Tutoring. Proc. IEEE FIE 2025.
    [5] Yan, H., Lu, Q., & Wang, X. (2025). Build AI Assistants Using Large Language Models and Agents to Enhance Biomechanics Education. arXiv:2511.15752.

    Explore this topic deeper

    Search 290M+ papers, detect research gaps, and find what hasn't been studied yet.

    Click to remove unwanted keywords

    Search 8 keywords →