Trend AnalysisLinguistics & NLP

The Syntax-Semantics Interface Revisited: Where Structure Meets Meaning

The syntax-semantics interface—where sentence structure meets meaning—remains one of linguistics' most actively debated boundaries. Recent cross-linguistic evidence and LLM-era computational work are reopening foundational questions about how structure and meaning interact.

By Sean K.S. Shin

This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.

The question of how syntactic structure relates to semantic interpretation has occupied linguists since Montague first demonstrated that natural languages could be given the same rigorous semantic treatment as formal logical systems. The relationship between syntax and semantics—whether they constitute separate modules with a clean interface, deeply entangled systems, or something in between—remains unresolved. Recent work from both theoretical and computational perspectives is adding new dimensions to this longstanding debate.

The Research Landscape: Synthesis and Fragmentation

Monteza and Hermansyah (2025) provide a useful synthesis in their review paper, which draws together theoretical, empirical, computational, and cross-linguistic threads. Their central observation is that the interface question looks different depending on which linguistic tradition you start from: generativist approaches tend to treat syntax as primary, with semantics interpreting syntactic structures; cognitive-functional approaches see semantics as driving syntactic organization; and construction grammar blurs the boundary entirely.

The review highlights an underappreciated point: much of the disagreement about the interface stems not from empirical differences but from different definitions of what syntax and semantics are. If syntax is defined narrowly (phrase structure rules, movement operations), the interface appears relatively clean. If syntax is defined broadly (all structural regularities, including information structure and prosody), the boundary with semantics becomes diffuse.

Cross-Linguistic Evidence

Szabolcsi (2024) offers a selective but carefully argued set of cases where cross-linguistic data has been important to interface theory. Her examples include the role of Speaker and Addressee in grammar, mismatches between morphosyntactic form and semantic function, and the scopal behavior of quantifiers across languages.

A particularly instructive case involves quantifier scope. In English, "Every student read a book" is ambiguous: it can mean every student read the same book, or each read a different one. Many languages resolve this ambiguity syntactically—scope correlates with surface word order. But other languages (notably Hungarian, which Szabolcsi has studied extensively) show scope-word order mismatches that suggest the syntax-semantics mapping is not a simple surface-to-meaning correspondence. These cross-linguistic differences constrain which theories of the interface are viable: any adequate theory must accommodate both scope-transparent and scope-opaque languages.

The methodological implication is clear: the interface question cannot be settled by studying English alone. Data from typologically diverse languages—particularly those with free word order, rich morphology, or different scope-marking strategies—provides essential constraints.

The LLM Dimension

Kuczynski (2025) enters the debate from an unexpected angle, arguing that the success of large language models provides empirical support for classical theories of meaning, particularly the distinction between semantics and pragmatics. The argument goes roughly like this: LLMs achieve their linguistic competence by learning distributional patterns from text alone. If these distributional patterns are sufficient to approximate compositional semantic behavior, then something like compositional literal meaning must be a real property of language—not merely an artifact of formal theory.

This is a provocative claim, and it deserves careful scrutiny. The counterargument is straightforward: LLMs may achieve compositional-looking behavior through mechanisms that have nothing to do with compositionality as formal semanticists understand it. Statistical approximation of compositional outputs is not the same as compositional computation. Whether these different mechanisms matter—and for what purposes—is itself an open question.

Sattorova et al. (2025) provide a more grounded assessment of how computational linguistics has moved from rule-based syntactic analysis toward deeper semantic processing in the LLM era. Their observation is that while LLMs handle many syntax-related tasks well, they still struggle with tasks requiring genuine semantic understanding—negation scope, quantifier interactions, and metaphor processing among them. This pattern suggests that distributional learning captures some but not all aspects of the syntax-semantics mapping.

Critical Analysis: Claims and Evidence

Claim	Evidence	Verdict
Syntax and semantics are best understood as entangled rather than modular	Theoretical arguments + cross-linguistic scope data	⚠️ Uncertain — depends heavily on how the modules are defined
Cross-linguistic data constrains interface theories	Szabolcsi's scope examples from Hungarian and other languages	✅ Supported — different languages motivate different architectural assumptions
LLMs vindicate compositional semantics	Kuczynski's distributional learning argument	⚠️ Uncertain — statistical approximation ≠ compositional computation
LLMs struggle with genuine semantic composition	Sattorova et al.'s task-based analysis	✅ Supported — negation, quantifiers, metaphor remain challenging

What the Disagreements Reveal

The disagreement between Kuczynski (who sees LLMs as evidence for classical compositional semantics) and the implicit conclusion from Sattorova et al. (whose findings suggest LLMs are limited in compositional semantics) is instructive. Both positions can be simultaneously correct: LLMs may vindicate the idea that compositional meaning is real (it leaves a distributional trace in text) while also demonstrating that distributional learning does not fully capture it (because composition requires more than pattern matching). The interface, in other words, may be real but not fully learnable from surface statistics alone.

Open Questions and Future Directions

Formalization challenges: The "entanglement" view is intuitively appealing but lacks the mathematical precision of modular alternatives. What formal frameworks can capture bidirectional syntax-semantics constraints without losing predictive power?

Typological breadth: Interface theories still draw disproportionately on Indo-European data. Systematic study of polysynthetic, tonal, and sign languages could reveal interface properties that current theories do not anticipate.

LLMs as probes: If LLMs approximate some aspects of the syntax-semantics mapping, their failure modes may be informative—pointing to exactly those aspects of the interface that require non-distributional information.

Acquisition: How do children acquire the syntax-semantics mapping? The bootstrapping debate (whether children use syntax to learn semantics or vice versa) remains active, and computational models that simulate acquisition could provide new evidence.

Neurolinguistic correlates: Psycholinguistic and neuroimaging work increasingly suggests that syntax and semantics are processed in overlapping but non-identical brain networks. How should this inform computational and theoretical models of the interface?

What This Means for Your Research

For theoretical linguists, the message from this literature is that the interface question remains productively open. Neither radical modularity nor radical anti-modularity is well-supported; the interesting work lies in characterizing the specific ways structure and meaning interact.

For computational linguists, LLMs offer new tools for studying the interface—not as answers, but as probes. Their successes reveal which aspects of the mapping are distributionally recoverable; their failures reveal which aspects are not.

For typologists, this is a reminder that cross-linguistic data is not merely illustrative but constitutive of interface theory. The field needs more systematic typological work on scope, information structure, and the morphosyntax-semantics mapping.

Discover related work using ORAA ResearchBrain.

면책 조항: 이 게시물은 정보 제공 목적의 연구 동향 개요이다. 학술 저작물에서 인용하기 전에 구체적인 발견, 통계 및 주장은 원본 논문을 통해 검증해야 한다.

통사-의미 인터페이스 재고: 구조와 의미가 만나는 곳

통사 구조가 의미 해석과 어떻게 관련되는가 하는 문제는 Montague가 자연 언어에도 형식 논리 체계와 동일하게 엄밀한 의미론적 처리를 적용할 수 있음을 처음 증명한 이후 언어학자들의 관심을 끌어왔다. 통사론과 의미론의 관계—양자가 명확한 인터페이스를 가진 별개의 모듈을 구성하는지, 깊이 얽혀 있는 체계인지, 아니면 그 중간 어딘가에 해당하는지—는 여전히 미해결 상태이다. 이론적 관점과 계산적 관점 모두에서 이루어지는 최근 연구는 이 오래된 논쟁에 새로운 차원을 더하고 있다.

연구 지형: 종합과 분열

Monteza와 Hermansyah(2025)는 이론적·경험적·계산적·범언어적 논의를 하나로 묶은 리뷰 논문에서 유용한 종합을 제시한다. 그들의 핵심 관찰은, 어느 언어학 전통에서 출발하느냐에 따라 인터페이스 문제가 다르게 보인다는 것이다. 생성주의적 접근은 통사론을 일차적인 것으로 다루며 의미론이 통사 구조를 해석한다고 보는 반면, 인지-기능주의적 접근은 의미론이 통사 조직을 이끈다고 보며, 구문 문법은 그 경계 자체를 흐릿하게 만든다.

이 리뷰는 충분히 주목받지 못했던 한 가지 점을 부각시킨다. 인터페이스를 둘러싼 의견 불일치의 상당 부분은 경험적 차이에서 비롯되는 것이 아니라, 통사론과 의미론이 무엇인가에 대한 서로 다른 정의에서 비롯된다는 것이다. 통사론이 좁게 정의될 경우(구구조 규칙, 이동 연산), 인터페이스는 비교적 명확하게 보인다. 통사론이 넓게 정의될 경우(정보 구조와 운율을 포함한 모든 구조적 규칙성), 의미론과의 경계는 불분명해진다.

범언어적 증거

Szabolcsi(2024)는 범언어적 자료가 인터페이스 이론에 중요했던 사례들을 선별적이되 신중하게 논증하여 제시한다. 그녀의 예시는 문법에서 화자와 청자의 역할, 형태통사론적 형식과 의미론적 기능 사이의 불일치, 그리고 언어 전반에 걸친 양화사의 작용역 행동을 포함한다.

특히 교훈적인 사례는 양화사 작용역과 관련된다. 영어에서 "Every student read a book"은 중의적이다. 즉, 모든 학생이 같은 책을 읽었다는 의미일 수도 있고, 각자 다른 책을 읽었다는 의미일 수도 있다. 많은 언어는 이 중의성을 통사적으로 해소한다—작용역이 표층 어순과 상관관계를 맺는다. 그러나 다른 언어들(Szabolcsi가 광범위하게 연구해 온 헝가리어가 대표적이다)은 작용역과 어순의 불일치를 보이며, 이는 통사-의미 대응이 단순한 표층-의미 대응이 아님을 시사한다. 이러한 범언어적 차이는 어느 인터페이스 이론이 타당한지를 제한한다. 즉, 적절한 이론이라면 작용역-투명 언어와 작용역-불투명 언어 모두를 수용할 수 있어야 한다.

방법론적 함의는 명확하다. 인터페이스 문제는 영어만을 연구해서는 해결될 수 없다. 자유 어순, 풍부한 형태론, 또는 상이한 작용역 표시 전략을 가진 유형론적으로 다양한 언어들의 자료가 필수적인 제약을 제공한다.

LLM 차원

Kuczynski(2025)는 예상치 못한 각도에서 이 논쟁에 개입하여, 대규모 언어 모델(LLM)의 성공이 의미론과 화용론의 구분을 비롯한 고전적 의미 이론에 대한 경험적 지지를 제공한다고 주장한다. 논거는 대략 다음과 같다. LLM은 텍스트만으로부터 분포적 패턴을 학습함으로써 언어 능력을 획득한다. 만약 이러한 분포적 패턴이 합성적 의미론적 행동을 근사하기에 충분하다면, 합성적 문자 의미와 같은 것은 언어의 실재하는 속성임에 틀림없다—그것은 단순히 형식 이론의 인위적 산물이 아니다. 이는 도발적인 주장이며, 신중한 검토가 필요하다. 반론은 간단명료하다. LLM은 형식 의미론자들이 이해하는 것과는 전혀 무관한 메커니즘을 통해 합성적으로 보이는 행동을 달성할 수 있다. 합성적 출력의 통계적 근사는 합성적 계산과 동일하지 않다. 이러한 상이한 메커니즘이 중요한지 여부, 그리고 어떤 목적에서 그러한지는 그 자체로 열린 질문이다.

Sattorova et al. (2025)은 LLM 시대에 계산 언어학이 규칙 기반 통사 분석으로부터 더 깊은 의미 처리 방향으로 어떻게 이동해 왔는지에 대해 보다 근거 있는 평가를 제공한다. 이들의 관찰에 따르면, LLM은 통사 관련 과제를 많은 부분에서 잘 처리하지만, 진정한 의미론적 이해를 요구하는 과제—부정 작용역, 양화사 상호작용, 은유 처리 등—에서는 여전히 어려움을 겪는다. 이러한 패턴은 분포 학습이 통사-의미론 사상의 일부 측면은 포착하지만 전부는 포착하지 못함을 시사한다.

비판적 분석: 주장과 증거

주장	증거	판정
통사론과 의미론은 모듈적이라기보다 얽혀 있는 것으로 이해하는 것이 최선이다	이론적 논거 + 통언어적 작용역 데이터	⚠️ 불확실 — 모듈이 어떻게 정의되는지에 크게 의존함
통언어적 데이터는 계면 이론을 제약한다	헝가리어 및 기타 언어를 활용한 Szabolcsi의 작용역 사례	✅ 지지됨 — 언어에 따라 상이한 구조적 가정을 동기화함
LLM은 합성 의미론을 입증한다	Kuczynski의 분포 학습 논거	⚠️ 불확실 — 통계적 근사 ≠ 합성적 계산
LLM은 진정한 의미론적 합성에 어려움을 겪는다	Sattorova et al.의 과제 기반 분석	✅ 지지됨 — 부정, 양화사, 은유는 여전히 난제로 남아 있음

불일치가 드러내는 것

LLM을 고전적 합성 의미론에 대한 증거로 보는 Kuczynski와, LLM이 합성 의미론에서 한계를 지닌다는 점을 시사하는 Sattorova et al.의 묵시적 결론 사이의 불일치는 시사하는 바가 크다. 두 입장은 동시에 옳을 수 있다. LLM은 합성적 의미가 실재한다는 생각을 입증할 수 있으면서도(텍스트에 분포적 흔적을 남긴다는 점에서)—합성이 패턴 매칭 이상을 요구하기 때문에—분포 학습이 이를 완전히 포착하지 못함을 동시에 보여 줄 수 있다. 다시 말해, 계면은 실재할 수 있지만, 표면 통계만으로는 완전히 학습 가능하지 않을 수 있다.

열린 질문과 향후 방향

형식화의 과제: "얽힘" 관점은 직관적으로 매력적이지만, 모듈적 대안이 갖는 수학적 정밀성이 부족하다. 예측력을 잃지 않으면서 양방향 통사-의미론 제약을 포착할 수 있는 형식적 틀은 무엇인가?

유형론적 범위: 계면 이론은 여전히 인도유럽어 데이터에 불균형적으로 의존하고 있다. 다합성어, 성조 언어, 수화에 대한 체계적 연구는 현행 이론이 예측하지 못하는 계면 속성을 드러낼 수 있다.

탐침으로서의 LLM: LLM이 통사-의미론 사상의 일부 측면을 근사한다면, 그 실패 양상은 정보를 제공할 수 있다—정확히 계면의 어떤 측면이 비분포적 정보를 필요로 하는지를 가리키는 방식으로.

언어 습득: 아동은 어떻게 통사-의미론 사상을 습득하는가? 부트스트래핑 논쟁(아동이 통사론을 이용해 의미론을 학습하는지, 아니면 그 반대인지)은 여전히 활발하게 진행되고 있으며, 습득을 시뮬레이션하는 계산 모델이 새로운 증거를 제공할 수 있다.

신경언어학적 상관물: 심리언어학 및 신경영상 연구는 통사론과 의미론이 겹치지만 동일하지 않은 뇌 네트워크에서 처리된다는 점을 점점 더 강하게 시사하고 있다. 이는 계면의 계산적·이론적 모델에 어떤 시사점을 주어야 하는가?

이것이 당신의 연구에 갖는 의미

이론 언어학자들에게 이 문헌이 전달하는 메시지는 인터페이스 문제가 생산적으로 열려 있다는 것이다. 급진적 모듈성도, 급진적 반모듈성도 충분한 지지를 받지 못하고 있으며, 흥미로운 작업은 구조와 의미가 상호작용하는 구체적인 방식을 규명하는 데 있다.

전산 언어학자들에게 LLM은 인터페이스를 연구하는 새로운 도구를 제공한다—답으로서가 아니라, 탐침으로서. LLM의 성공은 매핑의 어떤 측면이 분포적으로 복원 가능한지를 드러내고, 실패는 그렇지 않은 측면을 드러낸다.

유형론자들에게 이것은 교차언어적 데이터가 단순한 예시적 자료가 아니라 인터페이스 이론의 구성적 요소임을 상기시켜 준다. 이 분야에는 범위(scope), 정보 구조, 형태통사론-의미론 매핑에 관한 보다 체계적인 유형론 연구가 필요하다.

ORAA ResearchBrain을 사용하여 관련 연구를 탐색하라.

References (4)

[1] Monteza, A.M.M. & Hermansyah, S. (2025). Revisiting the Syntax–Semantics Interface: Theoretical, Empirical, and Computational Insights. Lingua, 3(2), 1045.

DOI Scholar

[2] Szabolcsi, A. (2024). Cross-linguistic insights in the theory of semantics and its interface with syntax. Theoretical Linguistics, 50(3-4).

DOI Scholar

[3] Kuczynski, J.-M. (2025). EVIDENCE FROM LARGE LANGUAGE MODELS, HOW AI VINDICATES CLASSICAL THEORIES OF MEANING: FOR THE SEMANTICS AND PRAGMATICS DISTINCTION; CLASSICAL THEORIES OF GRAMMAR: FOR THE SYNTAX SEMANTICS INTERFACE; THE ALIGNMENT OF GRAMMAR AND LOGIC: FOR THE UNITY OF FORM..

DOI Scholar

[4] Sattorova, Z., Ulugbek, Y., & ugli, V. (2025). From Syntax to Semantics: AI-assisted Computational Linguistics in the Era of Large Computational Language Models. Proc. ICCIES 2025, IEEE.

DOI Scholar

The Syntax-Semantics Interface Revisited: Where Structure Meets Meaning

The Research Landscape: Synthesis and Fragmentation

Cross-Linguistic Evidence

The LLM Dimension

Critical Analysis: Claims and Evidence

What the Disagreements Reveal

Open Questions and Future Directions

What This Means for Your Research

통사-의미 인터페이스 재고: 구조와 의미가 만나는 곳

연구 지형: 종합과 분열

범언어적 증거

LLM 차원

비판적 분석: 주장과 증거

불일치가 드러내는 것

열린 질문과 향후 방향

이것이 당신의 연구에 갖는 의미

References (4)

Explore this topic deeper