Trend AnalysisManagement & Business

ARR as a Valuation Signal: How VCs Evaluate AI Startups (and Why the Metrics May Mislead)

AI startups like Midjourney and ElevenLabs report ARR growth from zero to $100M+ in months—but how reliable are these revenue signals? A critical analysis reveals that ARR in AI startups conflates genuine product-market fit with API consumption spikes, free-tier conversions, and VC-subsidized growth.

By Sean K.S. Shin

This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.

Annual Recurring Revenue (ARR) has become the dominant valuation metric for venture-backed technology companies, particularly in the Software-as-a-Service (SaaS) sector. The logic is straightforward: recurring revenue indicates predictable, scalable demand that justifies high valuation multiples. For AI startups, ARR growth has been extraordinary—companies like Midjourney, ElevenLabs, and various LLM API providers have reported ARR trajectories from near-zero to near $100 million within approximately 20 months (per Ratnatunga's cited ElevenLabs example). These numbers attract enormous VC investment and justify multi-billion-dollar valuations. But how reliable is ARR as a signal of genuine business quality in the AI sector?

The Research Landscape: ARR Under Scrutiny

Ratnatunga (2025) provides a critical analysis of ARR as a growth metric in the VC ecosystem, with particular attention to AI startups. The analysis identifies several mechanisms through which AI startup ARR can be inflated relative to underlying business quality:

API consumption spikes: Many AI startups generate revenue through usage-based API pricing. A developer building a prototype may consume $10,000/month in API calls during active development and $100/month thereafter. ARR calculated during the development spike dramatically overstates sustainable revenue.

Free-to-paid conversion timing: Startups that offer generous free tiers can engineer ARR spikes by converting free users to paid plans through feature restrictions or trial expirations—creating a one-time revenue bump that is recorded as "recurring" but may not recur if converted users subsequently churn.

VC-subsidized consumption: Some VC-backed startups offer below-cost pricing to acquire customers rapidly, generating ARR that depends on continued VC subsidy rather than sustainable unit economics. When pricing normalizes, churn may dramatically reduce the ARR base.

Gross vs. net revenue: AI companies that use third-party LLM APIs (OpenAI, Anthropic) as backend infrastructure may report gross revenue (total customer payments) rather than net revenue (payments minus API costs), inflating the apparent size and margin of the business.

Ratnatunga argues that VC reliance on ARR creates a "circular startup ecosystem": startups optimize for ARR growth because VCs reward it, VCs invest based on ARR growth because it predicts future funding rounds, and future funding rounds validate the ARR-based valuation—creating a self-referential cycle that can persist until an external shock (market downturn, rate increases, competitive disruption) breaks the loop.

Revenue Composition Analysis

Zhao (2026) uses graph-based analysis to examine AI startup revenue composition, distinguishing between:

Core product revenue: Payments for the primary product or service, from customers who derive ongoing value.
Experimental revenue: Payments from enterprises and developers testing AI capabilities, with uncertain conversion to long-term usage.
Network-effect revenue: Revenue driven by platform lock-in, data network effects, or ecosystem dependencies—potentially more durable than product-quality-based revenue.

The analysis finds that for early-stage AI startups, experimental revenue often constitutes 40–more than half of total ARR—a fraction that tends to convert at 15–a meaningful fraction to sustained usage, implying that headline ARR may overstate sustainable revenue by a factor of 2x or more.

Early-Stage Valuation Challenges

Onyshchenko (2025) examines the broader challenge of valuing early-stage IT companies, noting that conventional valuation methods (DCF, comparable multiples, precedent transactions) are poorly suited to companies with unstable cash flows, evolving business models, and limited historical data. For AI startups, these challenges are amplified by:

Rapid technology obsolescence (today's differentiating model capability may be commoditized within months).
Uncertain competitive dynamics (open-source alternatives can emerge quickly).
Regulatory uncertainty (EU AI Act, copyright litigation, data privacy regulations).

The study proposes a structured approach combining multiple valuation methods with scenario analysis—but acknowledges that the fundamental uncertainty of AI startup trajectories resists reduction through any single framework.

Critical Analysis: Claims and Evidence

Claim	Evidence	Verdict
AI startup ARR can overstate sustainable revenue by 2x+	Ratnatunga + Zhao: structural analysis of revenue composition	⚠️ Uncertain — plausible argument, limited systematic data
VC creates a circular ARR-valuation ecosystem	Ratnatunga: conceptual framework	⚠️ Uncertain — consistent with behavioral VC literature but not empirically tested
Experimental revenue constitutes 40–more than half of early AI startup ARR	Zhao: graph-based analysis	⚠️ Uncertain — methodology details needed for assessment
Conventional valuation methods are inadequate for AI startups	Onyshchenko: structured argument	✅ Supported — well-recognized challenge in VC
ARR is a useless metric for AI startups	None of the papers makes this extreme claim	❌ Refuted — ARR is informative but requires decomposition

What Better Metrics Might Look Like

The reviewed literature points toward several supplementary metrics that could improve AI startup evaluation:

Net revenue retention (NRR): Revenue from existing customers over time, capturing both expansion and churn. NRR > 120% indicates genuine product stickiness; NRR < 100% signals unsustainable growth.
API margin: Revenue minus third-party API costs, revealing the true margin captured by the startup rather than passed through to infrastructure providers.
Customer concentration: What share of ARR comes from the top 10 customers? High concentration amplifies churn risk.
Consumption velocity trend: Is per-customer usage increasing (product finding deeper adoption) or declining (initial experimentation fading)?

Open Questions and Future Directions

Longitudinal ARR tracking: Can we build datasets tracking AI startup ARR from initial growth through to sustainability (or collapse) to empirically test the overstatement hypothesis?

VC due diligence evolution: How are sophisticated VCs adapting their due diligence processes for AI startups? Are they decomposing ARR into the components identified here?

Open-source disruption pricing: How does the availability of open-source AI models (LLaMA, Mistral, DBRX) affect the pricing power and ARR sustainability of proprietary AI startups?

Regulatory impact: How might the EU AI Act's compliance requirements affect AI startup cost structures and thus the relationship between ARR and profitability?

Correction dynamics: When AI startup valuations correct (as they did for some companies in 2023–2024), what distinguishes companies that survive from those that fail?

Implications for Researchers and Practitioners

For venture capitalists, the evidence argues for decomposing ARR into its component sources (core vs. experimental, gross vs. net, organic vs. subsidized) rather than treating headline ARR as an unqualified growth signal. For AI startup founders, the practical implication is that sustainable unit economics—not ARR growth rate—will determine long-term survival once the current funding environment normalizes.

For management researchers, the AI startup valuation challenge provides a fertile empirical context for studying information asymmetry, signaling, and herding behavior in venture capital markets—dynamics that have been theorized extensively but are difficult to observe in real time. The current AI investment cycle offers an unusually transparent window into these processes.

면책 조항: 이 게시물은 정보 제공 목적의 연구 동향 개요이다. 학술 연구에서 인용하기 전에 특정 발견, 통계 및 주장을 원본 논문과 대조하여 검증해야 한다.

밸류에이션 신호로서의 ARR: VC가 AI 스타트업을 평가하는 방법 (그리고 지표가 오해를 불러일으킬 수 있는 이유)

연간 반복 매출(Annual Recurring Revenue, ARR)은 벤처 투자를 받은 기술 기업, 특히 서비스형 소프트웨어(Software-as-a-Service, SaaS) 부문에서 지배적인 밸류에이션 지표가 되었다. 논리는 간단하다. 반복 매출은 높은 밸류에이션 배수를 정당화하는 예측 가능하고 확장 가능한 수요를 나타낸다. AI 스타트업의 경우 ARR 성장은 매우 놀라운 수준이었다. Midjourney, ElevenLabs, 그리고 다양한 LLM API 제공업체들은 약 20개월 내에 ARR이 거의 0에서 약 1억 달러에 근접하는 궤적을 보고한 바 있다(Ratnatunga가 인용한 ElevenLabs 사례 참조). 이러한 수치는 막대한 VC 투자를 유치하고 수십억 달러에 달하는 밸류에이션을 정당화한다. 그러나 AI 부문에서 ARR은 실질적인 비즈니스 품질의 신뢰할 수 있는 신호인가?

연구 동향: 검토 대상이 된 ARR

Ratnatunga(2025)는 VC 생태계에서 성장 지표로서의 ARR에 대한 비판적 분석을 제공하며, AI 스타트업에 특별한 주의를 기울인다. 이 분석은 AI 스타트업의 ARR이 실질적인 비즈니스 품질 대비 부풀려질 수 있는 여러 메커니즘을 식별한다.

API 소비 급증: 많은 AI 스타트업은 사용량 기반 API 가격 책정을 통해 수익을 창출한다. 프로토타입을 개발하는 개발자는 활발한 개발 기간 동안 월 10,000달러의 API 호출을 소비하고 이후에는 월 100달러만 소비할 수 있다. 개발 급증 기간에 산정된 ARR은 지속 가능한 수익을 크게 과대평가한다.

무료-유료 전환 타이밍: 관대한 무료 티어를 제공하는 스타트업은 기능 제한이나 체험판 만료를 통해 무료 사용자를 유료 요금제로 전환하는 방식으로 ARR 급증을 유도할 수 있다. 이는 "반복" 매출로 기록되는 일회성 수익 급등을 만들어내지만, 전환된 사용자가 이후 이탈할 경우 반복되지 않을 수 있다.

VC 보조 소비: 일부 VC 투자를 받은 스타트업은 고객을 빠르게 확보하기 위해 원가 이하의 가격을 제공하여, 지속 가능한 단위 경제학이 아닌 지속적인 VC 보조에 의존하는 ARR을 창출한다. 가격이 정상화되면 이탈로 인해 ARR 기반이 급격히 축소될 수 있다.

총수익 대 순수익: 백엔드 인프라로 서드파티 LLM API(OpenAI, Anthropic)를 사용하는 AI 기업들은 순수익(결제금액에서 API 비용을 차감한 금액) 대신 총수익(총 고객 결제금액)을 보고하여, 비즈니스의 외형적 규모와 마진을 부풀릴 수 있다.

Ratnatunga는 VC의 ARR 의존이 "순환적 스타트업 생태계"를 만들어낸다고 주장한다. 스타트업은 VC가 이를 보상하기 때문에 ARR 성장을 최적화하고, VC는 ARR 성장이 향후 펀딩 라운드를 예측하기 때문에 이를 기반으로 투자하며, 향후 펀딩 라운드는 ARR 기반 밸류에이션을 검증한다. 이는 외부 충격(시장 침체, 금리 인상, 경쟁적 혼란)이 고리를 끊을 때까지 지속될 수 있는 자기 준거적 순환을 만들어낸다.

매출 구성 분석

Zhao(2026)는 그래프 기반 분석을 통해 AI 스타트업의 매출 구성을 검토하며, 다음을 구분한다.

핵심 제품 매출: 지속적인 가치를 얻는 고객으로부터 주요 제품이나 서비스에 대한 결제.
실험적 매출: AI 역량을 테스트하는 기업과 개발자로부터의 결제로, 장기 사용으로의 전환이 불확실함.
네트워크 효과 매출: 플랫폼 락인, 데이터 네트워크 효과, 또는 생태계 의존성에 의해 창출되는 매출로, 제품 품질 기반 매출보다 잠재적으로 더 지속적임.

이 분석은 초기 단계 AI 스타트업의 경우 실험적 매출이 전체 ARR의 40~절반 이상을 구성하는 경우가 많으며, 이 비율이 지속적 사용으로 전환되는 비율은 15~상당 비율에 그친다는 점을 발견했다. 이는 표면적인 ARR이 지속 가능한 매출을 2배 이상 과대평가할 수 있음을 시사한다.

초기 단계 밸류에이션의 과제

Onyshchenko (2025)는 초기 단계 IT 기업 가치 평가의 보다 광범위한 과제를 검토하며, 기존 가치 평가 방법(DCF, 비교 배수, 선행 거래)은 불안정한 현금 흐름, 진화하는 비즈니스 모델, 제한된 과거 데이터를 가진 기업에 적합하지 않다고 지적한다. AI 스타트업의 경우 이러한 과제는 다음 요인들로 인해 더욱 심화된다:

급격한 기술 노후화(오늘날 차별화된 모델 역량은 몇 달 안에 범용화될 수 있다).
불확실한 경쟁 역학(오픈소스 대안이 빠르게 등장할 수 있다).
규제 불확실성(EU AI Act, 저작권 소송, 데이터 프라이버시 규정).

해당 연구는 시나리오 분석과 여러 가치 평가 방법을 결합한 구조화된 접근법을 제안하지만, AI 스타트업 궤적의 근본적인 불확실성은 어떤 단일 프레임워크로도 줄이기 어렵다는 점을 인정한다.

비판적 분석: 주장과 근거

주장	근거	판정
AI 스타트업 ARR은 지속 가능한 수익을 2배 이상 과장할 수 있다	Ratnatunga + Zhao: 수익 구성에 대한 구조적 분석	⚠️ 불확실 — 그럴듯한 주장이나 체계적 데이터 부족
VC는 순환적 ARR-가치평가 생태계를 만든다	Ratnatunga: 개념적 프레임워크	⚠️ 불확실 — VC 행동 문헌과 일치하나 실증적으로 검증되지 않음
실험적 수익이 초기 AI 스타트업 ARR의 40~절반 이상을 구성한다	Zhao: 그래프 기반 분석	⚠️ 불확실 — 평가를 위한 방법론 세부 사항 필요
기존 가치 평가 방법은 AI 스타트업에 부적합하다	Onyshchenko: 구조화된 논거	✅ 지지됨 — VC 분야에서 널리 인정되는 과제
ARR은 AI 스타트업에 무용한 지표다	어떤 논문도 이러한 극단적 주장을 하지 않음	❌ 반박됨 — ARR은 유용한 정보를 제공하나 분해가 필요함

더 나은 지표의 형태

검토된 문헌은 AI 스타트업 평가를 개선할 수 있는 몇 가지 보완적 지표를 제시한다:

순수익 유지율(NRR): 기존 고객으로부터의 시간에 따른 수익으로, 확장과 이탈 모두를 포착한다. NRR > 120%는 진정한 제품 고착성을 나타내며, NRR < 100%는 지속 불가능한 성장을 신호한다.
API 마진: 수익에서 제3자 API 비용을 차감한 것으로, 인프라 제공업체에 넘겨지는 비용이 아닌 스타트업이 실제로 확보하는 마진을 드러낸다.
고객 집중도: ARR의 몇 퍼센트가 상위 10개 고객에서 나오는가? 높은 집중도는 이탈 위험을 증폭시킨다.
소비 속도 추세: 고객당 사용량이 증가하고 있는가(제품이 더 깊이 채택되고 있음) 아니면 감소하고 있는가(초기 실험이 사라지고 있음)?

미해결 과제와 향후 방향

ARR 종단 추적: 과장 가설을 실증적으로 검증하기 위해 초기 성장부터 지속 가능성(또는 붕괴)까지 AI 스타트업 ARR을 추적하는 데이터셋을 구축할 수 있는가?

VC 실사 진화: 정교한 VC들은 AI 스타트업에 대한 실사 과정을 어떻게 적응시키고 있는가? 이들은 ARR을 여기서 확인된 구성 요소들로 분해하고 있는가?

오픈소스 교란 가격 책정: 오픈소스 AI 모델(LLaMA, Mistral, DBRX)의 가용성은 독점 AI 스타트업의 가격 결정력과 ARR 지속 가능성에 어떤 영향을 미치는가?

규제 영향: EU AI Act의 준수 요건은 AI 스타트업의 비용 구조, 나아가 ARR과 수익성 간의 관계에 어떤 영향을 미칠 수 있는가?

조정 역학: AI 스타트업 가치 평가가 조정될 때(2023~2024년 일부 기업에서 그랬듯이), 살아남는 기업과 실패하는 기업을 구별하는 요인은 무엇인가?

연구자 및 실무자에 대한 시사점

벤처 캐피털리스트에게 있어 이 근거는 표면적 ARR을 무조건적인 성장 신호로 취급하기보다 ARR을 구성 요소별 출처(핵심 대 실험적, 총 대 순, 유기적 대 보조적)로 분해할 것을 주장한다. AI 스타트업 창업자에게 있어 실질적 시사점은, 현재의 투자 환경이 정상화되면 ARR 성장률이 아닌 지속 가능한 단위 경제성이 장기 생존을 결정할 것이라는 점이다. 경영학 연구자들에게 AI 스타트업 가치평가 문제는 벤처 캐피털 시장에서의 정보 비대칭, 신호 발송, 그리고 군집 행동을 연구하기 위한 풍부한 실증적 맥락을 제공한다—이러한 역동성은 이론적으로는 광범위하게 논의되어 왔으나 실시간으로 관찰하기는 어렵다. 현재의 AI 투자 사이클은 이러한 과정들을 들여다볼 수 있는 유례없이 투명한 창을 제공하고 있다.

References (4)

[1] Ratnatunga, J. (2025). ARR Growth Metric: Its use in Venture Capital and the Circular Startup Ecosystem. Open Access Journal, 1023223.

DOI Scholar

[2] Zhao, J. (2026). Graph-Based Deep Dive on AI Startup Revenue Composition and Venture Capital Network Effect. Research Journal, 4vg9t911.

DOI Scholar

[3] Onyshchenko, B. (2025). Specifics of investment valuation for early-stage companies in the IT sector. NaUKMA Economic Sciences, 10(1), 153–159.

DOI Scholar

[4] Hassel, J.-E., Clausen, T.H. & Rasmussen, E. (2025). Venture builders: new venture production in the entrepreneurship industry. Small Business Economics, 65, 01132.

DOI Scholar

ARR as a Valuation Signal: How VCs Evaluate AI Startups (and Why the Metrics May Mislead)

The Research Landscape: ARR Under Scrutiny

Revenue Composition Analysis

Early-Stage Valuation Challenges

Critical Analysis: Claims and Evidence

What Better Metrics Might Look Like

Open Questions and Future Directions

Implications for Researchers and Practitioners

밸류에이션 신호로서의 ARR: VC가 AI 스타트업을 평가하는 방법 (그리고 지표가 오해를 불러일으킬 수 있는 이유)

연구 동향: 검토 대상이 된 ARR

매출 구성 분석

초기 단계 밸류에이션의 과제

비판적 분석: 주장과 근거

더 나은 지표의 형태

미해결 과제와 향후 방향

연구자 및 실무자에 대한 시사점

References (4)

Explore this topic deeper