There is no submit button for AI answers. To get cited, your content has to clear three gates, in order: an assistant's crawler must be able to reach you, it must be able to read you in raw HTML, and your content has to be structured so an answer is easy to lift. Citations go to pages that are reachable, readable and quotable — in that order. You don't buy your way in; you earn it by being the cleanest source on the question someone asked.

What “getting cited” actually means

When someone asks ChatGPT, Claude, Gemini or Perplexity a question, the assistant answers in one of two ways. It either recalls what it absorbed about you during training, or it retrieves live pages at answer time — browsing or searching, reading what it finds, and composing a reply. A citation is when it names or links you as the source behind that reply.

You can't edit a model's training data, and you can't predict which questions you'll surface on. What you cancontrol is whether you're a clean, retrievable, unambiguous source when an assistant goes looking — and that is what this guide is about. The good news: training recall and live retrieval reward the same things, so fixing the retrieval path makes you a better training candidate too.

The three gates (in order)

Most “how to rank in AI” advice jumps straight to schema and llms.txt. That's backwards. The gates are sequential — each one is pointless until the one before it is open.

  1. Reachable.The assistant's crawler — GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended — has to be allowed to fetch your pages. The trap is that you can be blocked at your CDN or WAF even when robots.txt looks fine, so the bot gets a 403 and you never appear. We cover how to check crawler access here.
  2. Readable.AI crawlers generally don't run JavaScript. If your content only appears after the page hydrates, the bot receives an empty shell — reachable, but nothing to read. View your page's source (not the inspector) and confirm the actual words are in the initial HTML.
  3. Quotable.Once you're reachable and readable, the question becomes whether your content is easy to lift as an answer: a direct answer near the top, question-shaped headings, and structured data that says what you are. This is where most sites that “do everything right” still lose.

What makes content quotable

Assistants compose answers from passages they can extract cleanly. Write so the answer is impossible to miss:

  • Answer first. Open each section with a one- or two-sentence answer that stands on its own, then explain. Models lift the lede; if your answer is buried under three paragraphs of wind-up, it gets skipped.
  • Question-shaped headings.Use headings that match how people ask (“How do I…”, “What is…”). They map directly onto prompts.
  • FAQ blocks. A short question-and-answer section is the most-quoted format in AI answers — and marking it up as FAQPage JSON-LD is the single strongest structured-data signal at the retrieval layer.
  • Define your entity plainly.Say who you are, what you do and who you're for, in plain words, on the page. Don't make the model infer it from a clever tagline.
  • Lists, tables and concrete data. Comparisons, steps and numbers are easy to extract and cite. Vague prose is not.

Structured data that helps you get cited

Structured data (schema.org JSON-LD) is how you tell an assistant what a page isin a format it doesn't have to guess at. It has to live in the HTML the crawler actually receives — not injected by JavaScript after load. The ones that matter for citation:

  • Organization. Name, URL and logo, so engines can confirm who you are as an entity.
  • Organization sameAs.Links to your profiles elsewhere (LinkedIn, X, Crunchbase, Wikidata). This is how a model connects “the page” to “the entity it already knows,” which makes it far likelier to name you.
  • FAQPage.The strongest retrieval signal — Q&A is the format assistants quote most.
  • Article. For posts and guides, include author, datePublished and dateModified so your content reads as fresh and attributable.

Be the entity, not just a page

Citation isn't only on-page. Assistants build a picture of you from across the web, so consistency compounds. Use the same name and description everywhere, link your profiles with sameAs, and earn mentions on sources models read often — well-known publications, directories, comparison pages, community threads. The more places describe you the same way, the more confidently an assistant will name you when it's relevant. An llms.txt file is a nice finishing touch here — a curated map of your key pages — but only after the three gates are open.

How to check where you stand

You can probe the gates by hand. Request a page as the bot to confirm it's reachable:

curl -A "GPTBot" -I https://yourdomain.com
# 200 OK  -> reachable
# 403 / 401 / challenge page -> blocked upstream

Then view the page source (not the inspector) and check your real content is in the HTML, and search it for application/ld+json to see whether your structured data is actually present server-side.

Or do it in one step. AEOScan crawls your site the way the assistants do, runs 34 checks across six areas — AI Crawler Access, llms.txt, Structured Data, Content Structure, Technical Foundations and Agent Readiness — andasks ChatGPT, Claude, Gemini and Perplexity what they actually know about you, printing their unedited answers next to the checks that explain the gap. You get a score, the assistants' real words, and copy-paste fixes. Free, about 30 seconds, no signup.

The practical checklist

  1. Allow the AI crawlers in robots.txt and at your CDN/WAF.
  2. Confirm your content is in the raw HTML, with JavaScript off.
  3. Lead pages with a direct, self-contained answer.
  4. Add an FAQ section with FAQPage JSON-LD.
  5. Ship Organization JSON-LD with sameAs links.
  6. Keep names and descriptions consistent across the web.
  7. Add an llms.txt map once the above is done.
  8. Re-scan, read what the assistants say, and fix the top gap.

The bottom line

Getting cited by ChatGPT isn't a growth hack — it's the natural result of being the most reachable, readable and quotable source on a question. Open the three gates in order, write so the answer is easy to lift, and make it trivial for an assistant to know who you are. Do that and citations follow; skip the order and no amount of schema will save you.

Frequently asked questions

Can I pay to get cited by ChatGPT?

No. There is no ad slot, submission form or pay-to-rank option for AI answers. Assistants cite the sources they can reach, read and lift an answer from. You earn citations by being the cleanest, clearest source on the question — not by buying placement.

How long after I make changes will ChatGPT cite me?

It depends on how the answer is produced. When an assistant browses or retrieves live pages (Perplexity, ChatGPT with search, Gemini), it picks up your changes the next time it crawls you — usually days to a few weeks. When it answers from training data, the lag is far longer and outside your control. So fix the technical gates now; the live-retrieval surfaces reward you first.

What content format gets cited most?

Answer-first question-and-answer content. Lead each section with a direct, self-contained answer, use question-shaped headings, and add an FAQ block marked up as FAQPage JSON-LD. In citation research, FAQPage is the single strongest structured-data signal at the retrieval layer — Q&A is the format assistants quote most.

Does getting cited by ChatGPT help my Google ranking?

They are different systems, so there's no direct ranking transfer. But the fundamentals overlap heavily: a site that is crawlable, fast, server-rendered and well-structured tends to do well in both. Think of getting cited as Answer Engine Optimization (AEO), a sibling of SEO rather than a replacement.

Do I need an llms.txt file to get cited?

No — it helps, but it is not required and it is not the first thing to fix. Reachability (crawlers allowed), readability (content in raw HTML) and structure (clean JSON-LD) matter more. Add llms.txt once those are in place as a curated map for assistants.