Query Fan-Out Analysis for Screaming Frog
A custom JavaScript extraction that uses Google's Gemini AI to analyze how your content performs in Google's AI Mode search by predicting query fan-out patterns. You should use Gemini Flash models, as some users have reported that Pro API models don't work. Flash models are working very fast.
๐ What is Query Fan-Out?
Google's AI Mode doesn't just process your queryโit explodes it into multiple sub-queries, searching across Knowledge Graphs and web indexes to synthesize comprehensive answers. This tool helps you understand and optimize for this new search paradigm.
The Problem
- Most websites only answer 30% of queries Google's AI generates from their content
- Traditional keyword optimization misses 70% of AI search opportunities
- No existing tools measure query fan-out coverage
The Solution
This script analyzes your pages to:
- ๐ฏ Identify your primary entity
- ๐ Predict 8-10 sub-queries Google would generate
- ๐ Score your content coverage (Yes/Partial/No)
- ๐ก Suggest content gaps to fill
- ๐ฎ Predict follow-up questions
๐ Prerequisites
- Screaming Frog SEO Spider (any version with custom javascript)
- Google Gemini API Key (free tier available)
- Basic understanding of JavaScript (optional)
๐ ๏ธ Installation
1. Get Your Gemini API Key
- Visit Google AI Studio
- Click "Create API key"
- Copy your API key
2. Set Up in Screaming Frog
- Open Screaming Frog
- Navigate to:
Configuration > Custom > Custom Javascript - Click "Add"
- Configure:
- Name: Query Fan-Out Analysis
- Extraction: Function Value
- Extract Text: Paste the script below
3. Add Your API Key
Replace YOUR_GEMINI_API_KEY in the script with your actual API key:
const apiKey = 'YOUR_GEMINI_API_KEY'; // Replace this
๐ The Script
// Gemini AI Query Fan-Out Detection for Screaming Frog
// Version 1.0
const apiKey = 'YOUR_GEMINI_API_KEY';
// Extract semantic chunks from page (layout-aware chunking)
function extractSemanticChunks() {
const chunks = [];
// Extract title and main heading
const title = document.title || '';
const h1 = document.querySelector('h1')?.textContent || '';
if (title || h1) {
chunks.push({
type: 'primary_topic',
content: `${title} ${h1}`.trim()
});
}
// Extract headings and their content (layout-aware chunking)
const headings = document.querySelectorAll('h2, h3');
headings.forEach(heading => {
let content = heading.textContent;
let sibling = heading.nextElementSibling;
let sectionContent = '';
while (sibling && !['H1', 'H2', 'H3'].includes(sibling.tagName)) {
if (sibling.textContent) {
sectionContent += ' ' + sibling.textContent;
}
sibling = sibling.nextElementSibling;
}
if (sectionContent.trim()) {
chunks.push({
type: 'section',
heading: content,
content: sectionContent.trim().substring(0, 500)
});
}
});
// Extract key lists and FAQs
document.querySelectorAll('ul, ol').forEach((list, idx) => {
if (idx < 5 && list.children.length > 2) {
chunks.push({
type: 'list',
content: Array.from(list.children).map(li => li.textContent).join(' | ').substring(0, 300)
});
}
});
// Extract schema.org data if present
const schemas = document.querySelectorAll('script[type="application/ld+json"]');
schemas.forEach(schema => {
try {
const data = JSON.parse(schema.textContent);
if (data['@type']) {
chunks.push({
type: 'structured_data',
content: `Type: ${data['@type']}, ${JSON.stringify(data).substring(0, 200)}`
});
}
} catch (e) {}
});
return chunks;
}
try {
const url = window.location.href;
const chunks = extractSemanticChunks();
// Create comprehensive prompt for Gemini
const prompt = `You are analyzing a webpage for Google's AI Mode query fan-out potential. Google's AI Mode decomposes user queries into multiple sub-queries to synthesize comprehensive answers.
URL: ${url}
SEMANTIC CHUNKS FROM PAGE:
${JSON.stringify(chunks, null, 2)}
Based on this content, perform the following analysis:
1. IDENTIFY PRIMARY ENTITY: What is the main ontological entity or topic of this page?
2. PREDICT FAN-OUT QUERIES: Generate 8-10 likely sub-queries that Google's AI might create when a user asks about this topic. Consider:
- Related queries (broader context)
- Implicit queries (unstated user needs)
- Comparative queries (alternatives, comparisons)
- Procedural queries (how-to aspects)
- Contextual refinements (budget, size, location specifics)
3. SEMANTIC COVERAGE SCORE: For each predicted query, assess if the page content provides information to answer it (Yes/Partial/No).
4. FOLLOW-UP QUESTION POTENTIAL: What follow-up questions would users likely ask after reading this content?
OUTPUT FORMAT:
PRIMARY ENTITY: [entity name]
FAN-OUT QUERIES:
โข [Query 1] - Coverage: [Yes/Partial/No]
โข [Query 2] - Coverage: [Yes/Partial/No]
...
FOLLOW-UP POTENTIAL:
โข [Follow-up question 1]
โข [Follow-up question 2]
...
COVERAGE SCORE: [X/10 queries covered]
RECOMMENDATIONS: [Specific content gaps to fill]`;
// Call Gemini API
const requestData = {
contents: [{
parts: [{
text: prompt
}]
}],
generationConfig: {
temperature: 0.3,
topK: 20,
topP: 0.9,
maxOutputTokens: 2048
}
};
const xhr = new XMLHttpRequest();
xhr.open('POST', `https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=${apiKey}`, false);
xhr.setRequestHeader('Content-Type', 'application/json');
xhr.send(JSON.stringify(requestData));
if (xhr.status === 200) {
const response = JSON.parse(xhr.responseText);
if (response.candidates && response.candidates[0] && response.candidates[0].content) {
let analysis = response.candidates[0].content.parts[0].text;
// Add chunking summary
let output = '=== GOOGLE AI MODE QUERY FAN-OUT ANALYSIS ===\n\n';
output += analysis;
output += '\n\n=== CONTENT CHUNKING SUMMARY ===\n';
output += `โข Primary Topic Chunks: ${chunks.filter(c => c.type === 'primary_topic').length}\n`;
output += `โข Section Chunks: ${chunks.filter(c => c.type === 'section').length}\n`;
output += `โข List/FAQ Chunks: ${chunks.filter(c => c.type === 'list').length}\n`;
output += `โข Structured Data: ${chunks.filter(c => c.type === 'structured_data').length > 0 ? 'Yes' : 'No'}\n`;
output += `โข Total Semantic Chunks: ${chunks.length}`;
return seoSpider.data(output);
} else {
return seoSpider.error('Invalid Gemini response');
}
} else {
return seoSpider.error(`API Error: ${xhr.status}`);
}
} catch (error) {
return seoSpider.error(`Error: ${error.toString()}`);
}
๐ฏ Usage
- Start a crawl in Screaming Frog
- Navigate to the tab after crawling (saved Custom JS name)
- Review the Query Fan-Out Analysis column
- Export results to CSV for bulk analysis
Understanding the Output
=== GOOGLE AI MODE QUERY FAN-OUT ANALYSIS ===
PRIMARY ENTITY: Sustainable E-commerce Marketing
FAN-OUT QUERIES:
โข What makes marketing sustainable for online stores? - Coverage: Partial
โข How much budget do small e-commerce businesses need? - Coverage: No
โข Which eco-friendly marketing channels have best ROI? - Coverage: Yes
...
COVERAGE SCORE: 3/10 queries covered
RECOMMENDATIONS: Add budget guidelines, case studies, measurement frameworks
Interpreting Coverage Scores
- 0-3/10: Critical gaps - Major content opportunities
- 4-6/10: Average - Room for improvement
- 7-8/10: Good - Minor optimizations needed
- 9-10/10: Excellent - Comprehensive coverage
๐ Advanced Usage
Bulk Analysis
- Crawl your entire site or sitemap
- Export Custom Javascript data
- Sort by Coverage Score to prioritize optimization
Competitor Analysis
- Crawl competitor pages
- Compare coverage scores
- Identify gaps you can exploit
Content Strategy
Use fan-out queries to:
- Plan new content sections
- Optimize existing pages
- Create FAQ schemas
- Build topic clusters
๐ง Customization Options
Use Gemini Pro for Deeper Analysis
Replace the API endpoint for more detailed results:
// Change from:
xhr.open('POST', `https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=${apiKey}`, false);
// To:
xhr.open('POST', `https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro:generateContent?key=${apiKey}`, false);
Adjust Temperature for Different Results
- Lower (0.1-0.3): More consistent, predictable queries
- Higher (0.5-0.7): More creative, diverse queries
Modify Chunk Extraction
Add custom selectors for your site structure:
// Example: Extract FAQ sections
const faqs = document.querySelectorAll('.faq-item, [itemtype*="FAQPage"]');
๐ Troubleshooting
Common Issues
API Error 403
- Check your API key is correct
- Ensure API is enabled in Google Cloud Console
- Verify you haven't exceeded quotas
Empty Results
- Check if JavaScript rendering is enabled in Screaming Frog
- Verify the page has actual content
- Test API key in Google AI Studio first
Timeout Errors
- Reduce maxOutputTokens
- Use gemini-1.5-flash instead of pro
- Check your internet connection
API Limits
Free tier includes:
- 60 requests per minute
- 1,500 requests per day
- Input: 128k tokens
- Output: 8k tokens
๐ Performance Tips
- Start Small: Test on 10-20 key pages first
- Use Filters: Exclude non-content pages (tags, archives)
- Schedule Crawls: Stay within API limits
- Cache Results: Export and track changes over time
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Inspired by Google's query fan-out patents (US 2024/0289407 A1, WO2024064249A1)
- Built on insights from the SEO community's AI Mode research
- Special thanks to early testers and contributors
๐ Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Twitter/X: @yourhandle
Remember: The future of SEO is query networks, not keywords. Start analyzing your fan-out coverage today!
If this tool helped you, please โญ star the repository and share your results!