Home
Softono
y

yifanfeng97

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Total Products
1

Software by yifanfeng97

Hyper-Extract
Open Source

Hyper-Extract

<div align="center"> <a href="https://yifanfeng97.github.io/Hyper-Extract/latest/"> <picture> <source media="(prefers-color-scheme: dark)" srcset="docs/assets/logo/logo-horizontal-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="docs/assets/logo/logo-horizontal.svg"> <img alt="Hyper-Extract Logo" src="docs/assets/logo/logo-horizontal.svg" width="600"> </picture> </a> <br/> <br/> **Smart Knowledge Extraction CLI** **Transform documents into structured knowledge with one command.** [๐Ÿ“– English Version](./README.md) ยท [ไธญๆ–‡็‰ˆ](./README_ZH.md) [![PyPI Version](https://img.shields.io/pypi/v/hyperextract)](https://pypi.org/project/hyperextract/) [![Python Version](https://img.shields.io/badge/python-3.11%2B-blue)](https://python.org) [![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE) [![Status](https://img.shields.io/badge/status-active-success)]() [![Docs](https://img.shields.io/badge/docs-online-blue)](https://yifanfeng97.github.io/Hyper-Extract/latest/) <br/> > **"Stop reading. Start understanding."** > *"ๅ‘Šๅˆซๆ–‡ๆกฃ็„ฆ่™‘๏ผŒ่ฎฉไฟกๆฏไธ€็›ฎไบ†็„ถ"* <br/> <img src="docs/assets/hero.jpg" alt="Hero & Workflow" width="800" style="max-width: 100%;"> <br/> </div> Hyper-Extract is an intelligent, LLM-powered knowledge extraction and evolution framework. It radically simplifies transforming highly unstructured texts into persistent, predictable, and strongly-typed **Knowledge Abstracts**. It effortlessly extracts information into a wide spectrum of formatsโ€”ranging from simple **Collections** (Lists/Sets) and **Pydantic Models**, to complex **Knowledge Graphs**, **Hypergraphs**, and even **Spatio-Temporal Graphs**. ## โœจ Core Features | | | |:---|:---| | ๐Ÿ”ท **8 Knowledge Structures** | From simple Lists to advanced Graphs, Hypergraphs, and Spatio-Temporal Graphs | | ๐Ÿง  **10+ Extraction Engines** | GraphRAG, LightRAG, Hyper-RAG, KG-Gen, and more โ€” ready to use | | ๐Ÿ“ **80+ YAML Templates** | Zero-code extraction across Finance, Legal, Medical, TCM, Industry, and General domains | | ๐Ÿ”„ **Incremental Evolution** | Feed new documents anytime to expand and refine your knowledge base | ## ๐ŸŽฏ What Can You Do With It? <details> <summary><b>๐Ÿ“„ Researcher โ€” Turn papers into knowledge graphs</b></summary> <br> Feed a 20-page academic paper, get an interactive graph of key concepts, authors, and citations. ```bash he parse paper.pdf -t general/academic_graph -o ./paper_kb/ he show ./paper_kb/ ``` </details> <details> <summary><b>๐Ÿฆ Financial Analyst โ€” Extract entities from earnings reports</b></summary> <br> Automatically identify companies, executives, financial metrics, and their relationships from unstructured reports. ```bash he parse earnings.md -t finance/earnings_graph -o ./finance_kb/ he search ./finance_kb/ "What are the key risk factors?" ``` </details> <details> <summary><b>๐Ÿ”’ Local Deployment โ€” Keep data on-premise with vLLM</b></summary> <br> Run Qwen3.5-9B + bge-m3 locally via vLLM. No data leaves your machine. ```python from hyperextract import create_client llm, emb = create_client( llm="vllm:Qwen3.5-9B@http://localhost:8000/v1", embedder="vllm:bge-m3@http://localhost:8001/v1", api_key="dummy", ) ``` </details> ## ๐Ÿš€ Supported Platforms & Models Hyper-Extract relies on the LLM's structured output capability (`json_schema` or Function Calling). | Platform | Verified Models | |----------|-----------------| | **OpenAI** | gpt-4o, gpt-4o-mini, gpt-5 | | **้˜ฟ้‡Œไบ‘็™พ็‚ผ** | qwen-plus, qwen-turbo, deepseek-r1 | | **Local vLLM** | Qwen3.5-9B (GPTQ-Marlin) | **Embedding models** (semantic search) work with any OpenAI-compatible endpoint: `text-embedding-3-small`, `text-embedding-v4` (Bailian), `bge-m3` (local vLLM). > ๐Ÿ“– Full guide: [Provider System & Local Model Support](https://yifanfeng97.github.io/Hyper-Extract/latest/concepts/provider-system/) ## โšก 30-Second Quick Start ```bash # Install uv tool install hyperextract # Configure API key he config init -k YOUR_OPENAI_API_KEY # Extract knowledge from a document he parse examples/en/tesla.md -t general/biography_graph -o ./output/ -l en # Query it he search ./output/ "What are Tesla's major achievements?" # Visualize he show ./output/ ``` <details> <summary><b>๐Ÿ Python API</b> (click to expand)</summary> <br> ```bash uv pip install hyperextract ``` ```python from hyperextract import Template ka = Template.create("general/biography_graph") with open("examples/en/tesla.md") as f: result = ka.parse(f.read()) result.show() ``` > ๐Ÿ”— More examples: [examples/en](./examples/en/) </details> ## ๐Ÿ“ˆ Why Hyper-Extract? | Feature | GraphRAG | LightRAG | KG-Gen | ATOM | **Hyper-Extract** | | :------ | :------: | :------: | :----: | :--: | :---------------: | | Knowledge Graph | โœ… | โœ… | โœ… | โœ… | โœ… | | Temporal Graph | โœ… | โŒ | โŒ | โœ… | โœ… | | Spatial Graph | โŒ | โŒ | โŒ | โŒ | โœ… | | Hypergraph | โŒ | โŒ | โŒ | โŒ | โœ… | | Domain Templates | โŒ | โŒ | โŒ | โŒ | โœ… | | Interactive CLI | โœ… | โŒ | โŒ | โŒ | โœ… | | Multi-language | โœ… | โŒ | โŒ | โŒ | โœ… | ## ๐Ÿงฉ Supported Knowledge Structures From simple to complex โ€” pick the right structure for your data: <img src="docs/assets/autotypes.jpg" alt="Knowledge Structures Matrix" width="750" style="max-width: 100%;"> **Example โ€” AutoGraph visualization:** <img src="docs/assets/en_show.jpg" alt="AutoGraph Visualization" width="750" style="max-width: 100%; border-radius: 8px; box-shadow: 0 4px 12px rgba(0,0,0,0.1);"> <details> <summary><b>๐Ÿ“‹ What's under the hood? (Architecture & Templates)</b></summary> <br> Hyper-Extract follows a **three-layer architecture**: - **Auto-Types** โ€” 8 strongly-typed data structures (Model, List, Set, Graph, Hypergraph, Temporal Graph, Spatial Graph, Spatio-Temporal Graph) - **Methods** โ€” Extraction algorithms: KG-Gen, GraphRAG, LightRAG, Hyper-RAG, Cog-RAG, and more - **Templates** โ€” 80+ presets across 6 domains. Zero-code setup. <img src="docs/assets/arch.jpg" alt="Architecture" width="750" style="max-width: 100%;"> **Template example (Graph type):** ```yaml language: en name: Knowledge Graph type: graph tags: [general] description: 'Extract entities and their relationships.' output: entities: fields: - name: name type: str - name: type type: str - name: description type: str relations: fields: - name: source type: str - name: target type: str - name: type type: str identifiers: entity_id: name relation_id: '{source}|{type}|{target}' ``` - [Browse all 80+ templates](./hyperextract/templates/presets/) - [Create custom templates](./hyperextract/templates/DESIGN_GUIDE.md) </details> ## ๐Ÿ“š Documentation & Resources | Resource | Link | | :------- | :--- | | Full Documentation | [yifanfeng97.github.io/Hyper-Extract](https://yifanfeng97.github.io/Hyper-Extract/latest/) | | CLI Guide | [Command-line interface](https://yifanfeng97.github.io/Hyper-Extract/latest/cli/) | | Provider System | [Model compatibility & local deployment](https://yifanfeng97.github.io/Hyper-Extract/latest/concepts/provider-system/) | | Template Gallery | [80+ presets](./hyperextract/templates/presets/) | | Examples | [Working code](./examples/) | ## ๐Ÿค Contributing & License Contributions are welcome! Please submit [Issues](https://github.com/yifanfeng97/hyper-extract/issues) and [PRs](https://github.com/yifanfeng97/hyper-extract/pulls). Licensed under **Apache-2.0**. ## โญ Star History [![Star History Chart](https://api.star-history.com/svg?repos=yifanfeng97/hyper-Extract&type=Date)](https://star-history.com/#yifanfeng97/hyper-Extract&Date)

AI Tools Knowledge Bases & RAG
974 Github Stars