A comprehensive collection of examples demonstrating structured data extraction and JSON output generation using various language models and frameworks. This repository showcases different approaches to ensure LLMs return well-formatted, schema-compliant responses.
- Multiple Framework Support: Examples for LangChain, vLLM, Outlines, Ollama, and more
- Pydantic Integration: Type-safe structured outputs with validation
- Batch Processing: Efficient handling of multiple prompts
- Vision Model Support: Structured outputs from multimodal models
- Flexible Backends: Support for local models, API services, and GGUF formats
File | Description | Framework |
---|---|---|
Groq_Langchain.py |
Groq API integration with LangChain | LangChain + Groq |
Gemini_langchain.py |
Google Gemini API with guided decoding | LangChain + Gemini |
File | Description | Framework |
---|---|---|
vLLM.py |
Local vLLM server with JSON schema validation | vLLM |
vLLM_openai_client.py |
vLLM server via OpenAI-compatible client | vLLM + OpenAI Client |
File | Description | Use Case |
---|---|---|
ollama.py |
Direct Ollama chat API usage | Simple structured outputs |
OllamaLLM.py |
Single prompt processing | Individual requests |
OllamaLLM_MultiModel.py |
MultiModel with structured outputs | Individual requests with Image |
OllamaLLM_Batch_Processing.py |
Batch processing with Pydantic validation | High-throughput scenarios |
chatOllama.py |
Chat-based interface | Conversational structured outputs |
File | Description | Model Type |
---|---|---|
Outlines_for_transformers.py |
Transformer models with JSON generation | HuggingFace Transformers |
Outlines_for_GGUF.py |
GGUF models via llama_cpp backend | Quantized models |
Outlines_for_transformers_vision.py |
Vision-language models | Multimodal inputs |
Outlines_for_transformers_vision_batch.py |
Batch vision processing | High-volume multimodal |
from pydantic import BaseModel
from typing import Optional
class PersonInfo(BaseModel):
name: str
age: Optional[int] = None
# Use any of the provided scripts with this schema
- Vision Models:
Outlines_for_transformers_vision.py
requires PyTorch 2.4 specifically - GGUF Models: Ensure llama_cpp is properly installed for GGUF examples
- API Keys: Set appropriate environment variables for Groq and Gemini examples
- Data Extraction: Extract structured information from unstructured text
- API Responses: Ensure consistent JSON responses from LLMs
- Batch Processing: Process large datasets with structured outputs
- Multimodal Analysis: Extract structured data from images and text
- Validation: Type-safe outputs with automatic validation
Contributions are welcome! Feel free to:
- Add examples for new frameworks
- Improve existing implementations
- Add error handling and edge cases
- Enhance documentation
This project is open source. Please check individual dependencies for their licensing terms.