EPUB File Loader

🚀

Enhanced

Direct integration with Langfuse tracing

EPUB (Electronic Publication) is a free and open e-book standard by the International Digital Publishing Forum (IDPF). This module provides functionality to load and process EPUB files within your workflow.

This module provides a sophisticated EPUB document loader that can:

  • Load single or multiple EPUB files
  • Support both base64-encoded files and files from storage
  • Extract content per chapter or per file
  • Process content with text splitters
  • Handle metadata extraction
  • Manage temporary file processing

Inputs

Required Parameters

  • EPUB File: The EPUB file(s) to process (.epub extension required)
  • Usage: Choose between:
    • One document per chapter: Split content by chapters
    • One document per file: Process entire file as one document

Optional Parameters

  • Text Splitter: A text splitter to process the extracted content
  • Additional Metadata: JSON object with additional metadata
  • Omit Metadata Keys: Comma-separated list of metadata keys to omit

Outputs

  • Document: Array of document objects containing metadata and pageContent
  • Text: Concatenated string from pageContent of documents

Features

  • Multiple file processing
  • Chapter-level splitting
  • File-level processing
  • Storage integration
  • Metadata customization
  • Text splitting support
  • Temporary file handling
  • Error handling

Processing Modes

Per Chapter Mode

  • Creates separate documents for each chapter
  • Maintains chapter structure
  • Preserves chapter metadata
  • Better for detailed analysis

Per File Mode

  • Processes entire file as one document
  • Maintains overall structure
  • Simpler document organization
  • Better for overview analysis

Notes

  • Supports both local and storage-based files
  • Handles base64 encoded content
  • Automatically cleans up temporary files
  • Preserves document structure
  • Supports custom metadata addition
  • Error handling for invalid files
  • Memory-efficient processing