Json Lines File

🚀

Enhanced

Direct integration with Langfuse tracing

Json Lines File Node

JSON Lines (JSONL) is a text format where each line is a valid JSON value. This module provides functionality to load and process JSONL files, with support for pointer-based content extraction and dynamic metadata handling.

This module provides a sophisticated JSONL document loader that can:

  • Load single or multiple JSONL files
  • Extract specific values using JSON pointers
  • Handle dynamic metadata extraction
  • Process content with text splitters
  • Support base64 encoded files
  • Handle file storage integration
  • Customize metadata extraction

Inputs

Required Parameters

  • JSONL File: The JSONL file(s) to process (.jsonl extension)
  • Pointer Extraction: JSON pointer to extract content (e.g., “key” for {"key": "value"})

Optional Parameters

  • Text Splitter: A text splitter to process the extracted content
  • Additional Metadata: JSON object with additional metadata
  • Omit Metadata Keys: Comma-separated list of metadata keys to omit

Outputs

  • Document: Array of document objects containing metadata and pageContent
  • Text: Concatenated string from pageContent of documents

Features

  • JSON pointer extraction
  • Dynamic metadata handling
  • Text splitting support
  • Base64 file support
  • File storage integration
  • Error handling
  • Memory-efficient processing

JSON Pointer Extraction

Basic Example

For JSONL content:

{"key": "value1", "source": "file1.txt"}
{"key": "value2", "source": "file2.txt"}

With pointer “key”, extracts: “value1”, “value2”

Dynamic Metadata

You can extract values as metadata using JSON pointers:

{
    "source": "/source",
    "custom": "/metadata/field"
}

Document Structure

Each document contains:

  • pageContent: Extracted content using pointer
  • metadata:
    • source: Original file path
    • line: Line number in file
    • pointer: Used JSON pointer
    • Additional dynamic metadata

File Handling

Local Files

  • Direct file loading
  • Base64 encoded content
  • Multiple file support

Storage Integration

  • File storage system support
  • Organization-based storage
  • Chatflow-based storage

Notes

  • One document per JSONL line
  • Invalid JSON lines are skipped
  • Memory-efficient processing
  • Error handling for invalid pointers
  • Support for nested JSON structures
  • Dynamic metadata extraction
  • Flexible output formats