ur_ingest_resource_path_match_rule (table) Content

ignore .git and node_modules paths default /(.git|node_modules)/ IGNORE_RESOURCE Ignore any entry with `/.git/` or `/node_modules/` in the path. 2025-10-24 12:25:28 SYSTEM
images-metadata default .(?P<nature>jpg|jpeg|png|gif|bmp|tiff|svg|webp)$ CONTENT_ACQUIRABLE | CONTENT_ACQUIRABLE_METADATA ?P<nature> Images with metadata extraction 2025-10-24 12:25:28 SYSTEM
jsonl-content-replace default .(?P<nature>jsonl)$ CONTENT_ACQUIRABLE | CONTENT_REPLACE_JSON_LINES ?P<nature> JSONL files with content replacement 2025-10-24 12:25:28 SYSTEM
media-metadata default .(?P<nature>mp4|mp3)$ CONTENT_ACQUIRABLE | CONTENT_ACQUIRABLE_METADATA ?P<nature> Media files with metadata extraction 2025-10-24 12:25:28 SYSTEM
pdf-docx-transform-metadata default .(?P<nature>pdf|docx)$ CONTENT_ACQUIRABLE | CONTENT_ACQUIRABLE_TRANSFORM_MARKITDOWN | CONTENT_ACQUIRABLE_METADATA ?P<nature> PDF and DOCX documents with full transformation and metadata extraction 2025-10-24 12:25:28 SYSTEM
surveilr-SQL capturable executable default surveilr-SQL CAPTURABLE_EXECUTABLE | CAPTURABLE_SQL Any entry with surveilr-SQL in the path will be treated as a capturable SQL executable and allow execution of the SQL 2025-10-24 12:25:28 SYSTEM
surveilr-[NATURE] style capturable executable default surveilr[(?P<nature>[^]]*)] CAPTURABLE_EXECUTABLE ?P<nature> Any entry with surveilr-[XYZ] in the path will be treated as a capturable executable extracting XYZ as the nature 2025-10-24 12:25:28 SYSTEM
typical ingestion extensions default .(?P<nature>md|mdx|html|json|jsonc|puml|txt|toml|yml|xml|tap|csv|tsv|ssv|psv|tm7)$ CONTENT_ACQUIRABLE ?P<nature> Ingest the content for text and structured data extensions. Assume the nature is the same as the extension. 2025-10-24 12:25:28 SYSTEM