ur_ingest_resource_path_match_rule (table) Content
ignore .git and node_modules paths | default | /(\.git|node_modules)/ | IGNORE_RESOURCE | Ignore any entry with `/.git/` or `/node_modules/` in the path. | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | ||||||||
typical ingestion extensions | default | \.(?P<nature>md|mdx|html|json|jsonc|puml|txt|toml|yml|xml|tap|csv|tsv|ssv|psv|tm7)$ | CONTENT_ACQUIRABLE | ?P<nature> | Ingest the content for text and structured data extensions (md, mdx, html, json, jsonc, puml, txt, toml, yml, xml, tap, csv, tsv, ssv, psv, tm7). Assume the nature is the same as the extension. | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | |||||||
surveilr-[NATURE] style capturable executable | default | surveilr\[(?P<nature>[^\]]*)\] | CAPTURABLE_EXECUTABLE | ?P<nature> | Any entry with `surveilr-[XYZ]` in the path will be treated as a capturable executable extracting `XYZ` as the nature | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | |||||||
surveilr-SQL capturable executable | default | surveilr-SQL | CAPTURABLE_EXECUTABLE | CAPTURABLE_SQL | Any entry with surveilr-SQL in the path will be treated as a capturable SQL executable and allow execution of the SQL | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | ||||||||
pdf-docx-transform-metadata | default | \.(?P<nature>pdf|docx)$ | CONTENT_ACQUIRABLE | CONTENT_ACQUIRABLE_TRANSFORM_MARKITDOWN | CONTENT_ACQUIRABLE_METADATA | ?P<nature> | 1 | PDF and DOCX documents with full transformation and metadata extraction | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | ||||||
images-metadata | default | \.(?P<nature>jpg|jpeg|png|gif|bmp|tiff|svg|webp)$ | CONTENT_ACQUIRABLE | CONTENT_ACQUIRABLE_METADATA | ?P<nature> | 2 | Images with metadata extraction | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | ||||||
media-metadata | default | \.(?P<nature>mp4|mp3)$ | CONTENT_ACQUIRABLE | CONTENT_ACQUIRABLE_METADATA | ?P<nature> | 3 | Media files with metadata extraction | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 | ||||||
jsonl-content-replace | default | \.(?P<nature>jsonl)$ | CONTENT_ACQUIRABLE | CONTENT_REPLACE_JSON_LINES | ?P<nature> | 4 | JSONL files for line-by-line JSON ingestion | 2025-09-18 12:37:02 | 2025-09-18 12:40:35 |
(Page 1 of 1)