Linux File Search Query Formatter Model

Model Overview

This model is a Query Formatter trained on the Linux File Search Dataset.
It maps natural language file search queries into a structured JSON-like representation of file attributes based on a fixed schema.

Key Features:

Converts NL queries → structured tag–value pairs
Supports all schema attributes from the Linux File Search NLI dataset:
- File attributes (file_type, extension, size_kb, owner, group, permissions)
- Temporal attributes (created_year, modified_year)
- Semantic attributes (language, purpose, contains_text, is_executable, hidden)
- Path scope and generic tags (path_scope, important, autogenerated, obsolete, archived)
Outputs deterministic JSON suitable for safe post-processing into find commands or other Linux search engines

Intended Use

Recommended:

Formatting natural language queries into structured representations
Query-to-Structure pipelines for semantic file search
Integration with safe Linux CLI search tools (find, grep, fd)
Training downstream Q2I or NLI models

Not Recommended:

Direct command execution without validation
General-purpose conversation
Use outside Linux file systems without adaptation

Model Architecture

Type: Decoder-only (seq2seq transformer)
Input: Natural language query
Output: JSON-like structured representation (tag:value pairs)
Precision: bf16
Training Dataset: Linux File Search NLI (~3500 synthetic examples)
Training Objective: Map NL queries → structured schema attributes

Limitations

English-only queries
Linux-centric file system abstraction
Temporal reasoning limited to years
Logical operators may require post-processing
Does not execute commands

Safety Considerations

Outputs are structured representations, not shell commands
Any conversion to executable commands should be validated and sandboxed
Prevent execution of arbitrary system commands from model output

Downloads last month: 2

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for software-si/gemma3-270m-query-formatter-linux-file-search-fp32

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it

Finetuned

(1104)

this model

software-si
/

gemma3-270m-query-formatter-linux-file-search-fp32