Corrections made: - docs/architecture.md: Fix crate count from 9 to 8 (actual count) - docs/tools.md: Fix code_search supported languages (kotlin -> haskell, scheme, racket) - docs/CODE_SEARCH.md: Add missing Haskell and Scheme to supported languages list - docs/providers.md: Add complete Gemini provider documentation section - docs/configuration.md: Add Gemini configuration section The Gemini provider (crates/g3-providers/src/gemini.rs) was fully implemented but not documented. The code_search tool actually supports haskell and scheme (via tree-sitter) but documentation incorrectly listed kotlin. Agent: lamport
492 lines
10 KiB
Markdown
492 lines
10 KiB
Markdown
# g3 Code Search Guide
|
|
|
|
**Last updated**: January 2025
|
|
**Source of truth**: `crates/g3-core/src/code_search/`, `crates/g3-core/src/tool_definitions.rs`
|
|
|
|
## Purpose
|
|
|
|
g3 includes a syntax-aware code search tool powered by tree-sitter. Unlike text-based search (grep), it understands code structure and finds actual functions, classes, methods, and other constructs—ignoring matches in comments and strings.
|
|
|
|
## Why Use Code Search?
|
|
|
|
| Feature | grep/ripgrep | code_search |
|
|
|---------|--------------|-------------|
|
|
| Finds text in comments | ✅ | ❌ |
|
|
| Finds text in strings | ✅ | ❌ |
|
|
| Understands code structure | ❌ | ✅ |
|
|
| Finds function definitions | Regex needed | Native |
|
|
| Finds class hierarchies | ❌ | ✅ |
|
|
| Language-aware | ❌ | ✅ |
|
|
|
|
**Use code_search when**:
|
|
- Finding function/method definitions
|
|
- Finding class/struct declarations
|
|
- Searching for specific code constructs
|
|
- Need accurate results without false positives
|
|
|
|
**Use grep when**:
|
|
- Searching non-code files (logs, markdown)
|
|
- Simple string searches
|
|
- Searching comments or documentation
|
|
- Regex for text patterns
|
|
|
|
## Supported Languages
|
|
|
|
- Rust
|
|
- Python
|
|
- JavaScript
|
|
- TypeScript
|
|
- Go
|
|
- Java
|
|
- C
|
|
- C++
|
|
- Haskell
|
|
- Scheme
|
|
- Racket
|
|
|
|
## Basic Usage
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "my_search",
|
|
"query": "(function_item name: (identifier) @name)",
|
|
"language": "rust"
|
|
}]
|
|
}}
|
|
```
|
|
|
|
### Parameters
|
|
|
|
| Parameter | Type | Required | Description |
|
|
|-----------|------|----------|-------------|
|
|
| `searches` | array | Yes | Array of search objects (max 20) |
|
|
| `max_concurrency` | integer | No | Parallel searches (default: 4) |
|
|
| `max_matches_per_search` | integer | No | Max matches (default: 500) |
|
|
|
|
### Search Object
|
|
|
|
| Field | Type | Required | Description |
|
|
|-------|------|----------|-------------|
|
|
| `name` | string | Yes | Label for this search |
|
|
| `query` | string | Yes | Tree-sitter query (S-expression) |
|
|
| `language` | string | Yes | Programming language |
|
|
| `paths` | array | No | Paths to search (default: current dir) |
|
|
| `context_lines` | integer | No | Lines of context (0-20, default: 0) |
|
|
|
|
## Query Syntax
|
|
|
|
Tree-sitter queries use S-expression syntax. The basic pattern is:
|
|
|
|
```
|
|
(node_type field: (child_type) @capture_name)
|
|
```
|
|
|
|
- `node_type`: The AST node to match
|
|
- `field`: Optional field name
|
|
- `child_type`: Type of child node
|
|
- `@capture_name`: Name for the captured node
|
|
|
|
## Common Query Patterns
|
|
|
|
### Rust
|
|
|
|
```lisp
|
|
;; All functions
|
|
(function_item name: (identifier) @name)
|
|
|
|
;; Async functions
|
|
(function_item (function_modifiers) name: (identifier) @name)
|
|
|
|
;; Structs
|
|
(struct_item name: (type_identifier) @name)
|
|
|
|
;; Enums
|
|
(enum_item name: (type_identifier) @name)
|
|
|
|
;; Impl blocks
|
|
(impl_item type: (type_identifier) @name)
|
|
|
|
;; Trait definitions
|
|
(trait_item name: (type_identifier) @name)
|
|
|
|
;; Macros
|
|
(macro_definition name: (identifier) @name)
|
|
|
|
;; Constants
|
|
(const_item name: (identifier) @name)
|
|
|
|
;; Static variables
|
|
(static_item name: (identifier) @name)
|
|
|
|
;; Type aliases
|
|
(type_item name: (type_identifier) @name)
|
|
|
|
;; Modules
|
|
(mod_item name: (identifier) @name)
|
|
```
|
|
|
|
### Python
|
|
|
|
```lisp
|
|
;; Functions
|
|
(function_definition name: (identifier) @name)
|
|
|
|
;; Async functions
|
|
(function_definition name: (identifier) @name) @fn
|
|
|
|
;; Classes
|
|
(class_definition name: (identifier) @name)
|
|
|
|
;; Methods (functions inside classes)
|
|
(class_definition
|
|
body: (block
|
|
(function_definition name: (identifier) @name)))
|
|
|
|
;; Decorators
|
|
(decorator) @decorator
|
|
|
|
;; Imports
|
|
(import_statement) @import
|
|
(import_from_statement) @import
|
|
```
|
|
|
|
### JavaScript / TypeScript
|
|
|
|
```lisp
|
|
;; Function declarations
|
|
(function_declaration name: (identifier) @name)
|
|
|
|
;; Arrow functions assigned to variables
|
|
(variable_declarator
|
|
name: (identifier) @name
|
|
value: (arrow_function))
|
|
|
|
;; Classes
|
|
(class_declaration name: (identifier) @name)
|
|
|
|
;; Methods
|
|
(method_definition name: (property_identifier) @name)
|
|
|
|
;; Exports
|
|
(export_statement) @export
|
|
|
|
;; Imports
|
|
(import_statement) @import
|
|
```
|
|
|
|
### Go
|
|
|
|
```lisp
|
|
;; Functions
|
|
(function_declaration name: (identifier) @name)
|
|
|
|
;; Methods
|
|
(method_declaration name: (field_identifier) @name)
|
|
|
|
;; Structs
|
|
(type_declaration
|
|
(type_spec name: (type_identifier) @name
|
|
type: (struct_type)))
|
|
|
|
;; Interfaces
|
|
(type_declaration
|
|
(type_spec name: (type_identifier) @name
|
|
type: (interface_type)))
|
|
```
|
|
|
|
### Java
|
|
|
|
```lisp
|
|
;; Classes
|
|
(class_declaration name: (identifier) @name)
|
|
|
|
;; Interfaces
|
|
(interface_declaration name: (identifier) @name)
|
|
|
|
;; Methods
|
|
(method_declaration name: (identifier) @name)
|
|
|
|
;; Constructors
|
|
(constructor_declaration name: (identifier) @name)
|
|
|
|
;; Fields
|
|
(field_declaration
|
|
declarator: (variable_declarator name: (identifier) @name))
|
|
```
|
|
|
|
### C / C++
|
|
|
|
```lisp
|
|
;; Functions
|
|
(function_definition
|
|
declarator: (function_declarator
|
|
declarator: (identifier) @name))
|
|
|
|
;; Structs (C)
|
|
(struct_specifier name: (type_identifier) @name)
|
|
|
|
;; Classes (C++)
|
|
(class_specifier name: (type_identifier) @name)
|
|
|
|
;; Namespaces (C++)
|
|
(namespace_definition name: (identifier) @name)
|
|
```
|
|
|
|
### Racket
|
|
|
|
Racket uses an S-expression grammar where code is represented as nested lists. The tree-sitter-racket parser represents most forms as `(list (symbol) ...)` nodes.
|
|
|
|
```lisp
|
|
;; Function definitions: (define (name args...) body)
|
|
(list . (symbol) @kw (#eq? @kw "define") . (list . (symbol) @name))
|
|
|
|
;; Variable definitions: (define name value)
|
|
(list . (symbol) @kw (#eq? @kw "define") . (symbol) @name)
|
|
|
|
;; Struct definitions: (struct name (fields...))
|
|
(list . (symbol) @kw (#eq? @kw "struct") . (symbol) @name)
|
|
|
|
;; Lambda expressions
|
|
(list . (symbol) @kw (#match? @kw "^(lambda|λ)$"))
|
|
|
|
;; Let bindings
|
|
(list . (symbol) @kw (#match? @kw "^(let|let\\*|letrec)$"))
|
|
|
|
;; Require statements
|
|
(list . (symbol) @kw (#eq? @kw "require"))
|
|
|
|
;; Provide statements
|
|
(list . (symbol) @kw (#eq? @kw "provide"))
|
|
|
|
;; Module definitions
|
|
(list . (symbol) @kw (#match? @kw "^module"))
|
|
|
|
;; Contracts
|
|
(list . (symbol) @kw (#eq? @kw "define/contract") . (list . (symbol) @name))
|
|
|
|
;; Macros
|
|
(list . (symbol) @kw (#match? @kw "^(define-syntax|define-syntax-rule)$") . (symbol) @name)
|
|
|
|
;; For loops
|
|
(list . (symbol) @kw (#match? @kw "^for"))
|
|
|
|
;; Match expressions
|
|
(list . (symbol) @kw (#eq? @kw "match"))
|
|
|
|
;; Class definitions
|
|
(list . (symbol) @kw (#match? @kw "^class"))
|
|
```
|
|
|
|
**Note**: The `.` (dot) in queries like `(list . (symbol))` means "first child" - it matches the symbol that appears immediately after the opening parenthesis.
|
|
|
|
### Scheme
|
|
|
|
Scheme uses similar patterns to Racket:
|
|
|
|
```lisp
|
|
;; Function definitions
|
|
(list . (symbol) @kw (#eq? @kw "define") . (list . (symbol) @name))
|
|
|
|
;; Lambda expressions
|
|
(list . (symbol) @kw (#eq? @kw "lambda"))
|
|
```
|
|
|
|
## Advanced Queries
|
|
|
|
### Wildcards
|
|
|
|
Use `_` to match any node:
|
|
|
|
```lisp
|
|
;; Any function with any name
|
|
(function_item name: (_) @name)
|
|
```
|
|
|
|
### Alternatives
|
|
|
|
Match multiple patterns:
|
|
|
|
```lisp
|
|
;; Functions or methods
|
|
[(function_item) (impl_item)] @item
|
|
```
|
|
|
|
### Predicates
|
|
|
|
Filter matches:
|
|
|
|
```lisp
|
|
;; Functions starting with "test_"
|
|
(function_item name: (identifier) @name
|
|
(#match? @name "^test_"))
|
|
|
|
;; Functions NOT starting with "_"
|
|
(function_item name: (identifier) @name
|
|
(#not-match? @name "^_"))
|
|
```
|
|
|
|
### Nested Matches
|
|
|
|
```lisp
|
|
;; Methods inside impl blocks
|
|
(impl_item
|
|
body: (declaration_list
|
|
(function_item name: (identifier) @method_name)))
|
|
```
|
|
|
|
## Batch Searches
|
|
|
|
Run multiple searches in parallel:
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [
|
|
{
|
|
"name": "functions",
|
|
"query": "(function_item name: (identifier) @name)",
|
|
"language": "rust"
|
|
},
|
|
{
|
|
"name": "structs",
|
|
"query": "(struct_item name: (type_identifier) @name)",
|
|
"language": "rust"
|
|
},
|
|
{
|
|
"name": "tests",
|
|
"query": "(function_item name: (identifier) @name (#match? @name \"^test_\"))",
|
|
"language": "rust",
|
|
"paths": ["tests/"]
|
|
}
|
|
],
|
|
"max_concurrency": 4
|
|
}}
|
|
```
|
|
|
|
## Context Lines
|
|
|
|
Include surrounding code:
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "functions",
|
|
"query": "(function_item name: (identifier) @name)",
|
|
"language": "rust",
|
|
"context_lines": 3
|
|
}]
|
|
}}
|
|
```
|
|
|
|
This shows 3 lines before and after each match.
|
|
|
|
## Path Filtering
|
|
|
|
Search specific directories:
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "core_functions",
|
|
"query": "(function_item name: (identifier) @name)",
|
|
"language": "rust",
|
|
"paths": ["src/core", "src/lib.rs"]
|
|
}]
|
|
}}
|
|
```
|
|
|
|
## Output Format
|
|
|
|
Results include:
|
|
- File path
|
|
- Line number
|
|
- Matched code
|
|
- Context (if requested)
|
|
|
|
```
|
|
=== functions (15 matches) ===
|
|
|
|
src/lib.rs:42
|
|
fn process_request(req: Request) -> Response {
|
|
|
|
src/lib.rs:78
|
|
fn handle_error(err: Error) -> Result<()> {
|
|
|
|
src/utils.rs:15
|
|
fn format_output(data: &str) -> String {
|
|
```
|
|
|
|
## Tips
|
|
|
|
### Finding the Right Query
|
|
|
|
1. **Start simple**: Begin with basic node types
|
|
2. **Use AST explorer**: Understand your language's AST
|
|
3. **Iterate**: Refine queries based on results
|
|
|
|
### Performance
|
|
|
|
- **Limit paths**: Search specific directories when possible
|
|
- **Use concurrency**: Batch related searches
|
|
- **Set max_matches**: Prevent overwhelming output
|
|
|
|
### Debugging Queries
|
|
|
|
If a query returns no results:
|
|
1. Check language spelling (lowercase)
|
|
2. Verify node type names for your language
|
|
3. Start with simpler query, add constraints
|
|
4. Check if files exist in search paths
|
|
|
|
## Examples by Task
|
|
|
|
### Find all public functions in Rust
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "public_fns",
|
|
"query": "(function_item (visibility_modifier) name: (identifier) @name)",
|
|
"language": "rust"
|
|
}]
|
|
}}
|
|
```
|
|
|
|
### Find all test functions
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "tests",
|
|
"query": "(function_item name: (identifier) @name (#match? @name \"^test_\"))",
|
|
"language": "rust",
|
|
"paths": ["tests/"]
|
|
}]
|
|
}}
|
|
```
|
|
|
|
### Find all API endpoints (Python Flask)
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "routes",
|
|
"query": "(decorated_definition (decorator) @dec (function_definition name: (identifier) @name))",
|
|
"language": "python"
|
|
}]
|
|
}}
|
|
```
|
|
|
|
### Find all React components
|
|
|
|
```json
|
|
{"tool": "code_search", "args": {
|
|
"searches": [{
|
|
"name": "components",
|
|
"query": "(function_declaration name: (identifier) @name (#match? @name \"^[A-Z]\"))",
|
|
"language": "javascript",
|
|
"paths": ["src/components/"]
|
|
}]
|
|
}}
|
|
```
|