Shuu12121 commited on
Commit
e08ecb2
·
verified ·
1 Parent(s): 31cf67a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -64,7 +64,7 @@ model = SentenceTransformer("Shuu12121/Owl-ph2-len2048")
64
 
65
  ### Training Dataset
66
 
67
- This model was trained on the **Owl corpus**, a dataset constructed for code search and code-text retrieval.
68
  The training set contains approximately **100,000 samples per language**, resulting in **800,640 training pairs** in total.
69
 
70
  ### Training Hyperparameters
@@ -72,3 +72,23 @@ The training set contains approximately **100,000 samples per language**, result
72
  * **Learning rate:** 1e-5
73
  * **Epochs:** 1
74
  * **Loss:** MultipleNegativesRankingLoss
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
  ### Training Dataset
66
 
67
+ This model was trained on the [**Owl corpus**](https://huggingface.co/collections/Shuu12121/codesearch-datasets), a dataset constructed for code search and code-text retrieval.
68
  The training set contains approximately **100,000 samples per language**, resulting in **800,640 training pairs** in total.
69
 
70
  ### Training Hyperparameters
 
72
  * **Learning rate:** 1e-5
73
  * **Epochs:** 1
74
  * **Loss:** MultipleNegativesRankingLoss
75
+
76
+ ## Integrations
77
+
78
+ ### Owl-CLI
79
+
80
+ This model is used as the embedding model in **[Owl-CLI](https://github.com/Shun0212/Owl-CLI)**, a command-line tool for semantic code search.
81
+
82
+ Owl-CLI indexes source code at the **function level**, generates dense embeddings using this model, and performs **vector similarity search** to retrieve relevant code for natural language queries.
83
+
84
+ Key features of Owl-CLI include:
85
+
86
+ - **Semantic code search** using dense embeddings
87
+ - **Function-level indexing** with file paths and line numbers
88
+ - **Automatic indexing** on first search
89
+ - **Differential embedding cache** to avoid re-embedding unchanged files
90
+ - **JSON output** for tool integration
91
+ - **MCP server support** for integration with AI coding agents (e.g., Claude Code)
92
+
93
+ Repository:
94
+ https://github.com/Shun0212/Owl-CLI