Instructions to use devngho/code_edu_classifier-v3-microsoft_codebert-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use devngho/code_edu_classifier-v3-microsoft_codebert-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="devngho/code_edu_classifier-v3-microsoft_codebert-base")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("devngho/code_edu_classifier-v3-microsoft_codebert-base") model = AutoModelForSequenceClassification.from_pretrained("devngho/code_edu_classifier-v3-microsoft_codebert-base") - Notebooks
- Google Colab
- Kaggle
| base_model: | |
| - microsoft/codebert-base | |
| datasets: | |
| - devngho/the-stack-llm-annotations-v2 | |
| language: | |
| - code | |
| library_name: transformers | |
| license: mit | |
| metrics: | |
| - f1 | |
| # devngho/code_edu_classifier-v3-microsoft_codebert-base | |
| ์ด ๋ชจ๋ธ์ [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base)์ classifier๋ฅผ ์ถ๊ฐํ ๋ชจ๋ธ์ ๋๋ค. [HuggingFaceFW/fineweb-edu-classifier](https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier)์ ์ฝ๋ ๋ฒ์ ์ ๋ชฉํ๋ก, ์ฝ๋์ ๊ต์ก์ฑ ์ ์๋ฅผ ํ๊ฐํฉ๋๋ค. | |
| ํ์ต์๋ [bigcode/the-stack-dedup](https://huggingface.co/datasets/bigcode/the-stack-dedup)์์ ์ถ์ถํ ์ํ์ [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)๋ก ํ๊ฐํ [devngho/the-stack-llm-annotations-v2](https://huggingface.co/datasets/devngho/the-stack-llm-annotations-v2) ๋ฐ์ดํฐ์ ์ด ์ฌ์ฉ๋์์ต๋๋ค. | |
| ์ด ์ฐ๊ตฌ๋ Google์ TPU Research Cloud [(TRC)](https://sites.research.google/trc/about/)์ Cloud TPU ์ ๊ณต์ผ๋ก ์ํ๋์์ต๋๋ค. โก | |
| ## ์์ธ | |
| - **์ ์:** devngho | |
| - **์ธ์ด:** code | |
| - **๋ผ์ด์ ์ค:** mit | |
| - **๊ธฐ๋ฐ ๋ชจ๋ธ:** [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) | |
| ## ํ์ต ์์ธ | |
| - learning_rate: 3e-4 (cosine) | |
| - warmup_ratio: 0.1 | |
| - batch_size: 2048(512*4) | |
| - optimizer: adamw(b1=0.9, b2=0.98, eps=1e-8, weight_decay=0.01) | |
| - duration: 4h 41m | |
| - steps: 6080 | |
| ## ํ์ต ์ฅ๋น | |
| TPU v4-8 | |
| ## ์ฑ๋ฅ | |
| ``` | |
| Validation Report: | |
| precision recall f1-score support | |
| 0 0.80 0.06 0.10 72 | |
| 1 0.62 0.40 0.48 835 | |
| 2 0.61 0.62 0.61 2722 | |
| 3 0.48 0.72 0.58 1891 | |
| 4 0.62 0.02 0.05 623 | |
| 5 0.00 0.00 0.00 1 | |
| accuracy 0.55 6144 | |
| macro avg 0.52 0.30 0.30 6144 | |
| weighted avg 0.58 0.55 0.52 6144 | |
| Confusion Matrix: | |
| [[ 4 36 30 2 0 0] | |
| [ 1 330 464 40 0 0] | |
| [ 0 157 1684 881 0 0] | |
| [ 0 5 516 1361 9 0] | |
| [ 0 0 71 537 15 0] | |
| [ 0 0 0 1 0 0]] | |
| ``` | |
| 3 ์ด์๊ณผ ๋ฏธ๋ง์ผ๋ก ๊ตฌ๋ถํ ๋ f1 score๋ ์ฝ 0.72์ ๋๋ค. | |
| # devngho/code_edu_classifier-v3-microsoft_codebert-base | |
| This model is [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) with classfier head. It is designed to evaluate the educational value of codes, similar to the [HuggingFaceFW/fineweb-edu-classifier](https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier), but focused on code. The training data comes from [devngho/the-stack-llm-annotations-v2](https://huggingface.co/datasets/devngho/the-stack-llm-annotations-v2) dataset, contains samples extracted from [bigcode/the-stack-dedup](https://huggingface.co/datasets/bigcode/the-stack-dedup) and evaluated using [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct). | |
| This research was supported with Cloud TPUs from Google's TPU Research Cloud [(TRC)](https://sites.research.google/trc/about/).โก | |
| - **Developed by:** devngho | |
| - **Language(s):** code | |
| - **License:** mit | |
| - **Base model:** [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) | |
| ## Training detail | |
| - learning_rate: 3e-4 (cosine) | |
| - warmup_ratio: 0.1 | |
| - batch_size: 2048(512*4) | |
| - optimizer: adamw(b1=0.9, b2=0.98, eps=1e-8, weight_decay=0.01) | |
| - duration: 4h 41m | |
| - steps: 6080 | |
| ## Training hardware | |
| TPU v4-8 | |
| ## Performance | |
| ``` | |
| Validation Report: | |
| precision recall f1-score support | |
| 0 0.80 0.06 0.10 72 | |
| 1 0.62 0.40 0.48 835 | |
| 2 0.61 0.62 0.61 2722 | |
| 3 0.48 0.72 0.58 1891 | |
| 4 0.62 0.02 0.05 623 | |
| 5 0.00 0.00 0.00 1 | |
| accuracy 0.55 6144 | |
| macro avg 0.52 0.30 0.30 6144 | |
| weighted avg 0.58 0.55 0.52 6144 | |
| Confusion Matrix: | |
| [[ 4 36 30 2 0 0] | |
| [ 1 330 464 40 0 0] | |
| [ 0 157 1684 881 0 0] | |
| [ 0 5 516 1361 9 0] | |
| [ 0 0 71 537 15 0] | |
| [ 0 0 0 1 0 0]] | |
| ``` | |
| The F1 score is about 0.72 when separating above and below 3. |