tau-0.5B

Model Details

  • Model Name: tau-0.5B
  • Base Model: Qwen1.5-0.5B
  • Dataset: UltraTextbooks-2.0
  • Model Size: 0.5B parameters
  • Model Type: Language Model
  • Training Procedure: Further pre-training of Qwen1.5-0.5B on UltraTextbooks-2.0.

Model Use

tau-0.5B is designed to be a general-purpose language model with enhanced capabilities in the domains of machine learning, mathematics, and coding. It can be used for a wide range of natural language processing tasks, such as:

  • Educational question answering
  • Text summarization
  • Content generation for educational purposes
  • Code understanding and generation
  • Mathematical problem solving

The model's exposure to the diverse content in the UltraTextbooks-2.0 dataset makes it particularly well-suited for applications in educational technology and research.

Training Data

tau-0.5B was further pre-trained on the UltraTextbooks-2.0 dataset, which is an expanded version of the original UltraTextbooks dataset. UltraTextbooks-2.0 incorporates additional high-quality synthetic and human-written textbooks from various sources on the Hugging Face platform, with a focus on increasing the diversity of content in the domains of machine learning, mathematics, and coding.

For more details on the dataset, please refer to the UltraTextbooks-2.0 Dataset Card.

Performance and Limitations

Refer to Evaluation for evaluations. It is essential to note that the model may still exhibit biases or inaccuracies present in the training data. Users are encouraged to critically evaluate the model's outputs and report any issues to facilitate continuous improvement.

Environmental Impact

The training of tau-0.5B required computational resources that contribute to the model's overall environmental impact. However, efforts were made to optimize the training process and minimize the carbon footprint.

Ethical Considerations

tau-0.5B was trained on a diverse dataset that may contain biases and inaccuracies. Users should be aware of these potential limitations and use the model responsibly. The model should not be used for tasks that could cause harm or discriminate against individuals or groups.

Evaluation

Tasks Version Filter n-shot Metric Value Stderr
agieval_nous N/A none 0 acc 0.2235 ยฑ 0.0434
none 0 acc_norm 0.2141 ยฑ 0.0498
- agieval_aqua_rat 1 none 0 acc 0.1417 ยฑ 0.0219
none 0 acc_norm 0.1535 ยฑ 0.0227
- agieval_logiqa_en 1 none 0 acc 0.2796 ยฑ 0.0176
none 0 acc_norm 0.3118 ยฑ 0.0182
- agieval_lsat_ar 1 none 0 acc 0.2000 ยฑ 0.0264
none 0 acc_norm 0.1696 ยฑ 0.0248
- agieval_lsat_lr 1 none 0 acc 0.2275 ยฑ 0.0186
none 0 acc_norm 0.2020 ยฑ 0.0178
- agieval_lsat_rc 1 none 0 acc 0.1487 ยฑ 0.0217
none 0 acc_norm 0.1561 ยฑ 0.0222
- agieval_sat_en 1 none 0 acc 0.2330 ยฑ 0.0295
none 0 acc_norm 0.2039 ยฑ 0.0281
- agieval_sat_en_without_passage 1 none 0 acc 0.2524 ยฑ 0.0303
none 0 acc_norm 0.1942 ยฑ 0.0276
- agieval_sat_math 1 none 0 acc 0.2227 ยฑ 0.0281
none 0 acc_norm 0.1682 ยฑ 0.0253
Tasks Version Filter n-shot Metric Value Stderr
truthfulqa 2 none 0 acc 0.3931 ยฑ 0.0143
mmlu N/A none 0 acc 0.3642 ยฑ 0.0040
- humanities N/A none 5 acc 0.3320 ยฑ 0.0068
- formal_logic 0 none 5 acc 0.2619 ยฑ 0.0393
- high_school_european_history 0 none 5 acc 0.4909 ยฑ 0.0390
- high_school_us_history 0 none 5 acc 0.4167 ยฑ 0.0346
- high_school_world_history 0 none 5 acc 0.4641 ยฑ 0.0325
- international_law 0 none 5 acc 0.5537 ยฑ 0.0454
- jurisprudence 0 none 5 acc 0.4167 ยฑ 0.0477
- logical_fallacies 0 none 5 acc 0.2638 ยฑ 0.0346
- moral_disputes 0 none 5 acc 0.3757 ยฑ 0.0261
- moral_scenarios 0 none 5 acc 0.2402 ยฑ 0.0143
- philosophy 0 none 5 acc 0.3794 ยฑ 0.0276
- prehistory 0 none 5 acc 0.3426 ยฑ 0.0264
- professional_law 0 none 5 acc 0.3103 ยฑ 0.0118
- world_religions 0 none 5 acc 0.2807 ยฑ 0.0345
- other N/A none 5 acc 0.4071 ยฑ 0.0088
- business_ethics 0 none 5 acc 0.4200 ยฑ 0.0496
- clinical_knowledge 0 none 5 acc 0.4491 ยฑ 0.0306
- college_medicine 0 none 5 acc 0.3873 ยฑ 0.0371
- global_facts 0 none 5 acc 0.3600 ยฑ 0.0482
- human_aging 0 none 5 acc 0.3498 ยฑ 0.0320
- management 0 none 5 acc 0.4854 ยฑ 0.0495
- marketing 0 none 5 acc 0.5470 ยฑ 0.0326
- medical_genetics 0 none 5 acc 0.4000 ยฑ 0.0492
- miscellaneous 0 none 5 acc 0.4291 ยฑ 0.0177
- nutrition 0 none 5 acc 0.4183 ยฑ 0.0282
- professional_accounting 0 none 5 acc 0.3582 ยฑ 0.0286
- professional_medicine 0 none 5 acc 0.3015 ยฑ 0.0279
- virology 0 none 5 acc 0.3494 ยฑ 0.0371
- social_sciences N/A none 5 acc 0.4075 ยฑ 0.0088
- econometrics 0 none 5 acc 0.2719 ยฑ 0.0419
- high_school_geography 0 none 5 acc 0.5000 ยฑ 0.0356
- high_school_government_and_politics 0 none 5 acc 0.4611 ยฑ 0.0360
- high_school_macroeconomics 0 none 5 acc 0.4051 ยฑ 0.0249
- high_school_microeconomics 0 none 5 acc 0.3908 ยฑ 0.0317
- high_school_psychology 0 none 5 acc 0.4239 ยฑ 0.0212
- human_sexuality 0 none 5 acc 0.3893 ยฑ 0.0428
- professional_psychology 0 none 5 acc 0.3399 ยฑ 0.0192
- public_relations 0 none 5 acc 0.4455 ยฑ 0.0476
- security_studies 0 none 5 acc 0.3510 ยฑ 0.0306
- sociology 0 none 5 acc 0.5174 ยฑ 0.0353
- us_foreign_policy 0 none 5 acc 0.5500 ยฑ 0.0500
- stem N/A none 5 acc 0.3276 ยฑ 0.0083
- abstract_algebra 0 none 5 acc 0.3000 ยฑ 0.0461
- anatomy 0 none 5 acc 0.2889 ยฑ 0.0392
- astronomy 0 none 5 acc 0.3487 ยฑ 0.0388
- college_biology 0 none 5 acc 0.3403 ยฑ 0.0396
- college_chemistry 0 none 5 acc 0.2600 ยฑ 0.0441
- college_computer_science 0 none 5 acc 0.3800 ยฑ 0.0488
- college_mathematics 0 none 5 acc 0.3300 ยฑ 0.0473
- college_physics 0 none 5 acc 0.2745 ยฑ 0.0444
- computer_security 0 none 5 acc 0.4300 ยฑ 0.0498
- conceptual_physics 0 none 5 acc 0.3447 ยฑ 0.0311
- electrical_engineering 0 none 5 acc 0.3931 ยฑ 0.0407
- elementary_mathematics 0 none 5 acc 0.3095 ยฑ 0.0238
- high_school_biology 0 none 5 acc 0.4161 ยฑ 0.0280
- high_school_chemistry 0 none 5 acc 0.2759 ยฑ 0.0314
- high_school_computer_science 0 none 5 acc 0.3100 ยฑ 0.0465
- high_school_mathematics 0 none 5 acc 0.3185 ยฑ 0.0284
- high_school_physics 0 none 5 acc 0.2517 ยฑ 0.0354
- high_school_statistics 0 none 5 acc 0.3009 ยฑ 0.0313
- machine_learning 0 none 5 acc 0.3036 ยฑ 0.0436
medqa_4options Yaml none 5 acc 0.2687 ยฑ 0.0124
none 5 acc_norm 0.2687 ยฑ 0.0124
logieval 0 get-answer 5 exact_match 0.3505 ยฑ 0.0120
gsm8k_cot 3 strict-match 8 exact_match 0.0690 ยฑ 0.0070
flexible-extract 8 exact_match 0.1365 ยฑ 0.0095
Tasks Version Filter n-shot Metric Value Stderr
arc_easy 1 none 25 acc 0.5981 ยฑ 0.0101
none 25 acc_norm 0.5939 ยฑ 0.0101
arc_challenge 1 none 25 acc 0.2688 ยฑ 0.0130
none 25 acc_norm 0.2969 ยฑ 0.0134

Usage Rights

Make sure to read Qwen's license before using this model.

Downloads last month
183
Safetensors
Model size
0.5B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for M4-ai/tau-0.5B

Quantizations
4 models

Dataset used to train M4-ai/tau-0.5B

Spaces using M4-ai/tau-0.5B 2