← Library Read on the original site ↗
TF Cookbook · Post-Training
Post-training techniques for open-source models on Nebius — RLHF, DPO, instruction tuning. End-to-end working examples.tokenfactory
aicloud
About this entry
Post-training techniques for open-source models on Nebius — RLHF, DPO, instruction tuning. End-to-end working examples.