Sabiá-4: Technical Report
2026This technical report introduces Sabiá-4 and Sabiazinho-4, a new generation of language models focused on Brazilian Portuguese. The models were developed in four stages: continued pre-training on Portuguese and Brazilian legal corpora, long-context extension to 128K tokens, supervised fine-tuning, and preference alignment.