top of page

MariTalk é um chatbot baseado em LLM e treinado para atender as necessidades do Brasil.

Use-o gratuitamente em chat.maritaca.ai

Nossos Produtos

Nossos LLMs especializados no português estão disponíveis através de dois produtos:
MariTalk API: modelos que rodam em nossa nuvem
MariTalk Local: modelos que rodam em sua máquina

MariTalk API

A MariTalk API possibilita que você use nossos modelos Sabiá-2 pagando um valor proporcional à quantidade de tokens enviados (prompt) e gerados. 

Devido ao treinamento especializado dos modelos Sabiá-2, eles entregam uma qualidade maior a um menor preço do que nossos concorrentes.

 

Compare abaixo a qualidade dos nossos modelos, medida pelo desempenho em 64 Exames Brasileiros (Enem, Enade, Revalida, OAB, UNICAMP, USP, etc) vs preço:

Custo Beneficio v2.png

* Considerando US$1 = R$ 5

Estimativa de que para cada 1 milhão de tokens, 500 mil são tokens de entrada e os outros 500 mil são tokens de saída.

Um milhão de tokens equivalem à aproximadamente 700 páginas de texto em português.

Para mais detalhes, confira o blog post do Sabiá-2:

Preços API

Pricing

Sabiá-2 Small

 R$

1,00

Entrada

3,00

Saída

Para cada 1M tokens

R$ 20 de créditos inicias

8192 tokens de contexto

Baixa latência

Melhor custo-benefício

Rate Limit:

150k input tokens/min

50k output tokens/min

Sabiá-2 Medium

 R$

5,00

Entrada

15,00

Saída

Para cada 1M tokens

R$ 20 de créditos inicias

8192 tokens de contexto

Maior assertividade

Rate Limit:

150k input tokens/min

50k output tokens/min

Como usar

Use a Maritalk API através de nossa biblioteca Python:

import maritalk

model = maritalk.MariTalk(key="insira sua chave aqui. Ex: '100088...'")

answer = model.generate("Quanto é 25 + 27?")

print(f"Resposta: {answer}")    # Deve imprimir algo como "52."

maritalk local (1).png

MariTalk Local

Além de usar nossos modelos via API, você pode também hospedá-los localmente.

Seus dados nunca saem da sua máquina local; apenas informações sobre o tempo de uso do modelo são enviados para nossos servidores para efeitos de cobrança.

Sabiá-2 Small

 R$

3,50

Por hora

30 dias grátis

8192 tokens de contexto

Baixa latência

Melhor custo-benefício

Requer 24 GB RAM de GPU

Sabiá-2 Medium

 R$

10,00

Por hora

30 dias grátis

8192 tokens de contexto

Maior assertividade

Requer 80 GB RAM de GPU

Como usar

Use a Maritalk Local através de nossa biblioteca Python:

import maritalk

# Criando uma instância do cliente MariTalkLocal

client = maritalk.MariTalkLocal()

# Iniciando o servidor com uma chave de licença especificada.

# O executável será baixado em ~/bin/maritalk

client.start_server(client="00000-00000-00000-00000")

# Gerando uma resposta para a pergunta

response client.generate("Quanto é 25 + 27?")

print(response["output"])    # Deve imprimir algo como "52."

  • How do I get access to MariTalk?
    To access MariTalk, go to chat.maritaca.ai and log in with an email address. If you are looking for information on how to use the MariTalk API, please visit the link https://github.com/maritaca-ai/maritalk-api for more information.
  • What are the models capable of?
    They have various abilities, such as: Providing detailed explanations on a variety of topics and assisting in learning new concepts; Answering questions according to the context of a conversation; Translating text to and from many different languages; Generating creative texts, such as stories, poems, dialogues, and much more.
  • Can I fine-tune the Maritaca models on my data?
    Not yet, but we will soon have this functionality available both in models served via API and in models that can be downloaded and used locally.
  • What is the difference between MariTalk and Sabiá models?
    MariTalk is a platform that serves various LLMs, including the Sabiá-2 small and medium models. If you want to learn more about the Sabiá-2 models, see this blog post. The Sabiá (1) models, on the other hand, are non-commercial models resulting from a scientific research.
  • Which model is used in the MariTalk web chat?
    The model used in the web chat is indicated below the text box (e.g., "Sabiá-2-medium version 2024-03-01").
  • On what data were the models trained?
    MariTalk is trained on a large amount of text from the internet, mostly in Portuguese. However, the model also has reasonable capabilities in languages such as English and Spanish. 
  • How are the models trained?
    The models go through two stages of training. In the first, the models are trained in an auto-supervised manner (e.g., predicting the next word in a document) on large amounts of text, for example, extracted from the web, books, etc. In the second stage, the focus is on teaching the models to understand and follow specific instructions, as well as produce responses that are safe, avoiding offensive, dangerous content or that violates ethical principles.
  • What is the cut-off date for the training data?
    The models were trained on data available up to mid-2023, so they are not aware of events or information that emerged after that date.
  • Are my data used for training? What is the data retention policy?
    MariTalk API: all data sent to our servers are immediately discarded after the output is generated. We only store token counts for billing purposes. MariTalk Local: your data never leaves the machine where it is being executed. The only communication with Maritaca servers is to verify that the provided license remains valid. Web Chatbot: as it is a free service, we may occasionally use the most frequent user questions to improve the models. 
  • What is the architecture and how many parameters do the models served by MariTalk API and Local have?
    The models are based on the Transformers architecture, but the number of parameters and exact architecture are information we keep confidential.
  • Do you have embedding services for retrieval augmented LLMs (RAG)?
    No, but we will include this service in our portfolio in the near future. If you are interested in integrating our LLMs with search systems, we recommend consulting this example with LangChain.
  • Do you plan to have models specialized in specific domains, such as legal, financial, health, etc.?
    Yes, we plan to serve models specialized in certain areas of knowledge. However, we do not yet have a date for their releases.
  • Which cloud is used by the MariTalk API?
    Our models run on GPUs in Oracle Cloud, Amazon AWS, and Google Cloud. The training is mostly done on TPUs in Google Cloud.
  • Why use MariTalk instead of OpenAI ChatGPT, Google Gemini, etc.?
    MariTalk stands out from competitors in two fundamental aspects. Firstly, it was specifically trained to understand the Portuguese language well, therefore, it performs better in tasks in this language. If your project or application demands specific knowledge of Brazil, MariTalk might be the ideal choice. Secondly, MariTalk Local allows for the download and local execution, on your own server or desktop, without the need to send your data to the cloud. This can be especially useful for entities dealing with sensitive data, such as hospitals and law firms, which need or are legally required to keep their data on their own servers.
  • What are the limitations of Maritaca AI models?
    Tasks that require logical reasoning and code writing are challenging for current models. In addition, the models can hallucinate, such as inventing facts or answering questions about events that never happened. These are open problems for the scientific community, but are progressively mitigated with each new version of the models.
  • When to use MariTalk API vs MariTalk Local?
    If your data cannot be transmitted outside of your local network, MariTalk Local is the ideal solution. This option is also recommended when the number of requests is high, as the investment in dedicated hardware and license tends to be more efficient. On the other hand, if the volume of requests is moderate, the API hosted by Maritaca AI may be more economically advantageous. This model allows us to share hardware costs with other customers, making it a more accessible option.
  • What support is included by Maritaca AI when purchasing a MariTalk Local license?
    The license includes support from the Maritaca AI team for installing the model in your local environment. If you do not have suitable hardware, we can assist in running the model on cloud provider servers with competitive prices, such as LambdaLabs, CoreWeave, DataCrunch, or on more popular providers like Oracle Cloud, Google Cloud, Amazon AWS, and Microsoft Azure. If you wish to purchase a server, we will guide you in the ideal configuration and indicate suppliers.
  • Upon purchasing a MariTalk Local license, will I have access to the model weights?
    Upon purchasing a license, you will have access to a version of the model that has been trained and adjusted to meet your specific needs in the Portuguese language. However, the model weights remain the property of Maritaca AI. This is done to protect the intellectual property of Maritaca AI. The license you purchase gives you the right to use the model, but does not give you access to the model weights.
  • How many replicas of MariTalk Local can I run simultaneously with one license?
    You can run multiple instances simultaneously per license, with the cost charged proportionally to the number of active instances. For example, double the amount will be charged if a license is run on two machines simultaneously.
  • Does the license allow me to run MariTalk Local in containers (e.g., Docker) or VMs?
    Yes, it is possible to run MariTalk Local in containers or virtual machines. In the documentation, we provide examples of how to run using Docker on different cloud providers: https://github.com/maritaca-ai/maritalk-api/blob/main/examples/local/docker.md
  • Can I run MariTalk Local models on CPU?
    Currently, we do not support CPUs, only Nvidia GPUs.
  • When activating the MariTalk Local license, is it tied to specific hardware?
    No. Users can change hardware as many times as they wish, without restrictions.
  • Can I cancel the MariTalk Local license at any time?
    Yes, and you will only be charged for the hours that the MariTalk Local server was on.
  • What solutions does your company offer for specific tasks, such as improving customer service through chatbots, drafting petitions, etc.?
    We do not provide customized solutions for specific client challenges. In this case, we recommend contacting a software integration company that can offer a service more aligned with your specific demands.
  • I have a project that involves LLMs. Could you help me execute it?
    As we are focused on improving our products, unfortunately, we do not have the time to engage in specific projects or customize our products.
  • I would like to train my company's staff on generative AI, LLMs, RAG, etc. Do you offer courses, workshops, etc.?
    As we are a team focused on improving our products, unfortunately, we do not have the time to offer training.
  • Where is Maritaca AI located?
    Maritaca AI is located in Campinas, in the state of São Paulo, but it is a hybrid company, with some of the team working remotely.
  • Is Maritaca affiliated with UNICAMP?
    Although part of our team has had or has some connection with UNICAMP, whether as students or researchers, Maritaca AI has no formal link with the university, nor was it incubated by it. However, we greatly benefit from being so close to the ecosystem of education, research, and development that shapes high-level professionals.
bottom of page