⚡️ Byte Latent Transformer: Patches Scale Better Than Tokens
Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness.
🖥 Github: https://github.com/facebookresearch/blt
📕 Paper: https://arxiv.org/abs/2412.09871v1
🌟 Dataset: https://paperswithcode.com/dataset/mmlu
@Machine_learn
Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness.
🌟 Dataset: https://paperswithcode.com/dataset/mmlu
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
PDF Math Translate
DF scientific paper translation with preserved formats
Creator: Byaidu
Stars ⭐️: 5.1k
Forked By: 375
https://github.com/Byaidu/PDFMathTranslate
@Machine_learn
DF scientific paper translation with preserved formats
Creator: Byaidu
Stars ⭐️: 5.1k
Forked By: 375
https://github.com/Byaidu/PDFMathTranslate
@Machine_learn
GitHub
GitHub - Byaidu/PDFMathTranslate: PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Goog…
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero - Byaidu/PDFMathTranslate
🀄 GuoFeng Webnovel: A Discourse-Level and Multilingual Corpus of Web Fiction
🖥 Github: https://github.com/longyuewangdcu/guofeng-webnovel
📕 Paper: https://arxiv.org/abs/2412.11732v1
🌟 Dataset: www2.statmt.org/wmt24/literary-trans
@Machine_learn
🌟 Dataset: www2.statmt.org/wmt24/literary-trans
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
تنها نفر ۴ ام از این کار مشترک باقی مونده
شروع کار ۱ دی ماه هستش. جهت همکاری به ایدی بنده پیام بدین.
@Raminmousa
شروع کار ۱ دی ماه هستش. جهت همکاری به ایدی بنده پیام بدین.
@Raminmousa
⏩SmolLM2-1.7B
⏩SmolLM2-360M
⏩SmolLM2-135M
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "HuggingFaceTB/SmolLM2-1.7B"
device = "cuda" # for GPU usage or "cpu" for CPU usage
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
inputs = tokenizer.encode("Gravity is", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
New o3 OpenAI model is changing the game!
For a long time, ARC was seen as proof that AI models “can’t think.” The argument went: if they truly could, why do they perform so poorly on this benchmark?
Well, those days are over. The o3 model demonstrates not only the ability to think but also the capability to tackle tasks once considered out of reach.
👀 Check out the full breakdown of this breakthrough: https://arcprize.org/blog/oai-o3-pub-breakthrough
It might be time to rethink what AI can achieve. Looking forward to the release!
@Machine_learn
For a long time, ARC was seen as proof that AI models “can’t think.” The argument went: if they truly could, why do they perform so poorly on this benchmark?
Well, those days are over. The o3 model demonstrates not only the ability to think but also the capability to tackle tasks once considered out of reach.
👀 Check out the full breakdown of this breakthrough: https://arcprize.org/blog/oai-o3-pub-breakthrough
It might be time to rethink what AI can achieve. Looking forward to the release!
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Probability, Random Processes, and Statistical Analysis Applications to Communications, Signal Processing, Queueing Theory and Mathematical Finance
📕 Book
@Machine_learn
📕 Book
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM