Telegram Web Link
🌟 BioNeMo: A Framework for Developing AI Models for Drug Design.

NVIDIA BioNeMo2 Framework is a set of tools, libraries, and models for computational drug discovery and design.



▶️ Pre-trained models:

🟢 ESM-2 is a pre-trained bidirectional encoder (BERT-like) for amino acid sequences. BioNeMo2 includes checkpoints with parameters 650M and 3B;

🟢 Geneformer is a tabular scoring model that generates a dense representation of a cell's scRNA by examining co-expression patterns in individual cells.


▶️ Datasets:

🟠 CELLxGENE is a collection of publicly available single-cell datasets collected by the CZI (Chan Zuckerberg Initiative) with a total volume of 24 million cells;


🟠 UniProt is a database of clustered sets of protein sequences from UniProtKB, created on the basis of translated genomic data.



🟡 Project page
🟡 Documentation
🖥 GitHub

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
polyBERT: a chemical language model to enable fully machine-driven ultrafast polymer informatics

https://www.nature.com/articles/s41467-023-39868-6.pdf

@Machine_learn
NPGPT: Natural Product-Like Compound Generation with GPT-based Chemical Language
Models


https://arxiv.org/pdf/2411.12886

@Machine_learn
2DMatGMM: An open-source robust machine learning platform for real-time detection and classification of 2D material flakes

🖥 Github: https://github.com/jaluus/2dmatgmm

📕 Paper: https://arxiv.org/abs/2412.09333v1

⭐️ Dataset: https://paperswithcode.com/task/instance-segmentation

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
🔺تنها ۴ روز برای شروع این کار مونده....!🔺🔸
Please open Telegram to view this post
VIEW IN TELEGRAM
OASIS Alzheimer's Detection

Large-scale brain MRI dataset for deep neural network analysis

About Dataset
The dataset used is the OASIS MRI dataset (https://sites.wustl.edu/oasisbrains/), which consists of 80,000 brain MRI images. The images have been divided into four classes based on Alzheimer's progression. The dataset aims to provide a valuable resource for analyzing and detecting early signs of Alzheimer's disease.

To make the dataset accessible, the original .img and .hdr files were converted into Nifti format (.nii) using FSL (FMRIB Software Library). The converted MRI images of 461 patients have been uploaded to a GitHub repository, which can be accessed in multiple parts.
For the neural network training, 2D images were used as input. The brain images were sliced along the z-axis into 256 pieces, and slices ranging from 100 to 160 were selected from each patient. This approach resulted in a comprehensive dataset for analysis.

Patient classification was performed based on the provided metadata and Clinical Dementia Rating (CDR) values, resulting in four classes: demented, very mild demented, mild demented, and non-demented. These classes enable the detection and study of different stages of Alzheimer's disease progression.

During the dataset preparation, the .nii MRI scans were converted to .jpg files. Although this conversion presented some challenges, the files were successfully processed using appropriate tools. The resulting dataset size is 1.3 GB.

@Machine_learn
Forwarded from Papers
با عرض سلام نفر ۳ از مقاله زیر رو نیاز داریم.

Title: hybrid deep learnings and machine learning frameworks
for air quality prediction
during the COVID‑19 pandemic

journal: https://www.sciencedirect.com/journal/expert-systems-with-applications
if:7.5
در این مقاله تاثیر ۲۶ مدل ansemble و ترکیبی رو برای پیش بینی کیفیت هوا در بازه ۱ روزه ۳ روزه و ۷ روزه بررسی کردیم. جهت شرکت در این مقاله به ایدی بنده پیام بدین.


@Raminmousa
@Machine_learn
https://www.tg-me.com/+SP9l58Ta_zZmYmY0
WIS Python programming course started in 2024.04

📖 Github

@Machine_learn
Large language models (LLMs): survey, technical frameworks,
and future challenges

https://link.springer.com/content/pdf/10.1007/s10462-024-10888-y.pdf

@Machine_learn
Forwarded from Papers
با عرض سلام در راستاي ادامه تحقيقات مشترك سعي داريم از ١ ام دي ماه روي حوزه ي LLM مدل ها كار كنيم.
این کار تحت نظر استاد
Rex (Zhitao) Ying
انجام میشه.
link: https://scholar.google.com.au/citations?user=6fqNXooAAAAJ&hl=en
۲نفر براي همکاری نياز داريم.

BioPars: a pre-trained biomedical large language model for persian biomedical text mining.
١- مراحل اوليه: جمع اوري متن هاي فارسي بيولوژيكي از منابع (...)
٢- پيش پردازش متن ها و تميز كردن متن ها
٣- اموزش ترنسفورمرها ي مورد نظر
٤- استفاده از بردارها ي اموزش داده شده در سه تسك (...)
دوستاني كه مايل به مشاركت هستن مي تونين تا ١ دي بهم اطلاع بدن.
هزينه سرور به ازاي هر ساعت ١.٢ دلار مي باشد. و حدود ٢ هزار ساعت براي اموزش مدل زباني نياز ميباشد. هزينه به ترتيب براي نفرات علاوه بر انجام تسك ها به صورت زير مي باشد.
🔹نفر چهارم 500 دلار
🔺نفر پنجم 400 دلار
@Raminmousa
@Machine_learn
https://www.tg-me.com/+SP9l58Ta_zZmYmY0
Please open Telegram to view this post
VIEW IN TELEGRAM
📃 Large language models and their applications in bioinformatics

📎 Study the paper

@Machine_learn
⚡️ Byte Latent Transformer: Patches Scale Better Than Tokens

Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness.

🖥 Github: https://github.com/facebookresearch/blt

📕 Paper: https://arxiv.org/abs/2412.09871v1

🌟 Dataset: https://paperswithcode.com/dataset/mmlu

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
📃A Comprehensive Survey on Automatic Knowledge Graph Construction

📎 Study paper

@Machine_learn
🀄 GuoFeng Webnovel: A Discourse-Level and Multilingual Corpus of Web Fiction

🖥 Github: https://github.com/longyuewangdcu/guofeng-webnovel

📕 Paper: https://arxiv.org/abs/2412.11732v1

🌟 Dataset: www2.statmt.org/wmt24/literary-trans

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
2025/02/23 22:02:40
Back to Top
HTML Embed Code: