Machine learning books and papers 3428

Machine learning books and papers

OmniParser for Pure Vision Based GUI Agent

1 Aug 2024 · Yadong Lu, Jianwei Yang, Yelong Shen, Ahmed Awadallah

The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of #GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. We first curated an interactable icon detection dataset using popular webpages and an icon description dataset. These datasets were utilized to fine-tune specialized models: a detection model to parse interactable regions on the screen and a caption model to extract the functional semantics of the detected elements. \textsc{#OmniParser} significantly improves GPT-4V's performance on ScreenSpot benchmark. And on #Mind2Web and AITW benchmark, \textsc{OmniParser} with screenshot only input #outperforms the GPT-4V baselines requiring additional information outside of screenshot.

Paper: https://arxiv.org/pdf/2408.00203v1.pdf

Code: https://github.com/microsoft/omniparser

Dataset: ScreenSpot

@Machine_learn

2.0K views11:19

Machine learning books and papers

Competitive Programming with Large Reasoning Models
OpenAI∗

▪ link

@Machine_learn

2.3K views17:14

Machine learning books and papers

The Pandas Workshop (2022).pdf

28.9 MB

The Pandas Workshop A comprehensive guide to using Python for data analysis with real-world case studies

@Machine_learn

2.0K viewsedited 07:17

Machine learning books and papers

Enhance-A-Video: Better Generated Video for Free

11 Feb 2025 · Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You

DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.

Paper: https://arxiv.org/pdf/2502.07508v1.pdf

Code: https://github.com/NUS-HPC-AI-Lab/Enhance-A-Video

@Machine_learn

2.1K views07:19

Machine learning books and papers

Bayesian Sample Inference

🖥

Github: https://github.com/martenlienen/bsi

📕

Paper: https://arxiv.org/abs/2502.07580

🌟 Dataset: https://paperswithcode.com/dataset/cifar-10

@Machine_learn

Please open Telegram to view this post

VIEW IN TELEGRAM

1.8K views08:33

Machine learning books and papers

Forwarded from Papers

با عرض سلام
نیاز به یک نفر داریم که در موضوع زیر‌کمکمون کنه (نفر اول)

🔸

Title: Chronic kidney disease classification: Deep ansemble approach
کنفرانس مد نظر :

⭐️

https://saiconference.com/IntelliSys

⚙️

Abstract: Chronic kidney disease (CKD) is a progressive disease that may lead to kidney failure, so early diagnosis is crucial for proper management. This condition has a high mortality rate, especially in developing countries. CKD is often overlooked because there are no apparent symptoms in the early stages. Meanwhile, early diagnosis and timely clinical intervention are essential to reduce the progression of the disease. CKD diagnosis using deep learning (DL) and feature selection (FS) methods can be a useful application of artificial intelligence (AI) in healthcare. DL algorithms can provide cost-effective and efficient computer-aided diagnosis (CAD) to assist physicians. DL models are based on automatic feature selection.
In some cases, manual feature extraction can improve the results before the network learning process. This study aims to present an ensemble deep-learning model for CKD classification. The proposed method used Deep Embedded Clustering (DEC) as a similarity feature. Also, latent features obtained from the Gaussian Mixture Model (GMM) process were used. The proposed method on UCI databases achieved an accuracy of 1.0 using the Synthetic Minority Over-Sampling technique (SMOTE).

دوستانی که مشارکت میکنم بخشی از هزینه چاپ رو هم تقبل میکنن. بخش related work and introduction, هم بر عهده ی مشارکت کنندست.
@Raminmousa
Papers channel: https://www.tg-me.com/+SP9l58Ta_zZmYmY0

Please open Telegram to view this post

VIEW IN TELEGRAM

Saiconference

Intelligent Systems Conference (IntelliSys) | Machine Learning, Computer Vision & Artificial Intelligence Conference | AI Conference

SAI Conferences presents leading Artificial Intelligence, Machine Learning, and Computer Vision Conferences, drawing experts from 55+ countries. Join us at this AI Conference on 28-29 August 2025 at Amsterdam, The Netherlands for innovative AI insights and…

2.1K views12:10

Machine learning books and papers

preprints202502.0982.v1.pdf

1018.1 KB

PKG-LLM: A Framework for Predicting GAD and MDD Using Knowledge Graphs and Large Language Models in Cognitive Neuroscience

Ali Sarabadani,Hadis Taherinia,Niloufar Ghadiri,
Ehsan Karimi Shahmarvandi,
Ramin Mousa *

Abstract
Purpose: This research project has a single purpose: the construction and evaluation of PKG-LLM, a knowledge graph framework whose application is primarily intended for cognitive neuroscience. It also aims to improve predictions of relationships among neurological entities and improve named entity recognition (NER) and relation extraction (RE) from large neurological datasets. Employing the GPT-4 and expert review, we aim to demonstrate how this framework may outperform traditional models by way of precision, recall, and F1 score, intending to provide key insights into possible future clinical and research applications in the field of neuroscience. Method: In the evaluation of PKG-LLM, there were two different tasks primarily: relation extraction (RE) and named entity recognition (NER). Both tasks processed data and obtained performance metrics, such as precision, recall, and F1-score, using GPT-4. Moreover, there was an integration of an expert review process comprising neurologists and domain experts reviewing those extracted relationships and entities and improving such final performance metrics. Model comparative performance was reported against StrokeKG and Heart Failure KG. On the other hand, PKG-LLM evinced itself to link prediction-in-cognition through metrics such as Mean Rank (MR), Mean Reciprocal Rank (MRR), and Precision at K (P@K). The model was evaluated against other link prediction models, including TransE, RotatE, DistMult, ComplEx, ConvE, and HolmE. Findings: PKG-LLM demonstrated competitive performance in both relation extraction and named entity recognition tasks. In its traditional form, PKG-LLM achieved a precision of 75.45\%, recall of 78.60\%, and F1-score of 76.89\% in relation extraction, which improved to 82.34\%, 85.40\%, and 83.85\% after expert review. In named entity recognition, the traditional model scored 73.42\% precision, 76.30\% recall, and 74.84\% F1-score, improving to 81.55\%, 84.60\%, and 82.99\% after expert review. For link prediction, PKG-LLM achieved an MRR of 0.396, P@1 of 0.385, and P@10 of 0.531, placing it in a competitive range compared to models like TransE, RotatE, and ConvE. Conclusion: This study showed that PKG-LLM mainly outperformed the existing models by adding expert reviews in its application in extraction and named entity recognition tasks. Further, the model's competitive edge in link prediction lends credence to its capability in knowledge graph construction and refinement in the field of cognitive neuroscience as well. PKG-LLM's superiority over existing models and its ability to generate more accurate results with clinical relevance indicates that it is a significant tool to augment neuroscience research and clinical applications. The evaluation process entailed using GPT-4 and expert review. This approach ensures that the resulting knowledge graph is scientifically compelling and practically beneficial in more advanced cognitive neuroscience research.

Link: https://www.preprints.org/manuscript/202502.0982/v1
@Machine_learn

2.7K viewsedited 16:22

Machine learning books and papers

📃 Methods of decomposition theory and graph labeling in the study of social network structure

📎 Study the paper

@Machine_learn

1.8K views07:05

Machine learning books and papers

Mathematics of Backpropagation Through Time.

📕 Paper

@Machine_learn

1.8K views08:27

Machine learning books and papers

OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia

Large Language Models (LLMs) have made significant progress in various downstream tasks, inspiring the development of Speech Understanding Language Models (SULMs) to enable comprehensive speech-based interactions. However, most advanced SULMs are developed by the industry, leveraging large-scale datasets and computational resources that are not readily available to the academic community. Moreover, the lack of transparency in training details creates additional barriers to further innovation. In this study, we present OSUM, an Open Speech Understanding Model designed to explore the potential of training SLUMs under constrained academic resources. The OSUM model combines a Whisper encoder with a Qwen2 LLM and supports a wide range of speech tasks, including speech recognition (ASR), speech recognition with timestamps (SRWT), vocal event detection (VED), speech emotion recognition (SER), speaking style recognition (SSR), speaker gender classification (SGC), speaker age prediction (SAP), and speech-to-text chat (STTC). By employing an ASR+X training strategy, OSUM achieves efficient and stable multi-task training by simultaneously optimizing ASR alongside target tasks. Beyond delivering strong performance, OSUM emphasizes transparency by providing openly available data preparation and training methodologies, offering valuable insights and practical guidance for the academic community. By doing so, we aim to accelerate research and innovation in advanced SULM technologies.

Paper: https://arxiv.org/pdf/2501.13306v2.pdf

Code: https://github.com/aslp-lab/osum

Datasets: LibriSpeech - IEMOCAP

@Machine_learn

1.7K views08:28

Machine learning books and papers

ًThe Data Science Design Manual

📓 Book

@Machine_learn

1.8K views17:57

Machine learning books and papers

Forwarded from Papers

یکی از ابزارهای خوبی که بنده تونستم توسعه بدم ابزار Stock Ai می باشد. در این ابزار از ۳۶۰ اندیکاتور استفاده کردم. گزارشات back test این ابزار در ویدیو های زیر موجود می باشد.

May 2024 :

https://youtu.be/aSS99lynMFQ?si=QSk8VVKhLqO_2Qi3

July 2014:

https://youtu.be/ThyZ0mZwsGk?si=FKPK7Hkz-mRx-752&t=209

از این رو سعی میکنیم مقاله ای این کار رو بنویسیم. شروع مقاله ی این کار ۲۰ اسفند خواهد بود.
دوستانی که می تونن به هر نحوی کمک کنند تا شروع مقاله می تونن نام نویسی کنند.
نفرات ٣ و ٥ اين كار باقي مونده.

@Raminmousa

YouTube

May 2024 Backtest Smart AI Signal Telegram Channel #telegram_to_mt4 #telegramsignals

-------------------------------------------------------------------------------------
For the next 30 days, you can USE PROMO CODE LAUNCH70 to get 70% off your subscription of the mltiplai.com database.
FOR MORE INFO VISIT US AT
✅ https://mltiplai.com
✅…

2.4K views17:58

Machine learning books and papers

🔥

The Ultra-Scale Playbook:

🔗

playbook

@Machine_learn

Please open Telegram to view this post

VIEW IN TELEGRAM

2.2K views09:44

Machine learning books and papers

Forwarded from Github LLMs

Tutorial: Train your own Reasoning model with GRPO

📓 Tutorial

https://www.tg-me.com/deep_learning_proj

2.3K views09:44

Machine learning books and papers

Probability and Statistics for Machine Learning.pdf

17.9 MB

Probability and Statistics for Machine Learning

@Machine_learn

2.2K viewsedited 07:18

Machine learning books and papers

Forwarded from Github LLMs

Slamming: Training a Speech Language Model on One GPU in a Day

19 Feb 2025 · Gallil Maimon, Avishai Elmakies, Yossi Adi ·

We introduce Slam, a recipe for training high-quality Speech Language Models (SLMs) on a single academic GPU in 24 hours. We do so through empirical analysis of model initialisation and architecture, synthetic training data, preference optimisation with synthetic data and tweaking all other components. We empirically demonstrate that this training recipe also scales well with more compute getting results on par with leading SLMs in a fraction of the compute cost. We hope these insights will make SLM training and research more accessible. In the context of SLM scaling laws, our results far outperform predicted compute optimal performance, giving an optimistic view to #SLM feasibility. See code, data, models, samples at - https://pages.cs.huji.ac.il/adiyoss-lab/slamming .

Paper: https://arxiv.org/pdf/2502.15814v1.pdf

Code: https://github.com/slp-rl/slamkit

https://www.tg-me.com/deep_learning_proj

2.0K views15:51

Machine learning books and papers

Forwarded from Papers

YouTube

May 2024 Backtest Smart AI Signal Telegram Channel #telegram_to_mt4 #telegramsignals

2.2K views15:53

Machine learning books and papers

1:21

This media is not supported in your browser

VIEW IN TELEGRAM

رمضان الکریم ❤️
@Machine_learn

2.4K viewsedited 15:44

Machine learning books and papers

Forwarded from Papers

با عرض سلام نياز به نفر سوم در مقاله زير داريم.

وضعيت: ريوايزد🔥

💠

Advancements in Deep Learning for predicting Drug-Lipid interactions in liposomal drug delivery

🔹

Abstract

Liposomal drug delivery systems have improved cancer therapeutics by enhancing drug stability, allowing selective tissue targeting, and reducing off-target effects. One of the main problems, however, is how to maximize drug-lipid interaction as well as develop personalized treatment alternatives. Traditional methods in computational biology, such as molecular dynamics simulations, are useful but have challenges in their scalability and cost of computation. This study focuses on the use of deep learning algorithms, Graph Neural Networks (GNNs), Attention Mechanisms, and Physics-Informed Neural Networks (PINNs) for the prediction and optimization of drug-lipid interactions in liposomal formulations. These models are much more advanced, can handle complex datasets with simplified models, and recognize complicated interaction patterns while adhering to the necessary physics involved in the problem. We highlight the practicality of these models in predicting encapsulation efficiency, drug release kinetics, and developing controlled drug delivery systems for cancer treatment through several case studies. Also, the application of transfer learning and meta-learning improves model transferability in different drug-lipid matrices, which is a step towards personalized medicine. Our results highlight that the combination of deep learning with experimental and clinical evidence enhances predictive performance and expands scope, thereby facilitating the formulation of more exact and individualized treatment modalities. Such an interdisciplinary approach can greatly improve treatment efficacy and expand the horizons of precision medicine in the field of nanomedicine.

Keywords: Liposomal drug delivery, Deep Learning models, Drug-Lipid interactions, Physics-Informed Neural Networks (PINNs), Encapsulation efficiency, Personalized medicine, Nanomedicine.

Journal:https://link.springer.com/journal/11831
If: 9.9
جهت ثبت سفارش به ايدي بنده پيام بدين.
@Raminmousa
@Paper4money
@Machine_learn

Please open Telegram to view this post

VIEW IN TELEGRAM

SpringerLink

Archives of Computational Methods in Engineering

Archives of Computational Methods in Engineering is a forum for disseminating the state of the art on research and advanced practice in computational ...

2.1K views11:55

Machine learning books and papers

Machine learning books and papers pinned «با عرض سلام نياز به نفر سوم در مقاله زير داريم. وضعيت: ريوايزد🔥 💠Advancements in Deep Learning for predicting Drug-Lipid interactions in liposomal drug delivery 🔹Abstract Liposomal drug delivery systems have improved cancer therapeutics by enhancing…»

11:55

2025/07/08 05:23:39
Back to Top

HTML Embed Code:

<iframe width="100%" src="https://www.bootg.com/buyppe/web?embed=1" title="Telegram Web" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>