📄 How natural language processing derived techniques are used on biological data: a systematic review
📎 Study the paper
@Machine_learn
📎 Study the paper
@Machine_learn
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
Paper: https://arxiv.org/pdf/2502.05179v1.pdf
Code: https://github.com/foundationvision/flashvideo
@Machine_learn
Paper: https://arxiv.org/pdf/2502.05179v1.pdf
Code: https://github.com/foundationvision/flashvideo
@Machine_learn
Forwarded from Papers
با عرض سلام برای یکی از کارهای پژوهشیمون در wound image classification نیاز به نفر سوم داریم. شخص علاوه بر کار بخشی از هزینه سرور رو هم باید تقبل کنه.
Journal: https://www.nature.com/srep/
جهت هماهنگی می تونین با ایدی بنده در ارتباط باشین.
@Raminmousa
Journal: https://www.nature.com/srep/
جهت هماهنگی می تونین با ایدی بنده در ارتباط باشین.
@Raminmousa
Nature
Scientific Reports
Scientific Reports publishes original research in all areas of the natural and clinical sciences. We believe that if your research is scientifically valid and ...
CapsF: Capsule Fusion for Extracting psychiatric stressors for suicide from Twitter
Author links open overlay panel
Mohammad Ali Dadgostarnia ,
Ramin Mousa , Saba Hesaraki ,
Mahdi Hemmasian
https://www.sciencedirect.com/science/article/pii/S294971912500010X
@Machine_learn
Author links open overlay panel
Mohammad Ali Dadgostarnia ,
Ramin Mousa , Saba Hesaraki ,
Mahdi Hemmasian
https://www.sciencedirect.com/science/article/pii/S294971912500010X
@Machine_learn
OmniParser for Pure Vision Based GUI Agent
1 Aug 2024 · Yadong Lu, Jianwei Yang, Yelong Shen, Ahmed Awadallah
The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of #GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. We first curated an interactable icon detection dataset using popular webpages and an icon description dataset. These datasets were utilized to fine-tune specialized models: a detection model to parse interactable regions on the screen and a caption model to extract the functional semantics of the detected elements. \textsc{#OmniParser} significantly improves GPT-4V's performance on ScreenSpot benchmark. And on #Mind2Web and AITW benchmark, \textsc{OmniParser} with screenshot only input #outperforms the GPT-4V baselines requiring additional information outside of screenshot.
Paper: https://arxiv.org/pdf/2408.00203v1.pdf
Code: https://github.com/microsoft/omniparser
Dataset: ScreenSpot
@Machine_learn
1 Aug 2024 · Yadong Lu, Jianwei Yang, Yelong Shen, Ahmed Awadallah
The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable icons within the user interface, and 2) understanding the semantics of various elements in a screenshot and accurately associate the intended action with the corresponding region on the screen. To fill these gaps, we introduce \textsc{OmniParser}, a comprehensive method for parsing user interface screenshots into structured elements, which significantly enhances the ability of #GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. We first curated an interactable icon detection dataset using popular webpages and an icon description dataset. These datasets were utilized to fine-tune specialized models: a detection model to parse interactable regions on the screen and a caption model to extract the functional semantics of the detected elements. \textsc{#OmniParser} significantly improves GPT-4V's performance on ScreenSpot benchmark. And on #Mind2Web and AITW benchmark, \textsc{OmniParser} with screenshot only input #outperforms the GPT-4V baselines requiring additional information outside of screenshot.
Paper: https://arxiv.org/pdf/2408.00203v1.pdf
Code: https://github.com/microsoft/omniparser
Dataset: ScreenSpot
@Machine_learn
The Pandas Workshop (2022).pdf
28.9 MB
The Pandas Workshop A comprehensive guide to using Python for data analysis with real-world case studies
@Machine_learn
@Machine_learn
Enhance-A-Video: Better Generated Video for Free
11 Feb 2025 · Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You
DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.
Paper: https://arxiv.org/pdf/2502.07508v1.pdf
Code: https://github.com/NUS-HPC-AI-Lab/Enhance-A-Video
@Machine_learn
11 Feb 2025 · Yang Luo, Xuanlei Zhao, Mengzhao Chen, Kaipeng Zhang, Wenqi Shao, Kai Wang, Zhangyang Wang, Yang You
DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.
Paper: https://arxiv.org/pdf/2502.07508v1.pdf
Code: https://github.com/NUS-HPC-AI-Lab/Enhance-A-Video
@Machine_learn
Bayesian Sample Inference
🖥 Github: https://github.com/martenlienen/bsi
📕 Paper: https://arxiv.org/abs/2502.07580
🌟 Dataset: https://paperswithcode.com/dataset/cifar-10
@Machine_learn
🌟 Dataset: https://paperswithcode.com/dataset/cifar-10
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from Papers
با عرض سلام
نیاز به یک نفر داریم که در موضوع زیرکمکمون کنه (نفر اول)
🔸 🔸 🔸 🔸 🔸 🔸 🔸 🔸 🔸
Title: Chronic kidney disease classification: Deep ansemble approach
کنفرانس مد نظر :
⭐️ https://saiconference.com/IntelliSys
⚙️ Abstract: Chronic kidney disease (CKD) is a progressive disease that may lead to kidney failure, so early diagnosis is crucial for proper management. This condition has a high mortality rate, especially in developing countries. CKD is often overlooked because there are no apparent symptoms in the early stages. Meanwhile, early diagnosis and timely clinical intervention are essential to reduce the progression of the disease. CKD diagnosis using deep learning (DL) and feature selection (FS) methods can be a useful application of artificial intelligence (AI) in healthcare. DL algorithms can provide cost-effective and efficient computer-aided diagnosis (CAD) to assist physicians. DL models are based on automatic feature selection.
In some cases, manual feature extraction can improve the results before the network learning process. This study aims to present an ensemble deep-learning model for CKD classification. The proposed method used Deep Embedded Clustering (DEC) as a similarity feature. Also, latent features obtained from the Gaussian Mixture Model (GMM) process were used. The proposed method on UCI databases achieved an accuracy of 1.0 using the Synthetic Minority Over-Sampling technique (SMOTE).
دوستانی که مشارکت میکنم بخشی از هزینه چاپ رو هم تقبل میکنن. بخش related work and introduction, هم بر عهده ی مشارکت کنندست.
@Raminmousa
Papers channel: https://www.tg-me.com/+SP9l58Ta_zZmYmY0
نیاز به یک نفر داریم که در موضوع زیرکمکمون کنه (نفر اول)
Title: Chronic kidney disease classification: Deep ansemble approach
کنفرانس مد نظر :
In some cases, manual feature extraction can improve the results before the network learning process. This study aims to present an ensemble deep-learning model for CKD classification. The proposed method used Deep Embedded Clustering (DEC) as a similarity feature. Also, latent features obtained from the Gaussian Mixture Model (GMM) process were used. The proposed method on UCI databases achieved an accuracy of 1.0 using the Synthetic Minority Over-Sampling technique (SMOTE).
دوستانی که مشارکت میکنم بخشی از هزینه چاپ رو هم تقبل میکنن. بخش related work and introduction, هم بر عهده ی مشارکت کنندست.
@Raminmousa
Papers channel: https://www.tg-me.com/+SP9l58Ta_zZmYmY0
Please open Telegram to view this post
VIEW IN TELEGRAM
Saiconference
Intelligent Systems Conference (IntelliSys) | Machine Learning, Computer Vision & Artificial Intelligence Conference | AI Conference
SAI Conferences presents leading Artificial Intelligence, Machine Learning, and Computer Vision Conferences, drawing experts from 55+ countries. Join us at this AI Conference on 28-29 August 2025 at Amsterdam, The Netherlands for innovative AI insights and…
preprints202502.0982.v1.pdf
1018.1 KB
PKG-LLM: A Framework for Predicting GAD and MDD Using Knowledge Graphs and Large Language Models in Cognitive Neuroscience
Ali Sarabadani,Hadis Taherinia,Niloufar Ghadiri,
Ehsan Karimi Shahmarvandi,
Ramin Mousa *
Abstract
Purpose: This research project has a single purpose: the construction and evaluation of PKG-LLM, a knowledge graph framework whose application is primarily intended for cognitive neuroscience. It also aims to improve predictions of relationships among neurological entities and improve named entity recognition (NER) and relation extraction (RE) from large neurological datasets. Employing the GPT-4 and expert review, we aim to demonstrate how this framework may outperform traditional models by way of precision, recall, and F1 score, intending to provide key insights into possible future clinical and research applications in the field of neuroscience. Method: In the evaluation of PKG-LLM, there were two different tasks primarily: relation extraction (RE) and named entity recognition (NER). Both tasks processed data and obtained performance metrics, such as precision, recall, and F1-score, using GPT-4. Moreover, there was an integration of an expert review process comprising neurologists and domain experts reviewing those extracted relationships and entities and improving such final performance metrics. Model comparative performance was reported against StrokeKG and Heart Failure KG. On the other hand, PKG-LLM evinced itself to link prediction-in-cognition through metrics such as Mean Rank (MR), Mean Reciprocal Rank (MRR), and Precision at K (P@K). The model was evaluated against other link prediction models, including TransE, RotatE, DistMult, ComplEx, ConvE, and HolmE. Findings: PKG-LLM demonstrated competitive performance in both relation extraction and named entity recognition tasks. In its traditional form, PKG-LLM achieved a precision of 75.45\%, recall of 78.60\%, and F1-score of 76.89\% in relation extraction, which improved to 82.34\%, 85.40\%, and 83.85\% after expert review. In named entity recognition, the traditional model scored 73.42\% precision, 76.30\% recall, and 74.84\% F1-score, improving to 81.55\%, 84.60\%, and 82.99\% after expert review. For link prediction, PKG-LLM achieved an MRR of 0.396, P@1 of 0.385, and P@10 of 0.531, placing it in a competitive range compared to models like TransE, RotatE, and ConvE. Conclusion: This study showed that PKG-LLM mainly outperformed the existing models by adding expert reviews in its application in extraction and named entity recognition tasks. Further, the model's competitive edge in link prediction lends credence to its capability in knowledge graph construction and refinement in the field of cognitive neuroscience as well. PKG-LLM's superiority over existing models and its ability to generate more accurate results with clinical relevance indicates that it is a significant tool to augment neuroscience research and clinical applications. The evaluation process entailed using GPT-4 and expert review. This approach ensures that the resulting knowledge graph is scientifically compelling and practically beneficial in more advanced cognitive neuroscience research.
Link: https://www.preprints.org/manuscript/202502.0982/v1
@Machine_learn
Ali Sarabadani,Hadis Taherinia,Niloufar Ghadiri,
Ehsan Karimi Shahmarvandi,
Ramin Mousa *
Abstract
Purpose: This research project has a single purpose: the construction and evaluation of PKG-LLM, a knowledge graph framework whose application is primarily intended for cognitive neuroscience. It also aims to improve predictions of relationships among neurological entities and improve named entity recognition (NER) and relation extraction (RE) from large neurological datasets. Employing the GPT-4 and expert review, we aim to demonstrate how this framework may outperform traditional models by way of precision, recall, and F1 score, intending to provide key insights into possible future clinical and research applications in the field of neuroscience. Method: In the evaluation of PKG-LLM, there were two different tasks primarily: relation extraction (RE) and named entity recognition (NER). Both tasks processed data and obtained performance metrics, such as precision, recall, and F1-score, using GPT-4. Moreover, there was an integration of an expert review process comprising neurologists and domain experts reviewing those extracted relationships and entities and improving such final performance metrics. Model comparative performance was reported against StrokeKG and Heart Failure KG. On the other hand, PKG-LLM evinced itself to link prediction-in-cognition through metrics such as Mean Rank (MR), Mean Reciprocal Rank (MRR), and Precision at K (P@K). The model was evaluated against other link prediction models, including TransE, RotatE, DistMult, ComplEx, ConvE, and HolmE. Findings: PKG-LLM demonstrated competitive performance in both relation extraction and named entity recognition tasks. In its traditional form, PKG-LLM achieved a precision of 75.45\%, recall of 78.60\%, and F1-score of 76.89\% in relation extraction, which improved to 82.34\%, 85.40\%, and 83.85\% after expert review. In named entity recognition, the traditional model scored 73.42\% precision, 76.30\% recall, and 74.84\% F1-score, improving to 81.55\%, 84.60\%, and 82.99\% after expert review. For link prediction, PKG-LLM achieved an MRR of 0.396, P@1 of 0.385, and P@10 of 0.531, placing it in a competitive range compared to models like TransE, RotatE, and ConvE. Conclusion: This study showed that PKG-LLM mainly outperformed the existing models by adding expert reviews in its application in extraction and named entity recognition tasks. Further, the model's competitive edge in link prediction lends credence to its capability in knowledge graph construction and refinement in the field of cognitive neuroscience as well. PKG-LLM's superiority over existing models and its ability to generate more accurate results with clinical relevance indicates that it is a significant tool to augment neuroscience research and clinical applications. The evaluation process entailed using GPT-4 and expert review. This approach ensures that the resulting knowledge graph is scientifically compelling and practically beneficial in more advanced cognitive neuroscience research.
Link: https://www.preprints.org/manuscript/202502.0982/v1
@Machine_learn
📃 Methods of decomposition theory and graph labeling in the study of social network structure
📎 Study the paper
@Machine_learn
📎 Study the paper
@Machine_learn
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia
Large Language Models (LLMs) have made significant progress in various downstream tasks, inspiring the development of Speech Understanding Language Models (SULMs) to enable comprehensive speech-based interactions. However, most advanced SULMs are developed by the industry, leveraging large-scale datasets and computational resources that are not readily available to the academic community. Moreover, the lack of transparency in training details creates additional barriers to further innovation. In this study, we present OSUM, an Open Speech Understanding Model designed to explore the potential of training SLUMs under constrained academic resources. The OSUM model combines a Whisper encoder with a Qwen2 LLM and supports a wide range of speech tasks, including speech recognition (ASR), speech recognition with timestamps (SRWT), vocal event detection (VED), speech emotion recognition (SER), speaking style recognition (SSR), speaker gender classification (SGC), speaker age prediction (SAP), and speech-to-text chat (STTC). By employing an ASR+X training strategy, OSUM achieves efficient and stable multi-task training by simultaneously optimizing ASR alongside target tasks. Beyond delivering strong performance, OSUM emphasizes transparency by providing openly available data preparation and training methodologies, offering valuable insights and practical guidance for the academic community. By doing so, we aim to accelerate research and innovation in advanced SULM technologies.
Paper: https://arxiv.org/pdf/2501.13306v2.pdf
Code: https://github.com/aslp-lab/osum
Datasets: LibriSpeech - IEMOCAP
@Machine_learn
Large Language Models (LLMs) have made significant progress in various downstream tasks, inspiring the development of Speech Understanding Language Models (SULMs) to enable comprehensive speech-based interactions. However, most advanced SULMs are developed by the industry, leveraging large-scale datasets and computational resources that are not readily available to the academic community. Moreover, the lack of transparency in training details creates additional barriers to further innovation. In this study, we present OSUM, an Open Speech Understanding Model designed to explore the potential of training SLUMs under constrained academic resources. The OSUM model combines a Whisper encoder with a Qwen2 LLM and supports a wide range of speech tasks, including speech recognition (ASR), speech recognition with timestamps (SRWT), vocal event detection (VED), speech emotion recognition (SER), speaking style recognition (SSR), speaker gender classification (SGC), speaker age prediction (SAP), and speech-to-text chat (STTC). By employing an ASR+X training strategy, OSUM achieves efficient and stable multi-task training by simultaneously optimizing ASR alongside target tasks. Beyond delivering strong performance, OSUM emphasizes transparency by providing openly available data preparation and training methodologies, offering valuable insights and practical guidance for the academic community. By doing so, we aim to accelerate research and innovation in advanced SULM technologies.
Paper: https://arxiv.org/pdf/2501.13306v2.pdf
Code: https://github.com/aslp-lab/osum
Datasets: LibriSpeech - IEMOCAP
@Machine_learn
Forwarded from Papers
یکی از ابزارهای خوبی که بنده تونستم توسعه بدم ابزار Stock Ai می باشد. در این ابزار از ۳۶۰ اندیکاتور استفاده کردم. گزارشات back test این ابزار در ویدیو های زیر موجود می باشد.
May 2024 :
https://youtu.be/aSS99lynMFQ?si=QSk8VVKhLqO_2Qi3
July 2014:
https://youtu.be/ThyZ0mZwsGk?si=FKPK7Hkz-mRx-752&t=209
از این رو سعی میکنیم مقاله ای این کار رو بنویسیم. شروع مقاله ی این کار ۲۰ اسفند خواهد بود.
دوستانی که می تونن به هر نحوی کمک کنند تا شروع مقاله می تونن نام نویسی کنند.
نفرات ٣ و ٥ اين كار باقي مونده.
@Raminmousa
May 2024 :
https://youtu.be/aSS99lynMFQ?si=QSk8VVKhLqO_2Qi3
July 2014:
https://youtu.be/ThyZ0mZwsGk?si=FKPK7Hkz-mRx-752&t=209
از این رو سعی میکنیم مقاله ای این کار رو بنویسیم. شروع مقاله ی این کار ۲۰ اسفند خواهد بود.
دوستانی که می تونن به هر نحوی کمک کنند تا شروع مقاله می تونن نام نویسی کنند.
نفرات ٣ و ٥ اين كار باقي مونده.
@Raminmousa
YouTube
May 2024 Backtest Smart AI Signal Telegram Channel #telegram_to_mt4 #telegramsignals
-------------------------------------------------------------------------------------
For the next 30 days, you can USE PROMO CODE LAUNCH70 to get 70% off your subscription of the mltiplai.com database.
FOR MORE INFO VISIT US AT
✅ https://mltiplai.com
✅…
For the next 30 days, you can USE PROMO CODE LAUNCH70 to get 70% off your subscription of the mltiplai.com database.
FOR MORE INFO VISIT US AT
✅ https://mltiplai.com
✅…
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from Github LLMs
Tutorial: Train your own Reasoning model with GRPO
📓 Tutorial
https://www.tg-me.com/deep_learning_proj
📓 Tutorial
https://www.tg-me.com/deep_learning_proj