Machine learning books and papers pinned «تنها دو روز تا شروع اين پروژه باقي مونده دوستاني كه مايل به همكاري هستن به ايدي بنده پيام بدن @Raminmousa»
Forwarded from Github LLMs
Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap
Large language models (LLMs) have not only revolutionized natural language processing but also extended their prowess to various domains, marking a significant stride towards artificial general intelligence. The interplay between LLMs and evolutionary algorithms (EAs), despite differing in objectives and methodologies, share a common pursuit of applicability in complex problems. Meanwhile, EA can provide an optimization framework for LLM's further enhancement under black-box settings, empowering LLM with flexible global search capacities. On the other hand, the abundant domain knowledge inherent in LLMs could enable EA to conduct more intelligent searches. Furthermore, the text processing and generative capabilities of LLMs would aid in deploying EAs across a wide range of tasks. Based on these complementary advantages, this paper provides a thorough review and a forward-looking roadmap, categorizing the reciprocal inspiration into two main avenues: LLM-enhanced EA and EA-enhanced #LLM. Some integrated synergy methods are further introduced to exemplify the complementarity between LLMs and EAs in diverse scenarios, including code generation, software engineering, neural architecture search, and various generation tasks. As the first comprehensive review focused on the EA research in the era of #LLMs, this paper provides a foundational stepping stone for understanding the collaborative potential of LLMs and EAs. The identified challenges and future directions offer guidance for researchers and practitioners to unlock the full potential of this innovative collaboration in propelling advancements in optimization and artificial intelligence.
Paper: https://arxiv.org/pdf/2401.10034v3.pdf
Code: https://github.com/wuxingyu-ai/llm4ec
https://www.tg-me.com/deep_learning_proj
Large language models (LLMs) have not only revolutionized natural language processing but also extended their prowess to various domains, marking a significant stride towards artificial general intelligence. The interplay between LLMs and evolutionary algorithms (EAs), despite differing in objectives and methodologies, share a common pursuit of applicability in complex problems. Meanwhile, EA can provide an optimization framework for LLM's further enhancement under black-box settings, empowering LLM with flexible global search capacities. On the other hand, the abundant domain knowledge inherent in LLMs could enable EA to conduct more intelligent searches. Furthermore, the text processing and generative capabilities of LLMs would aid in deploying EAs across a wide range of tasks. Based on these complementary advantages, this paper provides a thorough review and a forward-looking roadmap, categorizing the reciprocal inspiration into two main avenues: LLM-enhanced EA and EA-enhanced #LLM. Some integrated synergy methods are further introduced to exemplify the complementarity between LLMs and EAs in diverse scenarios, including code generation, software engineering, neural architecture search, and various generation tasks. As the first comprehensive review focused on the EA research in the era of #LLMs, this paper provides a foundational stepping stone for understanding the collaborative potential of LLMs and EAs. The identified challenges and future directions offer guidance for researchers and practitioners to unlock the full potential of this innovative collaboration in propelling advancements in optimization and artificial intelligence.
Paper: https://arxiv.org/pdf/2401.10034v3.pdf
Code: https://github.com/wuxingyu-ai/llm4ec
https://www.tg-me.com/deep_learning_proj
با عرض سلام پروژه جدیدمون شروع شد.
هدف اصلی این پروژه اموزش یک مدل پیشنهاد دهنده ی مدل برای مسائله طبقه بندی تصاویر پزشکی میباشد که از اموزش مجدد مدل ها جلوگیری میکند. این مسائله با جنبه جلوگیری از مصرف انرژی اموزشی و زمان اموزش مدل ها ارائه می شود. برای این منظور ۵۰۰۰ مقاله در این زمینه جمع اوری شده است. جزئیات بیشتر در لینک گیت قرار دارد.
Project Title: MedRec: Medical recommender system for image classification without retraining
Github: https://github.com/Ramin1Mousa/MedicalRec
Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
Impact factor: 20.8
۷ نفر دیگر امکان اضافه شدن به این پروژه رو دارند. هر شخص نیاز هست که حدودا داده های ۴۰۰ مقاله رو بررسی کند. زمان تقریبی هر مقاله ۵-۱۰ دقیقه می باشد. هزینه مشارکت در مقاله:
🔹 2- 600$❌
🔺 3- 500$❌
💠 4- 400$✅
🔺 5- 300$✅
🔹 6- 200$❌
🔸 7- 200$❌
جهت مشارکت می تونید به ایدی بنده پیام بدین.
تنها نفرات ۴ و ۵ باقی مانده....!
@Raminmousa
هدف اصلی این پروژه اموزش یک مدل پیشنهاد دهنده ی مدل برای مسائله طبقه بندی تصاویر پزشکی میباشد که از اموزش مجدد مدل ها جلوگیری میکند. این مسائله با جنبه جلوگیری از مصرف انرژی اموزشی و زمان اموزش مدل ها ارائه می شود. برای این منظور ۵۰۰۰ مقاله در این زمینه جمع اوری شده است. جزئیات بیشتر در لینک گیت قرار دارد.
Project Title: MedRec: Medical recommender system for image classification without retraining
Github: https://github.com/Ramin1Mousa/MedicalRec
Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
Impact factor: 20.8
۷ نفر دیگر امکان اضافه شدن به این پروژه رو دارند. هر شخص نیاز هست که حدودا داده های ۴۰۰ مقاله رو بررسی کند. زمان تقریبی هر مقاله ۵-۱۰ دقیقه می باشد. هزینه مشارکت در مقاله:
جهت مشارکت می تونید به ایدی بنده پیام بدین.
تنها نفرات ۴ و ۵ باقی مانده....!
@Raminmousa
Please open Telegram to view this post
VIEW IN TELEGRAM
GitHub
GitHub - Ramin1Mousa/MedicalRec: MedRec: Medical recommender system for image classification without retraining
MedRec: Medical recommender system for image classification without retraining - Ramin1Mousa/MedicalRec
Machine learning books and papers pinned «با عرض سلام پروژه جدیدمون شروع شد. هدف اصلی این پروژه اموزش یک مدل پیشنهاد دهنده ی مدل برای مسائله طبقه بندی تصاویر پزشکی میباشد که از اموزش مجدد مدل ها جلوگیری میکند. این مسائله با جنبه جلوگیری از مصرف انرژی اموزشی و زمان اموزش مدل ها ارائه می شود. برای…»
DeepSeek-V3 Technical Report
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in #DeepSeek V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. The model checkpoints are available at https://github.com/deepseek-ai/DeepSeek-V3.
Paper: https://arxiv.org/pdf/2412.19437v1.pdf
Code: https://github.com/deepseek-ai/deepseek-v3
@Machine_learn
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in #DeepSeek V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. The model checkpoints are available at https://github.com/deepseek-ai/DeepSeek-V3.
Paper: https://arxiv.org/pdf/2412.19437v1.pdf
Code: https://github.com/deepseek-ai/deepseek-v3
@Machine_learn
Machine learning books and papers
با عرض سلام پروژه جدیدمون شروع شد. هدف اصلی این پروژه اموزش یک مدل پیشنهاد دهنده ی مدل برای مسائله طبقه بندی تصاویر پزشکی میباشد که از اموزش مجدد مدل ها جلوگیری میکند. این مسائله با جنبه جلوگیری از مصرف انرژی اموزشی و زمان اموزش مدل ها ارائه می شود. برای…
نفرات ٤ و ٥ از اين پروژه باقي موندن دوستاني كه حاضر به همكاري هستن به ايدي بنده مراجعه كنند.
@Raminmousa
@Raminmousa
📄 A comprehensive bibliometric analysis on social network anonymization: current approaches and future directions
📎 Study the paper
@Machine_learn
📎 Study the paper
@Machine_learn
Title: Breast Cancer Ultrasound Image Segmentation Using Improved 3DUnet++
🔹 🔹 🔹 🔹 🔹 🔹 🔹 🔹
Author: @Raminmousa
🔹 🔹 🔹 🔹 🔹 🔹 🔹 🔹
Cite: https://doi.org/10.1016/j.wfumbo.2024.100068
ABSTRACT: Breast cancer is the most common cancer and the main cause of cancer-related deaths in women around the world. Early detection reduces the number of deaths. Automated breast ultrasound (ABUS) is a new and promising screening method for examining the entire breast. Volumetric ABUS examination is time-consuming, and lesions may be missed during the examination. Therefore, computer-aided cancer diagnosis in ABUS volume is highly expected to help the physician for breast cancer screening. In this research, we presented 3D structures based on UNet, ResUNet, and UNet++ for the automatic detection of cancer in ABUS volume to speed up examination while providing high detection sensitivity with low false positives (FPs). The three investigated approaches were evaluated on equal datasets in terms of training and testing as well as with proportional hyperparameters. Among the proposed approaches in classification and segmentation problems, the UNet++ approach was able to achieve more acceptable results. The UNet++ approach on the dataset of the Tumor Segmentation, Classification, and Detection Challenge on Automated 3D Breast Ultrasound 2023 (Named TSCD-ABUS2023) was able to achieve Accuracy=0.9911 and AUROC=0.9761 in classification and Dice=0.4930 in segmentation.
#Accepted✅
@Machine_learn
Author: @Raminmousa
Cite: https://doi.org/10.1016/j.wfumbo.2024.100068
ABSTRACT: Breast cancer is the most common cancer and the main cause of cancer-related deaths in women around the world. Early detection reduces the number of deaths. Automated breast ultrasound (ABUS) is a new and promising screening method for examining the entire breast. Volumetric ABUS examination is time-consuming, and lesions may be missed during the examination. Therefore, computer-aided cancer diagnosis in ABUS volume is highly expected to help the physician for breast cancer screening. In this research, we presented 3D structures based on UNet, ResUNet, and UNet++ for the automatic detection of cancer in ABUS volume to speed up examination while providing high detection sensitivity with low false positives (FPs). The three investigated approaches were evaluated on equal datasets in terms of training and testing as well as with proportional hyperparameters. Among the proposed approaches in classification and segmentation problems, the UNet++ approach was able to achieve more acceptable results. The UNet++ approach on the dataset of the Tumor Segmentation, Classification, and Detection Challenge on Automated 3D Breast Ultrasound 2023 (Named TSCD-ABUS2023) was able to achieve Accuracy=0.9911 and AUROC=0.9761 in classification and Dice=0.4930 in segmentation.
#Accepted
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of #AI research and industry, shedding light on a promising path toward the next AI milestone. However, significant challenges remain preventing MLLMs from being practical in real-world applications. The most notable challenge comes from the huge cost of running an MLLM with a massive number of parameters and extensive computation. As a result, most MLLMs need to be deployed on high-performing cloud servers, which greatly limits their application scopes such as mobile, offline, energy-sensitive, and privacy-protective scenarios. In this work, we present MiniCPM-V, a series of efficient #MLLMs deployable on end-side devices. By integrating the latest MLLM techniques in architecture, pretraining and alignment, the latest MiniCPM-Llama3-V 2.5 has several notable features: (1) Strong performance, outperforming GPT-4V-1106, Gemini Pro and Claude 3 on OpenCompass, a comprehensive evaluation over 11 popular benchmarks, (2) strong #OCR capability and 1.8M pixel high-resolution #image perception at any aspect ratio, (3) trustworthy behavior with low hallucination rates, (4) multilingual support for 30+ languages, and (5) efficient deployment on mobile phones. More importantly, MiniCPM-V can be viewed as a representative example of a promising trend: The model sizes for achieving usable (e.g., GPT-4V) level performance are rapidly decreasing, along with the fast growth of end-side computation capacity. This jointly shows that GPT-4V level MLLMs deployed on end devices are becoming increasingly possible, unlocking a wider spectrum of real-world AI applications in the near future.
Paper: https://arxiv.org/pdf/2408.01800v1.pdf
Codes:
https://github.com/OpenBMB/MiniCPM-o
https://github.com/openbmb/minicpm-v
Datasets: Video-MME
@Machine_learn
The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of #AI research and industry, shedding light on a promising path toward the next AI milestone. However, significant challenges remain preventing MLLMs from being practical in real-world applications. The most notable challenge comes from the huge cost of running an MLLM with a massive number of parameters and extensive computation. As a result, most MLLMs need to be deployed on high-performing cloud servers, which greatly limits their application scopes such as mobile, offline, energy-sensitive, and privacy-protective scenarios. In this work, we present MiniCPM-V, a series of efficient #MLLMs deployable on end-side devices. By integrating the latest MLLM techniques in architecture, pretraining and alignment, the latest MiniCPM-Llama3-V 2.5 has several notable features: (1) Strong performance, outperforming GPT-4V-1106, Gemini Pro and Claude 3 on OpenCompass, a comprehensive evaluation over 11 popular benchmarks, (2) strong #OCR capability and 1.8M pixel high-resolution #image perception at any aspect ratio, (3) trustworthy behavior with low hallucination rates, (4) multilingual support for 30+ languages, and (5) efficient deployment on mobile phones. More importantly, MiniCPM-V can be viewed as a representative example of a promising trend: The model sizes for achieving usable (e.g., GPT-4V) level performance are rapidly decreasing, along with the fast growth of end-side computation capacity. This jointly shows that GPT-4V level MLLMs deployed on end devices are becoming increasingly possible, unlocking a wider spectrum of real-world AI applications in the near future.
Paper: https://arxiv.org/pdf/2408.01800v1.pdf
Codes:
https://github.com/OpenBMB/MiniCPM-o
https://github.com/openbmb/minicpm-v
Datasets: Video-MME
@Machine_learn
Transformers 2: Self-adaptive LLMs
Paper: https://arxiv.org/pdf/2501.06252v2.pdf
Code:
https://github.com/SakanaAI/self-adaptive-llms
https://github.com/codelion/adaptive-classifier
Datasets: GSM8K - HumanEval - MATH
MBPP - TextVQA - OK-VQA - ARC (AI2 Reasoning Challenge)
@Machine_learn
Paper: https://arxiv.org/pdf/2501.06252v2.pdf
Code:
https://github.com/SakanaAI/self-adaptive-llms
https://github.com/codelion/adaptive-classifier
Datasets: GSM8K - HumanEval - MATH
MBPP - TextVQA - OK-VQA - ARC (AI2 Reasoning Challenge)
@Machine_learn
🧑🍳 New Cookbook guide: How to use the Usage API and Cost API to monitor your OpenAI usage
📚 Book
@Machine_learn
📚 Book
@Machine_learn
This media is not supported in your browser
VIEW IN TELEGRAM
git clone https://github.com/DepthAnything/Video-Depth-Anything
cd Video-Depth-Anything
pip install -r requirements.txt
▪GitHub
▪Paper
▪Model Small
▪Model Large
▪Demo
@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Mathematics of Machine Learning.pdf
3.9 MB
📚 Mathematics of Machine Learning
👨🏻🏫 Philipp Christian Petersen
📝 Table of Contents:
● Language of Machine Learning
● ML Mathematical Framework
● Rademacher Complexities
● Rademacher Complexities Applications
●The Mysterious Machine
● Lower Bounds on Learning
● Model Selection
● Regression and Regularization
● Freezing Fritz
● Support Vector Machines
● Kernel Methods
● Nearest Neighbour
● Neural Networks
● Boosting
● Clustering
● Dimensionality Reduction
@Machine_learn
👨🏻🏫 Philipp Christian Petersen
📝 Table of Contents:
● Language of Machine Learning
● ML Mathematical Framework
● Rademacher Complexities
● Rademacher Complexities Applications
●The Mysterious Machine
● Lower Bounds on Learning
● Model Selection
● Regression and Regularization
● Freezing Fritz
● Support Vector Machines
● Kernel Methods
● Nearest Neighbour
● Neural Networks
● Boosting
● Clustering
● Dimensionality Reduction
@Machine_learn
با عرض سلام پروژه جدیدمون شروع شد.
هدف اصلی این پروژه اموزش یک مدل پیشنهاد دهنده ی مدل برای مسائله طبقه بندی تصاویر پزشکی میباشد که از اموزش مجدد مدل ها جلوگیری میکند. این مسائله با جنبه جلوگیری از مصرف انرژی اموزشی و زمان اموزش مدل ها ارائه می شود. برای این منظور ۵۰۰۰ مقاله در این زمینه جمع اوری شده است. جزئیات بیشتر در لینک گیت قرار دارد.
Project Title: MedRec: Medical recommender system for image classification without retraining
Github: https://github.com/Ramin1Mousa/MedicalRec
Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
Impact factor: 20.8
۷ نفر دیگر امکان اضافه شدن به این پروژه رو دارند. هر شخص نیاز هست که حدودا داده های ۴۰۰ مقاله رو بررسی کند. زمان تقریبی هر مقاله ۵-۱۰ دقیقه می باشد. هزینه مشارکت در مقاله:
🔹 2- 600$
🔺 3- 500$
💠 4- 400$
🔺 5- 300$
🔹 6- 200$
🔸 7- 200$
جهت مشارکت می تونید به ایدی بنده پیام بدین.
🔹 شنبه شروع این پروژه هست🔹
@Raminmousa
هدف اصلی این پروژه اموزش یک مدل پیشنهاد دهنده ی مدل برای مسائله طبقه بندی تصاویر پزشکی میباشد که از اموزش مجدد مدل ها جلوگیری میکند. این مسائله با جنبه جلوگیری از مصرف انرژی اموزشی و زمان اموزش مدل ها ارائه می شود. برای این منظور ۵۰۰۰ مقاله در این زمینه جمع اوری شده است. جزئیات بیشتر در لینک گیت قرار دارد.
Project Title: MedRec: Medical recommender system for image classification without retraining
Github: https://github.com/Ramin1Mousa/MedicalRec
Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
Impact factor: 20.8
۷ نفر دیگر امکان اضافه شدن به این پروژه رو دارند. هر شخص نیاز هست که حدودا داده های ۴۰۰ مقاله رو بررسی کند. زمان تقریبی هر مقاله ۵-۱۰ دقیقه می باشد. هزینه مشارکت در مقاله:
جهت مشارکت می تونید به ایدی بنده پیام بدین.
@Raminmousa
Please open Telegram to view this post
VIEW IN TELEGRAM