Telegram Web Link
Forwarded from Papers
با عرض سلام بسیاری از دوستان که می خواهند مقاله شروع کنند نیاز به نقش راهی برای شروع دارند. از این رو سعی داریم جهت مشاوره در موضوعات زیر همکاری داشته باشیم. انتخاب موضوع، انتخاب ایده، بررسی ساختار کلی مقاله و انتخاب ژورنال با بنده خواهد بود و هر هفته یک جلسه جهت بررسی کارهای انجام شده خواهیم داشت. هزینه مشاوره در هر موضوع ۵ تومن می باشد.

💠Medical Image

1-alzheimer disease classification
2-Wound image classification
3- skin cancer classification
4- breast cancer segmentation

💠Time series:
1- crypro market price prediction: High dimensional features
2-crypto market illiquidity prediction: High dimensional features
3-Air quality prediction
4-Network traffic prediction
5-Malware detection

💠Text mining
1-Large languge model: systmatic survey
2-multi-domain sentiment analysis
3- Extracting psychiatric stressors for suicide from twitter
جهت مشارکت می تونین با ایدی بنده در ارتباط باشین.
@Raminmousa
Please open Telegram to view this post
VIEW IN TELEGRAM
Machine learning books and papers pinned «با عرض سلام بسیاری از دوستان که می خواهند مقاله شروع کنند نیاز به نقش راهی برای شروع دارند. از این رو سعی داریم جهت مشاوره در موضوعات زیر همکاری داشته باشیم. انتخاب موضوع، انتخاب ایده، بررسی ساختار کلی مقاله و انتخاب ژورنال با بنده خواهد بود و هر هفته یک جلسه…»
📚 Linear Algebra for Computer Vision,
Robotics, and Machine Learning

👉 Book

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
ML Cheat Sheet.pdf
6.2 MB
📃 Cheet Sheet for Machine Learning

@Machine_learn
Detecting Backdoor Samples in Contrastive Language Image Pretraining

3 Feb 2025 · Hanxun Huang, Sarah Erfani, Yige Li, Xingjun Ma, James Bailey ·

Contrastive language-image pretraining (CLIP) has been found to be vulnerable to poisoning backdoor attacks where the adversary can achieve an almost perfect attack success rate on CLIP models by poisoning only 0.01\% of the training dataset. This raises security concerns on the current practice of pretraining large-scale models on unscrutinized web data using CLIP. In this work, we analyze the representations of backdoor-poisoned samples learned by CLIP models and find that they exhibit unique characteristics in their local subspace, i.e., their local neighborhoods are far more sparse than that of clean samples. Based on this finding, we conduct a systematic study on detecting CLIP backdoor attacks and show that these attacks can be easily and efficiently detected by traditional density ratio-based local outlier detectors, whereas existing backdoor sample detection methods fail. Our experiments also reveal that an unintentional backdoor already exists in the original CC3M dataset and has been trained into a popular open-source model released by OpenCLIP. Based on our detector, one can clean up a million-scale web dataset (e.g., CC3M) efficiently within 15 minutes using 4 Nvidia A100 GPUs.

Paper: https://arxiv.org/pdf/2502.01385v1.pdf

Code: https://github.com/HanxunH/Detect-CLIP-Backdoor-Samples

Datasets: Conceptual Captions CC12M RedCaps

@Machine_learn
Efficient Reasoning with Hidden Thinking



Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose
(as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space. We design the Heima Encoder to condense each intermediate CoT into a compact, higher-level hidden representation using a single thinking token, effectively minimizing verbosity and reducing the overall number of tokens required during the reasoning process. Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs. Experimental results across diverse reasoning MLLM benchmarks demonstrate that Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy. Moreover, the effective reconstruction of multimodal reasoning processes with Heima Decoder validates both the robustness and interpretability of our approach.

Paper: https://arxiv.org/pdf/2501.19201v1.pdf

Code: https://github.com/shawnricecake/heima

Datasets: MMBench - MM-Vet - MathVista - MMStar - HallusionBench



@Machine_learn
Forwarded from Papers
با عرض سلام
نفر سوم و چهارم از مقاله زیر رو جهت ثبت اسم نیاز داریم. این مقاله ۸ ماه داریم روش کار میکنیم.

Title: Gaussian Mixture latent for Recurrent Neural Networks Basic deficiencies


The problem of time series prediction analyzes patterns in past data to predict the future. Traditional machine learning algorithms, despite achieving impressive results, require manual feature selection. Automatic feature selection along with the addition of time concept in deep recurrent networks has led to the provision of more suitable solutions. The selection of feature order in deep recurrent networks leads to the provision of different results due to the use of Back-propagation. The problem of selecting feature order is an NP-complete problem. In this research, the aim is to provide a solution to improve this problem.......!

Jouranl: Expert system with application
هزینه نفر سوم ۵۰۰ دلار و هزینه نفر چهارم ۴۰۰ دلار می باشد. جهت ثبت اسم با ایدی بنده در ارتباط باشین.
@Raminmousa
@Machine_learn
Forwarded from Papers
با عرض سلام برای یکی از مقالاتمون نفر دوم رو لازم داریم زمان سابمیت امشب تا فردا شب

Time-series Forecasting of Bitcoin Prices and Illiquidity
using High-dimensional Features: XGBoostLSTM
Approach


Corresponding author: Ramin Mousa

Abstract Liquidity is the ease of converting an asset into cash or another asset
without loss, and is shown by the relationship between the time scale and the
price scale of an investment. This article examines the relationship between
Bitcoin’s price prediction and illiquidity. Bitcoin Hash Rate information was col-
lected in three different intervals, and three techniques of feature selection (FS)
Filter, Wrapper, and Embedded were used. Considering the regression nature of
illiquidity prediction, an approach based on LSTM network and XGBoost was
proposed. LSTM was used to extract time series features, and XGBoost was used
to learn these features. The proposed LSTMXGBoost approach was evaluated in
two modes: price prediction and illiquidity prediction. This approach achieved
MAE 1.60 in the next-day forecast and MAE 3.46 in the next-day illiquidity
forecast. In the cross-validation of the proposed approach on the FS approaches,
the best result was obtained in the prediction by the filter approach and in
the classification by the wrapper approach. These obtained results indicate that
the presented models outperform the existing models in the literature. Examin-
ing the confusion matrices indicates that the two tasks of price prediction and
illiquidity prediction have no correlation and harm each other.

Keywords: illiquidity prediction, Bitcoin hash rate, hybrid model, price pre-
diction, LSTMXGBoost
ژورنال سابمیت
Journal : Finanace innovation(springer)
If: 6.5
دوستانی که در سری زمانی کار می کنن می تونن در این مقاله شرکت کنن.

@Raminmousa
Data Science and Data Analytics (en).pdf
23.9 MB
DATA SCIENCE AND DATA
ANALYTICS OPPORTUNITIES AND CHALLENGES

Edited by
Amit Kumar Tyagi
#Book

@Machine_learn
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

22 Feb 2024 · Yijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu, Omar Khattab, Monica S. Lam ·

We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages. This underexplored problem poses new challenges at the pre-writing stage, including how to research the topic and prepare an outline prior to writing. We propose STORM, a writing system for the Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking. STORM models the pre-writing stage by (1) discovering diverse perspectives in researching the given topic, (2) simulating conversations where writers carrying different perspectives pose questions to a topic expert grounded on trusted Internet sources, (3) curating the collected information to create an outline. For evaluation, we curate FreshWiki, a dataset of recent high-quality Wikipedia articles, and formulate outline assessments to evaluate the pre-writing stage. We further gather feedback from experienced Wikipedia editors. Compared to articles generated by an outline-driven retrieval-augmented baseline, more of STORM's articles are deemed to be organized (by a 25% absolute increase) and broad in coverage (by 10%). The expert feedback also helps identify new challenges for generating grounded long articles, such as source bias transfer and over-association of unrelated facts.

Paper: https://arxiv.org/pdf/2402.14207v2.pdf

Codes:
https://github.com/assafelovic/gpt-researcher
https://github.com/stanford-oval/storm



@Machine_learn
🖥 Competitive Programming with Large Reasoning Models

📚Article

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from Papers
یکی از ابزارهای خوبی که بنده تونستم توسعه بدم ابزار Stock Ai می باشد. در این ابزار از ۳۶۰ اندیکاتور استفاده کردم. گزارشات back test این ابزار در ویدیو های زیر موجود می باشد.

May 2024 :

https://youtu.be/aSS99lynMFQ?si=QSk8VVKhLqO_2Qi3

July 2014:

https://youtu.be/ThyZ0mZwsGk?si=FKPK7Hkz-mRx-752&t=209

از این رو سعی میکنیم مقاله ای این کار رو بنویسیم. شروع مقاله ی این کار ۲۰ اسفند خواهد بود.
دوستانی که می تونن به هر نحوی کمک کنند تا شروع مقاله می تونن نام نویسی کنند.
@Raminmousa
CapsF_Capsule_Fusion_for_Extracting_Psychiatric_Stressors_for_Suicide.pdf
466.5 KB
CapsF: Capsule Fusion for Extracting Psychiatric
Stressors for Suicide From Twitter

Authors:
Mohammad Ali Dadgostarnia, Ramin Mousa, Saba Hesaraki, Mahdi Hemmasian


Accepted
Journal: Natural Language Processing

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Machine learning books and papers
CapsF_Capsule_Fusion_for_Extracting_Psychiatric_Stressors_for_Suicide.pdf
در این پروژه به برسی عوامل استرس زا در خودکشی پرداختیم. این اولین کار در زبان فارسی می باشد که برای این منظور گسترش داده شد. از دوستانی که در این پروژه همکاری کردن تشکر می کنم❤️
Please open Telegram to view this post
VIEW IN TELEGRAM
2025/02/22 21:57:08
Back to Top
HTML Embed Code: