Telegram Web Link
Database Querying Using SQL.pdf
136.4 KB
Notes on SQL for data management and analysis, including queries and integration with R, from University of South Carolina.
❀2πŸ‘1
πŸ“š Data Science Riddle

A business team wants interpretable insights, not just predictions. What's the best model to start with?
Anonymous Quiz
33%
Random Forest
36%
Logistic Regression
13%
XGBoost
19%
Deep Neural Net
Top Data Science Tools By Function
❀3πŸ‘1
Forwarded from Cool GitHub repositories
lerobot

This is an end-to-end library for robot learning. It handles the entire pipeline from loading and processing robotics datasets to training policies and deploying them in simulation or on real hardware.

Creator:   huggingface
Stars ⭐️:  19,000
Forked by: 3,000

Github Repo:
https://github.com/huggingface/lerobot

#robotics #AI
βž–βž–βž–βž–βž–βž–βž–βž–βž–βž–βž–βž–βž–βž–    
Join @github_repositories_bds for more cool repositories. This channel belongs to @bigdataspecialist group
❀3
Descriptive Statistics and Exploratory Data Analysis.pdf
1 MB
Covers basic numerical and graphical summaries with practical examples, from University of Washington.
❀5πŸ‘2πŸ‘1
Relational DB Vs Graph DB by BigData Specialist.pdf
4.5 MB
This is our latest post from Instagram, saved as PDF.

It's a comprehensive breakdown(as always) explaining the difference between Relational DB and Graph DB in a fun and easy to grasp way.

⚠️ Spoiler alert: You will love it!

Here's our Instagram post: Relational DB Vs Graph DB
❀6πŸ‘2
Regression Analysis Cheatsheet
❀5
Linear Regression.pdf
834.6 KB
Covers basics of Linear Regression for modeling numerical data, including assumptions and applications in genetics, from University of Washington.
❀5
πŸ“š Data Science Riddle

In a real-world NLP project, your model performs poorly on new slang abbreviations. What's the fix?
Anonymous Quiz
7%
Add more layers
72%
Use contextual embeddings like BERT
13%
Tune dropout
9%
Increase token length
❀1
Top 6 Data Concepts
❀5
πŸ“š Data Science Riddle

A data engineer complains that your model training job is failing in production due to schema mismatch. What's the root fix?
Anonymous Quiz
13%
Cast data types in code
15%
Skip invalid rows
22%
Retrain with old schema
51%
Use a schema registry
K-Means Clustering
❀4
Covariance vs. Correlation: Same Family, Different Story

People use them interchangeably but they measure different things.

Covariance tells you the direction of relationship (positive or negative).
Correlation goes further; it tells you the strength, normalized between -1 and 1.

So while covariance can be 2345.67, correlation says 0.92. clear, interpretable, scale-free.
Covariance shows movement, correlation shows consistency.
❀5πŸ‘1
πŸ“š Data Science Riddle

You're Processing a dataset with frequent schema evolution. Which format handles it most gracefully?
Anonymous Quiz
10%
ORC
14%
Avro
56%
CSV
20%
Parquet
❀3
Eigenvalues & Eigenvectors β€” Why PCA Actually Works

You’ve heard of PCA. But what’s really happening underneath?

PCA finds the directions (vectors) where your data varies the most.

Those directions are eigenvectors of the covariance matrix and the eigenvalues tell you how much variance each captures.

You’re basically rotating your data to find its β€œnatural axes.”

PCA isn’t compression β€” it’s discovering how your data wants to be seen.
❀6πŸ‘2
πŸ“š Data Science Riddle

Your spark job fails due to executor memory pressure. Most effective optimization?
Anonymous Quiz
14%
Broadcast variables
27%
Larger cluster
41%
More shuffle partitions
18%
Persist fewer objects
BigDataAnalytics-Lecture.pdf
10.2 MB
Notes on HDFS, MapReduce, YARN, Hadoop vs. traditional systems and much more... from Columbia University.
❀7
πŸ“š Data Science Riddle

You fit a forecasting model and residuals show increasing variance. What is needed?
Anonymous Quiz
21%
Differnecing
47%
Smoothing
26%
Decomposition
7%
Box-Cox
πŸ‘3❀1
4 Pillars of Data Science
πŸ”₯4
AI vs Machine Learning vs Deep Learning Vs Generative AI
❀5
2025/12/10 13:12:27
Back to Top
HTML Embed Code: