Telegram Web Link
πŸ“š Data Science Riddle

You're building a chatbot but it gives generic answers. What's the root issue?
Anonymous Quiz
10%
Model is too deep
66%
Training data lacks context
11%
Wrong loss function
13%
Poor tokenization
Cheatsheet: Imbalanced Data In Classification
❀5
The Data Analyst Cheatsheet
❀6
πŸ“š Data Science Riddle

Model Accuracy improves after dropping half the features. Why?
Anonymous Quiz
13%
Model became smaller
70%
Overfitting reduced
12%
Data size shrank
6%
Training faster
❀3
Understanding the Forecast Statistics and Four Moments (4P).pdf
181.8 KB
Statistical Moments (M1, M2) for Data Analysis

Here are 5 curated PDFs diving into the mean (M1), variance (M2), and their applications in crafting research questions and sourcing data.

A channel member requested resources on this topic and we delivered.

If you have a topic you want resources on let us know, and we’ll make it happen!

@datascience_bds
❀8
Excel Vs SQL Vs Python
❀6πŸ‘3
Basic SQL Commands
❀2
πŸ“š Data Science Riddle

Why do we use Batch Normalization?
Anonymous Quiz
31%
Speeds up training
40%
Prevents overfitting
9%
Adds non-linearity
20%
Reduces dataset size
❀3
LLM Cheatsheet
❀5
πŸ“š Data Science Riddle

Your object detection model misses small objects. Easiest fix?
Anonymous Quiz
24%
Use larger input images
27%
Add more classes
34%
Reduce learning rate
15%
Train longer
πŸ€– AI that creates AI: ASI-ARCH finds 106 new SOTA architectures

ASI-ARCH β€” experimental ASI that autonomously researches and designs neural nets. It hypothesizes, codes, trains & tests models.

πŸ’‘ Scale:
1,773 experiments β†’ 20,000+ GPU-hours.
Stage 1 (20M params, 1B tokens): 1,350 candidates beat DeltaNet.
Stage 2 (340M params): 400 models β†’ 106 SOTA winners.
Top 5 trained on 15B tokens vs Mamba2 & Gated DeltaNet.

πŸ“Š Results:
PathGateFusionNet: 48.51 avg (Mamba2: 47.84, Gated DeltaNet: 47.32).
BoolQ: 60.58 vs 60.12 (Gated DeltaNet).
Consistent gains across tasks.
πŸ” Insights:
Prefers proven tools (gating, convs), refines them iteratively.
Ideas come from: 51.7% literature, 38.2% self-analysis, 10.1% originality.
SOTA share: self-analysis ↑ to 44.8%, literature ↓ to 48.6%.

@datascience_bds
❀4
πŸš€ Databricks Tip: REPLACE vs MERGE

When updating Delta tables, you’ve got two powerful options:

πŸ”Ή REPLACE TABLE … ON
πŸ“š Like throwing away the entire library and rebuilding it.
- Drops the old table & recreates it.
- Schema + data = fully replaced.
- ⚑ Super fast but destructive (old data gone).
- βœ… Best for full refreshes or schema changes.

πŸ”Ή MERGE
πŸ“– Like updating only the books that changed.
- Works row by row.
- Updates, inserts, or deletes specific records.
- πŸ” Preserves unchanged data.
- βœ… Best for incremental updates or CDC (Change Data Capture).

βš–οΈ Key Difference
- REPLACE = Start fresh with a new table.
- MERGE = Surgically update rows without losing the rest.

πŸ‘‰ Rule of thumb:
Use REPLACE for full rebuilds,
Use MERGE for incremental upserts.

#Databricks #DeltaLake
❀3
2025/10/22 13:01:13
Back to Top
HTML Embed Code: