Data Science Weekly - Issue 645Curated news, articles and jobs related to Data Science, AI, & Machine LearningIssue #645 |
|
.
Last Week’s Poll:
.
Data Science Articles & Videos
Seeing like a spreadsheet
You cannot really understand the transformation of the American economy over the last few decades without understanding the spreadsheet. This is a story about how a piece of software transformed the way that American businesses understood themselves, and how they were understood by others; how it enabled the rise of financial engineering and the entire apparatus of Wall Street dealmaking; how it helped reshape the American corporation from an organization that built things into an organization that optimized numbers; and how it offers us a lesson, and a warning, about how artificial intelligence will transform economic life…Bayesian Statistics Future Relevance [Reddit]
I have been interested in Bayesian statistics for a long time and would like to do some research in it on an applied project. However, I was wondering how relevant it is/is going to be in the future? Genuinely asking as I have no idea. I am interested in doing some work in advanced Bayesian hierarchical models. Would doing more stuff in ML/AI be more beneficial for trying to get a job in industry or is Bayesian work going to be sought after?…
Machine Learning Visualized
Book of Jupyter Notebooks that implement and mathematically derive machine learning algorithms from first-principles. The output of each notebook is a visualization of the machine learning algorithm throughout its training phase, ultimately converging at its optimal weights…AI badly needs a dose of skepticism - Some scientists are too eager to believe their own claims
I explain some of my reasons for being deeply skeptical about AI models that claim to understand DNA, genes, and genomes…How to Build a Scalable Serverless Social Media Ingestion & Analytics Pipeline on AWS
In this post, we explain how to build a scalable, cost-efficient, and serverless data pipeline on AWS to ingest, process, and visualize social media data…Regulating AI Agents
The European Union’s AI Act - promulgated prior to the development and widespread use of AI agents, the EU AI Act faces significant obstacles in confronting the governance challenges arising from this transformative technology, such as performance failures in autonomous task execution, the risk of misuse of agents by malicious actors, and unequal access to the economic opportunities afforded by AI agents. We systematically analyze the EU AI Act’s response to these challenges, focusing on both the substantive provisions of the regulation and, crucially, the institutional frameworks that aim to support its implementation…Autograd and Mutation
How does PyTorch autograd deal with mutation? In particular, what happens when a mutation occurs on a view, which aliases with some other tensor? In 2017, Sam Gross implemented support for in-place operations on views, but the details of which have never been described in plain English… until now…genStats: Standardizing Experimentation Analysis at Scale
CarGurus runs hundreds of A/B tests annually to improve every aspect of our marketplace. Because the car-buying journey is complex and touches every part of our platform, we validate product changes through qualitative insights (consumer interviews) and quantitative experimentation (A/B tests)…genStats is an internal framework orchestrated in Python that standardizes A/B test analysis across CarGurus. It automates notebook generation, centralizes statistical logic, and packages domain-specific metrics into reusable “metric families,” addressing the limitations listed above. A Metric Family is a collection of standardized queries. Analysts contribute the queries for the metrics they specifically own. This approach codifies domain expertise, turning individual knowledge into a shared tool…Modern Probability Modeling - A Tools Approach
These are notes for my class on probability models. In these notes, I walk through the concepts and computations that support modern probability modeling in political science using both maximum likelihood and Bayesian approaches…
bananarama
bananarama generates presentation images using Google Gemini. Define your images in a YAML configuration file with support for reference images and style defaults. Using gemini to generate slide images usually means a tedious loop of copying prompts into a web UI, downloading images, tweaking, and repeating. bananarama makes this process reproducible: your prompts live in version-controlled YAML, and regenerating every image is a single function call. No more copy-paste, no more losing track of which prompt produced which image. It also generates all images in parallel, making it wicked fast to generate full deck of images…Learn Claude Code by doing, not reading.
11 interactive modules with terminal simulators, config builders, and quizzes. No setup required…
Is Hip-Hop in Decline? A Statistical Analysis
Today we’ll explore hip-hop’s meteoric rise and supposed fall, the genres gaining traction in recent years, and the forces that shape music popularity…Introduction to Online Control
This text presents an introduction to an emerging paradigm in control of dynamical systems and differentiable reinforcement learning called online nonstochastic control. The new approach applies techniques from online convex optimization and convex relaxations to obtain new methods with provable guarantees for classical settings in optimal and robust control…
.
Last Week's Newsletter's 3 Most Clicked Links
.
* Based on unique clicks.
** Please take a look at last week's issue #644 here.
Cutting Room Floor
.
Whenever you're ready, 3 ways we can help:
Go deeper each week (paid subscription)
Get 3 additional posts per week designed to help you:Statistics → understand the math behind ML
AI Agents → build with modern AI tools
Career → become more valuable at your job
Looking to get a job?
A practical guide to landing your first (or next) data science role, based on thousands of reader questions.
👉 Check out our “Get A Data Science Job” CoursePromote your organization/project/event to ~68,000 subscribers
Sponsor this newsletter and reach a highly engaged data science audience (30–35% open rate).
👉 Reply to this email to learn more
Thank you for joining us this week! :)
Stay Data Science-y!
All our best,
Hannah & Sebastian

