Data Science Weekly - Issue 628Curated news, articles and jobs related to Data Science, AI, & Machine LearningIssue #628 |
|
.
Last Week’s Poll:
.
Data Science Articles & Videos
Real-Time Anomaly Detection with Apache Flink
A critical payment service has started to fail, but only for a small percentage of users…By the time a report flags the revenue dip tomorrow morning, the damage will be done. What you need is a system that watches the data as it flows, understands what “normal” looks like, and flags a problem the instant it deviates…The goal is to move beyond analyzing historical reports and start catching these glitches as they occur. This post is a guide to building that system. We’ll show you how to use Apache Flink to perform real-time anomaly detection…Recently we acquired another company whose entire data stack is modern cloud: Snowflake, AWS, Git, CI/CD, onboarding systems to the cloud, etc…I’m excited but also aware that the tech jump is huge…If you were in my shoes, how would you prepare for leading a modern cloud data engineering function? Any advice from people who moved from traditional ETL into cloud data engineering would be appreciated…
Capsule: Comprehensive Reproducibility Framework for R and Bioinformatics Workflows
Capsule is a comprehensive reproducibility framework specifically designed for bioinformatics and computational biology workflows. It automatically captures your entire analysis environment and generates everything needed to reproduce your research—from Docker containers to pipeline configurations…The Q, K, V Matrices
At the core of the attention mechanism in LLMs are three matrices: Query, Key, and Value. These matrices are how transformers actually pay attention to different parts of the input. In this write-up, we will go through the construction of these matrices from the ground up…Writing an abstract for a lightning talk
Lightning talks are generally 5-10 minutes. As the name implies, they are quick! A good lightning talk is not just your breakout talk condensed into a shorter time frame. You can’t simply deliver the same material faster, or the same material at a higher level, or the same material with a few bits left out…Which Songs Do We Replay the Most? A Statistical Analysis
Which songs do we listen to on repeat? And how does binge-listening behavior change with age?…Roger Peng: Sustaining data science in classrooms, code, and conversations
Michael, Hadley, and Wes welcome Roger Peng, professor of statistics and data science at UT Austin and co-host of Not So Standard Deviations. Together they trace Roger’s journey from early R adopter to pioneering online educator and prolific podcaster. The conversation ranges from the accidental rise of “data science” as a field, to the tension between research papers and software maintenance, to what makes for meaningful, lasting creative work…State of AI: December 2025 newsletter
What you’ve got to know in AI from the last 4 weeks…Welcome to the latest issue of the State of AI, an editorialized newsletter that covers the key developments in AI policy, research, industry, and start-ups over the last month…What does it mean to understand language?
Language understanding entails not just extracting the surface-level meaning of the linguistic input, but constructing rich mental models of the situation it describes. Here we propose that because processing within the brain’s core language system is fundamentally limited, deeply understanding language requires exporting information from the language system to other brain regions that compute perceptual and motor representations, construct mental models, and store our world knowledge and autobiographical memories…
How we built it: Real-time analytics for Stripe Billing
We’ve developed a new, real-time streaming analytics system for Stripe Billing. Now when customers use the Stripe Dashboard to explore and visualize subscription metrics such as monthly recurring revenue (MRR) growth, churn rates, trial conversion rates, and more, they’re getting data that reflects any new subscription activity with latency as low as 15 minutes…We’ll explore how we built each of these functionalities, the engineering challenges involved in them, and how they work together to create a fast, flexible, and reliable real-time analytics platform…A data visualization library for Python that combines the grammar of graphics from ggplot2 with the interactivity of Plotly…
What worked for you for job search? [Reddit]
So I am trying to switch after 2 years of experience in DS. Not getting enough calls…Can anyone share some strategies that helped them getting interview calls?Some Options for Fast Matrix Decompositions in R
This short post shows some different options for speeding up matrix decompositions in R, which can speed up a wide variety of commmon statistical functions…
.
Last Week's Newsletter's 3 Most Clicked Links
.
* Based on unique clicks.
** Please take a look at last week's issue #627 here.
Cutting Room Floor
.
Whenever you're ready, 2 ways we can help:
Looking to get a job? Check out our “Get A Data Science Job” Course
It is a comprehensive course that teaches you everything you need to know about getting a data science job, based on answers to thousands of reader emails like yours. The course has three sections: Section 1 covers how to get started, Section 2 covers how to assemble a portfolio to showcase your experience (even if you don’t have any), and Section 3 covers how to write your resume.Promote yourself/organization to ~68,750 subscribers by sponsoring this newsletter. 30-35% weekly open rate.
Thank you for joining us this week! :)
Stay Data Science-y!
All our best,
Hannah & Sebastian
You're currently a free subscriber to Data Science Weekly Newsletter. For the full experience, upgrade your subscription.


Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.