ISSUE 456 · October 10, 2023Tools & CodePyTimeTKPyTimeTK is a new package that aims to make time series analysis simpler, easier, and faster with Python. The package provides a variety of tools for working with time series data, including tools for data wrangling, plotting, data "augmenting", feature summarizing, date utils, and more. This is related to the R package, timetk, by the same author. FAµSTFAµST is a toolkit that factorizes any dense matrix as a product of sparse factors. Apply it to your operators to get reduced storage and faster matrix multiplication. This has been around for awhile but was recently open-sourced. Sponsored LinkGretel trained an LLM on 500BN tokens of public tabular datasets to generate better synthetic dataYou can use Gretel's Tabular LLM to create a new correlated dataset with the correct datatypes, based on an SQL schema or query; or to expand existing datasets by automatically adding columns and records, or filling in gaps and null values. The LLM was trained on half a trillion tokens of public data and generates highly realistic tables while ensuring relationships between records remain consistent. It’s even aware of regional differences (company names, emails, etc.) Request early access here Posts & TutorialsAutodiff PuzzlesThis notebook contains a series of 20 self-contained puzzles for learning about derivatives in tensor libraries. While related in spirit, the puzzles are all pretty seperate and can be done on their own. Deploy Shinylive R App on Github PagesShinylive enables running Shiny applications in a web browser without needing a backend server. It was first introduced for Python in 2022 and it was recently announced for R during Posit Conf 2023 using WebR. This is a great, step-by-step tutorial that shows how to deploy an R Shinylive app to Github Pages. For Python, see this tutorial >> Visualizations on Statistics and Signal ProcessingAwesome collection of interactives that illustrate key concepts in statistics and signal processing. A Beginner's Guide to Sequence Analytics in SQLSequence data refers to data that's arranged in a specific order. The ordering is important because it provides context that would be lost if the elements were considered individually or in a different order. This post is a practical introduction to analyzing sequence data using SQL. Make your own space pictures with telescope data!Pictures are just numbers on big grids, right? In his latest post, Randy Au explores the data that's produced by big telescopes like JWST. Here's how to access that data, how the data is structured, and ultimately, how to turn that data into pictures. TrendsHow AI tools could disrupt scientific publishingFor better or for worse, a world of AI-assisted writing and reviewing is going to transform the nature of the scientific paper. Here's how. CareerLooking for Opportunities?The Data Elixir Job Board has current listings for several data scientist positions, a senior data analyst, two ML Engineers, a staff research scientist, and a Director of AI with the Chan Zuckerberg Initiative. Positions are located around the globe and a couple are remote. Was this email forwarded to you? Sign up here >> |
Поиск по этому блогу
Search1
123
вторник, 10 октября 2023 г.
Data Elixir - Issue 456
Подписаться на:
Комментарии к сообщению (Atom)
Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.