ISSUE 436 · May 23, 2023TrendsGoogle "We Have No Moat & Neither Does OpenAI"Google and OpenAI have enormous resources for training big models but there are a lot of ways for smaller projects to thrive. For starters, data quality scales better than data size. This is a leaked internal Google document that's been out for two weeks already but it keeps showing up in my Inbox and it's a good read. Sponsored LinkWebinar | Is the Modern Data Stack Enterprise Ready?How should enterprise companies think about the modern data stack today? Join Mode's next webinar with folks from Gartner, VMWare and Mode for a fireside chat on the topic. RSVP here. Tutorials & OpinionsLinear AlgebraGilbert Strang, the legendary math teacher at MIT, retired this week after teaching for 61 years. His Linear Algebra course, in particular, is worthwhile and the lecture videos and course materials are still available online for free. The Basics of Python Packaging in Early 2023This how-to guide walks through modern Python packaging standards and explains how to decide which tools are most appropriate for you. CLI & version control for ELTData teams can now fetch data from anywhere, send data anywhere, and transform data their way. Download open source Meltano. Tools & CodeprivateGPTHere's a new python package that gives you the power of LLMs for interacting with private and/or proprietary documents. privateGPT is 100% private and no data leaves your environment. It has a permissive Apache 2.0 license and is built with LangChain, GPT4All and LlamaCpp. whyqd: simplicity, transparency, speedwhyqd is a python-based toolkit that makes it easy to create schema-to-schema "crosswalks" for restructuring messy data to conform to a standardized metadata schema. Here's how that's useful. Julia 1.9 HighlightsJulia 1.9 introduces significant improvements in time-to-first-execution, memory management, and features for package authors. Key highlights include native code caching, package extensions, heap snapshot capabilities, and an interactive thread pool for task prioritization. ResourcesTidy Finance with RThis comprehensive text brings together all the tools the authors wished they had at the beginning of their graduate studies in finance. Features chapters on data organization, empirical asset pricing, linear models, portfolio optimization, and more. Free to read online. The World Is Built On ProbabilityThis two-part book introduces probability from the perspective that all modern life is based on probability. This is a thought-provoking book that was originally written in 1984 and was recently updated and released for the web. Read online or follow the links to download. Data VisualizationCollection Space NavigatorThe Collection Space Navigator is an interactive visualization tool for working with large image collections and their multidimensional representations. It provides customizable projections and a variety of filters that make it easy to get slices of high-dimensional space. There's a lot to explore here, including a paper, code, and a demo. Was this email forwarded to you? Sign up here >> |
Поиск по этому блогу
Search1
123
вторник, 16 мая 2023 г.
Data Elixir - Issue 436
Подписаться на:
Комментарии к сообщению (Atom)
Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.