Поиск по этому блогу

Search1

123

четверг, 7 сентября 2023 г.

XGBoost is the Secret of ML Energy

Can't read or see images? View this email in a browser
 

When an ML model has to deal with tabular data, it’s XGBoost (Extreme Gradient Boosting) that energises the model’s performance and computational speed. XGBoost stands as a tree-based ensemble machine learning algorithm renowned for its superior predictive capabilities and performance. The powerful machine learning algorithm is more capable of training a model to find patterns in a dataset with labels and features than LLMs.

https://media0.giphy.com/media/v1.Y2lkPTc5MGI3NjExMWQ5ZnNyYXJ0bnZ1YnQyN2RiMjhtb3RicHZtYnJrM2x2dWRoMGpraSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/26gR2s2PIjg0gDM88/giphy.gif

XGBoost should be considered for any supervised learning task when there’s a substantial number of training examples. Besides, it excels when dealing with a blend of categorical and numeric features. It is particularly effective in scenarios where the dataset comprises a mix of these feature types or when the developer is exclusively working with numeric features.


A question that often arises here is, why choose XGBoost when you have LLMs? In fact, tabular-data-focused data scientists are deeply divided when it comes to choosing between XGBoost, lightBGM, and LLMs.


LLMs effectively classify tabular data with minimal preprocessing, though at the expense of time. To apply LLMs to tabular data, emerging approaches like prompt engineering are being explored but are still in the early stages of development. 


Instead of relying solely on textual outputs, the focus is shifting towards using the internal embeddings generated by LLMs, known as latent structure embeddings. These embeddings can be integrated into traditional tabular models like XGBoost. 


While Transformers have revolutionised generative AI, their primary strengths remain in handling unstructured and sequential data, as well as tasks involving intricate patterns. This convergence of techniques is a promising step towards more versatile and efficient machine learning models.


Read the full story here.




Toolkit for Ethical AI


Developing and deploying AI ethically and responsibly is of utmost importance, and a range of toolkits are available to assist this endeavour. Here are a few:

Read the full story here.




Law-breaker Llama 2


With new developments, Llama 2 is breaking many laws to emerge as a unique model which people are using to train models. The latest development is in the form of TinyLlama.


A research assistant at Singapore University has initiated the training of TinyLlama, a 1.1 billion parameter model inspired by Llama 2. His goal is to pre-train TinyLlama on a massive dataset of 3 trillion tokens. The ambitious goal goes against the Chinchilla scaling law that says that for training a Transformer-based language model to achieve optimal compute, the number of parameters and the number of tokens for training the model should scale in approximately equal proportions. 


Read the full story here.




OpenAI-backed Startups

https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExYmJpbmN5aHhsZW1qaXcyN2Z6cGVhNjh1YXBlMjhnd3p3dHNlY2N0biZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/KVioxFliECSZUmTWAw/giphy.gif

At Microsoft Build’s 2021 conference, OpenAI CEO Sam Altman introduced the OpenAI Startup Fund, initially planning to invest $100 million to support AI startups for a positive global impact. Subsequently, in May, OpenAI secured $175 million for this fund. 


The fund has backed seven startups across various sectors, including robotics, law, education, and more. Of these, two are available as ChatGPT plugins, while two are not yet operational. Major beneficiaries include Speak, a language-learning app that recently raised $16 million, and Descript, an AI-powered audio and video editing tool that has garnered significant investment. Other supported ventures include Mem, Harvey, Milo, 1X, and Charles AI. OpenAI aims to further AI development by investing in startups in countries like India and South Korea.


Read the full story here.

     

TAUSIF ALAM & AMIT RAJA NAIK

Wednesday, Sep 6, 2023 | Was this email forwarded to you? Sign up here

     
   

DOWNLOAD OUR MOBILE APP

Stay Connected

info@analyticsindiamag.com

© 2023 Analytics India Magazine

   
Facebook
Twitter
LinkedIn
Youtube
Instagram
   
 
Analytics India Magazine | 280, 2nd floor, 5th Main, 15 A cross, Sector 6, HSR layout Bengaluru, Karnataka 560102

Комментариев нет:

Отправить комментарий

Примечание. Отправлять комментарии могут только участники этого блога.