Deepseek is not the only Chinese open source LLM to be making waves, in fact, the leaderboard is trending with many China-based open source LLMs such as Tigerbot-70b-chat-v2, Yi-34B model, Qwen-72B and a smaller Qwen-1.8B. Tigerbot-70b-chat-v2, developed by Tiger Research and available on GitHub and Hugging Face, is built on top of Llama 2 70B architecture.
Chinese tech giant Alibaba has been a leader in the open source LLM category riding on Qwen-72B with 3 trillion tokens, and a 32K context length. Its smaller version Qwen-1.8B can run only on 3GB of GPU memory. Qwen is also an organisation building LLMs and large multimodal models (LMMs), and other AGI-related projects.
These Chinese open source developments have sent the US into a frenzy, which has been trying to decelerate AI growth through various sanctions.
China's display of strength in open source underscores its self-sufficiency in AI development, challenging any notions of dependence on the US. The rise of China's prominence in this field becomes more likely if the US continues to stifle its own open source ecosystem with regulations. Currently, as China persists in open sourcing robust AI models, there appears to be no immediate or imminent threat.
Read the full story here.
All Roads Lead to TSMC
What’s the common thread between Apple, Qualcomm, NVIDIA, Google, Microsoft and AWS? They all go to one manufacturer for all their chip manufacturing needs — Taiwan Semiconductor Manufacturing Company or TSMC.
Apple, NVIDIA and Qualcomm have been old customers of TSMC and have been using the fab for chip manufacturing. Besides, TSMC has added new clients like Google, AWS and Microsoft to consolidate its position in the market. AWS recently announced the second generation of AWS-designed chips — AWS Graviton4 and AWS Trainium2 — which will be manufactured by TSMC. Microsoft’s newly announced Microsoft Azure Maia 100 AI Accelerator and Azure Cobalt 100 CPU will also be manufactured by the Taiwanese company.
Read the full story here.
End-Game for Insta Influencers
Alibaba's Institute for Intelligent Computing introduces 'Animate Anyone’, a character animation technology transforming static images into dynamic character videos. The framework, utilising diffusion models, tackles challenges in maintaining temporal consistency and detailed information during image-to-video conversion.
The approach incorporates ReferenceNet to merge features from reference images while preserving appearance intricacies. An efficient pose guide ensures controlled character movements, addressing controllability and continuity challenges. The technology's diffusion model, overcoming hurdles in preserving intricate details during transitions, could pose a threat to short-video content creators on platforms like Instagram and TikTok. A public release is anticipated as the team transitions from an academic prototype to a user-friendly version.
Read the full story here.
The Delay Further Delayed
Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.