The developer ecosystem anticipates that Llama 3 will have an infusion of high-quality training data, perhaps something akin to Phi 1.5, to catapult its performance to new heights. Excitement is high concerning the potential expansion of tokens and further forays into exploring the scaling laws. Another hot topic revolves around the concept of Mixture-of-Architecture, a statistical approach poised to address the shortcomings of parametric architecture, possibly surpassing individual experts or submodels.
Llama 3 is also anticipated to introduce multimodal capabilities into the open-source arena. Meta stands ready to leverage its ecosystem of multimodal models, a domain populated by the likes of mPLUG-Owl, llava, minigpt4, and blip2, all rooted in the robust foundation of LLaMA.
Besides fighting its own battle, LLaMA also has to meet the expectations of the open source AI community, which now heavily relies on it. The open-source LLM leaderboard is filled with models fine-tuned on LLaMA, six from the top at least are LLaMA-based, namely, Uni-TianYan, FashionGPT, sheep-duck, Orca, to GenZ models.
However, the Llama 3 delay has sent the open source community into panic mode. They realise that the delay would mean that the open-source community would lag behind. At this crucial stage, Meta can’t afford to say “better late than never” and rather focus on “now or never”.
Read the full story here.
OpenAI’s Race to the Finish
Google has been hyping Gemini for a while now, but OpenAI recently stole the spotlight by revealing plans to integrate DALL-E 3 with ChatGPT Plus and ChatGPT Enterprise. This strategic move positions GPT-4 as the first functional multimodal model, generating both text and images, exactly what Gemini promised. Google responded by extending Bard's capabilities, allowing image uploads with Lens and incorporating Search images into responses, an attempt to go multimodal. However, OpenAI's DALL-E-integrated ChatGPT Plus, set to launch in October, poses a significant challenge.
OpenAI's move doesn't just impact Google, it also pressures other text-to-image generation models like Midjourney and Stable Diffusion, as DALL-E 3 has demonstrated its image generation prowess.
Read the full story here.
Pixels at War
Which is better, DALL.E or Midjourney? The discussion has once again taken centre stage with the announcement of the integration of DALL.E in ChatGPT Plus and ChatGPT Enterprise. Users, who tested DALL-E 3, found that it outperformed Midjourney, offering superior image quality and prompt coherence. DALL-E 3's simplicity in text prompts allows users to engage in natural conversations with the chatbot for precise image output. In contrast, Midjourney, while versatile and feature-rich for image creation and editing, is primarily Discord-based, complicating its accessibility.
Read the full story here.
Microsoft’s Cutting-edge Tool
Комментариев нет:
Отправить комментарий
Примечание. Отправлять комментарии могут только участники этого блога.