Shopping cart

Subtotal:

Alibaba’s Qwen Unveils Advanced AI Models Capable of Device Control

Alibaba’s Qwen team introduces Qwen2.5-VL AI models, showcasing capabilities in text and image analysis, video comprehension, and device control, outperforming major competitors in various benchmarks.

Alibaba’s Qwen Unveils Advanced AI Models Capable of Device Control

Big news from the AI world: Alibaba’s Qwen team just dropped the Qwen2.5-VL series. Think of it as the Swiss Army knife of AI models—it’s got skills in text and image analysis, video understanding, and even bossing around your PC and phone. You can take it for a spin in the Qwen Chat app or grab it from Hugging Face. And get this—it’s been outshining heavyweights like OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.0 Flash in tests. Not too shabby, right?

What can it do? Well, besides making sense of charts and pulling data from those pesky scanned documents, it can watch long videos without getting bored (unlike some of us). It’s also got an eye for spotting intellectual property in movies, TV shows, and products—which kinda makes you wonder what it’s been binge-watching during training. Just remember, it plays by China’s internet rules, steering clear of anything that doesn’t vibe with socialist values.

Here’s where it gets wild: Qwen2.5-VL can mess around with software on your devices. There’s even a video of it booking a flight on Booking.com like it’s no big deal. Sure, it might stumble on a Linux desktop (who doesn’t?), but the potential is huge. The series comes with some pretty open licenses, though the top-tier Qwen2.5-VL-72B model asks the big players to check in before going commercial. Because, you know, with great power comes great paperwork.

Top