Connect with us

artificial intelligence

AI Startup Hugging Face is Building Small LMs for ‘Next Stage Robotics’

Published

on



AI startup Hugging Face envisions that small—not large—language models will be used for applications including “next stage robotics,” its Co-Founder and Chief Science Officer Thomas Wolf said.

“We want to deploy models in robots that are smarter, so we can start having robots that are not only on assembly lines, but also in the wild,” Wolf said while speaking at Web Summit in Lisbon today.  But that goal, he said, requires low latency. “You cannot wait two seconds so that your robots understand what’s happening, and the only way we can do that is through a small language model,” Wolf added.

Small language models “can do a lot of the tasks we thought only large models could do,” Wolf said, adding that they can also be deployed on-device. “If you think about this kind of game changer, you can have them running on your laptop,” he said. “You can have them running even on your smartphone in the future.”

Ultimately, he envisions small language models running “in almost every tool or appliance that we have, just like today, our fridge is connected to the internet.”

The firm released its SmolLM language model earlier this year. “We are not the only one,” said Wolf, adding that, “Almost every open source company has been releasing smaller and smaller models this year.”

He explained that, “For a lot of very interesting tasks that we need that we could automate with AI, we don’t need to have a model that can solve the Riemann conjecture or general relativity.” Instead, simple tasks such as data wrangling, image processing and speech can be performed using small language models, with corresponding benefits in speed.

The performance of Hugging Face’s LLaMA 1b model to 1 billion parameters this year is “equivalent, if not better than, the performance of a 10 billion parameters model of last year,” he said. “So you have a 10 times smaller model that can reach roughly similar performance.”

“A lot of the knowledge we discovered for our large language model can actually be translated to smaller models,” Wolf said. He explained that the firm trains them on “very specific data sets” that are “slightly simpler, with some form of adaptation that’s tailored for this model.”

Those adaptations include “very tiny, tiny neural nets that you put inside the small model,” he said. “And you have an even smaller model that you add into it and that specializes,” a process he likened to “putting a hat for a specific task that you’re gonna do. I put my cooking hat on, and I’m a cook.”

In the future, Wolf said, the AI space will split across two main trends.

“On the one hand, we’ll have this huge frontier model that will keep getting bigger, because the ultimate goal is to do things that human cannot do, like new scientific discoveries,” using LLMs, he said. The long tail of AI applications will see the technology “embedded a bit everywhere, like we have today with the internet.”

Edited by Stacy Elliott.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

artificial intelligence

Zuckerberg Knowingly Used Pirated Data to Train Meta AI, Authors Allege

Published

on



Mark Zuckerberg approved using pirated books to train Meta AI, even after his own team warned the material was illegally obtained, a group of authors allege in a recent court filing.

The allegations come from a copyright infringement lawsuit filed by a group of authors including the comedian Sarah Silverman, Christopher Golden, and Richard Kadrey in a California federal court in July 2023. The group claimed Meta misused their books to train its Llama LLM, and they’re asking for damages and an injunction to stop Meta from using their works. The judge in the case dismissed most of the author’s claims in November of that same year, but these recent allegations may breathe new life into the legal dispute.

“Meta’s CEO, Mark Zuckerberg, approved Meta’s use of the LibGen dataset notwithstanding concerns within Meta’s AI executive team (and others at Meta) that LibGen is ‘a dataset we know to be pirated,'” lawyers for the plaintiffs said in a Wednesday filing. Despite these red flags, the lawsuit alleges that, “after escalation,” Zuckerberg gave the green light for Meta’s AI team to proceed with using the controversial dataset.

Representatives for Meta did not immediately respond to Decrypt’s request for comment.

LibGen, short for Library Genesis, is an online platform that provides free access to books, academic papers, articles, and other written publications without properly abiding by copyright laws. It operates as a “shadow library,” offering these materials without authorization from publishers or copyright holders. It currently hosts over 33 million books and over 85 million articles.

The lawsuit alleges Meta tried to keep this under wraps until the last possible moment. Just two hours before the fact discovery deadline on December 13, 2024, the company dumped what plaintiffs describe as “some of the most incriminating internal documents it has produced to date.”

Meta’s own engineers seemed uncomfortable with the plan, according to statements in court filings. The group of authors allege internal messages show Meta engineers hesitated to download the pirated material, with one noting that “torrenting from a [Meta-owned] corporate laptop doesn’t feel right (smile emoji).” Nevertheless, they proceeded to not only download the books but also systematically strip out copyright information to prepare them for AI training, the lawsuit claims.

The latest filings in the lawsuit paint a picture of a company fully aware of the risks: One internal memo warned that “media coverage suggesting we have used a dataset we know to be pirated, such as LibGen, may undermine our negotiating position with regulators.” Yet Meta went ahead anyway, both downloading and distributing (or “seeding”) the pirated content through torrenting networks by January 2024, according to the lawsuit.

When questioned about these activities in a deposition, Zuckerberg appeared to distance himself from the decision, testifying that such piracy would raise “lots of red flags” and “seems like a bad thing.”

The court documents also suggest that Meta’s approach to handling copyrighted information paid more attention to model training than copyright rules. According to the filing, one engineer “filtered […] copyright lines and other data out of LibGen to prepare a CMI-stripped version of it to train Llama.” This systematic removal of copyright information could strengthen the authors’ claims that Meta knowingly tried to hide its use of pirated materials.

The revelations come at a crucial time for Meta’s AI ambitions. The company has been pushing hard to compete with OpenAI and Google in the AI space, with Llama 3.2 being the most popular open source LLM, and Meta AI being a solid free competitor to ChatGPT with similar features.

Most of these AI companies are facing legal battles due to their questionable practices when it comes to training their large language models. Meta was already sued by another group of authors for copyright infringements, OpenAI is currently facing different lawsuits for training its LLMs on copyrighted material, and Anthropic is also facing different accusations from authors and songwriters.

But in general the tech entrepreneurs and creators have been up in arms ever since generative AI exploded in popularity. There are currently dozens of different lawsuits against AI companies for willingly using copyrighted material to train their models. But as with most things on the bleeding edge, we’ll have to wait and see what the courts have to say about it all.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

Continue Reading

artificial intelligence

The Most Eye-Catching and Absurd AI Products Unveiled at CES 2025 So Far

Published

on



As CES 2025 unfolds, one thing is clear—artificial intelligence is everywhere.

From TVs to vacuum cleaners, consumer electronics companies are racing to showcase the new AI features, sometimes shoehorned, into their products.

Some of these AI-powered products are impressive, while others stretch the meaning of “artificial intelligence” to its limits.

Here’s a look at some of the most eye-catching and occasionally absurd AI-powered products at CES so far.

The Roborock Saros 270: The robot vacuum claw machine

What it Does:

Developed by Roborock, the Saros 270 is a robot vacuum equipped with a robotic arm that moves small objects out of the way while cleaning. It has a charging station and can lift up to 300 grams, or 0.66 pounds.

Why It’s Absurd:

So, it’s cool, but the Saros 270 is limited by its size, making it useful for only picking up small toys and lightweight shoes and socks.

The claw it uses to pick up objects is only good for small and lightweight objects. Beyond that, what’s the point?

Unless the Saros 270 can carefully deposit those objects in a basket like a carnival claw machine, you’re left with slightly rearranged clutter. Fun? Yes. Practical? That’s debatable.

The SwitchBot K20+ Pro: The Swiss Army Knife of robot vacuums

What it does:

The SwitchBot K20+ Pro is another autonomous household robot. It isn’t just a vacuum—it’s an all-in-one home helper.

This robot can carry a humidifier, maneuver between rooms, and even collect floating pet hair from the air. Need to cool down at night? Attach a fan. Want a drink delivered? Add a shelf and let it roll your snacks around.

Why it Stands Out:

Its versatility is impressive. Unlike standard robot vacuums, the K20+ Pro feels more like a quirky butler on wheels.

It’s playful and genuinely useful—if you’re into the idea of your vacuum multitasking as a drink coaster.

Samsung Vision AI: AI for your TV

What it does:

Samsung’s Vision AI is part of their evolving “SmartThings” ecosystem.

Samsung’s Smart TVs now integrate AI to recognize their surroundings, adjust to user preferences, and offer generative AI features like creating digital art for wallpapers and screen savers and providing real-time subtitle translation during live broadcasts.

Why It’s Absurd:

While the tech sounds fancy, AI-generated wallpapers and live translation feel more like marketing gimmicks than necessities.

Plus, the more connected your TV is to other smart appliances, the bigger the cybersecurity risk. Do we really need another entry point for hackers in our living rooms, this time powered by AI?

Omnia Smart Mirror: Your reflection and health hub

What It Does:

Making the rounds at CES, the Omnia Smart Mirror by Withings is a smart mirror that provides AI-driven insights and tracks health metrics. The Omnia Smart Mirror also acts as a smart scale, heart rate monitor, and AI assistant in one, offering real-time health data directly from your reflection.

Why It Stands Out:

The Omnia Smart Mirror stands out by reimaging the mirror as a health tool. Similar to the Tonal workout station, where personal health metrics are clearly displayed. Adding to the appeal of the Omnia Smart Mirror is the option to track weight, cardio, body composition, and sleep patterns…if it ever launches.

LeafyPod: The Self-Watering Planter that thinks for you

What it Does:

LeafyPod is an AI-powered, self-watering smart planter that makes plant care effortless.

The LeafyPod is equipped with sensors that monitor soil moisture, light, temperature, and humidity, and it automatically adjusts watering schedules to suit your plant’s needs.

Why It Stands Out:

By automating plant care, LeafyPod will appeal to those who want green spaces but lack a green thumb. It ensures plants receive optimal care without constant attention.

The LeafyPod’s water reservoir can hold enough water to last up to four weeks, and a mobile app lets users monitor their plants and the surrounding environment.

AFEELA by Sony Honda Mobility: The intelligent EV

What It Does:

A collaboration between Sony and Honda, the Afeela is an electric car that blends advanced AI and sensor technology to elevate the driving experience.

The Afeela comes with 40 sensors, including cameras, LiDAR, radar, and ultrasonic units—Afeela offers automated driving assistance and immersive in-car entertainment.

Why It Stands Out:

One of its most unique features is in the cabin, where the driver can control in-car functions using natural voice prompts with the Afeela “Personal Agent” and receive activity suggestions.

Views and maps on the onboard display use Epic Games’ Unreal Engine, which hints at future features that could see the Afeela becoming not only a driving experience but also an entertainment hub.

While this is only a small sample of the innovations being unveiled at CES, it shows the AI arms race is still very much alive and well.

Edited by Sebastian Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

Continue Reading

AI

‘Hype Cycle’ To Last Another Four Months for This Altcoin Sector, According to Real Vision Analyst Jamie Coutts

Published

on


Real Vision’s chief digital assets analyst Jamie Coutts says that a nascent but soaring crypto sector could continue its upward trend for a few more months.

Coutts tells his 32,100 followers on the social media platform X that he thinks crypto artificial intelligence (AI) agents will continue to perform well in the coming months.

Crypto AI agents are protocols built to autonomously perform tasks on behalf of users such as interacting with blockchains and decentralized finance (DeFi) platforms, trading and managing portfolios.

Says Coutts,

“The last big crypto hype cycle was from November 2020 to May 2021, around six months. Subsectors like DeFi, NFTs (non-fungible tokens) around six-12 months.

Interest in AI agents in crypto took off in November 2024. Based on history, this trend is expected to last at least another four months, but probably longer.

AI agents are not like the others – they unlock potential for every established and new use case.”

Image
Source: Jamie Coutts/X

The Real Vision analyst, however, says that crypto AI agents could face a severe correction after reaching the cycle top.

“There will be many scams (tread carefully/position size), and as with every hype cycle, the dump will be massive, but I suspect this move still has a way to go.”

According to the cryptocurrency data aggregator CoinGecko, some of the AI-focused crypto projects that rank among the top 100 digital assets by market cap include Artificial Superintelligence Alliance (FET), Virtuals Protocol (VIRTUAL) and ai16z (AI16Z).

Artificial Superintelligence Alliance is a me ging of various decentralized AI platforms whose goal is to speed up the advancement of decentralized Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI).

Virtuals Protocol is a platform that aims to enable the co-ownership of AI agents.

Meanwhile, the ai16z crypto project is designed to leverage AI-driven insights to direct investments in blockchain projects.

Don’t Miss a Beat – Subscribe to get email alerts delivered directly to your inbox

Check Price Action

Follow us on X, Facebook and Telegram

Surf The Daily Hodl Mix

&nbsp

Disclaimer: Opinions expressed at The Daily Hodl are not investment advice. Investors should do their due diligence before making any high-risk investments in Bitcoin, cryptocurrency or digital assets. Please be advised that your transfers and trades are at your own risk, and any losses you may incur are your responsibility. The Daily Hodl does not recommend the buying or selling of any cryptocurrencies or digital assets, nor is The Daily Hodl an investment advisor. Please note that The Daily Hodl participates in affiliate marketing.

Generated Image: Midjourney





Source link

Continue Reading
Advertisement [ethereumads]

Trending

    wpChatIcon