- Big Data News Weekly
- Posts
- 🤖 Multi-Label Text Classification with Scikit-LLM
🤖 Multi-Label Text Classification with Scikit-LLM
🦾Plus: 🚨 U.S. Bans Mythos & Fable

Hey folks! Let’s get into Big Data and AI craziness…
In today's edition: Here’s roundup of sharp shifts in agents, models, policy, and capital 👇
🧾10 ChatGPT Projects case studies Cheat Sheet
🔄LLM agents still break when the world keeps changing
💼UC Berkeley tests whether AI agents can do real paid work
🚨China Drops Open-Source GLM-5.2 and Crushes Benchmarks
🎓 AI Just Made 12,000 Degrees Useless
💡 AI Tutorial: How to use fewer tokens when dealing with PDFs in Claude
🤖AI Tools and Data Tools to checkout

In this article, you will learn how to perform multi-label text classification using large language models and the scikit-LLM library, without the need for labeled training data or complex model training. What multi-label classification is and why it matters for nuanced text analysis. How to set up and configure scikit-LLM with a free, open-source LLM from Groq for zero-shot inference
The docs platform Anthropic, HubSpot, and Coinbase trust
Your documentation is a product decision. The companies building the most-used developer tools in the world chose Mintlify because great docs reduce support load, accelerate adoption, and convert evaluators into customers. Backed by a16z and Salesforce Ventures, Mintlify powers 20,000+ companies with AI-native documentation that keeps pace with your product — without pulling engineers off the roadmap to maintain it.

Here are the 10 curated ChatGpt hands-on projects to boost data science workflows with ChatGPT across ML, NLP, and full stack dev, including links to full project details.

EvoArena from National University of Singapore, MIT, University of Washington, and others tests agents in dynamic environments where tasks, tools, and preferences evolve over time. It introduces EvoMem, a patch-based memory system that tracks how the environment changes. Key result: current agents average only 39.6% accuracy, showing that reliable agents need memory that can update with the real world.
How do companies like Netflix, Airbnb, and Doordash apply machine learning to improve their products and processes? We put together a database of 200 case studies from 64 companies that share practical ML use cases and learnings from designing ML systems.️

Agents’ Last Exam from UC Berkeley and Dawn Song’s team introduces a benchmark for long-horizon, real-world tasks with clear outcomes. It covers 1,000+ tasks across 13 industry clusters, built with input from 250+ industry experts. Big finding: current AI agents still struggle hard, with the toughest tier showing an average full pass rate below 1%.

2 days ago, the US banned Anthropic's Claude Fable 5. The immediate response? China’s Zhipu AI stepped up and released GLM-5.2 as a fully open-source model, and it is already dominating the charts. Today, GLM-5.2 hit #1 on Bridgebench, scoring a perfect 100.0 on BS and a 42.8 on Reasoning, officially beating the now-banned Fable 5. It runs at a blistering 300 tokens per second at just 1/10th the cost of its closed-source rivals.
Fix that. Live. With Clay + HubSpot.
Defining your ICP on vibes is a pipeline killer. In Build Your GTM Alpha, Clay + HubSpot for Startups walk you through a live build. Real prospect list. Real enrichment. Real outreach sequence. You don't leave with a plan. You leave with outbound running. June 18. 11am ET / 4pm GMT.
👨💻 Data Tools, Libraries
kit (GitHub Repo)
kit is a toolkit for codebase mapping, symbol extraction, code search, and building LLM-powered developer tools, agents, and workflows. It can build things like code reviewers, code generators, and IDEs.
Styleframe (Website)
Styleframe's powerful TypeScript CSS API helps developers compose design systems in minutes.
Sintra: The AI "dream team" for overwhelmed operators. Access 12+ specialized agents that handle marketing, sales, ops, and much more for you - no prompting experience required.
AI News:

Anthropic just pulled its two most powerful AI models, the newly released Mythos and Fable 5, worldwide after the Trump administration ordered it to block all foreign access — citing a reported jailbreak the company called minor. The U.S. implemented an “export control directive” requiring Anthropic to remove access for all non-U.S. citizens, even those within the country.
Your creative brief is due Friday. Viktor wrote it Tuesday.
Tell him the campaign. Viktor pulls last quarter's performance from Meta and TikTok, scrapes competitor ads, drafts the brief, posts it for review. You edit, he ships the creative requests to your designer. Inside Slack.

College majors might not survive the AI shift, and China just proved it. China revoked or suspended 12,200 degree programs and added 10,200 across four years. Saturated arts and language fields gave way to AI, robotics and brain computer majors. The redesign ties degrees straight to Beijing's future industries strategy.

Elon Musk became the world's first trillionaire on Friday after shares of SpaceX rose 20% from their initial public offering price of $135. Musk was already the world's richest person, claiming the title from Jeff Bezos in 2021 with a net worth of over $185 billion. His worth is now equivalent to more than 3% of the US gross domestic product. It is five million times as large as that of the typical US family.

Satya Nadella calls a frontier without an ecosystem unstable and sees a new cognitive loop forming between people and machines. Run private evaluations and reinforcement learning on your own data, he says, and a model keeps the judgment your people built. Swap in a newer generalist model and the veteran expertise stays. He calls that loop a hill climbing machine that sharpens with every use.

Meta employees don't think it's a happy place to work at, especially given the seemingly endless layoffs the company has executed over the last few years. The company's Applied AI team is reportedly on the verge of revolt. One of the group's live-streamed, employee-only presentations was interrupted earlier this week with an expletive-laden meltdown targeting a senior Meta AI executive.
AI Tutorial
How to use fewer tokens when dealing with PDFs in Claude
You do not need any extra tools to reduce tokens in Claude. A simple method using what you already have can clean your PDF and make it easier to process. Here is how:

Step 1: Open the PDF and copy the text.
Step 2: Paste it into Google Docs
Step 3: Click on “File”
Step 4: Select “Download” as Markdown (.md)
Then upload back to Claude and start prompting
🔥Top AI tools to increase productivity:
Thor.ai captures the blockers, decisions, and commitments that never make it into your tools — straight from Slack threads and meetings
AirRankPilot helps local businesses get discovered by Google and AI tools like ChatGPT and Perplexity
InboxKit is your all-in-one platform for building, managing, and scaling cold-email infrastructure.
CCPayment is a cryptocurrency payment platform allowing merchants to accept and payouts
HuePress is a SaaS platform providing therapy-grade, high-quality printable coloring pages
Creatives Takeover is an AI platform that turns raw founder ideas into business plans
Pixel‑exact AI image generator composing scenes on fixed canvases
View our database of all the best AI tools for your needs: aitoolsup.com
Have cool resources to share? Submit AI tool
A.I. Generated Image of the Day
👀 Voclano in your drawing room

Recommended reading:
SPONSOR US
Get your product in front of Big Data & AI enthusiasts
😮 The Marketing Channel You've Probably Slept On:
Haven't tried newsletter sponsorships yet? You are missing out on a HUGE ROI+ customer acquisition channel. I know because dozens of advertisers keep coming back for more... Run a test campaign with us and see for yourself 👉 Get in touch today.
What did you think of today's email?Your feedback helps me create better emails for you! |




