The AI Training Hub
The line between search engines and AI has vanished. Whether you are teaching a chatbot how to reason or helping a search engine understand human intent, these platforms offer the most flexible remote work available.
Disclosure: This page contains affiliate links to services I trust. If you make a purchase through these links, I may earn a small commission at no extra cost to you.
π Success Toolkit: Before You Apply
High-paying AI platforms are now “expert-heavy.” They use automated filters to find top-tier writers and specialists. I recommend these specialized Fiverr services:
Tier 1: LLM & Chatbot Training
Stellar AI
Personal Experience (Nov 2024 β Present): Stellar AI is a top-paying platform for AI training. The work goes far beyond simple categorization; projects involve complex reasoning, adversarial prompting, and training “agentic” AI to navigate computer interfaces. In my generalist experience, tasks include building structured environments, executing multi-program workflows, and debugging AI logic. It’s highly engaging work, and top performers are frequently promoted to Reviewer roles.
The Flexibility Advantage: Crucially, Stellar features a pausable task timer. As a full-time caregiver, this makes it arguably the most viable independent contracting work available for my situation; it is equally perfect for stay-at-home parents or anyone dealing with limited, frequently interrupted time. You can safely pause midway through a complex task if life interrupts, which is a rare feature in this industry.
The Onboarding Process: Initial platform onboarding varies by track: the generalist “skills match” heavily tests creative writing, while experts take specialized domain exams. Either way, your application can sit unreviewed for months. Once you are accepted, however, project-specific assessments are graded much fasterβsometimes in just days.
Work Availability & Gaps: Because Stellar is a newer platform compared to industry giants, they don’t yet have a continuous, overlapping client pipeline. This results in a “feast or famine” cycle. Over my first 17 months, I experienced multiple 3-month periods with little to absolutely no work. Generalist work dried up in late January 2025 and has been unpredictable since, favoring domain experts like coders.
Paid Project Assessments: Unlike most platforms, Stellar pays a flat rate for taking project-specific qualification tests (note: the initial skills test is unpaid). Project scopes can vary wildly; a massive new generalist project (March 2026) is a complete anomaly, combining multiple workflows with a notoriously grueling test that can take 10 or more hours to complete. In response to rater feedback, they have simplified the exam and are paying a stipend for the attempt regardless of the outcome.
DataAnnotation.tech
Personal Experience (Sept 2023 β Aug 2024): I earned over $10,000 here while managing full-time caregiving duties. Work involves RLHF (Reinforcement Learning from Human Feedback), providing A/B comparative evaluations of AI chatbots to teach them “what a good, safe, and accurate answer looks like.” You rank responses based on reasoning, logic, and multi-criteria rubrics, identifying logic weaknesses and providing detailed written feedback.
Onboarding & Qualifications: Unlike platforms that just automatically assign you tasks, DA relies on a dedicated “Qualifications” dashboard. To access new project pools, you must voluntarily take these tests as they appear. Some pay your hourly rate, while others are unpaid. However, for many projects, you simply read the comprehensive guidelines provided and dive straight into the work.
The Un-Pausable Timers: A critical note on flexibility: the task timer is not pausable. The stated timer is a strict maximum limit, not a suggestion. If you are frequently interrupted and regularly get close to the maximum allowed time for tasks, you risk being flagged during a routine worker purge, as they analyze patterns of efficiency over time.
Strict Monitoring & Silent Firings: The platform has incredibly strict, zero-tolerance monitoring. Because iPad browsers frequently reload background tabs (deleting your unsaved work) and I was managing constant interruptions, I drafted answers in an external Notes app and pasted them in. Ultimately, I was let goβlikely because pushing the time limit made me a target during a routine purge, or because pasting my own text triggered anti-bot detectors. Crucially, they never explicitly warn you against drafting in external apps, leading many to make this innocent mistake unknowingly. If flagged, you face a permanent, silent firing (the “empty dashboard”) with no explanation and no appeal process.
Outlier (Scale AI)
Outlier aggressively hires experts to grade AI reasoning. While the pay can be the highest in the industry, work is strictly project-based and subject to “Empty Queue” (EQ) droughts. They recently migrated to Discourse and use biometric tracking for high-security tasks.
Read Full Aggregated Review ββ οΈ CRITICAL: The Vendor “Ban-Hammer”
Many Tier 2 & 3 companies are vendors for the same systems. Applying to the same system through two different companies is grounds for a permanent ban.
UHRS System
Vendors: Telus, Appen, OneForma, Datavio.
Rule: Only ONE UHRS account is allowed globally. Banned for duplicates.
TryRating System
Vendors: Telus (Maps), WeLocalize (Search).
Rule: Separate from UHRS. You can often work one UHRS and one TryRating role simultaneously.
Tier 2: Long-Term Evaluation Roles
Telus Digital (Formerly Lionbridge)
Telus is the “stable giant” of search evaluation. Unlike Tier 1 platforms that act like gig apps, Telus often hires for specific roles like **Search Quality Evaluator** or **Map Quality Analyst**. These are frequently W-2 positions with strict hourly limits (usually 20 hrs/week).
The 3-Part Exam: This is the most infamous test in the industry. It covers Theoretical Knowledge, Page Quality, and Needs Met ratings. It is unpaid, open-book, and typically takes 10-15 hours to complete across a week. If you fail, you may get one retake, but re-application is often barred for months.
WeLocalize
WeLocalize is the direct competitor to Telus for the “TryRating” client (Apple/Google). If you find Telus’s onboarding too slow, WeLocalize often moves faster, specializing in ad-quality evaluation and localized search intent.
RWS (Formerly Moravia)
RWS is a massive global vendor that focuses heavily on localization and high-end data annotation. They are a great secondary option for those in regions where Telus or DataAnnotation have limited openings.
Tier 3: Project & Task Aggregators
OneForma (Centific)
OneForma is a “marketplace” for data work. You don’t have a single role; instead, you apply to a dashboard of hundreds of projects. These range from simple photo collection to long-term UHRS search evaluation. High volume, but requires constant testing to stay active.
Datavio.ai
Datavio is a smaller “All-in-One” player similar to OneForma but with less project variety. They are primarily a vendor for UHRS (Microsoft) tasks. Keep this as a backup if you get waitlisted elsewhere.
Appen (Formerly CrowdGen)
Once the industry leader, Appen has recently pivoted to its “CrowdGen” model. It remains a high-volume aggregator for micro-tasks and transcription, though project stability has dropped recently after losing several major tech contracts.
Specialized Boards & International
Handshake AI (joinhandshake.com/ai)
Handshake is the best “Strategic Shortcut” for students. Not only can you find AI jobs from companies like Telus here, but **Handshake hires its own internal AI Content teams**. Itβs the only place where you can apply to work directly for the board’s own data quality initiatives.
Remoter.me (Image Annotation Specialty)
Remoter.me is a major player for **Bounding Box** work (training self-driving cars and medical AI). While they currently exclude the US, they are a vital resource for readers in Europe, LATAM, and Asia who want specialized visual annotation work.