To begin with, AI now tackles tasks once reserved for doctors, lawyers, consultants, even analysts. For example, companies like Mercor design assignments in law, medicine, finance and management that mimic real high-value work. In turn, they pay experienced professionals to create those tasks and judge AI output against their standards. Mercor rolled out 200 “knowledge work” tasks to test whether models can do jobs that pay real money. They hired professionals with 7+ years’ experience people who once worked at top banks, hospitals and firms to build each scenario. These experts aim to push AI past simple pattern matching and into nuanced reasoning.
In initial results, GPT-4o scored about 35.9 percent on those tasks. GPT-5 then surged to 64.2 percent an impressive leap, but still short of human-level performance. Models only aced two tasks fully ones relying mainly on basic reasoning or information lookup. Many assignments demand judgment calls, domain knowledge and adaptability.
Even if a model hits perfect score on Mercor benchmark, it might fail in real life. The tests emphasize fixed deliverables. They don’t capture messy, open-ended work or the creative intuition humans bring. Creating the prompts that guide the model often requires more effort than doing the work itself. AI growth into professional domains signals major disruption ahead. Tasks once thought safe from automation are under threat. Businesses may deploy hybrid teams, where humans validate or enhance AI output. Role definitions will change, not disappear.
Data annotation and content curation remain foundational in this transition. Large language models depend on vast curated datasets and annotators still ground the training process, albeit now in more specialized roles. Ultimately, the question isn’t if AI will replace certain jobs but how humans adapt. Those who can frame problems, ensure ethical use, interpret AI output, and handle ambiguity will stay ahead. The rise of AI in white-collar work demands rethinking how we train talent, reward expertise, and design collaboration between human and machine.