HolidayBlogging

Tag: notes

OpenAI co-founder calls for AI labs to safety-test rival models

OpenAI and Anthropic, two of the world’s leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing — a rare cross-lab collaboration at a time of fierce competition. The effort aimed to surface blind spots in each company’s internal evaluations and demonstrate how leading AI companies can work together on safety and alignment work in the future.

In an interview with TechCrunch, OpenAI co-founder Wojciech Zaremba said this kind of collaboration is increasingly important now that AI is entering a “consequential” stage of development, where AI models are used by millions of people every day.

“There’s a broader question of how the industry sets a standard for safety and collaboration, despite the billions of dollars invested, as well as the war for talent, users, and the best products,” said Zaremba.

The joint safety research, published Wednesday by both companies, arrives amid an arms race among leading AI labs like OpenAI and Anthropic, where billion-dollar data center bets and $100 million compensation packages for top researchers have become table stakes. Some experts warn that the intensity of product competition could pressure companies to cut corners on safety in the rush to build more powerful systems.

To make this research possible, OpenAI and Anthropic granted each other special API access to versions of their AI models with fewer safeguards (OpenAI notes that GPT-5 was not tested because it hadn’t been released yet). Shortly after the research was conducted, however, Anthropic revoked the API access of another team at OpenAI. At the time, Anthropic claimed that OpenAI violated its terms of service, which prohibits using Claude to improve competing products.

Zaremba says the events were unrelated and that he expects competition to stay fierce even as AI safety teams try to work together. Nicholas Carlini, a safety researcher with Anthropic, tells TechCrunch that he would like to continue allowing OpenAI safety researchers to access Claude models in the future.

“We want to increase collaboration wherever it’s possible across the safety frontier, and try to make this something that happens more regularly,” said Carlini.

Techcrunch event

Tech and VC heavyweights join the Disrupt 2025 agenda

Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $600+ before prices rise.

Tech and VC heavyweights join the Disrupt 2025 agenda

Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise.

San Francisco
|
October 27-29, 2025

REGISTER NOW

One of the most stark findings in the study relates to hallucination testing. Anthropic’s Claude Opus 4 and Sonnet 4 models refused to answer up to 70% of questions when they were unsure of the correct answer, instead offering responses like, “I don’t have reliable information.” Meanwhile, OpenAI’s o3 and o4-mini models refuse to answer questions far less, but showed much higher hallucination rates, attempting to answer questions when they didn’t have enough information.

Zaremba says the right balance is likely somewhere in the middle — OpenAI’s models should refuse to answer more questions, while Anthropic’s models should probably attempt to offer more answers.

Sycophancy, the tendency for AI models to reinforce negative behavior in users to please them, has emerged as one of the most pressing safety concerns around AI models.

In Anthropic’s research report, the company identified examples of “extreme” sycophancy in GPT-4.1 and Claude Opus 4 — in which the models initially pushed back on psychotic or manic behavior, but later validated some concerning decisions. In other AI models from OpenAI and Anthropic, researchers observed lower levels of sycophancy.

On Tuesday, parents of a 16-year-old boy, Adam Raine, filed a lawsuit against OpenAI, claiming that ChatGPT (specifically a version powered by GPT-4o) offered their son advice that aided in his suicide, rather than pushing back on his suicidal thoughts. The lawsuit suggests this may be the latest example of AI chatbot sycophancy contributing to tragic outcomes.

“It’s hard to imagine how difficult this is to their family,” said Zaremba when asked about the incident. “It would be a sad story if we build AI that solves all these complex PhD level problems, invents new science, and at the same time, we have people with mental health problems as a consequence of interacting with it. This is a dystopian future that I’m not excited about.”

In a blog post, OpenAI says that it significantly improved the sycophancy of its AI chatbots with GPT-5, compared to GPT-4o, claiming the model is better at responding to mental health emergencies.

Moving forward, Zaremba and Carlini say they would like Anthropic and OpenAI to collaborate more on safety testing, looking into more subjects and testing future models, and they hope other AI labs will follow their collaborative approach.

Update 2:00pm PT: This article was updated to include additional research from Anthropic that was not initially made available to TechCrunch ahead of publication.

Got a sensitive tip or confidential documents? We’re reporting on the inner workings of the AI industry — from the companies shaping its future to the people impacted by their decisions. Reach out to Rebecca Bellan at rebecca.bellan@techcrunch.com and Maxwell Zeff at maxwell.zeff@techcrunch.com. For secure communication, you can contact us via Signal at @rebeccabellan.491 and @mzeff.88.

2025-09-29
Pilot union urges FAA to reject Rainmaker’s drone cloud-seeding plan

Rainmaker Technology’s bid to deploy cloud-seeding flares on small drones is being met by resistance from the airline pilots union, which has urged the Federal Aviation Administration to consider denying the startup’s request unless it meets stricter safety guidelines.

The FAA’s decision will signal how the regulator views weather modification by unmanned aerial systems going forward. Rainmaker’s bet on small drones hangs in the balance.

The Air Line Pilots Association (ALPA) told the FAA that Rainmaker’s petition “fails to demonstrate an equivalent level of safety” and poses “an extreme safety risk.”

However, Rainmaker CEO Augustus Doricko said an email that all of the union’s objections are based on only the public notice, rather than non-public documents submitted to the FAA that outline all of the company’s safety data and risk mitigations.

Rainmaker is seeking an exemption from rules that bar small drones from carrying hazardous materials. The startup filed in July, and the FAA has yet to rule. Instead, it issued a follow-up request for information, pressing for specifics on operations and safety.

In its filing, Rainmaker proposed using two flare types, one “burn-in-place” and the other ejectable, on its Elijah quadcopter, to disperse particles that stimulate precipitation. Elijah has a maximum altitude of 15,000 feet MSL (measured from sea level), which sits inside controlled airspace where commercial airliners routinely fly. Drones need permission from Air Traffic Control to fly inside this bubble.

Rainmaker’s petition says it will operate in Class G (uncontrolled) airspace unless otherwise authorized. ALPA notes the filing doesn’t clearly state where flights would occur or what altitudes would be used. However, Doricko said the documents submitted to the FAA disclosed that in addition to the flights being constrained to a max altitude of 15,000 feet MSL, they will be conducted in airspace that is predetermined to be safe by aviation authorities, “voiding any reasonable concern about high altitude flight or airspace coordination.” ALPA did not reply to TechCrunch’s requests for comment.

Techcrunch event

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch, and a chance to learn from the top voices in tech. Grab your ticket before Sept 26 to save up to $668.

Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025

Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch, and a chance to learn from the top voices in tech. Grab your ticket before Sept 26 to save up to $668.

San Francisco
|
October 27-29, 2025

REGISTER NOW

The union also objects to the flares themselves, citing concerns about foreign object debris and fire safety. ALPA points out that the petition does not include trajectory modeling of the ejectable casings or analysis on the environmental impacts of chemical agents.

“Regarding their objection to the use of flares, independent bodies like this administration’s EPA and multiple state departments of natural resources have studied the dispersion and environmental safety of materials used in cloud seeding for over 70 years and never found any adverse effect from cloud seeding,” Doricko said.

Sam Kim, Rainmaker’s aviation regulatory manager, said the company respects the pilot’s union and hopes to “continue to strengthen our relationship with the organization,” but claimed the objection “shows a lack of understanding of why Rainmaker has filed for this exemption.”

“Our use of flares in unmanned systems is solely for research purposes in a controlled flying environment and is not a part of our larger ongoing operations,” Kim added.

Doricko said that a typical Rainmaker operation disperses 50-100 grams of silver iodide, and far less than that in a flight with flares, while one hour of flight of a commercial plane releases kilograms of uncombusted volatile organics, sulfur oxides, and soot – significantly more material than a Rainmaker op.

“Rainmaker is interested in doing the best, responsible atmospheric research and is thus comparing flares to our proprietary aerosol dispersion system that will replace flares and exclusively emit silver iodide. ALPA’s objection to this exemplifies their limited understanding of our CONOP, all of which contains extensive risk mitigations in the non-public docs that the FAA is reviewing now,” Doricko said.

“Regarding ALPA’s concerns about coordination with aviation authorities and airspace, our flight operations consist of broadcasting signals, intentional coordination with local ATC, certified pilots, and a collision avoidance system that involves electronic and physical observers,” he said.

However, Rainmaker says the flights will occur over rural areas and over properties owned by private landlords “with whom Rainmaker has developed close working relationships.”

Cloud-seeding already happens today, largely in the western U.S., with crewed airplanes flown in coordination with state agencies. Ski resorts commission the operations to help keep their runs white, and irrigation and water districts fly them to build snowpack in the winter to help feed their reservoirs during the spring melt.

The general practice of cloud seeding dates back to the 1950s. By spraying small particles into certain clouds, scientists found they could induce precipitation. Typically, cloud-seeding operations use silver iodide for the particles, mostly because they mimic the shape of ice crystals.

When a silver iodide particle bumps into droplets of water that are super-cooled, they cause the droplet to rapidly freeze because its water is already below the freezing point. Once the ice crystal forms, it can grow quickly if conditions are right, faster than a liquid water droplet would in similar circumstances. Plus, the rapid growth helps the crystals stick around longer than a water droplet, which might evaporate before it has a chance to fall as precipitation.

Rainmaker’s twist — doing this work with drones instead of pilots — could prove safer in the longer term. The company points out that the flight profiles are tightly bounded, overseen by a remote pilot and trained crews, over rural areas, with other safety checks in place.

What happens next hinges on whether the FAA thinks those mitigations are sufficient. However it’s decided, the agency’s response will likely set the tone for novel cloud-seeding approaches.

9/13/2025: The story has been updated to include Rainmaker’s comments from Augustus Doricko, founder and CEO, and Sam Kim, Rainmaker’s aviation regulatory manager.

2025-09-18
A History Of American Recessions
The official designation of a recession comes from a committee at the National Bureau of Economic Research (NBER), a private, nonprofit research organization.

The committee considers a wide range of economy-wide, monthly data points, but the NBER views GDP as “the single best measure.”

The committee calls a recession once there is a significant decline across these measures for more than a few months.

The NBER’s official designation of a recession, then, doesn’t happen until there are several months of data, allowing it to be sure both that a recession happened and when exactly it started.

In other words, as Voronoi notes, the NBER looks backward, not at the present moment.

Source: Voronoi

Using this measure, here’s a few insights:
- From 1855 to 2020, recessions lasted an average of 17 months. In the 20th and 21st centuries, the average recession has decreased to 14 months.
- The US’ longest recession lasted 65 months from October 1873 to March 1879
- The US has gone through 13 recessions since WWII
- The longest recession since WWII was the Great Recession
- The shortest US recession was during COVID-19, from February to April 2020
- Although economic struggles and the Great Depression marked the 1930s, the NBER-defined recession lasted from September 1929 to March 1933.
In other words… there used to be more ‘official’ recessions.
Loading recommendations…
2025-09-13