📢 Gate Square Exclusive: #PUBLIC Creative Contest# Is Now Live!
Join Gate Launchpool Round 297 — PublicAI (PUBLIC) and share your post on Gate Square for a chance to win from a 4,000 $PUBLIC prize pool
🎨 Event Period
Aug 18, 2025, 10:00 – Aug 22, 2025, 16:00 (UTC)
📌 How to Participate
Post original content on Gate Square related to PublicAI (PUBLIC) or the ongoing Launchpool event
Content must be at least 100 words (analysis, tutorials, creative graphics, reviews, etc.)
Add hashtag: #PUBLIC Creative Contest#
Include screenshots of your Launchpool participation (e.g., staking record, reward
The new problem for American AI startups: money, but lack of data
Article source: Silicon Publishing
作者:Lynn Yang
According to a new report from The Wall Street Journal: The generative AI startups that are raising billions of dollars may already be failing if they don’t have the right data.
Brad Svruga, co-founder and general partner of venture capital firm Primary Venture Partners, noted:
In other words: the real value becomes the data when, in the market, **building the actual model has become akin to a commodity that can be purchased. Having the right data is perhaps more important than ever.
(one)
The logic here is: At present, many AI startups hope to establish niche AI models in subdivided fields such as finance or healthcare, but because they lack brand recognition and social identity, these startups It is not easy for companies to obtain training datasets for vertical industries.
In this regard, large companies may have an advantage,** because large companies have won the trust of large customers in how to handle data. **
For example, according to the "Wall Street Journal" report: Ernst & Young has a large amount of transaction data around the world, and generative AI startups come to their door every day. But EY Global is worried: What will happen if you use your own proprietary data to train external models?
“Who owns this data? When we train a model, what is our access to this model? How else can others use this model? The data is part of the intellectual property that we bring.” EY Global pointed out.
To solve a similar IP problem, one countermeasure: Startups can train different models for each customer based only on the data of each customer.
For example, TermSheet uses this strategy to build the Ethan product strategy. The latter is a generative AI model that can answer industry questions for real estate developers, brokers and investors. But Roger Smith, CEO of TermSheet, also said that even if customers agree with this, they need to educate customers and some convincing.
**In addition, concerns about network security are also the reasons why major client companies are reluctant to choose startups. **
For example, Tracey Daniels, chief data officer of financial services company Truist, said that in terms of data security, they trust larger suppliers, so they only choose to explore generative AI applications with large technology suppliers rather than startups.
**Third, even in some cases, large customers in vertical industries will require generative AI startups to pay huge sums of money or company equity. **
For example, Veesual, a generative AI company that generates images of people trying on clothes, initially used public images on the Internet for training, but failed for these reasons when trying to get big retailers to agree to hand over their data to enhance the model.
**The fourth case is technically difficult to achieve. **
For example, PatentPal, a generative AI startup that helps law firms draft patent applications, has been trained to publish patent applications. They have the opportunity to continue training their models on encrypted or anonymized real customer feedback, making their tools even more accurate. But this process is complex because feedback must be kept separate from highly sensitive and confidential data, including commercial secrets.
**At the same time, however, the race for generative AI startups has heated up. **
As a result, there has been increasing pressure on generative AI startups to secure access to more data in certain niche markets.
Adam Struck, founder and managing partner of Struck Capital, noted: **Startups are racing against each other to secure more data in certain niche markets. **
“If you believe there is a proprietary dataset, you want to get it before they do, and then, negotiate exclusivity. In that sense, it becomes almost an arms race,” he said.
(two)
Interestingly, the above status quo can't help but make me think: **It seems that there is really a lack of a public trading market for data in the market. **
In fact, in 2018 or earlier in 2017, a friend of mine at Netflix, an American streaming media company, talked to me about his entrepreneurial idea: to be a public data trading market. However, there is still no suitable product form, including how to let companies voluntarily hand over their data.
From this perspective, a piece of news in the past two days-OpenAI is considering launching a trading market-is very worthy of attention.
It should be noted that: after ChatGPT’s plug-in plan almost failed, according to US media reports:
OpenAI is considering launching a marketplace to allow customers to sell their customized AI models to other companies. In other words: This marketplace will provide businesses with a way to access cutting-edge large language models and host fine-tuned versions of OpenAI models built by customers. ...
The main summary of the remainder of this article follows:
Why is OpenAI considering launching a trading market?
Is there any way in this trading market to open up data sharing and transactions between companies?