The new problem for American AI startups: money, but lack of data

2023-07-06 11:32:56

Article source: Silicon Publishing

作者：Lynn Yang

With the passage of time, the value-added focus of the US artificial intelligence industry has quietly shifted.

According to a new report from The Wall Street Journal: The generative AI startups that are raising billions of dollars may already be failing if they don’t have the right data.

Brad Svruga, co-founder and general partner of venture capital firm Primary Venture Partners, noted:

"We've seen a lot of companies that may be looking for great AI applications, but they don't have access to the data that would allow them to build powerful applications, let alone proprietary data that could help them."

In other words: the real value becomes the data when, in the market, **building the actual model has become akin to a commodity that can be purchased. Having the right data is perhaps more important than ever.

(one)

The logic here is: At present, many AI startups hope to establish niche AI models in subdivided fields such as finance or healthcare, but because they lack brand recognition and social identity, these startups It is not easy for companies to obtain training datasets for vertical industries.

In this regard, large companies may have an advantage,** because large companies have won the trust of large customers in how to handle data. **

For example, according to the "Wall Street Journal" report: Ernst & Young has a large amount of transaction data around the world, and generative AI startups come to their door every day. But EY Global is worried: What will happen if you use your own proprietary data to train external models?

“Who owns this data? When we train a model, what is our access to this model? How else can others use this model? The data is part of the intellectual property that we bring.” EY Global pointed out.

To solve a similar IP problem, one countermeasure: Startups can train different models for each customer based only on the data of each customer.

For example, TermSheet uses this strategy to build the Ethan product strategy. The latter is a generative AI model that can answer industry questions for real estate developers, brokers and investors. But Roger Smith, CEO of TermSheet, also said that even if customers agree with this, they need to educate customers and some convincing.

**In addition, concerns about network security are also the reasons why major client companies are reluctant to choose startups. **

For example, Tracey Daniels, chief data officer of financial services company Truist, said that in terms of data security, they trust larger suppliers, so they only choose to explore generative AI applications with large technology suppliers rather than startups.

**Third, even in some cases, large customers in vertical industries will require generative AI startups to pay huge sums of money or company equity. **

For example, Veesual, a generative AI company that generates images of people trying on clothes, initially used public images on the Internet for training, but failed for these reasons when trying to get big retailers to agree to hand over their data to enhance the model.

**The fourth case is technically difficult to achieve. **

For example, PatentPal, a generative AI startup that helps law firms draft patent applications, has been trained to publish patent applications. They have the opportunity to continue training their models on encrypted or anonymized real customer feedback, making their tools even more accurate. But this process is complex because feedback must be kept separate from highly sensitive and confidential data, including commercial secrets.

**At the same time, however, the race for generative AI startups has heated up. **

If you look at the scale of capital injection, according to data from PitchBook quoted by the Wall Street Journal: from 2022 last year to the first five months of this year, the venture capital funding of generative AI startups has grown from $4.8 billion to $12.7 billion.

As a result, there has been increasing pressure on generative AI startups to secure access to more data in certain niche markets.

Adam Struck, founder and managing partner of Struck Capital, noted: **Startups are racing against each other to secure more data in certain niche markets. **

“If you believe there is a proprietary dataset, you want to get it before they do, and then, negotiate exclusivity. In that sense, it becomes almost an arms race,” he said.

(two)

Interestingly, the above status quo can't help but make me think: **It seems that there is really a lack of a public trading market for data in the market. **

In fact, in 2018 or earlier in 2017, a friend of mine at Netflix, an American streaming media company, talked to me about his entrepreneurial idea: to be a public data trading market. However, there is still no suitable product form, including how to let companies voluntarily hand over their data.

From this perspective, a piece of news in the past two days-OpenAI is considering launching a trading market-is very worthy of attention.

It should be noted that: after ChatGPT’s plug-in plan almost failed, according to US media reports:

OpenAI is considering launching a marketplace to allow customers to sell their customized AI models to other companies. In other words: This marketplace will provide businesses with a way to access cutting-edge large language models and host fine-tuned versions of OpenAI models built by customers. ...

The main summary of the remainder of this article follows:

Why is OpenAI considering launching a trading market?
Is there any way in this trading market to open up data sharing and transactions between companies?

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes

Reward
1
Comment
Repost
Share

Comment

0/400

No comments

Topic
#Crypto Market Pullback
254k Popularity
#Jackson Hole Meeting
750 Popularity
#Gate Alpha ESPORTS Points Airdrop
505 Popularity
#Institutions Hold 10M+ ETH
20k Popularity
#MicroStrategy Loosens Stock Rules
17k Popularity

sitemap